Re: [v6ops] Opsdir last call review of draft-ietf-v6ops-slaac-renum-03
Fernando Gont <fgont@si6networks.com> Thu, 10 September 2020 09:15 UTC
Return-Path: <fgont@si6networks.com>
X-Original-To: v6ops@ietfa.amsl.com
Delivered-To: v6ops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3282D3A1140; Thu, 10 Sep 2020 02:15:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.845
X-Spam-Level:
X-Spam-Status: No, score=-2.845 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.948, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HQCj6gl4w1sl; Thu, 10 Sep 2020 02:15:41 -0700 (PDT)
Received: from fgont.go6lab.si (fgont.go6lab.si [IPv6:2001:67c:27e4::14]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1A7D33A124E; Thu, 10 Sep 2020 02:15:39 -0700 (PDT)
Received: from [10.0.0.134] (unknown [186.19.8.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by fgont.go6lab.si (Postfix) with ESMTPSA id 40C9A283A74; Thu, 10 Sep 2020 09:15:34 +0000 (UTC)
To: Jürgen Schönwälder <j.schoenwaelder@jacobs-university.de>, ops-dir@ietf.org
Cc: draft-ietf-v6ops-slaac-renum.all@ietf.org, last-call@ietf.org, v6ops@ietf.org
References: <159968910157.15345.3077847299653382902@ietfa.amsl.com>
From: Fernando Gont <fgont@si6networks.com>
Message-ID: <03acb49d-9c05-521a-9bf8-40da16c5f7a7@si6networks.com>
Date: Thu, 10 Sep 2020 06:00:02 -0300
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.1
MIME-Version: 1.0
In-Reply-To: <159968910157.15345.3077847299653382902@ietfa.amsl.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/v6ops/8b6CMUEJHYQVHGyjOxnEwnZwnAY>
Subject: Re: [v6ops] Opsdir last call review of draft-ietf-v6ops-slaac-renum-03
X-BeenThere: v6ops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: v6ops discussion list <v6ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/v6ops>, <mailto:v6ops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/v6ops/>
List-Post: <mailto:v6ops@ietf.org>
List-Help: <mailto:v6ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/v6ops>, <mailto:v6ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Sep 2020 09:15:46 -0000
Hi, Jürgen, Thanks a lot for your comments! In-line.... On 9/9/20 19:05, Jürgen Schönwälder via Datatracker wrote: [....] > > Perhaps indicate a bit earlier what unacceptably long means, i.e. we > are talking about days and weeks. This is a bit subjective. If I'm sitting on my computer doing e.g. video-conferencing (i.e., anything interactive), probably anything over a few minutes would be unacceptable. In a more general case, what's acceptable is a function of how often the problem happens and whether there's any ongoing interactive usage -- and that's still subjective. > The scenarios described read a bit > like somewhat rare events and hence it is useful for the reader to > have an idea what unacceptably long means in such events. I wondering if adding something like: " Any definition of what is considered 'acceptable' here would be subjective, and would probably also depend on how often these flash-renumbering events occur, whether the affected hosts are employing any interactive applications, and other parameters. However, one rough estimate would be that hosts should be able to deal with flash-renumbering events with a similar timeliness with which they can deal with failing default routers." would help? > (BTW, I find > the scenario not described at the beginning where a router announces > SLAAC lifetimes that are not synchronized with obtained prefix > lifetimes operationally the more tricky problem since this can lead to > regular failures.) Fair enough. How about adding this to the bulleted-list: " o A router (e.g. Customer Edge router) may advertise autoconfiguration prefixes corresponding to prefixes learned via DHCPv6-PD with constant PIO lifetimes that are not synchronized with the DHCPv6-PD lease time (as required in Section 6.3 of [RFC8415]). While this behavior violates the aforementioned requirement from [RFC8415], it is not an unusual behavior, particularly when e.g. DHCPv6-PD is implemented in a different software module than the SLAAC router component.". ? > Section 2.2 seems to confuse soft-state (this is what a learned IPv6 > prefix is for me) with certain protocol timers. There are many places > where protocols use soft-state and implementations use timers to purge > or refresh soft-state. That timers generally do not go off in normal > conditions is not really correct in this context, DHCP leases are > renewed when their lifetime expires, a normal operation. Normally, you renew the lease before the lease expires. > IP address > mappings to Ethernet addresses expire when their lifetime timer goes > off. This one is not the necessarily the best example ;-) (while RFC1122 requires that, IIRC in many implementations the entry is refreshed when referenced, and it only expires when not referenced/refreshed frequently enough). But I do see where you are going and I realize that the text is a bit sloppy in this respect. How about tweaking the text as follows: ---- cut here ---- Many protocols, from different layers, normally employ timers for fault isolation/recovery. The general logic is as follows: o A timer is set with a value such that, under normal conditions, the timer does *not* go off. o Whenever a fault condition arises, the timer goes off, and the protocol can perform fault recovery For example, when implementing reliability mechanisms, a timer is normally set when a packet is transmitted and, unless a response is received before the timer goes off, a fault recovery action (such as packet re-transmission) is triggered. ---- cut here ---- ? One might also look at this same issue as the timer implying a sensible period of time where information should be refreshed, as you correctly point out, though. (I guess the only difference is that when looking at this form the soft-state angle, you're mostly considering the case where information changes, whereas when looking at this from the fault-recovery pov, you're mostly thinking about failures, rather than updates). > Switches purge forwarding state regularly when forwarding entries > expire. Cached DNS name to IP resolutions expire. The only problem > here seems to be that a lifetime of 7 days / 30 days is a bit > ridiculous. Agreed. > Is anyone shipping the RFC 4861 defaults? Yes, unfortunately. Some implementations override the RFC4861 defaults. Still, RFC4861 defaults are extremely common and widespread. > The few > implementations I have seen do use a bit more reasonable defaults. I > think this section should be rewritten to replace the "timer going off > is associated with a failure" text with a discussion of soft-state in > other protocols. (Section 2.2 is why I ticked 'has issues'.) As a second alternative to what I've suggested above: ---- cut here ---- Many protocols, from different layers, normally employ timers for a variety of purposes, such as in fault isolation/recovery mechanisms, and in the maintenance of data structures that contain bindings of some sort (e.g., the IPv6 Neighbor Cache [RFC4861]). In the case of fault recovery/isolation, the general logic is as follows: o A timer is set with a value such that, under normal conditions, the timer does *not* go off. o Whenever a fault condition arises, the timer goes off, and the protocol can perform fault recovery For example, when implementing reliability mechanisms, a timer is normally set when a packet is transmitted and, unless a response is received before the timer goes off, a fault recovery action (such as packet re-transmission) is triggered. On the other hand, when maintaining bindings in data structures, timers are usually selected in a way that any bindings that become stale are updated in a timely manner. ---- cut here ---- ? > Isn't a part of the solution (other than moving to less ridiculous > default) that SLAAC hosts experiencing connectivity problems should > try to validate the prefix that they have learned (and if the > validation fails move to a newly learned prefix)? Yes, indeed. That's what we are pursuing in draft-ietf-6man-slaac-renum. (see Section 4 of this (draft-ietf-v6ops-slaac-renum-03) document). draft-ietf-v6ops-slaac-renum-03 contains the problem statement and *operational* mitigations only. > Involving the hosts > in a resolution of the problem may be more robust than expecting that > something in the network takes care of invalidating stale soft-state. I agree 100%. That is and has been, indeed, the motivation for pursuing draft-ietf-6man-slaac-renum. Thanks! Regards, -- Fernando Gont SI6 Networks e-mail: fgont@si6networks.com PGP Fingerprint: 6666 31C6 D484 63B2 8FB1 E3C4 AE25 0D55 1D4E 7492
- [v6ops] Opsdir last call review of draft-ietf-v6o… Jürgen Schönwälder via Datatracker
- Re: [v6ops] Opsdir last call review of draft-ietf… Fernando Gont
- Re: [v6ops] Opsdir last call review of draft-ietf… Juergen Schoenwaelder
- Re: [v6ops] Opsdir last call review of draft-ietf… Fernando Gont
- Re: [v6ops] [OPS-DIR] Opsdir last call review of … Juergen Schoenwaelder
- Re: [v6ops] [OPS-DIR] Opsdir last call review of … Warren Kumari