Re: [v6ops] A common problem with SLAAC in "renumbering" scenarios

Ole Troan <otroan@employees.org> Sun, 03 February 2019 15:26 UTC

Return-Path: <otroan@employees.org>
X-Original-To: v6ops@ietfa.amsl.com
Delivered-To: v6ops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 845F012426A; Sun, 3 Feb 2019 07:26:17 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7rzNMbsRregx; Sun, 3 Feb 2019 07:26:14 -0800 (PST)
Received: from bugle.employees.org (accordion.employees.org [IPv6:2607:7c80:54:3::74]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D37BA130F2C; Sun, 3 Feb 2019 07:26:14 -0800 (PST)
Received: from astfgl.hanazo.no (77.16.218.19.tmi.telenormobil.no [77.16.218.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bugle.employees.org (Postfix) with ESMTPSA id 8EB17FECC115; Sun, 3 Feb 2019 15:26:13 +0000 (UTC)
Received: from [IPv6:::1] (localhost [IPv6:::1]) by astfgl.hanazo.no (Postfix) with ESMTP id 43831DF25F5; Sun, 3 Feb 2019 16:26:09 +0100 (CET)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.2 \(3445.102.3\))
From: Ole Troan <otroan@employees.org>
In-Reply-To: <a484d5de-0dce-a41a-928e-785d8d80d05d@si6networks.com>
Date: Sun, 03 Feb 2019 16:26:08 +0100
Cc: Philip Homburg <pch-ipv6-ietf-6@u-1.phicoh.com>, "v6ops@ietf.org WG" <v6ops@ietf.org>, 6man WG <ipv6@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <A40C5116-9474-4F2B-BD94-F57D155ECD4C@employees.org>
References: <60fabe4b-fd76-4b35-08d3-09adce43dd71@si6networks.com> <alpine.DEB.2.20.1901311236320.5601@uplift.swm.pp.se> <m1gpCcz-0000FlC@stereo.hq.phicoh.net> <ddd28787-8905-bafd-3546-2ceef436c8b0@si6networks.com> <m1gptWx-0000G3C@stereo.hq.phicoh.net> <69609C58-7205-4519-B17A-4FBC8AE2EA16@employees.org> <ac773bb5-0da8-064b-d46b-3a218b8c9e7a@si6networks.com> <CFAEACC4-BA78-4DF9-AD8A-3EB0790B8000@employees.org> <a4f6742e-f18e-3384-d4cc-06bfab49101f@si6networks.com> <FEFA99C2-4F09-4D8F-8D51-C9D9D7090637@employees.org> <a484d5de-0dce-a41a-928e-785d8d80d05d@si6networks.com>
To: Fernando Gont <fgont@si6networks.com>
X-Mailer: Apple Mail (2.3445.102.3)
Archived-At: <https://mailarchive.ietf.org/arch/msg/v6ops/QdL1La8rAOhJPMUDudW4PM16b9c>
Subject: Re: [v6ops] A common problem with SLAAC in "renumbering" scenarios
X-BeenThere: v6ops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: v6ops discussion list <v6ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/v6ops>, <mailto:v6ops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/v6ops/>
List-Post: <mailto:v6ops@ietf.org>
List-Help: <mailto:v6ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/v6ops>, <mailto:v6ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Feb 2019 15:26:18 -0000

>>> 
>>> Well, the problem is that you are making a contract on the LAN side for
>>> a contract you may not have on the WAN side. If the router reboots and
>>> the CEP no longer "owns" some prefix, then that contract is void.
>> 
>> You have the contract on the WAN side. What makes you think not. E.g via PD learns that given prefix is valid until March 1 2020.
>> A reboot doesn’t change that. 
>> 
>>> Ideally, the CPE will advertise that the contract is void. But it is
>>> clear that for most deployed CPEs, that will not happen.
>> 
>> So a bug. 
>> What you are talking about is the case where the ISP breaks the contract. While it previously promised to delegate you a prefix until 20200301, all trace of that has gone. 
> 
> The CPE is the middle-man between the ISP and the LAN. No matter what
> you may *expect* the CPE to do, the CPE is currently not actually
> required to e.g. clean after the contract that the ISP broke (if you
> assume/think there's such a thing), or even adjust prefix timers
> according to the DHCPv6 lease times -- talk about under-specification of
> the glue between e.g. DHCPv6-PD and SLAAC.

RFC3633:
   Each prefix has an associated valid and preferred lifetime, which
   constitutes an agreement about the length of time over which the
   requesting router is allowed to use the prefix.  A requesting router
   can request an extension of the lifetimes on a delegated prefix and
   is required to terminate the use of a delegated prefix if the valid
   lifetime of the prefix expires.

This really isn’t hard.

> Besides, the layer-8 contract between the user and the ISP may be that
> you get dynamic prefixes. This means that whenever you request a lease,
> you get a different prefix. You might say that if you don't do another
> DHCPv6-PD request, you should be able to use the same prefix. But if you
> do ask a new prefix, you might indeed get a new one -- and this is what
> normally happens after reboots.

Not normally. That’s ISP allocation policy.
And not recommended.
But sure a correctly behaving DHCP PD implementation will include the current prefixes in it’s request message.

> The CPE should -- if possible -- be faithful to its LAN hosts, and
> advertise if previous contracts between the CPE and the LAN hosts are
> void. i.e., if the CPE does  not get leased the same prefix as before,
> it shoudl notifiy its "clients". However, possibly for simplicity sake,
> CPEs don't record what
> information was previously advertised on the LAN -- they are not
> required, so.... when they reboot, they may not not be in a position to
> notify hosts accordingly.
> 
> That's the environment hosts operate in -- no matter whether you or me
> like it.
> 
> In that environment, hosts can and should be smarter.

Sure. Hosts should always be smarter.
That doesn’t mean an addressing policy which breaks connections are not broken.

>>>> What you seem to be talking about is either a bug a misconfiguration or both. 
>>> 
>>> It's neither of those. If anything, it's the result of
>>> under-specification of the necessary glue between automatic
>>> configuration on the WAN side, and automatic configuration on the LAN side.
>>> 
>>> e.g., there were no requirements for CPEs to keep track of prefixes that
>>> they have been leased in the past -- if at all possible.
>> 
>> DHCP PD will give you the old prefix back. 
> 
> Not necessarily. In fact, it may intentionally not do that. If you no
> longer own the addresses, the sessions will have to be torn down.

That’s not how IPv6 addressing and renumbering is intended to work.
I take it your goal isn’t to change that, but to improve the situation where it unfortunately happens.
(Which of course might encourage ISPs to do just that.)

>>>> If you want something like session survivability,
>>>> that’s not a trivial problem to solve.
>>> 
>>> Not sure what you mean by "session survivability"
>> 
>> Try to keep a TCP session active while changing addresses. 
> 
> Of course that's not what we're trying to solve here.

No, but it is important to understand the implications of your work.

>>>> Currently the network will give an ICMP destination unreachable code 5 and deprecate the invalid prefix if it has information to do so.
>>> 
>>> Where in RFC4443 do ICMP unreach code 5 get to invalidate prefixes?
>>> 
>>> Answer: Nowhere. They don't get to do that. All ICMPv6 error messages
>>> are soft errors. And it would be a huge mistake (and huge
>>> vulnerability!) to behave otherwise.
>> 
>> It’s a strong hint to the host stack to pick a different source address. 
> 
> You said "deprecate the invalid prefix if it has information to do so."
> -- selecting a different address is a very different thing than
> deprecating an address. In fact, for connection-less protocols that
> might not even make sense -- since it implies resending stuff that you
> might not even be able to resend (send buffer is gone).

No, I didn’t say that.
I said the network will deprecate the prefix if it has information to do. I.e. send a PIO with preferred lifetime = 0.
And that the network will respond with a type 5 destination unreachable, if an invalid source address is used. See BCP38.


> 
> Besides,
> 
> * You are assuming somebody will send an ICMPs. But they may not.

Is it interesting discussion assumptions that aren’t part of the specifications?

> 
> * You are assuming that if they do, they will send code 5. But they may not.

See above.

> 
> * You are assuming that code 5 is an indication of wrong address... but
> it may be an indication of incorrect route.

No.

> 
> * You are assuming that nodes will process icmp code 5 in one specific
> way. I don't know of any implementation that behaves in the way you
> describe.

Hosts can improve.
My point was that it isn’t obivous that very much more can be done from the network side.

>>>> Without getting into the multi-homing discussion and requiring hosts to “throw spaghetti on the wall”, I don’t see how your draft improves on that. 
>>> 
>>> Not sure what you mean. If the same router that advertised those
>>> prefixes doesn't advertise those prefixes anymore, why would you think
>>> they are still valid?
>> 
>> Because that’s what the network previously advertised. 
>> If source addresses from that prefix no longer works that’s a good hint to the host to try something else. There’s a list of heuristics the host must use. 
>> 
>> I still don’t see how your draft improves much on this. Can you explain?
> 
> What our document wants to address is this:
> 
> * Initially, unprefer addresses for the deprecated prefix.
> 
> * subsequently, clean up the lagging addresses.
> 
> 
> One (more complex) way to achive this would be to e.g., wait for N *
> ROUTE_ADV_INTERVAL (I've just made up the parameter name), and if at
> least M RAs with PIOs have been received, but none of them contain PIOs
> for the (now invalid) prefix, deprecate the prefix.
> 
> The solution we currently propose in the I-D is simpler, and just
> involves one additional bit per prefix in the local data structures.
> 
> For a sample scenario, please check Appendix B of our draft. THe idea is
> simple: if two consecutive RAs with PIOs don't contain the
> previously-advertised prefix, un-prefer addresses for such prefix.
> Subsequently, once addresses have already been un-preferred, if you
> receive two additional RAs with PIOs that don't advertise the
> previously-advertised prefix, remove (invalidate) the corresponding
> addresses.
> 
> So, after two RAs, the lagging addresses are not preferred anymore.
> After four RAs, you get rid of them.

So the proposal is that an address is marked as not preferred after anywhere between 6 and 20 minutes?

I am not saying that we can’t and shouldn’t improve on SLAAC… but I think we can do better. And existing heuristics are better than this.

Cheers,
Ole