Re: [v6ops] A common problem with SLAAC in "renumbering" scenarios

Mark Smith <markzzzsmith@gmail.com> Sun, 03 February 2019 16:50 UTC

Return-Path: <markzzzsmith@gmail.com>
X-Original-To: v6ops@ietfa.amsl.com
Delivered-To: v6ops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 32CD8128CB7; Sun, 3 Feb 2019 08:50:21 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.499
X-Spam-Level:
X-Spam-Status: No, score=-0.499 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, FROM_LOCAL_NOVOWEL=0.5, HK_RANDOM_ENVFROM=0.001, HK_RANDOM_FROM=0.999, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FJzgPoeoOT0z; Sun, 3 Feb 2019 08:50:18 -0800 (PST)
Received: from mail-ot1-x329.google.com (mail-ot1-x329.google.com [IPv6:2607:f8b0:4864:20::329]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 621D9127598; Sun, 3 Feb 2019 08:50:18 -0800 (PST)
Received: by mail-ot1-x329.google.com with SMTP id k98so10254387otk.3; Sun, 03 Feb 2019 08:50:18 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=9ygANistGw+fP7AJAPOtC3trzJ2+yE6XYUpj1eb3YKI=; b=QvuioKTzGaVcyS1ooZAZaQ9iIzz/7nBHLtF98gmt++mLlFSSuOWX/ysXNkUdxnt78l CGPBU2UQ4ShFPm7UBKdmWb3uBrWph9UJ7gvoyNJiIVZbwSAOiGO0T+Bwdzlg3yEMQsYy /gztAEsKh71oPyg+YBPXLSV4OuE0b4AngbmPTUuMi4amjT/iWCNUHF1KzPKfkqW+5cHL /lauq47+71By8T8JuMk2wYZ57ijV1+ZVpPOmJX4aDkKINoUSB9CSDXE19ivytp3SrT7r CLJg5pvLHuXyAOAdMz08U59XRcawRyJwy6GofbI0GebNW/J4XYZd36RrezjS6asEYw9n FA0g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=9ygANistGw+fP7AJAPOtC3trzJ2+yE6XYUpj1eb3YKI=; b=R8+18tErPfSJy6X/+aadGHFn3xFb5cpJmwOMsFg5JfAv0W2IcGUTgOf1cps711rn4g oilQEy3rdPsPbC8iY+hiEpkr3RPaU4lB0xvLdUcrk+Y/4rUlAWa/I7As4fF6i3uGeaX+ cYykMmoO5sFDqNlw23YD6l0eua8ooJlEQeb+nSXnw4dGvs3SA3Kr5lv+HpwWcrbjHInW IjU5JBLWrFfWCU288iDgGIHdlJT3Q1if7L1/sC1y8OSA0u1Au+aa6ZT/N9tDCQLiBiEK LYKWhaLFY+VY8BW3ZIjbmLNRQTQ3kDFTK0o09GoMVDlFJyT0iuyHcGP4ED6qAHLL7mST 3stg==
X-Gm-Message-State: AJcUukemVtwpD2vzuax3U95SbOHpNCpj84zwX5XXf0zfVhYe6GdvgJFL 7UvEM95wU5+lssNcUw1vJJrDoW39yEKaNAmb7eA=
X-Google-Smtp-Source: ALg8bN7oZ7Fc1zQKT12NEm5u/TiJwGcBAdARUWpngD5sa1yBUkW8juni30nkBpcl+65lie4MfhNnPQHzE6whDm9G4Ms=
X-Received: by 2002:a9d:3d42:: with SMTP id a60mr34733160otc.285.1549212617442; Sun, 03 Feb 2019 08:50:17 -0800 (PST)
MIME-Version: 1.0
References: <60fabe4b-fd76-4b35-08d3-09adce43dd71@si6networks.com> <alpine.DEB.2.20.1901311236320.5601@uplift.swm.pp.se> <m1gpCcz-0000FlC@stereo.hq.phicoh.net> <ddd28787-8905-bafd-3546-2ceef436c8b0@si6networks.com> <m1gptWx-0000G3C@stereo.hq.phicoh.net> <69609C58-7205-4519-B17A-4FBC8AE2EA16@employees.org> <ac773bb5-0da8-064b-d46b-3a218b8c9e7a@si6networks.com> <CFAEACC4-BA78-4DF9-AD8A-3EB0790B8000@employees.org> <a4f6742e-f18e-3384-d4cc-06bfab49101f@si6networks.com> <FEFA99C2-4F09-4D8F-8D51-C9D9D7090637@employees.org> <a484d5de-0dce-a41a-928e-785d8d80d05d@si6networks.com>
In-Reply-To: <a484d5de-0dce-a41a-928e-785d8d80d05d@si6networks.com>
From: Mark Smith <markzzzsmith@gmail.com>
Date: Mon, 04 Feb 2019 03:49:49 +1100
Message-ID: <CAO42Z2xzYQESqqsz4AEE89vx=AhvBEf8Yzyae9o7z1U1XYyarw@mail.gmail.com>
To: Fernando Gont <fgont@si6networks.com>
Cc: Ole Troan <otroan@employees.org>, v6ops list <v6ops@ietf.org>, 6man WG <ipv6@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/v6ops/oxodlyyDDQXnpM32B14VsL7_Yvw>
Subject: Re: [v6ops] A common problem with SLAAC in "renumbering" scenarios
X-BeenThere: v6ops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: v6ops discussion list <v6ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/v6ops>, <mailto:v6ops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/v6ops/>
List-Post: <mailto:v6ops@ietf.org>
List-Help: <mailto:v6ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/v6ops>, <mailto:v6ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Feb 2019 16:50:21 -0000

On Mon, 4 Feb 2019 at 02:01, Fernando Gont <fgont@si6networks.com> wrote:
>
> On 3/2/19 09:27, Ole Troan wrote:
> >
> >
> >> On 3 Feb 2019, at 12:49, Fernando Gont <fgont@si6networks.com> wrote:
> >>
> >>> On 3/2/19 07:32, Ole Troan wrote:
> >>>
> >>>
> >>>> On 3 Feb 2019, at 05:29, Fernando Gont <fgont@si6networks.com> wrote:
> >>>>
> >>>> On 2/2/19 08:57, Ole Troan wrote:
> >>>>>> One question is whether it makes sense for routers to have valid lifetimes of
> >>>>>> more than a day for prefixes that are obtained using DHCP-PD.
> >>>>>>
> >>>>>> Another is whether general purpose hosts should accept lifetimes of more
> >>>>>> than a day. Maybe hosts should just truncate.
> >>>>>
> >>>>> The (original) intended lifetime for DHCP PD is a lifetime equal to the length of the contract with your ISP.
> >>>>> Lifetimes become meaningless with “flash renumbering”. Neither SLAAC nor DHCP PD is designed for that.
> >>>>>
> >>>>> The simple solution to this problem is “if it hurts, stop doing it”.
> >>>>
> >>>> FWIW, lifetimes are mostly irrelevant to the problem that *we* are
> >>>> discussing (which is rather orthogonal to the problem mentioned above):
> >>>> our case is that in which the router just reboots -- so no matter what
> >>>> the lifetime was, the information will be invalid anyway.
> >>>>
> >>>
> >>> That’s not how SLAAC and PD are designed. Lifetimes are not invalid just because of a router reboot. Look at advertised lifetimes as a sort of contract.
> >>
> >> Well, the problem is that you are making a contract on the LAN side for
> >> a contract you may not have on the WAN side. If the router reboots and
> >> the CEP no longer "owns" some prefix, then that contract is void.
> >
> > You have the contract on the WAN side. What makes you think not. E.g via PD learns that given prefix is valid until March 1 2020.
> > A reboot doesn’t change that.
> >
> >> Ideally, the CPE will advertise that the contract is void. But it is
> >> clear that for most deployed CPEs, that will not happen.
> >
> > So a bug.
> > What you are talking about is the case where the ISP breaks the contract. While it previously promised to delegate you a prefix until 20200301, all trace of that has gone.
>
> The CPE is the middle-man between the ISP and the LAN. No matter what
> you may *expect* the CPE to do, the CPE is currently not actually
> required to e.g. clean after the contract that the ISP broke (if you
> assume/think there's such a thing), or even adjust prefix timers
> according to the DHCPv6 lease times -- talk about under-specification of
> the glue between e.g. DHCPv6-PD and SLAAC.
>
> Besides, the layer-8 contract between the user and the ISP may be that
> you get dynamic prefixes. This means that whenever you request a lease,
> you get a different prefix. You might say that if you don't do another
> DHCPv6-PD request, you should be able to use the same prefix. But if you
> do ask a new prefix, you might indeed get a new one -- and this is what
> normally happens after reboots.

That's against the architecture and design of the Internet protocols.

A router reboot, anywhere along the path between communicating
end-points, is supposed to have no more effect than a transient period
of packet loss. Recovery is supposed to be via transport layer
retransmission within the existing established connections.

A first hop CPE rebooting and being given a different PD prefix is
effectively changing a transient packet loss event into the movement
of the CPE and its hosts to a different point of attachment to the
network. That's the significance of what the ISP is imposing on their
customers by having dynamic/unstable PD prefixes. It probably seems
less significant than it really is because the links to the customers
are virtual, e.g. PPPoE, rather than physical.

>
> The CPE should -- if possible -- be faithful to its LAN hosts, and
> advertise if previous contracts between the CPE and the LAN hosts are
> void. i.e., if the CPE does  not get leased the same prefix as before,
> it shoudl notifiy its "clients". However, possibly for simplicity sake,
> CPEs don't record what
> information was previously advertised on the LAN -- they are not
> required, so.... when they reboot, they may not not be in a position to
> notify hosts accordingly.
>
> That's the environment hosts operate in -- no matter whether you or me
> like it.
>
> In that environment, hosts can and should be smarter.
>
>


Multipath transport layer protocols for the win. They're splitting
identifier semantics off from IP addresses.


>
> >>> What you seem to be talking about is either a bug a misconfiguration or both.
> >>
> >> It's neither of those. If anything, it's the result of
> >> under-specification of the necessary glue between automatic
> >> configuration on the WAN side, and automatic configuration on the LAN side.
> >>
> >> e.g., there were no requirements for CPEs to keep track of prefixes that
> >> they have been leased in the past -- if at all possible.
> >
> > DHCP PD will give you the old prefix back.
>
> Not necessarily. In fact, it may intentionally not do that. If you no
> longer own the addresses, the sessions will have to be torn down.
>
>
>
>
> >>> If you want something like session survivability,
> >>> that’s not a trivial problem to solve.
> >>
> >> Not sure what you mean by "session survivability"
> >
> > Try to keep a TCP session active while changing addresses.
>
> Of course that's not what we're trying to solve here.
>
>
>
> >>> Currently the network will give an ICMP destination unreachable code 5 and deprecate the invalid prefix if it has information to do so.
> >>
> >> Where in RFC4443 do ICMP unreach code 5 get to invalidate prefixes?
> >>
> >> Answer: Nowhere. They don't get to do that. All ICMPv6 error messages
> >> are soft errors. And it would be a huge mistake (and huge
> >> vulnerability!) to behave otherwise.
> >
> > It’s a strong hint to the host stack to pick a different source address.
>
> You said "deprecate the invalid prefix if it has information to do so."
> -- selecting a different address is a very different thing than
> deprecating an address. In fact, for connection-less protocols that
> might not even make sense -- since it implies resending stuff that you
> might not even be able to resend (send buffer is gone).
>
> Besides,
>
> * You are assuming somebody will send an ICMPs. But they may not.
>
> * You are assuming that if they do, they will send code 5. But they may not.
>
> * You are assuming that code 5 is an indication of wrong address... but
> it may be an indication of incorrect route.
>
> * You are assuming that nodes will process icmp code 5 in one specific
> way. I don't know of any implementation that behaves in the way you
> describe.
>
>
>
>
> >>> Without getting into the multi-homing discussion and requiring hosts to “throw spaghetti on the wall”, I don’t see how your draft improves on that.
> >>
> >> Not sure what you mean. If the same router that advertised those
> >> prefixes doesn't advertise those prefixes anymore, why would you think
> >> they are still valid?
> >
> > Because that’s what the network previously advertised.
> > If source addresses from that prefix no longer works that’s a good hint to the host to try something else. There’s a list of heuristics the host must use.
> >
> > I still don’t see how your draft improves much on this. Can you explain?
>
> What our document wants to address is this:
>
> * Initially, unprefer addresses for the deprecated prefix.
>
> * subsequently, clean up the lagging addresses.
>
>
> One (more complex) way to achive this would be to e.g., wait for N *
> ROUTE_ADV_INTERVAL (I've just made up the parameter name), and if at
> least M RAs with PIOs have been received, but none of them contain PIOs
> for the (now invalid) prefix, deprecate the prefix.
>
> The solution we currently propose in the I-D is simpler, and just
> involves one additional bit per prefix in the local data structures.
>
> For a sample scenario, please check Appendix B of our draft. THe idea is
> simple: if two consecutive RAs with PIOs don't contain the
> previously-advertised prefix, un-prefer addresses for such prefix.
> Subsequently, once addresses have already been un-preferred, if you
> receive two additional RAs with PIOs that don't advertise the
> previously-advertised prefix, remove (invalidate) the corresponding
> addresses.
>
> So, after two RAs, the lagging addresses are not preferred anymore.
> After four RAs, you get rid of them.
>
> --
> Fernando Gont
> SI6 Networks
> e-mail: fgont@si6networks.com
> PGP Fingerprint: 6666 31C6 D484 63B2 8FB1 E3C4 AE25 0D55 1D4E 7492
>
>
>
>
> --------------------------------------------------------------------
> IETF IPv6 working group mailing list
> ipv6@ietf.org
> Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
> --------------------------------------------------------------------