Re: [v6ops] A common problem with SLAAC in "renumbering" scenarios

On Mon, 4 Feb 2019 at 02:01, Fernando Gont <fgont@si6networks.com> wrote:
>
> On 3/2/19 09:27, Ole Troan wrote:
> >
> >
> >> On 3 Feb 2019, at 12:49, Fernando Gont <fgont@si6networks.com> wrote:
> >>
> >>> On 3/2/19 07:32, Ole Troan wrote:
> >>>
> >>>
> >>>> On 3 Feb 2019, at 05:29, Fernando Gont <fgont@si6networks.com> wrote:
> >>>>
> >>>> On 2/2/19 08:57, Ole Troan wrote:
> >>>>>> One question is whether it makes sense for routers to have valid lifetimes of
> >>>>>> more than a day for prefixes that are obtained using DHCP-PD.
> >>>>>>
> >>>>>> Another is whether general purpose hosts should accept lifetimes of more
> >>>>>> than a day. Maybe hosts should just truncate.
> >>>>>
> >>>>> The (original) intended lifetime for DHCP PD is a lifetime equal to the length of the contract with your ISP.
> >>>>> Lifetimes become meaningless with “flash renumbering”. Neither SLAAC nor DHCP PD is designed for that.
> >>>>>
> >>>>> The simple solution to this problem is “if it hurts, stop doing it”.
> >>>>
> >>>> FWIW, lifetimes are mostly irrelevant to the problem that *we* are
> >>>> discussing (which is rather orthogonal to the problem mentioned above):
> >>>> our case is that in which the router just reboots -- so no matter what
> >>>> the lifetime was, the information will be invalid anyway.
> >>>>
> >>>
> >>> That’s not how SLAAC and PD are designed. Lifetimes are not invalid just because of a router reboot. Look at advertised lifetimes as a sort of contract.
> >>
> >> Well, the problem is that you are making a contract on the LAN side for
> >> a contract you may not have on the WAN side. If the router reboots and
> >> the CEP no longer "owns" some prefix, then that contract is void.
> >
> > You have the contract on the WAN side. What makes you think not. E.g via PD learns that given prefix is valid until March 1 2020.
> > A reboot doesn’t change that.
> >
> >> Ideally, the CPE will advertise that the contract is void. But it is
> >> clear that for most deployed CPEs, that will not happen.
> >
> > So a bug.
> > What you are talking about is the case where the ISP breaks the contract. While it previously promised to delegate you a prefix until 20200301, all trace of that has gone.
>
> The CPE is the middle-man between the ISP and the LAN. No matter what
> you may *expect* the CPE to do, the CPE is currently not actually
> required to e.g. clean after the contract that the ISP broke (if you
> assume/think there's such a thing), or even adjust prefix timers
> according to the DHCPv6 lease times -- talk about under-specification of
> the glue between e.g. DHCPv6-PD and SLAAC.
>
> Besides, the layer-8 contract between the user and the ISP may be that
> you get dynamic prefixes. This means that whenever you request a lease,
> you get a different prefix. You might say that if you don't do another
> DHCPv6-PD request, you should be able to use the same prefix. But if you
> do ask a new prefix, you might indeed get a new one -- and this is what
> normally happens after reboots.

That's against the architecture and design of the Internet protocols.

A router reboot, anywhere along the path between communicating
end-points, is supposed to have no more effect than a transient period
of packet loss. Recovery is supposed to be via transport layer
retransmission within the existing established connections.

A first hop CPE rebooting and being given a different PD prefix is
effectively changing a transient packet loss event into the movement
of the CPE and its hosts to a different point of attachment to the
network. That's the significance of what the ISP is imposing on their
customers by having dynamic/unstable PD prefixes. It probably seems
less significant than it really is because the links to the customers
are virtual, e.g. PPPoE, rather than physical.

>
> The CPE should -- if possible -- be faithful to its LAN hosts, and
> advertise if previous contracts between the CPE and the LAN hosts are
> void. i.e., if the CPE does  not get leased the same prefix as before,
> it shoudl notifiy its "clients". However, possibly for simplicity sake,
> CPEs don't record what
> information was previously advertised on the LAN -- they are not
> required, so.... when they reboot, they may not not be in a position to
> notify hosts accordingly.
>
> That's the environment hosts operate in -- no matter whether you or me
> like it.
>
> In that environment, hosts can and should be smarter.
>
>

Multipath transport layer protocols for the win. They're splitting
identifier semantics off from IP addresses.

>
> >>> What you seem to be talking about is either a bug a misconfiguration or both.
> >>
> >> It's neither of those. If anything, it's the result of
> >> under-specification of the necessary glue between automatic
> >> configuration on the WAN side, and automatic configuration on the LAN side.
> >>
> >> e.g., there were no requirements for CPEs to keep track of prefixes that
> >> they have been leased in the past -- if at all possible.
> >
> > DHCP PD will give you the old prefix back.
>
> Not necessarily. In fact, it may intentionally not do that. If you no
> longer own the addresses, the sessions will have to be torn down.
>
>
>
>
> >>> If you want something like session survivability,
> >>> that’s not a trivial problem to solve.
> >>
> >> Not sure what you mean by "session survivability"
> >
> > Try to keep a TCP session active while changing addresses.
>
> Of course that's not what we're trying to solve here.
>
>
>
> >>> Currently the network will give an ICMP destination unreachable code 5 and deprecate the invalid prefix if it has information to do so.
> >>
> >> Where in RFC4443 do ICMP unreach code 5 get to invalidate prefixes?
> >>
> >> Answer: Nowhere. They don't get to do that. All ICMPv6 error messages
> >> are soft errors. And it would be a huge mistake (and huge
> >> vulnerability!) to behave otherwise.
> >
> > It’s a strong hint to the host stack to pick a different source address.
>
> You said "deprecate the invalid prefix if it has information to do so."
> -- selecting a different address is a very different thing than
> deprecating an address. In fact, for connection-less protocols that
> might not even make sense -- since it implies resending stuff that you
> might not even be able to resend (send buffer is gone).
>
> Besides,
>
> * You are assuming somebody will send an ICMPs. But they may not.
>
> * You are assuming that if they do, they will send code 5. But they may not.
>
> * You are assuming that code 5 is an indication of wrong address... but
> it may be an indication of incorrect route.
>
> * You are assuming that nodes will process icmp code 5 in one specific
> way. I don't know of any implementation that behaves in the way you
> describe.
>
>
>
>
> >>> Without getting into the multi-homing discussion and requiring hosts to “throw spaghetti on the wall”, I don’t see how your draft improves on that.
> >>
> >> Not sure what you mean. If the same router that advertised those
> >> prefixes doesn't advertise those prefixes anymore, why would you think
> >> they are still valid?
> >
> > Because that’s what the network previously advertised.
> > If source addresses from that prefix no longer works that’s a good hint to the host to try something else. There’s a list of heuristics the host must use.
> >
> > I still don’t see how your draft improves much on this. Can you explain?
>
> What our document wants to address is this:
>
> * Initially, unprefer addresses for the deprecated prefix.
>
> * subsequently, clean up the lagging addresses.
>
>
> One (more complex) way to achive this would be to e.g., wait for N *
> ROUTE_ADV_INTERVAL (I've just made up the parameter name), and if at
> least M RAs with PIOs have been received, but none of them contain PIOs
> for the (now invalid) prefix, deprecate the prefix.
>
> The solution we currently propose in the I-D is simpler, and just
> involves one additional bit per prefix in the local data structures.
>
> For a sample scenario, please check Appendix B of our draft. THe idea is
> simple: if two consecutive RAs with PIOs don't contain the
> previously-advertised prefix, un-prefer addresses for such prefix.
> Subsequently, once addresses have already been un-preferred, if you
> receive two additional RAs with PIOs that don't advertise the
> previously-advertised prefix, remove (invalidate) the corresponding
> addresses.
>
> So, after two RAs, the lagging addresses are not preferred anymore.
> After four RAs, you get rid of them.
>
> --
> Fernando Gont
> SI6 Networks
> e-mail: fgont@si6networks.com
> PGP Fingerprint: 6666 31C6 D484 63B2 8FB1 E3C4 AE25 0D55 1D4E 7492
>
>
>
>
> --------------------------------------------------------------------
> IETF IPv6 working group mailing list
> ipv6@ietf.org
> Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
> --------------------------------------------------------------------