Re: [v6ops] A common problem with SLAAC in "renumbering" scenarios

Philip Homburg <pch-ipv6-ietf-6@u-1.phicoh.com> Sat, 23 March 2019 20:13 UTC

Return-Path: <pch-b9D3CB0F5@u-1.phicoh.com>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C624C130DCB for <ipv6@ietfa.amsl.com>; Sat, 23 Mar 2019 13:13:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZBCd8tYhYf-O for <ipv6@ietfa.amsl.com>; Sat, 23 Mar 2019 13:13:10 -0700 (PDT)
Received: from stereo.hq.phicoh.net (stereo6-tun.hq.phicoh.net [IPv6:2001:888:1044:10:2a0:c9ff:fe9f:17a9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0C8691200B3 for <ipv6@ietf.org>; Sat, 23 Mar 2019 13:13:08 -0700 (PDT)
Received: from stereo.hq.phicoh.net (localhost [::ffff:127.0.0.1]) by stereo.hq.phicoh.net with esmtp (TLS version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384) (Smail #157) id m1h7n0s-0000JmC; Sat, 23 Mar 2019 21:13:06 +0100
Message-Id: <m1h7n0s-0000JmC@stereo.hq.phicoh.net>
To: ipv6@ietf.org
Cc: Fernando Gont <fgont@si6networks.com>
Subject: Re: [v6ops] A common problem with SLAAC in "renumbering" scenarios
From: Philip Homburg <pch-ipv6-ietf-6@u-1.phicoh.com>
Sender: pch-b9D3CB0F5@u-1.phicoh.com
References: <6D78F4B2-A30D-4562-AC21-E4D3DE019D90@consulintel.es> <1F07F2BB-2F37-4D12-9731-7892DF4E3D88@consulintel.es> <0a582916-af14-bd82-a4cd-002a36f8830b@huitema.net> <67515a73-26a5-3ed0-da88-1a4ce64550d3@foobar.org> <360afa02-cf23-375c-4876-780d3c2aa5ac@gont.com.ar> <CAHL_VyD34V=TRcsCp0DOO9HJNHyy5xkiMQ_cZoBa7zTE4fe5OA@mail.gmail.com> <ead01e0a-9211-7944-88d6-ae8d037c03a8@si6networks.com> <FB8B77EE-CC16-4418-BB5E-D44EE66D6B72@jisc.ac.uk> <29dcc6ed-03f6-3ead-6866-eecbefdf1483@si6networks.com> <F4F90B88-3EED-4AF2-82FE-5F1023A05605@employees.org> <c3562b5b-0eec-636b-3bb1-1b0381109542@si6networks.com> <CAJE_bqdttuCfqbjyVRiyZrUOvckLhWMNr21eMfeXBVmv+UbTkA@mail.gmail.com> <924e562a-82e8-e224-d5c3-859c493657e8@si6networks.com> <CAJE_bqfHcL202E+t+RdtdnFMdGNX7NbbFQ8v_rcc1u_gd4Yqog@mail.gmail.com> <81aa4ec3-82 77-794a-4da1-9855d2b6b97f@si6networks.com> <m1h6DxP-0000F3C@stereo.hq.phicoh.net> <8906689d-f9eb-ed74-ae4d-e72d51e d8866@si6networks.com> <m1h7hgA-0000HfC@stereo.hq.phicoh.net> <c0c1a 784-4632-addf-20bf-eea955628c10@si6networks.com>
In-reply-to: Your message of "Sat, 23 Mar 2019 12:14:24 -0300 ." <c0c1a784-4632-addf-20bf-eea955628c10@si6networks.com>
Date: Sat, 23 Mar 2019 21:13:04 +0100
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/wMxyd7Yo188CrGlCD-t5rw-R6lk>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 23 Mar 2019 20:13:13 -0000

>> I'm not sure that will work. If you set 'n' high to be safe, then it will
>> take long time before the hosts takes action. If you set 'n' low, then you
>> still have a risk that a particular network exceeds this value.
>
>If you have another (working) address, then that's safe.

I think my very first proposal was effectively n=1:
after receiving an RA, deprecate all addresses from the same router
not covered by PIOs in the RA.

If I remember correctly, you didn't like it.

For higher values of n, it will take longer to deprecate addresses, which
may lead to shorter RA intervals, which has it's own problems.

>That said, even if one wasn't changing the defaults but rather
>suggesting tweaking of the timers, advice is warranted: you cannot
>expect the operator to figure out all the relevant details....

But increasing multicast rates to deal with a rather obscure problem doesn't
strike me as a good trade-off.

>> I don't think that's true for two reasons:
>> - If the ISP re-assigns the prefix within the valid-lifetime of the
>>   IA_PD prefix option, then the ISP is violating the DHCPv6 RFC.
>
>You may be right in "the ISP is violating the dhcpv6 rfc"... but this
>doesn't make  what I describe an inexistent scenario.

I don't think we should introduce risky code on hosts to deal with 
broken DHCPv6 server implementations. 

And even then, the problem seems mostly theoretical.

>> - The source address selection algorithm avoids deprecated addresses.
>>   So the only way this can lead to trouble is if the a new node would
>>   use the exact same address. With SLAAC that basically cannot happen.
>
>That's not correct. Normally, when a PIO has A=1, it normally has L=1.
>Therefore, that prefix will be considered on-link, and any packets meant
>for it will be sent on the local link, as opposed to forwarded to the
>local router...

That's a separate issue. Deleting an address will immediately disrupt
any flows that use that address.

Deleting a prefix from the list of onlink prefixes is more complex.
If the router still knows about the prefix, then there is no issue.
The router is supposed to forward the packet to the destination and
send a redirect.

If the router lost knowledge of the prefix and two hosts are communicating
locally then marking the prefix as offlink may disrupt that communication.

So there is no obvious way to deal with DHCPv6 servers that violate the
specs.

>1) c=0
>2) Whenever you receive a PIO from the same router that does not include
>the currently-configured prefix: c++;
>3) if c >= N1 -> address deprecated
>4) If c >= N2  (where N2>N1) -> address removed.

So for high values of N1, communication will be disrupted for a long time.
For values of N1 less than the number of RAs that a router uses, prefixes
will periodically marked as deprecated.

If the router also exceeds N2, then addresses will periodically be removed.

>> You are right. I have an idea how to deal with this issue.
>
>I'm listening ;-)

My current idea is that is is safe mark an address as deprecated if no
PIO that covers the address was seen for the 'Router Lifetime' field 
in the RA. Unless the field is zero, then do nothing. This should make
sure that eventually, stale addresses are marked deprecated. An obvious
edge case is PIOs advertised only by a router that is not a default router.

However for the case at hand, it is worth looking at deprecating an
address much quicker. So my proposal is to deprecate an address on a much
shorter time scale if the PIO is advertised by only one router and if that
router sends all PIOs in one RA.

Multiple routers advertising a single PIO is easy to detect. Detecting that
a single router puts all PIO in a single RA is harder. My currently
idea is to use a counter that counts how often a single RA covers all PIOs.
For low values of the counter, an PIO that was left out is evidence that the
router puts PIO in multiple RAs. If the counter is high enough (say above 10)
then assume that a missing PIO is stale and deprecate the address.

There are some corner cases, but they should be rare, and the effect 
relatively benign.