Re: [v6ops] A common problem with SLAAC in "renumbering" scenarios

Philip Homburg <pch-ipv6-ietf-6@u-1.phicoh.com> Tue, 19 March 2019 12:35 UTC

Return-Path: <pch-b9D3CB0F5@u-1.phicoh.com>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B711813125C for <ipv6@ietfa.amsl.com>; Tue, 19 Mar 2019 05:35:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lpXeTvFY2MpX for <ipv6@ietfa.amsl.com>; Tue, 19 Mar 2019 05:35:07 -0700 (PDT)
Received: from stereo.hq.phicoh.net (stereo6-tun.hq.phicoh.net [IPv6:2001:888:1044:10:2a0:c9ff:fe9f:17a9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 21996128664 for <ipv6@ietf.org>; Tue, 19 Mar 2019 05:35:07 -0700 (PDT)
Received: from stereo.hq.phicoh.net (localhost [::ffff:127.0.0.1]) by stereo.hq.phicoh.net with esmtp (TLS version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384) (Smail #157) id m1h6DxP-0000F3C; Tue, 19 Mar 2019 13:35:03 +0100
Message-Id: <m1h6DxP-0000F3C@stereo.hq.phicoh.net>
To: ipv6@ietf.org
Cc: Fernando Gont <fgont@si6networks.com>
Subject: Re: [v6ops] A common problem with SLAAC in "renumbering" scenarios
From: Philip Homburg <pch-ipv6-ietf-6@u-1.phicoh.com>
Sender: pch-b9D3CB0F5@u-1.phicoh.com
References: <6D78F4B2-A30D-4562-AC21-E4D3DE019D90@consulintel.es> <a73818d31db7422b99a524bc431b00ed@boeing.com> <CAO42Z2z9-48Gbb_Exf+oWUqDO=axSLpZBtqeDcxkAoFq5OziGw@mail.gmail.com> <CALx6S3624hnGauG1HaSWPMvQw0t2Q5R3gb8W4R8w3kuK7dcrWQ@mail.gmail.com> <1F07F2BB-2F37-4D12-9731-7892DF4E3D88@consulintel.es> <0a582916-af14-bd82-a4cd-002a36f8830b@huitema.net> <67515a73-26a5-3ed0-da88-1a4ce64550d3@foobar.org> <360afa02-cf23-375c-4876-780d3c2aa5ac@gont.com.ar> <CAHL_VyD34V=TRcsCp0DOO9HJNHyy5xkiMQ_cZoBa7zTE4fe5OA@mail.gmail.com> <ead01e0a-9211-7944-88d6-ae8d037c03a8@si6networks.com> <FB8B77EE-CC16-4418-BB5E-D44EE66D6B72@jisc.ac.uk> <29dcc6ed-03f6-3ead-6866-eecbefdf1483@si6networks.com> <F4F90B88-3EED-4AF2-82FE-5F1023A05605@employees.org> <c3562b5b-0eec-636b-3bb1-1b0381109542@si6networks.com> <CAJE_bqdttuCfqbjyVRiyZrUOvckLhWMNr21eMfeXBVmv+UbTkA@mail.gmail.com> <924e562a-82e8-e224-d5c3-859c493657e8@si6networks.com> <CAJE_bqfHcL202E+t+RdtdnFMdGNX7NbbFQ8v_rcc1u_gd4Yqog@mail.gmail.com> <81aa4ec3-82 77-794a-4da1-9855d2b6b97f@si6networks.com>
In-reply-to: Your message of "Wed, 27 Feb 2019 03:45:00 -0300 ." <81aa4ec3-8277-794a-4da1-9855d2b6b97f@si6networks.com>
Date: Tue, 19 Mar 2019 13:35:02 +0100
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/6qttsCC78oYoqAeDdNOeqD85bSI>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 19 Mar 2019 12:35:12 -0000

I spent some time thinking about Section 5.1.3 of
draft-gont-6man-slaac-renum-01. This the part where the host deals with
stale prefixes.

I'm interested in the host part for two reasons:
- Solving the issue at the CPE requires persistent storage of network state,
  which may be hard to require from all devices.
- It may take quite a long time before all CPEs are replaced.

So for any host it is worth dealing with the problem locally instead of 
relying on the network.

I find the following issues in draft-gont-6man-slaac-renum-01:
- In the notes it says "The aforementioned processing assumes that while
  network configuration information might be split into multiple RAs, PIOs
  will be spread among *at most* two RAs."

  I think it is bad to deploy something on hosts that would force all
  routers to comply with this.

- The draft proposes many changes to router behavior in particular 
  increasing the frequency of RA multicasts.

  One problem is that hosts will only benefit from those changes when new
  CPEs are deployed. The draft doesn't describe what will happen in networks
  with unmodified CPEs

- Increased multicast rates may have a negative effect on battery lifetime.

- The algorithm has a tight coupling between modifying the preferred and
  valid lifetimes of addresses. In most cases, accidentially marking an
  address as deprecated has little effect on connectivity. Deleting an
  addresses by setting the valid time to zero immediately disrupts all
  communication that uses the address.

  What is worse, these changes to the valid time are not necessary do deal
  with the problem at hand: flash renumbering situations.

  With respect to valid lifetimes, if two hosts are communicating locally
  using global addresses, is it worth breaking that communication just
  because the router ceases to send RAs for the prefix (i.e. well before
  the lifetime of the prefix or the onlink status expires)

So I was looking at a change to host processing of RAs such that:
- The host will quickly (on a human time scale) stop using the old
  prefix
- Any mistakes because the algorithm gets confused are relatively benign
- No changes to routers are required for the algorithm to work, though
  some currently allowed router behavior may lead to sub-optimal results.

The basic concept that I came up with is the following:
- I assume that for each addresses that is configured using SLAAC the host
  already maintains an associated default router to deal with poor man's
  multihoming setups.
- Add to each address a timetamp value that records when the last time
  a PIO (with A bit set) was seen that covers the prefix (last_time_seen).

For each RA that is received, if the RA contains any PIO with the A bit set,
processes those PIOs and update the last_time_seen value for addresses
covered. Update the default router as well.

For each PIO with A bit set and preferred lifetime non-zero, collect
the label values (based on the policy table in RFC 6724 "Default Address
Selection") in a set.

After processing all PIOs in the RA go over all addresses:
- Skip any addresses that have a different default router
- Skip any addresses that have a label not in the set that resulted
  from processing the PIOs
- Skip any addresses that have a last_seen_value that was less than 60
  seconds ago
- If the address has a preferred lifetime of more than 60 seconds,
  reduce the lifetime to 60 seconds.

I picked 60 seconds because it is a value that works well on a human
time scale. At the same time, if a router announces PIOs in multiple RAs
then they should be sent within 60 seconds to see no change in host behavior.
(I guess that for battery lifetime it is best to send all multicasts
together). If a router would spread RAs out over a longer period, 
hosts will follow the RAs causing load to be distributed over multiple
prefixes in a predictable fashion.

In simple CPE situations, this will cause old addresses to be deprecated
60 seconds after the CPE first announces the new prefix (most of the time).
In rare cases, where the CPE quickly reboots just after sending an RA,
it may take one extra round of RAs (at the moment 10 minutes).

In cases where a router spreads PIOs over multiple RAs, if those are 
sent within 60 seconds and there is no packet loss, all addresses remain
preferred. If the RAs are spread over a longer interval or if there is
packet loss, one or more addresses may get deprecated.

To deal with accidentally deprecated addresses, the algorithm makes sure
that an address gets deprecated only if there is a new address with the same
label. The biggest risk here is that a ULA may accidentally cause a
global address to be deprecated.