Re: [v6ops] Stability and Resilience (was Re: A common...)

Yes, at a very high level, it boils down to the same thing.  However,
saying "as much as possible", set a very wide standard, providing few
parameters to set expectations seems useful and helpful in developing a
consensus. For example, if someone says it is impossible for their solution
to ever issues the same prefix, they need to be told they should probably
find a different solution. Where on the other hand, what if an ISP could
always provide the same prefix, this basically says that is what they
should do, where once they exceed some minimum I would want to give the ISP
flexibility to implement what makes sense for their business model.  Also,
it needs to be clear this is "in normal circumstances", it should be
acknowledged there will be situations where this is not a reasonable
expectation.

On Fri, Feb 22, 2019 at 12:43 PM Lee Howard <lee@asgard.org> wrote:

>
> On 2/22/19 12:53 PM, David Farmer wrote:
>
> Generally, I agree with what you are saying, but I'd like to see something
> like the following added as well;
>
> Even if an ISP intends to change the IPv6 prefix regularly in
> the longer-term, say every few months or even each month at an extreme, in
> the shorter-term IPv6 prefixes SHOULD be stable, for time periods of hours,
> days, and maybe even weeks at a time.  Or, put another way, CPE devices
> SHOULD NOT get a new IPv6 prefix every time they are rebooted.  Note: even
> in locations where utility power is generally stable, power outages
> frequently occur in clusters over a few hours or days.  This occurs when an
> emergency repair is made to restore power and then more permanent repairs
> cause short outages in the following hours or days. In this scenario, each
> of these events in the cluster SHOULD NOT result in the CPE receiving a
> different IPv6 prefix.
>
> Conversely, when widespread power events occur, affecting thousands or
> even tens of thousands of customers, it may not be practical or even
> possible for an ISP to guarantee all CPE will receive the same IPv6 prefix
> they had before.  Therefore to the extent possible, CPE and local networks
> SHOULD be resilient to their ISP provided IPv6 prefix changing, sometimes
> even unexpectedly changing.
>
> What's the difference between what you said and "ISPs should, as much as
> possible, reissue the same prefix to customers."?
>
> Lee
>
>
> Thanks.
>
> On Fri, Feb 22, 2019 at 10:36 AM Lee Howard <lee@asgard.org> wrote:
>
>> I think I have heard the following suggestions in this conversation. I
>> hope that taken all together, rather than as individual spot solutions,
>> they can be a consensus recommendation.
>>
>>
>> ISPs should, as much as possible, reissue the same prefix to customers.
>> Some things ISPs can do to increase the chances of this:
>>
>>    1.
>>
>>    Share lease information between redundant DHCPv6 servers. Most ISPs
>>    probably have redundant servers, since this is critical provisioning
>>    infrastructure. It may be difficult to synch information between servers
>>    for millions of leases over tens of milliseconds of latency; see RFC6853,
>>    "DHCPv6 Redundancy Deployment Considerations." Maybe DHCP vendors can
>>    report.
>>
>>    2.
>>
>>    Aggregate above the provider edge device, so that grooming customers
>>    between Provider Edge boxes (PEs) doesn't force a renumbering. It's been a
>>    few years since I worked on CMTSs, but when I did they did not support
>>    MP-BGP well (if at all), so routes had to be aggregated on the PE, or
>>    leaked in the IGP which is bad for convergence time. Maybe PE vendors can
>>    report.
>>
>>    3.
>>
>>    Set DHCPv6 lease timers very low prior to grooming events. A short
>>    interval during the maintenance window will increase load on the DHCPv6
>>    server until timers have been returned to normal values.
>>
>>    4.
>>
>>    In the case of a PE reboot, use DHCPv6 Bulk Leasequery to rebuild the
>>    routing table. I think all of the necessary information is in those
>>    responses. Again, last time I was working on CMTSs, this feature was not
>>    supported. Maybe PE vendors can report.
>>
>>
>>
>> Networks should, as much as possible, be resilient to prefix changes.
>> Some things networks can do to improve resilience:
>>
>>    1.
>>
>>    Write a learned prefix to non-volatile memory and issue a DHCPv6
>>    Renew for that prefix on reboot.
>>
>>    2.
>>
>>    Use dynamic DNS and shorter TTLs.
>>
>>    3.
>>
>>    Implement something like NETCONF to distribute prefix information to
>>    policy devices like firewalls or SD-WAN controllers. I think a separate
>>    document describing this application of NETCONF would make sense.
>>
>>
>>
>> In the case of failures, it cannot be assumed that sessions will stay
>> active. We try to build in redundancy and resilience where we can, but
>> where there's a single point of failure (such as CE or PE), and it fails
>> (such as an unplanned reboot), our expectations should be appropriate.
>>
>> Is this a reasonable summary?
>>
>> Lee
>>
>>
>>
>>
>>
>>
>>
>> --------------------------------------------------------------------
>> IETF IPv6 working group mailing list
>> ipv6@ietf.org
>> Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
>> --------------------------------------------------------------------
>>
>
>
> --
> ===============================================
> David Farmer               Email:farmer@umn.edu
> Networking & Telecommunication Services
> Office of Information Technology
> University of Minnesota
> 2218 University Ave SE        Phone: 612-626-0815
> Minneapolis, MN 55414-3029   Cell: 612-812-9952
> ===============================================
>
>

-- 
===============================================
David Farmer               Email:farmer@umn.edu
Networking & Telecommunication Services
Office of Information Technology
University of Minnesota
2218 University Ave SE        Phone: 612-626-0815
Minneapolis, MN 55414-3029   Cell: 612-812-9952
===============================================