Re: [v6ops] Stability and Resilience (was Re: A common...)

David Farmer <farmer@umn.edu> Fri, 22 February 2019 21:40 UTC

Return-Path: <farmer@umn.edu>
X-Original-To: v6ops@ietfa.amsl.com
Delivered-To: v6ops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9A38A130E7F for <v6ops@ietfa.amsl.com>; Fri, 22 Feb 2019 13:40:51 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.3
X-Spam-Level:
X-Spam-Status: No, score=-4.3 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=umn.edu
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YOd9ph6V4VDs for <v6ops@ietfa.amsl.com>; Fri, 22 Feb 2019 13:40:47 -0800 (PST)
Received: from mta-p8.oit.umn.edu (mta-p8.oit.umn.edu [134.84.196.208]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 61F461275F3 for <v6ops@ietf.org>; Fri, 22 Feb 2019 13:40:47 -0800 (PST)
Received: from localhost (unknown [127.0.0.1]) by mta-p8.oit.umn.edu (Postfix) with ESMTP id 2F046548 for <v6ops@ietf.org>; Fri, 22 Feb 2019 21:40:46 +0000 (UTC)
X-Virus-Scanned: amavisd-new at umn.edu
Received: from mta-p8.oit.umn.edu ([127.0.0.1]) by localhost (mta-p8.oit.umn.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LKnNsQ0O6xYU for <v6ops@ietf.org>; Fri, 22 Feb 2019 15:40:45 -0600 (CST)
Received: from mail-vk1-f198.google.com (mail-vk1-f198.google.com [209.85.221.198]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mta-p8.oit.umn.edu (Postfix) with ESMTPS id C4966184 for <v6ops@ietf.org>; Fri, 22 Feb 2019 15:40:45 -0600 (CST)
Received: by mail-vk1-f198.google.com with SMTP id x200so1690060vkd.0 for <v6ops@ietf.org>; Fri, 22 Feb 2019 13:40:45 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=umn.edu; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=D63bq5HpWUFTtJ4siHBh3RrLLAZUjxSzpeofpODGTrc=; b=NDznbxYkVLjOPhEadocEF22l61LGncTuNhLd+ah8ni7Yjtv1XIrZoayDOIaBeNOtvZ IigXftcvGcwqWEKqrw3WI3jv+9US+I73VMkK7eJjxl/iFm4BazbfWQfaDmHiqw1Wpy+r BwMP07LSovT4mN+xMv2r0ADLvunigFyef9yqcKg/GATtgeTkm1N/pVo/BwjV/UCVwBSf POocUesiNBEtRPmjj2f7hFQZhvWxFM+uv70CzBHJUBfTSiXCOJvoaYmKcDxWVpLDklq4 NML93qVz7J8K4dH7C6XCM6rSYSNcDDmyPqYxs4tqsTuWlWk+k5mfNEuR7JfkkqNZTLSW WefQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=D63bq5HpWUFTtJ4siHBh3RrLLAZUjxSzpeofpODGTrc=; b=d2K/L9Ee4h9Swj1Ij5cxWI2LZcGCcU6hlrTNQ+PRpR937IzcjvVwv4JwBMg3dFsSAF vuHSYVur0b02gwOio1U4tLRSzRjIPlRT1nupsgFtJJv5Sa9Mnvx+Xvth1kK+54vbzpho +zyY+w6NNEa13OEcATt8eeQdnye6D5cmUpCnAKISmviUxiTAHTDUxCZ4OIvRNcbPPovN BwAvak93uXSSAQ6X/rmQP2n2g6/TKXGwdTOXNV3tb+YlOkaCTT0Y8oZSGA62xFHdsM0D lmtIP8D4XVDHgNsQ8dk2IbCQEyeSyc+M1l95nBVQuOg102e5yLpjVC8/66mWxuwx9N74 hFJA==
X-Gm-Message-State: AHQUAubNOYENTmvKNC+SRyMe5qmnY6a9Myv1Col2VaBDqwJtfwD76tZc 1juZbXlpyrspEoG6+TO7ucYIYhqTz8CVl/j3V1vRnrg7HSItBhtaYrTGZ9kckv1CYXYpYzPrAax Xnw5QnyUeCEFe1beRmzqB0RAqNg==
X-Received: by 2002:a67:2ed1:: with SMTP id u200mr3133657vsu.167.1550871644510; Fri, 22 Feb 2019 13:40:44 -0800 (PST)
X-Google-Smtp-Source: AHgI3IbDt/1Ov5seNEgMFBz+m7gCaSrDRUeIV+iut7S5cvbN9EGiS+Pr9IjZBSForCv6rob8GVtSuiHv3B7baP9iI78=
X-Received: by 2002:a67:2ed1:: with SMTP id u200mr3133638vsu.167.1550871643822; Fri, 22 Feb 2019 13:40:43 -0800 (PST)
MIME-Version: 1.0
References: <6D78F4B2-A30D-4562-AC21-E4D3DE019D90@consulintel.es> <B6E2EC33-EEAF-40D0-AFCC-BDAFA9134ACD@consulintel.es> <20190220113603.GK71606@Space.Net> <28fbc2c305c640c9afb3704050f6e8d7@boeing.com> <20190220213107.GS71606@Space.Net> <019c552eb1624d348641d6930829fd1f@boeing.com> <CAKD1Yr0HBG+rhyFWg9zh0t3mW486Mjx9umjn+CRqAZg4z9r0dg@mail.gmail.com> <20190221073530.GT71606@Space.Net> <CAO42Z2wmB2W52b4MZ2h9sW5E9cQKm-HRjyf--q8C26jezS7LXQ@mail.gmail.com> <a73818d31db7422b99a524bc431b00ed@boeing.com> <CAO42Z2z9-48Gbb_Exf+oWUqDO=axSLpZBtqeDcxkAoFq5OziGw@mail.gmail.com> <0629af5e-5e1b-7e01-5bf4-b288a2d36809@asgard.org> <CAN-Dau0UsVpcZ4TRraQLR9GrqenrCgmh3yq67DaXBXv1SPywJg@mail.gmail.com> <b531ed25-aa3e-0625-7da2-3fe254bf712f@asgard.org>
In-Reply-To: <b531ed25-aa3e-0625-7da2-3fe254bf712f@asgard.org>
From: David Farmer <farmer@umn.edu>
Date: Fri, 22 Feb 2019 15:40:26 -0600
Message-ID: <CAN-Dau2GDrYtGi+THiJD+oQ4weSyw8n2EOLaqst7en=cL_u9aw@mail.gmail.com>
To: Lee Howard <lee@asgard.org>
Cc: 6man WG <ipv6@ietf.org>, IPv6 Ops WG <v6ops@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000006c36240582827127"
Archived-At: <https://mailarchive.ietf.org/arch/msg/v6ops/PStG15sE_9G06j4TEsuvQL6QsYc>
Subject: Re: [v6ops] Stability and Resilience (was Re: A common...)
X-BeenThere: v6ops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: v6ops discussion list <v6ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/v6ops>, <mailto:v6ops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/v6ops/>
List-Post: <mailto:v6ops@ietf.org>
List-Help: <mailto:v6ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/v6ops>, <mailto:v6ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Feb 2019 21:40:52 -0000

Yes, at a very high level, it boils down to the same thing.  However,
saying "as much as possible", set a very wide standard, providing few
parameters to set expectations seems useful and helpful in developing a
consensus. For example, if someone says it is impossible for their solution
to ever issues the same prefix, they need to be told they should probably
find a different solution. Where on the other hand, what if an ISP could
always provide the same prefix, this basically says that is what they
should do, where once they exceed some minimum I would want to give the ISP
flexibility to implement what makes sense for their business model.  Also,
it needs to be clear this is "in normal circumstances", it should be
acknowledged there will be situations where this is not a reasonable
expectation.

On Fri, Feb 22, 2019 at 12:43 PM Lee Howard <lee@asgard.org> wrote:

>
> On 2/22/19 12:53 PM, David Farmer wrote:
>
> Generally, I agree with what you are saying, but I'd like to see something
> like the following added as well;
>
> Even if an ISP intends to change the IPv6 prefix regularly in
> the longer-term, say every few months or even each month at an extreme, in
> the shorter-term IPv6 prefixes SHOULD be stable, for time periods of hours,
> days, and maybe even weeks at a time.  Or, put another way, CPE devices
> SHOULD NOT get a new IPv6 prefix every time they are rebooted.  Note: even
> in locations where utility power is generally stable, power outages
> frequently occur in clusters over a few hours or days.  This occurs when an
> emergency repair is made to restore power and then more permanent repairs
> cause short outages in the following hours or days. In this scenario, each
> of these events in the cluster SHOULD NOT result in the CPE receiving a
> different IPv6 prefix.
>
> Conversely, when widespread power events occur, affecting thousands or
> even tens of thousands of customers, it may not be practical or even
> possible for an ISP to guarantee all CPE will receive the same IPv6 prefix
> they had before.  Therefore to the extent possible, CPE and local networks
> SHOULD be resilient to their ISP provided IPv6 prefix changing, sometimes
> even unexpectedly changing.
>
> What's the difference between what you said and "ISPs should, as much as
> possible, reissue the same prefix to customers."?
>
> Lee
>
>
> Thanks.
>
> On Fri, Feb 22, 2019 at 10:36 AM Lee Howard <lee@asgard.org> wrote:
>
>> I think I have heard the following suggestions in this conversation. I
>> hope that taken all together, rather than as individual spot solutions,
>> they can be a consensus recommendation.
>>
>>
>> ISPs should, as much as possible, reissue the same prefix to customers.
>> Some things ISPs can do to increase the chances of this:
>>
>>    1.
>>
>>    Share lease information between redundant DHCPv6 servers. Most ISPs
>>    probably have redundant servers, since this is critical provisioning
>>    infrastructure. It may be difficult to synch information between servers
>>    for millions of leases over tens of milliseconds of latency; see RFC6853,
>>    "DHCPv6 Redundancy Deployment Considerations." Maybe DHCP vendors can
>>    report.
>>
>>    2.
>>
>>    Aggregate above the provider edge device, so that grooming customers
>>    between Provider Edge boxes (PEs) doesn't force a renumbering. It's been a
>>    few years since I worked on CMTSs, but when I did they did not support
>>    MP-BGP well (if at all), so routes had to be aggregated on the PE, or
>>    leaked in the IGP which is bad for convergence time. Maybe PE vendors can
>>    report.
>>
>>    3.
>>
>>    Set DHCPv6 lease timers very low prior to grooming events. A short
>>    interval during the maintenance window will increase load on the DHCPv6
>>    server until timers have been returned to normal values.
>>
>>    4.
>>
>>    In the case of a PE reboot, use DHCPv6 Bulk Leasequery to rebuild the
>>    routing table. I think all of the necessary information is in those
>>    responses. Again, last time I was working on CMTSs, this feature was not
>>    supported. Maybe PE vendors can report.
>>
>>
>>
>> Networks should, as much as possible, be resilient to prefix changes.
>> Some things networks can do to improve resilience:
>>
>>    1.
>>
>>    Write a learned prefix to non-volatile memory and issue a DHCPv6
>>    Renew for that prefix on reboot.
>>
>>    2.
>>
>>    Use dynamic DNS and shorter TTLs.
>>
>>    3.
>>
>>    Implement something like NETCONF to distribute prefix information to
>>    policy devices like firewalls or SD-WAN controllers. I think a separate
>>    document describing this application of NETCONF would make sense.
>>
>>
>>
>> In the case of failures, it cannot be assumed that sessions will stay
>> active. We try to build in redundancy and resilience where we can, but
>> where there's a single point of failure (such as CE or PE), and it fails
>> (such as an unplanned reboot), our expectations should be appropriate.
>>
>> Is this a reasonable summary?
>>
>> Lee
>>
>>
>>
>>
>>
>>
>>
>> --------------------------------------------------------------------
>> IETF IPv6 working group mailing list
>> ipv6@ietf.org
>> Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
>> --------------------------------------------------------------------
>>
>
>
> --
> ===============================================
> David Farmer               Email:farmer@umn.edu
> Networking & Telecommunication Services
> Office of Information Technology
> University of Minnesota
> 2218 University Ave SE        Phone: 612-626-0815
> Minneapolis, MN 55414-3029   Cell: 612-812-9952
> ===============================================
>
>

-- 
===============================================
David Farmer               Email:farmer@umn.edu
Networking & Telecommunication Services
Office of Information Technology
University of Minnesota
2218 University Ave SE        Phone: 612-626-0815
Minneapolis, MN 55414-3029   Cell: 612-812-9952
===============================================