Re: Stability and Resilience (was Re: [v6ops] A common...)

David Farmer <farmer@umn.edu> Fri, 22 February 2019 21:40 UTC

Return-Path: <farmer@umn.edu>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 48186130E7F for <ipv6@ietfa.amsl.com>; Fri, 22 Feb 2019 13:40:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.3
X-Spam-Level:
X-Spam-Status: No, score=-4.3 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=umn.edu
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 87n2R5M5FHsq for <ipv6@ietfa.amsl.com>; Fri, 22 Feb 2019 13:40:47 -0800 (PST)
Received: from mta-p5.oit.umn.edu (mta-p5.oit.umn.edu [134.84.196.205]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2EB6F130E79 for <ipv6@ietf.org>; Fri, 22 Feb 2019 13:40:47 -0800 (PST)
Received: from localhost (unknown [127.0.0.1]) by mta-p5.oit.umn.edu (Postfix) with ESMTP id 7E56685D for <ipv6@ietf.org>; Fri, 22 Feb 2019 21:40:46 +0000 (UTC)
X-Virus-Scanned: amavisd-new at umn.edu
Received: from mta-p5.oit.umn.edu ([127.0.0.1]) by localhost (mta-p5.oit.umn.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0OOwFSb1z6Le for <ipv6@ietf.org>; Fri, 22 Feb 2019 15:40:46 -0600 (CST)
Received: from mail-ua1-f70.google.com (mail-ua1-f70.google.com [209.85.222.70]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mta-p5.oit.umn.edu (Postfix) with ESMTPS id 15985B38 for <ipv6@ietf.org>; Fri, 22 Feb 2019 15:40:45 -0600 (CST)
Received: by mail-ua1-f70.google.com with SMTP id t24so1149662uar.14 for <ipv6@ietf.org>; Fri, 22 Feb 2019 13:40:45 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=umn.edu; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=D63bq5HpWUFTtJ4siHBh3RrLLAZUjxSzpeofpODGTrc=; b=NDznbxYkVLjOPhEadocEF22l61LGncTuNhLd+ah8ni7Yjtv1XIrZoayDOIaBeNOtvZ IigXftcvGcwqWEKqrw3WI3jv+9US+I73VMkK7eJjxl/iFm4BazbfWQfaDmHiqw1Wpy+r BwMP07LSovT4mN+xMv2r0ADLvunigFyef9yqcKg/GATtgeTkm1N/pVo/BwjV/UCVwBSf POocUesiNBEtRPmjj2f7hFQZhvWxFM+uv70CzBHJUBfTSiXCOJvoaYmKcDxWVpLDklq4 NML93qVz7J8K4dH7C6XCM6rSYSNcDDmyPqYxs4tqsTuWlWk+k5mfNEuR7JfkkqNZTLSW WefQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=D63bq5HpWUFTtJ4siHBh3RrLLAZUjxSzpeofpODGTrc=; b=aVEy/ZetMHsAQpki0rng4KqtVWuT5SVcDP8lVTTLRiKVahAK7hKqlMBw6uwL8h3W5d GmR7jhtOMFtgvrqsPZkfg236RsDFiYTfzCJbQX9nSl4lFLvNr12S6O1DaPeN04bFCTRP 2/qLlxLNDBvymwG7nOA9DcaxwgvW1xd4yZUk6pS1AEN8sE287xI/dqk70UWaFleZ/qKZ /lQfL2B0OeADI+j1oOc+hSJ6w5E9mvNj00nhR10+Ctm6NnchK69XOXV/otwUmQVsvbQV iLbost0f2Bf4Mr0yRA1Prb8pJS01IWGyOrDQFY5Ul7nKtAHjb4KrxpWCxULPeFbz7T2e TO3Q==
X-Gm-Message-State: AHQUAuYIebv6Sf3AgD3+6wuG3SkkEKJFSbw7cxC/GfuQcpiWSk8pw9Lu bwGx4mZzMklSrM/PPWZGRWGulQFKYrCMoolA+BE3jBQKv5CVaK6UjQsCgVhsHs/mCYhfd41zD10 hI+jWZ92RMw2t4tHIkoOc+K4m
X-Received: by 2002:a67:2ed1:: with SMTP id u200mr3133655vsu.167.1550871644510; Fri, 22 Feb 2019 13:40:44 -0800 (PST)
X-Google-Smtp-Source: AHgI3IbDt/1Ov5seNEgMFBz+m7gCaSrDRUeIV+iut7S5cvbN9EGiS+Pr9IjZBSForCv6rob8GVtSuiHv3B7baP9iI78=
X-Received: by 2002:a67:2ed1:: with SMTP id u200mr3133638vsu.167.1550871643822; Fri, 22 Feb 2019 13:40:43 -0800 (PST)
MIME-Version: 1.0
References: <6D78F4B2-A30D-4562-AC21-E4D3DE019D90@consulintel.es> <B6E2EC33-EEAF-40D0-AFCC-BDAFA9134ACD@consulintel.es> <20190220113603.GK71606@Space.Net> <28fbc2c305c640c9afb3704050f6e8d7@boeing.com> <20190220213107.GS71606@Space.Net> <019c552eb1624d348641d6930829fd1f@boeing.com> <CAKD1Yr0HBG+rhyFWg9zh0t3mW486Mjx9umjn+CRqAZg4z9r0dg@mail.gmail.com> <20190221073530.GT71606@Space.Net> <CAO42Z2wmB2W52b4MZ2h9sW5E9cQKm-HRjyf--q8C26jezS7LXQ@mail.gmail.com> <a73818d31db7422b99a524bc431b00ed@boeing.com> <CAO42Z2z9-48Gbb_Exf+oWUqDO=axSLpZBtqeDcxkAoFq5OziGw@mail.gmail.com> <0629af5e-5e1b-7e01-5bf4-b288a2d36809@asgard.org> <CAN-Dau0UsVpcZ4TRraQLR9GrqenrCgmh3yq67DaXBXv1SPywJg@mail.gmail.com> <b531ed25-aa3e-0625-7da2-3fe254bf712f@asgard.org>
In-Reply-To: <b531ed25-aa3e-0625-7da2-3fe254bf712f@asgard.org>
From: David Farmer <farmer@umn.edu>
Date: Fri, 22 Feb 2019 15:40:26 -0600
Message-ID: <CAN-Dau2GDrYtGi+THiJD+oQ4weSyw8n2EOLaqst7en=cL_u9aw@mail.gmail.com>
Subject: Re: Stability and Resilience (was Re: [v6ops] A common...)
To: Lee Howard <lee@asgard.org>
Cc: 6man WG <ipv6@ietf.org>, IPv6 Ops WG <v6ops@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000006c36240582827127"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/Kjn9sAT8SmjfaI2x6MEVTdfbORY>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Feb 2019 21:40:50 -0000

Yes, at a very high level, it boils down to the same thing.  However,
saying "as much as possible", set a very wide standard, providing few
parameters to set expectations seems useful and helpful in developing a
consensus. For example, if someone says it is impossible for their solution
to ever issues the same prefix, they need to be told they should probably
find a different solution. Where on the other hand, what if an ISP could
always provide the same prefix, this basically says that is what they
should do, where once they exceed some minimum I would want to give the ISP
flexibility to implement what makes sense for their business model.  Also,
it needs to be clear this is "in normal circumstances", it should be
acknowledged there will be situations where this is not a reasonable
expectation.

On Fri, Feb 22, 2019 at 12:43 PM Lee Howard <lee@asgard.org> wrote:

>
> On 2/22/19 12:53 PM, David Farmer wrote:
>
> Generally, I agree with what you are saying, but I'd like to see something
> like the following added as well;
>
> Even if an ISP intends to change the IPv6 prefix regularly in
> the longer-term, say every few months or even each month at an extreme, in
> the shorter-term IPv6 prefixes SHOULD be stable, for time periods of hours,
> days, and maybe even weeks at a time.  Or, put another way, CPE devices
> SHOULD NOT get a new IPv6 prefix every time they are rebooted.  Note: even
> in locations where utility power is generally stable, power outages
> frequently occur in clusters over a few hours or days.  This occurs when an
> emergency repair is made to restore power and then more permanent repairs
> cause short outages in the following hours or days. In this scenario, each
> of these events in the cluster SHOULD NOT result in the CPE receiving a
> different IPv6 prefix.
>
> Conversely, when widespread power events occur, affecting thousands or
> even tens of thousands of customers, it may not be practical or even
> possible for an ISP to guarantee all CPE will receive the same IPv6 prefix
> they had before.  Therefore to the extent possible, CPE and local networks
> SHOULD be resilient to their ISP provided IPv6 prefix changing, sometimes
> even unexpectedly changing.
>
> What's the difference between what you said and "ISPs should, as much as
> possible, reissue the same prefix to customers."?
>
> Lee
>
>
> Thanks.
>
> On Fri, Feb 22, 2019 at 10:36 AM Lee Howard <lee@asgard.org> wrote:
>
>> I think I have heard the following suggestions in this conversation. I
>> hope that taken all together, rather than as individual spot solutions,
>> they can be a consensus recommendation.
>>
>>
>> ISPs should, as much as possible, reissue the same prefix to customers.
>> Some things ISPs can do to increase the chances of this:
>>
>>    1.
>>
>>    Share lease information between redundant DHCPv6 servers. Most ISPs
>>    probably have redundant servers, since this is critical provisioning
>>    infrastructure. It may be difficult to synch information between servers
>>    for millions of leases over tens of milliseconds of latency; see RFC6853,
>>    "DHCPv6 Redundancy Deployment Considerations." Maybe DHCP vendors can
>>    report.
>>
>>    2.
>>
>>    Aggregate above the provider edge device, so that grooming customers
>>    between Provider Edge boxes (PEs) doesn't force a renumbering. It's been a
>>    few years since I worked on CMTSs, but when I did they did not support
>>    MP-BGP well (if at all), so routes had to be aggregated on the PE, or
>>    leaked in the IGP which is bad for convergence time. Maybe PE vendors can
>>    report.
>>
>>    3.
>>
>>    Set DHCPv6 lease timers very low prior to grooming events. A short
>>    interval during the maintenance window will increase load on the DHCPv6
>>    server until timers have been returned to normal values.
>>
>>    4.
>>
>>    In the case of a PE reboot, use DHCPv6 Bulk Leasequery to rebuild the
>>    routing table. I think all of the necessary information is in those
>>    responses. Again, last time I was working on CMTSs, this feature was not
>>    supported. Maybe PE vendors can report.
>>
>>
>>
>> Networks should, as much as possible, be resilient to prefix changes.
>> Some things networks can do to improve resilience:
>>
>>    1.
>>
>>    Write a learned prefix to non-volatile memory and issue a DHCPv6
>>    Renew for that prefix on reboot.
>>
>>    2.
>>
>>    Use dynamic DNS and shorter TTLs.
>>
>>    3.
>>
>>    Implement something like NETCONF to distribute prefix information to
>>    policy devices like firewalls or SD-WAN controllers. I think a separate
>>    document describing this application of NETCONF would make sense.
>>
>>
>>
>> In the case of failures, it cannot be assumed that sessions will stay
>> active. We try to build in redundancy and resilience where we can, but
>> where there's a single point of failure (such as CE or PE), and it fails
>> (such as an unplanned reboot), our expectations should be appropriate.
>>
>> Is this a reasonable summary?
>>
>> Lee
>>
>>
>>
>>
>>
>>
>>
>> --------------------------------------------------------------------
>> IETF IPv6 working group mailing list
>> ipv6@ietf.org
>> Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
>> --------------------------------------------------------------------
>>
>
>
> --
> ===============================================
> David Farmer               Email:farmer@umn.edu
> Networking & Telecommunication Services
> Office of Information Technology
> University of Minnesota
> 2218 University Ave SE        Phone: 612-626-0815
> Minneapolis, MN 55414-3029   Cell: 612-812-9952
> ===============================================
>
>

-- 
===============================================
David Farmer               Email:farmer@umn.edu
Networking & Telecommunication Services
Office of Information Technology
University of Minnesota
2218 University Ave SE        Phone: 612-626-0815
Minneapolis, MN 55414-3029   Cell: 612-812-9952
===============================================