Re: [v6ops] Stability and Resilience (was Re: A common...)

David Farmer <farmer@umn.edu> Fri, 22 February 2019 17:54 UTC

Return-Path: <farmer@umn.edu>
X-Original-To: v6ops@ietfa.amsl.com
Delivered-To: v6ops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3E468130E2B for <v6ops@ietfa.amsl.com>; Fri, 22 Feb 2019 09:54:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.3
X-Spam-Level:
X-Spam-Status: No, score=-4.3 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=umn.edu
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ixksTjBNGzab for <v6ops@ietfa.amsl.com>; Fri, 22 Feb 2019 09:54:16 -0800 (PST)
Received: from mta-p5.oit.umn.edu (mta-p5.oit.umn.edu [134.84.196.205]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 78E82130F1E for <v6ops@ietf.org>; Fri, 22 Feb 2019 09:54:16 -0800 (PST)
Received: from localhost (unknown [127.0.0.1]) by mta-p5.oit.umn.edu (Postfix) with ESMTP id C3620BA9 for <v6ops@ietf.org>; Fri, 22 Feb 2019 17:54:15 +0000 (UTC)
X-Virus-Scanned: amavisd-new at umn.edu
Received: from mta-p5.oit.umn.edu ([127.0.0.1]) by localhost (mta-p5.oit.umn.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id njArsRqttZHC for <v6ops@ietf.org>; Fri, 22 Feb 2019 11:54:15 -0600 (CST)
Received: from mail-ua1-f71.google.com (mail-ua1-f71.google.com [209.85.222.71]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mta-p5.oit.umn.edu (Postfix) with ESMTPS id 6D5DFA78 for <v6ops@ietf.org>; Fri, 22 Feb 2019 11:54:15 -0600 (CST)
Received: by mail-ua1-f71.google.com with SMTP id j2so477917uap.18 for <v6ops@ietf.org>; Fri, 22 Feb 2019 09:54:15 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=umn.edu; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=VLvXf1Xn9Ck6nTOxypqJAc/ITKvkn+ktYe9kcbdFKo0=; b=qwzMI1WDhvNPA0i2SNf8xTne8M9vkatRcm07I9H5ABNr63ET4LCFMLbhqJ/t8MJ7D9 C39Xa5hquiNS2HZshYGnnh/nBEEmRxSDEL6y+dwr6Y6KNJ+KyPScooa3k0zwk8yH7tEx faVmLZhT8qRNl0hbpcmgh8Qq6ne1bDWzIaNDkvVOySzYjC/nkRkUn1PPejmvOdgs/Wbu wKvf3GEac5tEsk/aQso/NwsxiojgrG2q3lF/ofbed0NrDYcOxgzYgF35Si/3APH2Ziek ukfyS4zorjYagEwZ7+Z04VmFO4DnvEzLHTOkm4FR6DXrv9wJt6+oycX+MLyHOSiG7TPU Qgtw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=VLvXf1Xn9Ck6nTOxypqJAc/ITKvkn+ktYe9kcbdFKo0=; b=Tm3tMG0oCfnVAF+yci6GXxI34bPnykQ/Sare4D/MCREUYq/S01vGcNy+2480YumY0K rOGzKsj4eE556Pz4b23Zvsyt0aGyqE2DOP5KoSiyUrkGadOHxtXWfeGLkNxeVOMJ4XWT TxgSsqURi2ELNYx3l3ZtBTYKiFQnHeR3EPa0/uGUkc5CTO94G6DqgMCglfDSC0e2anIM IJQAZEnb4fHC3BBpqd2N5F/of6pm+mxa9ymxneucQm17ZdH2K69cpAF557K4rTlxz2em bV4oLJjVyhnfgmHG/JwPZd0wOl1nA64dawarhE/dYHX5cUCSAYiiuIICfL8tVDLe8Pyu vItQ==
X-Gm-Message-State: AHQUAuauN2Tidb/HEpEDqFUzO5JBRyUmsC6RPUUh8cDDk62orHFz5K8a gpX1SlZ9Lua/k9MPO/XMy1lznk+yglIu7Ax5xRmnwIxbZ22iSp6RVPC2UT3RS14TfcSwAl8l+17 2QQkbNH4pcZG1oEIepYC67PzOLw==
X-Received: by 2002:a67:e907:: with SMTP id c7mr2964173vso.221.1550858053990; Fri, 22 Feb 2019 09:54:13 -0800 (PST)
X-Google-Smtp-Source: AHgI3IaPpQoTzuUqT9JMUHt/YMgKKz/Yp9s9eLsSG3lYGUn2hWc0KZG0Rp2+aLEWLiZxfF1zqezz0cPnzNLDVZal+ws=
X-Received: by 2002:a67:e907:: with SMTP id c7mr2964151vso.221.1550858053443; Fri, 22 Feb 2019 09:54:13 -0800 (PST)
MIME-Version: 1.0
References: <6D78F4B2-A30D-4562-AC21-E4D3DE019D90@consulintel.es> <B6E2EC33-EEAF-40D0-AFCC-BDAFA9134ACD@consulintel.es> <20190220113603.GK71606@Space.Net> <28fbc2c305c640c9afb3704050f6e8d7@boeing.com> <20190220213107.GS71606@Space.Net> <019c552eb1624d348641d6930829fd1f@boeing.com> <CAKD1Yr0HBG+rhyFWg9zh0t3mW486Mjx9umjn+CRqAZg4z9r0dg@mail.gmail.com> <20190221073530.GT71606@Space.Net> <CAO42Z2wmB2W52b4MZ2h9sW5E9cQKm-HRjyf--q8C26jezS7LXQ@mail.gmail.com> <a73818d31db7422b99a524bc431b00ed@boeing.com> <CAO42Z2z9-48Gbb_Exf+oWUqDO=axSLpZBtqeDcxkAoFq5OziGw@mail.gmail.com> <0629af5e-5e1b-7e01-5bf4-b288a2d36809@asgard.org>
In-Reply-To: <0629af5e-5e1b-7e01-5bf4-b288a2d36809@asgard.org>
From: David Farmer <farmer@umn.edu>
Date: Fri, 22 Feb 2019 11:53:56 -0600
Message-ID: <CAN-Dau0UsVpcZ4TRraQLR9GrqenrCgmh3yq67DaXBXv1SPywJg@mail.gmail.com>
To: Lee Howard <lee@asgard.org>
Cc: 6man WG <ipv6@ietf.org>, IPv6 Ops WG <v6ops@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000005f7d9b05827f4747"
Archived-At: <https://mailarchive.ietf.org/arch/msg/v6ops/4Teh_2fhw7x0tbUsjNProdZlq6s>
Subject: Re: [v6ops] Stability and Resilience (was Re: A common...)
X-BeenThere: v6ops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: v6ops discussion list <v6ops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/v6ops>, <mailto:v6ops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/v6ops/>
List-Post: <mailto:v6ops@ietf.org>
List-Help: <mailto:v6ops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/v6ops>, <mailto:v6ops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Feb 2019 17:54:20 -0000

Generally, I agree with what you are saying, but I'd like to see something
like the following added as well;

Even if an ISP intends to change the IPv6 prefix regularly in
the longer-term, say every few months or even each month at an extreme, in
the shorter-term IPv6 prefixes SHOULD be stable, for time periods of hours,
days, and maybe even weeks at a time.  Or, put another way, CPE devices
SHOULD NOT get a new IPv6 prefix every time they are rebooted.  Note: even
in locations where utility power is generally stable, power outages
frequently occur in clusters over a few hours or days.  This occurs when an
emergency repair is made to restore power and then more permanent repairs
cause short outages in the following hours or days. In this scenario, each
of these events in the cluster SHOULD NOT result in the CPE receiving a
different IPv6 prefix.

Conversely, when widespread power events occur, affecting thousands or even
tens of thousands of customers, it may not be practical or even possible
for an ISP to guarantee all CPE will receive the same IPv6 prefix they had
before.  Therefore to the extent possible, CPE and local networks SHOULD be
resilient to their ISP provided IPv6 prefix changing, sometimes even
unexpectedly changing.

Thanks.

On Fri, Feb 22, 2019 at 10:36 AM Lee Howard <lee@asgard.org> wrote:

> I think I have heard the following suggestions in this conversation. I
> hope that taken all together, rather than as individual spot solutions,
> they can be a consensus recommendation.
>
>
> ISPs should, as much as possible, reissue the same prefix to customers.
> Some things ISPs can do to increase the chances of this:
>
>    1.
>
>    Share lease information between redundant DHCPv6 servers. Most ISPs
>    probably have redundant servers, since this is critical provisioning
>    infrastructure. It may be difficult to synch information between servers
>    for millions of leases over tens of milliseconds of latency; see RFC6853,
>    "DHCPv6 Redundancy Deployment Considerations." Maybe DHCP vendors can
>    report.
>
>    2.
>
>    Aggregate above the provider edge device, so that grooming customers
>    between Provider Edge boxes (PEs) doesn't force a renumbering. It's been a
>    few years since I worked on CMTSs, but when I did they did not support
>    MP-BGP well (if at all), so routes had to be aggregated on the PE, or
>    leaked in the IGP which is bad for convergence time. Maybe PE vendors can
>    report.
>
>    3.
>
>    Set DHCPv6 lease timers very low prior to grooming events. A short
>    interval during the maintenance window will increase load on the DHCPv6
>    server until timers have been returned to normal values.
>
>    4.
>
>    In the case of a PE reboot, use DHCPv6 Bulk Leasequery to rebuild the
>    routing table. I think all of the necessary information is in those
>    responses. Again, last time I was working on CMTSs, this feature was not
>    supported. Maybe PE vendors can report.
>
>
>
> Networks should, as much as possible, be resilient to prefix changes. Some
> things networks can do to improve resilience:
>
>    1.
>
>    Write a learned prefix to non-volatile memory and issue a DHCPv6 Renew
>    for that prefix on reboot.
>
>    2.
>
>    Use dynamic DNS and shorter TTLs.
>
>    3.
>
>    Implement something like NETCONF to distribute prefix information to
>    policy devices like firewalls or SD-WAN controllers. I think a separate
>    document describing this application of NETCONF would make sense.
>
>
>
> In the case of failures, it cannot be assumed that sessions will stay
> active. We try to build in redundancy and resilience where we can, but
> where there's a single point of failure (such as CE or PE), and it fails
> (such as an unplanned reboot), our expectations should be appropriate.
>
> Is this a reasonable summary?
>
> Lee
>
>
>
>
>
>
>
> --------------------------------------------------------------------
> IETF IPv6 working group mailing list
> ipv6@ietf.org
> Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
> --------------------------------------------------------------------
>


-- 
===============================================
David Farmer               Email:farmer@umn.edu
Networking & Telecommunication Services
Office of Information Technology
University of Minnesota
2218 University Ave SE        Phone: 612-626-0815
Minneapolis, MN 55414-3029   Cell: 612-812-9952
===============================================