Re: Stability and Resilience (was Re: [v6ops] A common...)

David Farmer <farmer@umn.edu> Fri, 22 February 2019 17:54 UTC

Return-Path: <farmer@umn.edu>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DA3D1130F39 for <ipv6@ietfa.amsl.com>; Fri, 22 Feb 2019 09:54:19 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.3
X-Spam-Level:
X-Spam-Status: No, score=-4.3 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=umn.edu
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qU4mTvuLJFBn for <ipv6@ietfa.amsl.com>; Fri, 22 Feb 2019 09:54:16 -0800 (PST)
Received: from mta-p5.oit.umn.edu (mta-p5.oit.umn.edu [134.84.196.205]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 74A1B130E2B for <ipv6@ietf.org>; Fri, 22 Feb 2019 09:54:16 -0800 (PST)
Received: from localhost (unknown [127.0.0.1]) by mta-p5.oit.umn.edu (Postfix) with ESMTP id AD8BAB9B for <ipv6@ietf.org>; Fri, 22 Feb 2019 17:54:15 +0000 (UTC)
X-Virus-Scanned: amavisd-new at umn.edu
Received: from mta-p5.oit.umn.edu ([127.0.0.1]) by localhost (mta-p5.oit.umn.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id oJKXLNtlPCZa for <ipv6@ietf.org>; Fri, 22 Feb 2019 11:54:15 -0600 (CST)
Received: from mail-ua1-f70.google.com (mail-ua1-f70.google.com [209.85.222.70]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mta-p5.oit.umn.edu (Postfix) with ESMTPS id 4B6AF9E8 for <ipv6@ietf.org>; Fri, 22 Feb 2019 11:54:15 -0600 (CST)
Received: by mail-ua1-f70.google.com with SMTP id w13so876899uaa.21 for <ipv6@ietf.org>; Fri, 22 Feb 2019 09:54:15 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=umn.edu; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=VLvXf1Xn9Ck6nTOxypqJAc/ITKvkn+ktYe9kcbdFKo0=; b=qwzMI1WDhvNPA0i2SNf8xTne8M9vkatRcm07I9H5ABNr63ET4LCFMLbhqJ/t8MJ7D9 C39Xa5hquiNS2HZshYGnnh/nBEEmRxSDEL6y+dwr6Y6KNJ+KyPScooa3k0zwk8yH7tEx faVmLZhT8qRNl0hbpcmgh8Qq6ne1bDWzIaNDkvVOySzYjC/nkRkUn1PPejmvOdgs/Wbu wKvf3GEac5tEsk/aQso/NwsxiojgrG2q3lF/ofbed0NrDYcOxgzYgF35Si/3APH2Ziek ukfyS4zorjYagEwZ7+Z04VmFO4DnvEzLHTOkm4FR6DXrv9wJt6+oycX+MLyHOSiG7TPU Qgtw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=VLvXf1Xn9Ck6nTOxypqJAc/ITKvkn+ktYe9kcbdFKo0=; b=mqBL7zI3ZyNTcE4wNTlxV2HGYjOi6iCASaSLPfTPSxNLHFgJ8Qb18lVWZHOtrYUvao i8RmbtzG4E6Pis3nLu3R4RsxT9dZOp6doRmoVr9EUUnZrs1IgAkbqYpACvAb3SKnMrIv dkao1i5WdsRr2FsDr/7lN8POFdR8bm3/4e1ItvyFa2YBHuwJp/UxX2HwkxPIXcCwKKdY WGfO0mwWtEuWVBneezq0lVmIK3EpumWbnoK8mou4oPftHiDZRQfmPjvDc8l4U9ZkwMd4 afN2bLeVw0O7+HkL03xr1e2YkeTnpYAGepPVJWdTQpvKM440GJRx0Z0yJwrV+PTMG/SH nhYw==
X-Gm-Message-State: AHQUAub79C0AYIhOrm4hdmhTkJXFtmfyCkFZg6IkX4Hgjnicjvyn79pN H1pH7FDsvLKamP/ycDrG9yWBj6EAuTy8r8x2lC3p6z5M1RXXFyGP+duITEt1Tscim5d9jZOe69I LFyY7jZEk2/9TFJn+HjVxTNhW
X-Received: by 2002:a67:e907:: with SMTP id c7mr2964172vso.221.1550858053990; Fri, 22 Feb 2019 09:54:13 -0800 (PST)
X-Google-Smtp-Source: AHgI3IaPpQoTzuUqT9JMUHt/YMgKKz/Yp9s9eLsSG3lYGUn2hWc0KZG0Rp2+aLEWLiZxfF1zqezz0cPnzNLDVZal+ws=
X-Received: by 2002:a67:e907:: with SMTP id c7mr2964151vso.221.1550858053443; Fri, 22 Feb 2019 09:54:13 -0800 (PST)
MIME-Version: 1.0
References: <6D78F4B2-A30D-4562-AC21-E4D3DE019D90@consulintel.es> <B6E2EC33-EEAF-40D0-AFCC-BDAFA9134ACD@consulintel.es> <20190220113603.GK71606@Space.Net> <28fbc2c305c640c9afb3704050f6e8d7@boeing.com> <20190220213107.GS71606@Space.Net> <019c552eb1624d348641d6930829fd1f@boeing.com> <CAKD1Yr0HBG+rhyFWg9zh0t3mW486Mjx9umjn+CRqAZg4z9r0dg@mail.gmail.com> <20190221073530.GT71606@Space.Net> <CAO42Z2wmB2W52b4MZ2h9sW5E9cQKm-HRjyf--q8C26jezS7LXQ@mail.gmail.com> <a73818d31db7422b99a524bc431b00ed@boeing.com> <CAO42Z2z9-48Gbb_Exf+oWUqDO=axSLpZBtqeDcxkAoFq5OziGw@mail.gmail.com> <0629af5e-5e1b-7e01-5bf4-b288a2d36809@asgard.org>
In-Reply-To: <0629af5e-5e1b-7e01-5bf4-b288a2d36809@asgard.org>
From: David Farmer <farmer@umn.edu>
Date: Fri, 22 Feb 2019 11:53:56 -0600
Message-ID: <CAN-Dau0UsVpcZ4TRraQLR9GrqenrCgmh3yq67DaXBXv1SPywJg@mail.gmail.com>
Subject: Re: Stability and Resilience (was Re: [v6ops] A common...)
To: Lee Howard <lee@asgard.org>
Cc: 6man WG <ipv6@ietf.org>, IPv6 Ops WG <v6ops@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000005f7d9b05827f4747"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ipv6/S1jGtCDaz6MnOMnD2XVGoegHQlE>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Feb 2019 17:54:20 -0000

Generally, I agree with what you are saying, but I'd like to see something
like the following added as well;

Even if an ISP intends to change the IPv6 prefix regularly in
the longer-term, say every few months or even each month at an extreme, in
the shorter-term IPv6 prefixes SHOULD be stable, for time periods of hours,
days, and maybe even weeks at a time.  Or, put another way, CPE devices
SHOULD NOT get a new IPv6 prefix every time they are rebooted.  Note: even
in locations where utility power is generally stable, power outages
frequently occur in clusters over a few hours or days.  This occurs when an
emergency repair is made to restore power and then more permanent repairs
cause short outages in the following hours or days. In this scenario, each
of these events in the cluster SHOULD NOT result in the CPE receiving a
different IPv6 prefix.

Conversely, when widespread power events occur, affecting thousands or even
tens of thousands of customers, it may not be practical or even possible
for an ISP to guarantee all CPE will receive the same IPv6 prefix they had
before.  Therefore to the extent possible, CPE and local networks SHOULD be
resilient to their ISP provided IPv6 prefix changing, sometimes even
unexpectedly changing.

Thanks.

On Fri, Feb 22, 2019 at 10:36 AM Lee Howard <lee@asgard.org> wrote:

> I think I have heard the following suggestions in this conversation. I
> hope that taken all together, rather than as individual spot solutions,
> they can be a consensus recommendation.
>
>
> ISPs should, as much as possible, reissue the same prefix to customers.
> Some things ISPs can do to increase the chances of this:
>
>    1.
>
>    Share lease information between redundant DHCPv6 servers. Most ISPs
>    probably have redundant servers, since this is critical provisioning
>    infrastructure. It may be difficult to synch information between servers
>    for millions of leases over tens of milliseconds of latency; see RFC6853,
>    "DHCPv6 Redundancy Deployment Considerations." Maybe DHCP vendors can
>    report.
>
>    2.
>
>    Aggregate above the provider edge device, so that grooming customers
>    between Provider Edge boxes (PEs) doesn't force a renumbering. It's been a
>    few years since I worked on CMTSs, but when I did they did not support
>    MP-BGP well (if at all), so routes had to be aggregated on the PE, or
>    leaked in the IGP which is bad for convergence time. Maybe PE vendors can
>    report.
>
>    3.
>
>    Set DHCPv6 lease timers very low prior to grooming events. A short
>    interval during the maintenance window will increase load on the DHCPv6
>    server until timers have been returned to normal values.
>
>    4.
>
>    In the case of a PE reboot, use DHCPv6 Bulk Leasequery to rebuild the
>    routing table. I think all of the necessary information is in those
>    responses. Again, last time I was working on CMTSs, this feature was not
>    supported. Maybe PE vendors can report.
>
>
>
> Networks should, as much as possible, be resilient to prefix changes. Some
> things networks can do to improve resilience:
>
>    1.
>
>    Write a learned prefix to non-volatile memory and issue a DHCPv6 Renew
>    for that prefix on reboot.
>
>    2.
>
>    Use dynamic DNS and shorter TTLs.
>
>    3.
>
>    Implement something like NETCONF to distribute prefix information to
>    policy devices like firewalls or SD-WAN controllers. I think a separate
>    document describing this application of NETCONF would make sense.
>
>
>
> In the case of failures, it cannot be assumed that sessions will stay
> active. We try to build in redundancy and resilience where we can, but
> where there's a single point of failure (such as CE or PE), and it fails
> (such as an unplanned reboot), our expectations should be appropriate.
>
> Is this a reasonable summary?
>
> Lee
>
>
>
>
>
>
>
> --------------------------------------------------------------------
> IETF IPv6 working group mailing list
> ipv6@ietf.org
> Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
> --------------------------------------------------------------------
>


-- 
===============================================
David Farmer               Email:farmer@umn.edu
Networking & Telecommunication Services
Office of Information Technology
University of Minnesota
2218 University Ave SE        Phone: 612-626-0815
Minneapolis, MN 55414-3029   Cell: 612-812-9952
===============================================