Re: [GROW] Handling of LAGs in Mitigating Negative Impact of Maintenance through BGP Session Culling

Thomas King <thomas.king@de-cix.net> Thu, 11 January 2018 21:34 UTC

Return-Path: <thomas.king@de-cix.net>
X-Original-To: grow@ietfa.amsl.com
Delivered-To: grow@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 97AA212DDD2; Thu, 11 Jan 2018 13:34:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.61
X-Spam-Level:
X-Spam-Status: No, score=-2.61 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KL5B01SamJIF; Thu, 11 Jan 2018 13:34:10 -0800 (PST)
Received: from de-cix.net (relay3.de-cix.net [46.31.123.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5DBBC126E01; Thu, 11 Jan 2018 13:34:09 -0800 (PST)
X-IronPort-AV: E=Sophos;i="5.46,346,1511823600"; d="scan'208";a="1346910"
Received: from smtp.de-cix.net ([192.168.65.10]) by mailgw013.de-cix.net with ESMTP; 11 Jan 2018 22:34:08 +0100
Received: from MS-EXCHANGE.for-the-inter.net (MS-EXCHANGE.for-the-inter.net [192.168.49.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by smtp.de-cix.net (Postfix) with ESMTPS id A29DBB00B8; Thu, 11 Jan 2018 22:34:07 +0100 (CET)
Received: from MS-EXCHANGE.for-the-inter.net (192.168.49.2) by MS-EXCHANGE.for-the-inter.net (192.168.49.2) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Thu, 11 Jan 2018 22:34:07 +0100
Received: from MS-EXCHANGE.for-the-inter.net ([fe80::9449:4d85:69bf:3d4c]) by MS-EXCHANGE.for-the-inter.net ([fe80::9449:4d85:69bf:3d4c%12]) with mapi id 15.00.1347.000; Thu, 11 Jan 2018 22:34:07 +0100
From: Thomas King <thomas.king@de-cix.net>
To: Job Snijders <job@instituut.net>, Will Hargrave <will@harg.net>
CC: "grow@ietf.org" <grow@ietf.org>, "draft-ietf-grow-bgp-session-culling@ietf.org" <draft-ietf-grow-bgp-session-culling@ietf.org>
Thread-Topic: [GROW] Handling of LAGs in Mitigating Negative Impact of Maintenance through BGP Session Culling
Thread-Index: AQHTiT1EVmpr6uxEKUG4l9q8mMNj9KNrWI6AgABC9QCAAARwgIADlWIA
Date: Thu, 11 Jan 2018 21:34:06 +0000
Message-ID: <DF40EDF7-DFF4-44EA-AF7E-A41552E5400A@de-cix.net>
References: <8BB20DB3-61E9-4CAC-B33B-B18CA12C2591@de-cix.net> <20180109113506.GA99435@vurt.meerval.net> <53E19D26-D4C0-4722-8CFE-FDC5BF5C3FBC@harg.net> <20180109155039.GD59807@vurt.meerval.net>
In-Reply-To: <20180109155039.GD59807@vurt.meerval.net>
Accept-Language: en-US, de-DE
Content-Language: en-GB
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [192.168.141.93]
Content-Type: text/plain; charset="utf-8"
Content-ID: <93C4903779EBB24FAC7F2B7C923CA4FE@for-the-inter.net>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/grow/7NrSilC7EmWMkpa0cEbl7lVIdrY>
Subject: Re: [GROW] Handling of LAGs in Mitigating Negative Impact of Maintenance through BGP Session Culling
X-BeenThere: grow@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Grow Working Group Mailing List <grow.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/grow>, <mailto:grow-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/grow/>
List-Post: <mailto:grow@ietf.org>
List-Help: <mailto:grow-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/grow>, <mailto:grow-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Jan 2018 21:34:11 -0000

Hi Job,

On 09.01.18, 16:50, "Job Snijders" <job@instituut.net> wrote:

    On Tue, Jan 09, 2018 at 03:34:46PM +0000, Will Hargrave wrote:
    > On 9 Jan 2018, at 11:35, Job Snijders wrote:
    > > > Our suggestion for handling LAGs looks like this: Typically, a
    > > > minimum number of port members can be defined for a LAG being up.
    > > > The LAG is not touched by BGP Session Culling during a maintenance
    > > > unless this number is undercut. If the number if undercut the LAG
    > > > is handled by BGP Session Culling as defined in the Internet
    > > > Draft.
    > > > 
    > > > If no value for the minimum number of active port members is
    > > > defined for a LAG, the value 1 should be used as this is the
    > > > behaviour of LAGs today already.
    > >
    > > Is this in context of multi-chassis LAG?
    > 
    > I think if we include anything about LAGs we should make it very clear
    > that you must apply the culling ACL to *either* all ports of a LAG
    > *or* none.  Applying it to half of an MCLAG could be disastrous.
    
    Will, my reading of Thomas' message is slightly different, I don't think
    he is proposing to apply culling ACLs on half the ports of a LAG, he
    proposes that culling ACLs are only applied (to all ports) when more
    than X members of a LAG will be impacted by the maintenance (where X is
    an ixp-participant configurable number).

This is what I meant!
    
    > I didn’t realise there were IXPs using MC-LAG. Discovering this maybe
    > surprise some members.
    
    Thomas, in terms of IETF logistics, the RFC-To-Be draft-ietf-grow-bgp-session-culling
    document has already been submitted to the RFC Editor and is in the
    queue, information on LAGs cannot be added at this point to the
    publication.

Really, my understanding is it could be added. It is not typical but doable.

    However, since this is a BCP, the next iteration could include
    additional information as our understanding of the culling practise
    improves, and the BCP number wouldn't change of course.
    
    From what I read in your message your organisation is still in the early
    phases of applying the 'culling' mechanism. I recommend you to (over
    time) carefully take notes of the interaction between LAG and culling,
    and whatever arises as the best current practise is documented in the
    next revision of the BCP.

From my point of view the culling mechanism is very import for LAGs and should be clearly defined in order to be useful und comprehensive. As we see more and more IXP peers moving to LAGs (n times 10GE and also n times 100GE) this topic is already very important. It will gain more importance in the future as traffic volumes increase.

Best regards,
Thomas