Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0

Keyur Patel <keyur@arrcus.com> Fri, 11 December 2020 23:30 UTC

Return-Path: <keyur@arrcus.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B12823A1032 for <idr@ietfa.amsl.com>; Fri, 11 Dec 2020 15:30:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=netorgft1331857.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MvVzjT2GSH0F for <idr@ietfa.amsl.com>; Fri, 11 Dec 2020 15:30:32 -0800 (PST)
Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2057.outbound.protection.outlook.com [40.107.220.57]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E1D8F3A102E for <idr@ietf.org>; Fri, 11 Dec 2020 15:30:31 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OUK4XQK4TBhaD9Db3KgB73C9tVOGp1RtDWjj3d+7UhHf8d0wQz4B7Je5v75EjH+95J44Cu0cEYmlP7NqjQ45BvcONZJcE/6QtVq8caevlyR5/7jZe8lsANeaf2ftCD5JERC/yA2Rp7r4z/T2lskdHhBxm/4UPP1KMtRbuF670PAIzTd6cAp35OL0dg1lQV8boruiIJDodMPDM8F0DbTvYUUEzNU5EK4b9Li3r7vPoEDV9UCgfS+TdIhZi0truACK/mWdD3DXpjeF1qqzZbcJmTdur4LJjhRJ3lY1qsHCuhtfkM4eiYN4pCBgDeatSFfUZE9wgtfa7CHK8GSjMT0LuA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=II2aF/3y5i1y2UK+QhIQsh5QZpgSCD07WtvcMcCS4VA=; b=hF6HZ/Iu9mmTT+BKHCpda6TinhqDZTObBH6V3ivOU1iOO2XUEDP9wQoB36Ya5c8ig0utVHhcYLAbdnrBai1a8+Fk1oThd1ZHIBLMgmaBds3H0UfxccHSXjDkYdaqHv8wuOkeTtI1QM6C759/07jWzWEshQLppeDMJg0EVp2Moy9sBYWbdPEwXg/bdcNk9eoa0wdW7oDAoNO8iv++AAy/yVC7JlcZi/gudtbt/vAPPAP/zev9PEK27C7JKQDcDmII2N1XJJzhkZhezhXdPvsGSR0oIz/syy/cUxArcQSRb/K4RhostGwqk+lKTMknJxwQZ+mF4T67C9qmWNKTLmJ+jg==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arrcus.com; dmarc=pass action=none header.from=arrcus.com; dkim=pass header.d=arrcus.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=NETORGFT1331857.onmicrosoft.com; s=selector2-NETORGFT1331857-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=II2aF/3y5i1y2UK+QhIQsh5QZpgSCD07WtvcMcCS4VA=; b=K27/C8IT8xXmOEwzGJhFFHwcXJ08M1/APfI4P8VNbpKwBWl6vIwLU4bF4kGMgL6I7QBBFdjM6aR4qOitOxgwMhHceKl612JCm0vS9jj35Ctk8ir5cODXKtsQsbkN//yPBoMA2aK0yRgHHlSMBSY26dggaVtDH2/rRGfP34P/XLI=
Received: from BYAPR18MB2696.namprd18.prod.outlook.com (2603:10b6:a03:10b::26) by SJ0PR18MB4060.namprd18.prod.outlook.com (2603:10b6:a03:2e4::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3654.14; Fri, 11 Dec 2020 23:30:21 +0000
Received: from BYAPR18MB2696.namprd18.prod.outlook.com ([fe80::6835:7b3f:491c:74b4]) by BYAPR18MB2696.namprd18.prod.outlook.com ([fe80::6835:7b3f:491c:74b4%4]) with mapi id 15.20.3654.015; Fri, 11 Dec 2020 23:30:21 +0000
From: Keyur Patel <keyur@arrcus.com>
To: John Scudder <jgs=40juniper.net@dmarc.ietf.org>, Job Snijders <job@sobornost.net>
CC: "idr@ietf.org" <idr@ietf.org>
Thread-Topic: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0
Thread-Index: AQHWz/PTKqeqTOTeJ0S3IgM5E+6+BanyUWWA//+zxQA=
Date: Fri, 11 Dec 2020 23:30:21 +0000
Message-ID: <2F238121-E468-4D0F-A0FF-9D82E44C3247@arrcus.com>
References: <X9PHRuGndvsFzQrG@bench.sobornost.net> <FCB1ADB7-AD8C-447E-82FE-2EC15B8C3FB9@juniper.net>
In-Reply-To: <FCB1ADB7-AD8C-447E-82FE-2EC15B8C3FB9@juniper.net>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/16.43.20110804
authentication-results: dmarc.ietf.org; dkim=none (message not signed) header.d=none;dmarc.ietf.org; dmarc=none action=none header.from=arrcus.com;
x-originating-ip: [70.234.233.187]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: cc61a469-6879-4a20-0f05-08d89e2cba1d
x-ms-traffictypediagnostic: SJ0PR18MB4060:
x-microsoft-antispam-prvs: <SJ0PR18MB4060879D6DB29BEBE5D7D80CC1CA0@SJ0PR18MB4060.namprd18.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: Ts+VEz5ZYJk+/8WHM4v8Izp1hY0wwK6GmGXzaW0kBjM4XZok+PX6eDAm8Wek+Iz+6G7FEuj4ZvnJSi3fDJTl2CSXA5ryHPl56TCRkAxBvrjpVktX1g4bBQi7iX6rvfzi1VXcC0wghsiJok5Nj5Kfomu7dM34jAKFuAFVG7cxajfvYwWiZK0Omxkm0enY8DUCdI39wg/E/2pir0haX0SOFA5ztRa1biDM3Dbnp0crZoe8hBNsM9jT3b9JH4EyvhNb5+Ero41jVGRL3ZluGMic7TO8Ok6HuGQfam/gIp4wY1S2+1mpWlLIjWMST08JgKtJ9J5Hg7iKWQRqxcRzuqiXO5PPJVUKqVsbqFnCQ6cvnr0gAEirxolAKJMpj8h+JVJfPr3uvRKNHeZyjZLjD/8dwXgw99zAF9SD1s83D8BId+QEDB82jvcGdK7zTJ1cuLD8/6grBuOid9fZ9LLEc381uQ==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BYAPR18MB2696.namprd18.prod.outlook.com; PTR:; CAT:NONE; SFS:(346002)(366004)(136003)(376002)(33656002)(66476007)(64756008)(26005)(2906002)(53546011)(83380400001)(76116006)(71200400001)(66446008)(4326008)(966005)(6486002)(66556008)(36756003)(6506007)(5660300002)(6512007)(2616005)(8936002)(66946007)(110136005)(8676002)(508600001)(186003)(66574015)(86362001)(45980500001); DIR:OUT; SFP:1101;
x-ms-exchange-antispam-messagedata: =?utf-8?B?ajdRVGFtL2c3Y1lrTjl0STR4S3VvYnRWNmYxc2hQVXV2NTZnbzRZRkdWNHcy?= =?utf-8?B?dUVqblVnVjFWUjlndHVUK1kyVU1uRVo1UlZHV0ZXWVpndUl4UXViR040YTNo?= =?utf-8?B?WmcyNEFhWFR1NDg2TTRpUStSWW9CRnBnSllGZjhoZHNPeUdhK2psMVJ1c0dk?= =?utf-8?B?ZEF6S2l6a2p4UXFQOGdzajdNTXptV3hzYSs2WWdJRFJOUFdaY0dtQTBnL0Vp?= =?utf-8?B?M0xMK0l4V0pQRksyakpYS3RMK0lQOWJBOG5OY1VnT1E4ZlpqQnREd2pMUlFE?= =?utf-8?B?VGZDNHliK3pIaktWWjhZTmZ0dGR4UHhPTnRaZzg1VXA1RnRQMWpWMGdtWktn?= =?utf-8?B?OG5tTjY0VXUwNDhyNGN0TnBkMHJMUXNGb2ZleDRTOGJaMWxwbjJyckI4UFVm?= =?utf-8?B?akJrQWp6M1BoTndqN0xpU01TcnM4S1Z4L1YydGx2UGI5TXJFYS84M00waWUy?= =?utf-8?B?K1l0eXFoMURVTkMxWEpOQVZnMWwyV3VTNXF4SG91Wmp1R1J2Rko1VmwrSVk3?= =?utf-8?B?eDFuKzAwRjNZSE94a3FESnRjWFhXWXh4YTE1K25mc0dnRnM0RnBlWU0yMHpi?= =?utf-8?B?NnJwcjhJc1pmb0IvU1duTG1WWFJJL1FEeDFkMjBhemZkZVZiaEhRZ0QvZUlj?= =?utf-8?B?UUxzVHdzN0V4VU41SjM1eXpxdDM4czdzaUd3TnU0YnlML2VFTlpGTXh1RUdD?= =?utf-8?B?a2hoNXZYMFpkWllmZDJQSU9ZZlNCLzRIMHk0NGpqV1Jjelh2aUJuZDExN3R6?= =?utf-8?B?dzZzcjRMUWFUY3JuTXFZTVRpQ0pWRnFnaEtoQTJNSlcyQ2hvYWd0R1FCNW56?= =?utf-8?B?MkFlRnplUzgrTTVyT2xodGFzSmIzQlBPZ3NxdHF0YTBDUjNrQkZuTG40L1hP?= =?utf-8?B?Zm5tQ3NzQzBjWjZQN1dHQXRFanZhWWx4N0g4QlQ4c09sc3gyajdIN1lZVVE2?= =?utf-8?B?dHFrbUhrRXVHMDBEcmtsN2t0TUVJc3RSd1FLRkhmdU84eXVTekFVM0o0MnNL?= =?utf-8?B?YmZQVlNmdkRZcHhDSFJsZVIyNVE2UENZZ1RITlp0UkxrSlEyTGVGaHEwZ29O?= =?utf-8?B?b082MGtieS9lU3VKSTV4RTI4ZmJycXN2SW9hdEZFMCtla2cxakZhRFlOUmhQ?= =?utf-8?B?VmoxSmZFVW4xL09SRW9TRDVPR3g3SFQxd1RhUkZRNTZMcWptMk0wUnNHUHhV?= =?utf-8?B?d3lUazJ1eDJXeHJxMVgwRnRHaU14WG5uY2p0OUluRjdvanQzaExHckRmUWEz?= =?utf-8?B?RCtyZlFtYWpzVDRuRjEyUTk2RGhyQ1ViSnJ5dWY2VWhxU0RCbFEzUXZ6OW1i?= =?utf-8?Q?ETcXAm0sLxo1yZs1O3f6rrBXPi5w1ZC+zL?=
x-ms-exchange-transport-forked: True
Content-Type: text/plain; charset="utf-8"
Content-ID: <73C320B82212A04894267F821BB8734C@namprd18.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: arrcus.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: BYAPR18MB2696.namprd18.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: cc61a469-6879-4a20-0f05-08d89e2cba1d
X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Dec 2020 23:30:21.0191 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 697b3529-5c2b-40cf-a019-193eb78f6820
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: hUcOdNxoisIi1Nc34V8tz0x48avvWj1hTX4EO+sMFL+aR67TZ+Jlc2k8LsqzTyuXZae2vq2Yj0jmsCQMQxzX6A==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR18MB4060
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/q_qDTLUu9QwnW8jwV8UoiTRsOK8>
Subject: Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Dec 2020 23:30:34 -0000

One comment inlined #Keyur

On 12/11/20, 12:04 PM, "Idr on behalf of John Scudder" <idr-bounces@ietf.org on behalf of jgs=40juniper.net@dmarc.ietf.org> wrote:

    [all hats on]

    Hi Job,

    Thanks for bringing this up.

    To take the liberty of summarizing your wall of text :-) you’re saying that you believe BGP should tear down its session if it’s unable to send a message for the duration of the hold time. 

    Given that the conversation last time was inconclusive I think this is a good thing for the WG to discuss again. If you want to, you (or someone) could turn the idea into a short draft that updates RFC 4271, and we could have a WG adoption discussion about it. It might help focus the discussion but it’s not mandatory.

    I’ll point out a few things to start with —

    - Making it mandatory to apply hold time to the sending of messages would potentially make BGP peerings less stable. It clearly can’t make them *more* stable. Of course one can argue that if you haven’t been able to send a message for the hold time, the session has failed its metric of usefulness anyway, so any veneer of stability at this point is a harmful sham.
    - If I recall correctly, RST doesn’t work (or may not work) if you’re using the MD5 TCP option. Nothing much to be done, but be aware.
    - There is nothing stopping an implementation from doing what you describe now. The formalism that keeps you within the letter of 4271 would be that the implementation supplies a configuration option, that you set to enable the behavior. Once you’ve done that, when the implementation notices that the hold time has been exceeded in the outbound direction, it generates a ManualStop event for the session. 

#Keyur: +1 to what John said. This could very well be an implementation knob that generates ManualStop event.

Regards,
Keyur

    Thanks,

    —John

    > On Dec 11, 2020, at 2:23 PM, Job Snijders <job@sobornost.net> wrote:
    > 
    > 
    > Dear group,
    > 
    > Not too long ago an incident [1] in one Autonomous System resulted in
    > the global Internet being unusable in many parts of the world for
    > multiple hours. Some have reported the root cause was a 'configuration
    > error', however I believe much of the observed communication blackouts
    > in the global routing system stemmed from a pre-existing condition: a
    > specific implementation property present in multiple implementations
    > currently in use in the default-free zone.
    > 
    > Usually when an incident happens in one AS, affected parties can through
    > unilateral action 'route around the problem', but the ability to 'route
    > around problems' critically depends on the ability to distribute
    > WITHDRAW or UPDATE messages. When messages are not processed, what
    > generally was assumed to be a unilaterally solvable problem, now requires
    > coordination between *all* neighbors of the suffering AS.
    > 
    > The global routing system requires every participant to process BGP
    > messages, because the alternative is intervention on thousands of BGP
    > devices to manually shutdown thousands of BGP sessions disconnecting the
    > AS suffering from an incident, to help the rest of the default-free
    > zone. I speak from experience when saying that coordinating a disconnection
    > of an AS at global scale is incredibly hard and slow, any many approval
    > levels must be worked through. It takes *hours* of phone calls & email
    > chains, a time window during which internet traffic is routed towards
    > stale (now blackholing) locations.
    > 
    > In the average ISP's network design using IBGP Route Reflectors, these
    > blackout effects are aggravated when BGP sessions landing in such
    > devices are not terminated when TCP causes the BGP session to stall.
    > 
    > The problem of how TCP and BGP-4 can interact has been discussed before,
    > but I'm not sure the working group followed up with any publication
    > detailing the problem and the solution.
    > 
    >    https://urldefense.com/v3/__https://mailarchive.ietf.org/arch/msg/idr/q0Sx5d3zZjfOmOQ4lO2OZAHh9Lc/__;!!NEt6yMaO-gk!WnfNFxBMMXzuVhI23_QuKvcPfiG3Jwero3GwHhk0hhH6WNn1W0XWUkPhCc8cBA$
    > 
    > Does everyone agree BGP-4 sessions MUST be terminated using a TCP RST
    > (instead of a BGP-4 Cease NOTIFICATION) if the peer has indicated for
    > the duration of the Hold Timer that the TCP receive window is zero?
    > I'm fine with there being buttons to make this different, but the
    > default for routers in the global Internet routing system should be to
    > consider the remote peer to be 'a lost cause' when it won't accept new
    > BGP messages for the duration of the hold timer.
    > 
    > Perhaps RFC 4271 Section 6.5 should be amended as following:
    > 
    > OLD:
    >    If a system does not receive successive KEEPALIVE, UPDATE, and/or
    >    NOTIFICATION messages within the period specified in the Hold Time
    >    field of the OPEN message, then the NOTIFICATION message with the
    >    Hold Timer Expired Error Code is sent and the BGP connection is
    >    closed.
    > 
    > NEW:
    >    If a system does not receive (or is unable to send) successive
    >    KEEPALIVE, UPDATE, and/or NOTIFICATION messages within the period
    >    specified in the Hold Time field of the OPEN message, then the
    >    NOTIFICATION message with the Hold Timer Expired Error Code is sent
    >    and the BGP connection is closed. If the NOTIFICATION message cannot
    >    be send the BGP connection is closed.
    > 
    > This is an ongoing problem. I suspect the BGP Nyancat's discoloration at
    > the left most eye might have been caused by an active TCP session
    > keeping a stale BGP session alive. But also the observations from "BGP
    > Zombies: an Analysis of Beacons Stuck Routes" [3] could be explained by
    > the problematic interaction between TCP and BGP.
    > 
    > I appreciate the work the IDR working group has done to *SOFTEN* the
    > blow from implementation defects on global routing (RFC 7606 is a
    > brilliant example of this), but I fear in this case there is no subtle
    > way to say goodbye when the peer doesn't process messages in a timely
    > fashion. It might be good to document this.
    > 
    > Kind regards,
    > 
    > Job
    > 
    > [1]: https://urldefense.com/v3/__https://www.reuters.com/article/level-3-communi-outages-idUSL2N1CB00C__;!!NEt6yMaO-gk!WnfNFxBMMXzuVhI23_QuKvcPfiG3Jwero3GwHhk0hhH6WNn1W0XWUkMkF2w4cg$
    > [2]: https://urldefense.com/v3/__https://labs.ripe.net/Members/cteusche/bgp-meets-cat__;!!NEt6yMaO-gk!WnfNFxBMMXzuVhI23_QuKvcPfiG3Jwero3GwHhk0hhH6WNn1W0XWUkMry7Ktyw$
    > [3]: https://urldefense.com/v3/__https://www.iij-ii.co.jp/en/members/romain/pdf/romain_pam2019.pdf__;!!NEt6yMaO-gk!WnfNFxBMMXzuVhI23_QuKvcPfiG3Jwero3GwHhk0hhH6WNn1W0XWUkO8A78j8Q$
    > 
    > _______________________________________________
    > Idr mailing list
    > Idr@ietf.org
    > https://urldefense.com/v3/__https://www.ietf.org/mailman/listinfo/idr__;!!NEt6yMaO-gk!WnfNFxBMMXzuVhI23_QuKvcPfiG3Jwero3GwHhk0hhH6WNn1W0XWUkMMXdwc-g$

    _______________________________________________
    Idr mailing list
    Idr@ietf.org
    https://www.ietf.org/mailman/listinfo/idr