Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0

"Jakob Heitz (jheitz)" <jheitz@cisco.com> Thu, 17 December 2020 20:53 UTC

Return-Path: <jheitz@cisco.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6AFBB3A100C for <idr@ietfa.amsl.com>; Thu, 17 Dec 2020 12:53:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.6
X-Spam-Level:
X-Spam-Status: No, score=-9.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com header.b=FQ53D9dP; dkim=pass (1024-bit key) header.d=cisco.onmicrosoft.com header.b=BZCkxnvJ
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id G2rgIJAZHEcr for <idr@ietfa.amsl.com>; Thu, 17 Dec 2020 12:53:12 -0800 (PST)
Received: from alln-iport-1.cisco.com (alln-iport-1.cisco.com [173.37.142.88]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 981933A1009 for <idr@ietf.org>; Thu, 17 Dec 2020 12:53:12 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=12480; q=dns/txt; s=iport; t=1608238392; x=1609447992; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=s8tnzcN0svkflGQfvIRt4B3bbRs86zK4wjo/Ipw0UjM=; b=FQ53D9dPUN+bIlNPTRNmHIfkfW3E2IbkDZXbnO7Ks2oU+TxCNS2cttrD gPf4Ugk9UWRvoy3wRi34xaWb8PkeR6yOuXnQ7O0qhhYeQwIugIIwhnBci WRq14KtdBDl62vaxKyspuK/lSf3IFXC0w/QRmUoXvwTfda/1kw68XazkB A=;
X-IPAS-Result: =?us-ascii?q?A0A2AACow9tfkI0NJK1iHQEBAQEJARIBBQUBgXsIAQsBg?= =?us-ascii?q?SIvUXxbLy6EP4NIA4RZiQMDlBqEcoEugSUDVAsBAQENAQEtAgQBAYRKAheBX?= =?us-ascii?q?AIlNAkOAgMBAQEDAgMBAQEBBQEBAQIBBgQUAQEBAQEBhjgMhXIBAQEEEhEKE?= =?us-ascii?q?wEBNwEPAgEIDgMEAQEkBAMCAgIwFAkIAgQOBQgagwQBgX5XAy4BoxACgTyIa?= =?us-ascii?q?XaBMoMEAQEFhSQYghAJgTgBgnSDeoY2JhuBQT+BVIJWPoRAFR+CYTOCLIJBg?= =?us-ascii?q?QQpLwKBIG0IKA+KYYRGCIMuhyqDMoh7kTYKgnSQFoUnhjCiPrEKhCcCBAIEB?= =?us-ascii?q?QIOAQEFgVY4gVlwFYMkUBcCDY4hGh2DOopYdDcCBgoBAQMJfIk5K4E7AV8BA?= =?us-ascii?q?Q?=
IronPort-PHdr: =?us-ascii?q?9a23=3ABFXb7x+A7rfj4/9uRHGN82YQeigqvan1NQcJ65?= =?us-ascii?q?0hzqhDabmn44+7ZxyN/fxpi1bNQYLd5u5bjPDVqObrXmlTqZqCsXVXdptKWl?= =?us-ascii?q?dFjMgNhAUvDYaDDlGzN//laSE2XaEgHF9o9n22Kw5ZTcD5YVCBunS26jcWBh?= =?us-ascii?q?L5OBZqIf72AcjZiMHkn+y38ofYNgNPgjf1aLhuLRKw+APWsMRz48NiJ689xw?= =?us-ascii?q?GPrGFPfrFdxHhjIhSYmBOv6w=3D=3D?=
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-AV: E=Sophos;i="5.78,428,1599523200"; d="scan'208,217";a="615920191"
Received: from alln-core-8.cisco.com ([173.36.13.141]) by alln-iport-1.cisco.com with ESMTP/TLS/DHE-RSA-SEED-SHA; 17 Dec 2020 20:53:11 +0000
Received: from XCH-ALN-005.cisco.com (xch-aln-005.cisco.com [173.36.7.15]) by alln-core-8.cisco.com (8.15.2/8.15.2) with ESMTPS id 0BHKrBih003618 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=FAIL); Thu, 17 Dec 2020 20:53:11 GMT
Received: from xhs-rtp-001.cisco.com (64.101.210.228) by XCH-ALN-005.cisco.com (173.36.7.15) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 17 Dec 2020 14:53:11 -0600
Received: from xhs-rtp-001.cisco.com (64.101.210.228) by xhs-rtp-001.cisco.com (64.101.210.228) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 17 Dec 2020 15:53:09 -0500
Received: from NAM12-BN8-obe.outbound.protection.outlook.com (64.101.32.56) by xhs-rtp-001.cisco.com (64.101.210.228) with Microsoft SMTP Server (TLS) id 15.0.1497.2 via Frontend Transport; Thu, 17 Dec 2020 15:53:09 -0500
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PcYAuqXWUeJ1x6tUpjkRaJioWidQgV3zfT+rMhcKEP/3bvs4br97UCwBqX9PwIE4yTMTGhslZi+ZBv6bwHP6LIul2fE10BIgaS1l4uHJDU0S3njSfQPkCJwnXUfILuDbAl+WFNCMri8ug+kmIPes/2UOkpr1nHa8cPQXkHV6HclCuC+9d1Dk6+0Jivx9hbGoIcN7N+QxGaDeDWKyBAzD9+s/AbsYvkZ2L6f1KxJ2hvrwimIhjKJ7VKVCUtWvMBm6T1m52LobT3c2jpAhVSufdpPNjG9dj4Ra0X/dMHNxXUwU5WGmE+YqUjIczEq+9+Snhip/l3fJQ5OX1IM4gScQQA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=s8tnzcN0svkflGQfvIRt4B3bbRs86zK4wjo/Ipw0UjM=; b=GEXaXwJ3Q8MFaFOaP9MmzTBKJ7mj2qR0V+KFMRvMys+y1zI3OjffI/BLybNVpcgtQuArce7PidJfzMjka0DTuxpEw0DxcNoMopyDzh/OEsKNcjTxvuvvHMdtxOefkuozuryu/6L5XkCvK+wuXsp55vXebpIq86PcWMFqRcfaCbJhi6MgisF5LTS7Z2r4AybzCRd9EFdJtDoQSZ4RqRzsWdmt9qiHfLhQ3+Go4qtbtzUhxsYA9vagvcZVk3MldVy6kHZyimVahv2y8pE/FIi921sQXq3nKVRoYgLPCxXRKUrCG/HBHkOxgqLIxNWbeFPDpES87uibxpu9j1Spa43Yyw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=cisco.com; dmarc=pass action=none header.from=cisco.com; dkim=pass header.d=cisco.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cisco.onmicrosoft.com; s=selector2-cisco-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=s8tnzcN0svkflGQfvIRt4B3bbRs86zK4wjo/Ipw0UjM=; b=BZCkxnvJ7R5Ueyup+McvcrkVHaB8xAceQlHZdcTKJia4AAIJN7spFhjZQW1m4t2gduA/NK5822KWmNV/xMIHLnzOBKM8QYx7jxsaS2y3yP6BfWvXVnO+JlYbJEN0LjwCvZVPi2GSa2ROlivf0IeBnyZQMD+/Fn2+3S6qYZB61Lw=
Received: from BYAPR11MB3207.namprd11.prod.outlook.com (2603:10b6:a03:7c::14) by BYAPR11MB3704.namprd11.prod.outlook.com (2603:10b6:a03:f9::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3654.14; Thu, 17 Dec 2020 20:51:55 +0000
Received: from BYAPR11MB3207.namprd11.prod.outlook.com ([fe80::2581:444d:50af:1701]) by BYAPR11MB3207.namprd11.prod.outlook.com ([fe80::2581:444d:50af:1701%4]) with mapi id 15.20.3654.025; Thu, 17 Dec 2020 20:51:54 +0000
From: "Jakob Heitz (jheitz)" <jheitz@cisco.com>
To: Enke Chen <enchen@paloaltonetworks.com>
CC: "idr@ietf. org" <idr@ietf.org>, Robert Raszuk <robert@raszuk.net>
Thread-Topic: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0
Thread-Index: AQHW1B4cdCIUk5RpOkeXOoBZa5k/Sqn7FGOAgACehoCAAAopgIAABuXA
Date: Thu, 17 Dec 2020 20:51:54 +0000
Message-ID: <BYAPR11MB32076033DB087855634992E0C0C40@BYAPR11MB3207.namprd11.prod.outlook.com>
References: <CANJ8pZ_02njLOJxJPAW4vT3q0EPGB6WY1ZGemQpfiXNMhadb6A@mail.gmail.com> <CAOj+MMHC_uGRDwEmJJO0QCRXahfinbWw5wLzSQJ=C9CYAma-mw@mail.gmail.com> <CANJ8pZ-rq7MbFBLi26nb2yGJvsfrEcQZzn1ieq3LgnJM1p4ULA@mail.gmail.com> <CAH1iCirO7AusJU_nfBsHb_jvaywWgyyxEnW96_-NM-4TQt9L8g@mail.gmail.com>
In-Reply-To: <CAH1iCirO7AusJU_nfBsHb_jvaywWgyyxEnW96_-NM-4TQt9L8g@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: paloaltonetworks.com; dkim=none (message not signed) header.d=none;paloaltonetworks.com; dmarc=none action=none header.from=cisco.com;
x-originating-ip: [2601:647:5701:46e0:e82d:ab03:2132:19e4]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 701450de-be4b-4cb8-4a76-08d8a2cd967d
x-ms-traffictypediagnostic: BYAPR11MB3704:
x-microsoft-antispam-prvs: <BYAPR11MB37042110B69B5CBE26623FA6C0C40@BYAPR11MB3704.namprd11.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:8882;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: fMl7Fl8NptrqT+CO+Y/LW/QMff18BhJ5qsrbUrmOhvGYT+OfEZzjfNWqrxzIdox2oLWhCoHCK31hNDEBkpPFOXsYTWPJAWtm1uU0vvVzSq8K+4p9xuoe6kpFt68LVRbk3nclGx6CAw5Y1uSwDEqQduEQWUjkfqlZiN9tHm3cVRCw2kjHTf5sPMJpIoqTCyhMJxhqx1S2vwEwc6BKg8+vEaMjXE1s5XPjhFUjvJ3CVKRlm4/1gAwu4U0GDgBYYXmnN/vlcelTJ72fvHPbXe/JZkmo5de/BTGj7o2I3qqw9tFuyOo40DAsom2I7Qcj/D61iHAEFtUjxGOh8O1Jh/CYHg==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BYAPR11MB3207.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(346002)(396003)(136003)(376002)(366004)(39860400002)(4326008)(33656002)(8936002)(478600001)(66476007)(71200400001)(83380400001)(186003)(53546011)(52536014)(86362001)(6916009)(6506007)(64756008)(8676002)(76116006)(5660300002)(66556008)(2906002)(55016002)(66446008)(66946007)(316002)(54906003)(9686003)(7696005); DIR:OUT; SFP:1101;
x-ms-exchange-antispam-messagedata: =?utf-8?B?ZEJ6cHJ3cUhsT0dqcGlNcXhZcFdoTTdoTStzaFVvdlRrQXRKM1pUWTg4bzVp?= =?utf-8?B?UVlYWVU2QXRSNnVhUUIyWjVXZkN5eVlEMGhVNzFGRGl2WnhNVDlUb2szUVU5?= =?utf-8?B?N1JIRkhxZnVUTkpDeVRvN0xNM21lcUI2c1dQQXBmQ0kvSDM0MTlranJ5VVQ1?= =?utf-8?B?TnlGUk12UHJ3VkpIVEpPUldJZnRWL3ArblhvMUxjRWNlU2pBaVVUV0xRN04z?= =?utf-8?B?SUlaWk5hdm9UdEp3Ukx6WkVVcnk4NDl4Y0hjamdhWTFvWDFYcVhSbkcyVyta?= =?utf-8?B?dTRwNmJCak54UjlTTmhGSEwvNzRiMWF1MEJCNnIvVTZyd0dhdUxwMkFhVjk4?= =?utf-8?B?QmprNitUTERaVVVzZ2FpSnM4RGxSVzNIMzUwWWlWVnhjVnk2eFlRRVZISkk3?= =?utf-8?B?TG56UzBNc2c2bld0eUNCMERFTDRHZFZXYjBmWWl1Nytnb0UvOHk2SHNTdTV4?= =?utf-8?B?Q3VidDZMenFXdzZSNUlrSUZZaHZTZnU4WjBHMHIvNlhacENEUUk4TUs1SFly?= =?utf-8?B?SFNuYnplaDVYaHlIbXZra3lwWlQ1bDZmYXU0MGFjRUN4OVFDZFI5ZXN5dmdy?= =?utf-8?B?bzRsdjB3R2JFZWN3YWlmaldkQWdKdG1ENVN0VzVlS2psR3pRSlVKSnlMYXYw?= =?utf-8?B?VHRFeVY3dURhMTRqQ1JnWEN2VHBrRmg2OFc0VlN4YnZ4Mm8rMlVKRUpLdFRa?= =?utf-8?B?UjJBeEJLVFkzQzV3YndwcEloWHVabWRSbnIwd3ZWa21sZUVDODRMN3k1NXdj?= =?utf-8?B?b1FmSzZmUGdTcEw5YmYrYk91aTVES2RlMFNucGhoaHZuMVlFUC90UzFkUDI3?= =?utf-8?B?b1V5VTJYd21TZE9GZ1RnWnN3a1ErRHE2eUVrQ3UyRDQrUFJIZE9QK0pmQ2s3?= =?utf-8?B?NlRhQ3R6U0R0ZFBSUldvdEVxN2g4cENPOFhRalBQbnJjQVpCVEZ4R0dRZlNZ?= =?utf-8?B?aGR6S1RnZUlpcktuRk4vMlh6OTlnOTVaSW0zbUJieDJsajIxZnNZTXcxMUxB?= =?utf-8?B?aTV5SkNYTW1TeU0rcWlmTG5na1BPMktlUlNkWnNMcURuQ1k4N0RkSVExNXR0?= =?utf-8?B?MVgwWjhJVm1CaDlrbm9SNG9JY0g0T0xsZVY3ck05WVZsTHZWVXBidHh3Y1Bj?= =?utf-8?B?VEMzNlVsUUpkMTB1L1R3Zk5vUS9GZWlxZGZqd01hbHliUE9sMnZrT3lVYlUr?= =?utf-8?B?alk2akJMRm8wMlFmSFNZd2txeVVCYTRlUkE2cTA2dlBVOHVXYUQzN1paUFdD?= =?utf-8?B?Rko0VU5sdWxkMXpqenZCUU14c2VLRXNJMDdMZGxjZFlQeUk0ckJrMkdlMzVC?= =?utf-8?B?Z3dsaytkbHRwZDJQRkxpOTFNTDJBV1VEOUVHaU5Bd2M1Q2pMcFAxZGpCRm90?= =?utf-8?Q?d1P5XJMKRdB4qMgV42U4DpCcouzTADYA=3D?=
x-ms-exchange-transport-forked: True
Content-Type: multipart/alternative; boundary="_000_BYAPR11MB32076033DB087855634992E0C0C40BYAPR11MB3207namp_"
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: BYAPR11MB3207.namprd11.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 701450de-be4b-4cb8-4a76-08d8a2cd967d
X-MS-Exchange-CrossTenant-originalarrivaltime: 17 Dec 2020 20:51:54.8045 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 5ae1af62-9505-4097-a69a-c1553ef7840e
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: awbA9JiYQI1nEhcRMICaBzmHXf1brCeirpMgxwIgPuHLQE8HnhQyZjbrRfExmn6SNlBj9o+xY+1a8BquAMbPuw==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR11MB3704
X-OriginatorOrg: cisco.com
X-Outbound-SMTP-Client: 173.36.7.15, xch-aln-005.cisco.com
X-Outbound-Node: alln-core-8.cisco.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/63adMNc4oF5owLvY4QbUSM2y-Nw>
Subject: Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Dec 2020 20:53:14 -0000

TCP_USER_TIMEOUT times the peer's ACK.
That's only half the problem.
The other half is the zero window.
When the peer is advertising zero window, everything is acked. no?

Regards,
Jakob.

From: Idr <idr-bounces@ietf.org> On Behalf Of Brian Dickson
Sent: Thursday, December 17, 2020 12:25 PM
To: Enke Chen <enchen@paloaltonetworks.com>
Cc: idr@ietf. org <idr@ietf.org>rg>; Robert Raszuk <robert@raszuk.net>
Subject: Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0



On Thu, Dec 17, 2020 at 11:49 AM Enke Chen <enchen@paloaltonetworks.com<mailto:enchen@paloaltonetworks.com>> wrote:
Hi, Robert:

The receiver is broken for not closing the session after the holdtime expires, and that certainly needs attention.

However, the rational for trying to do something on the sender seems to be the following: as the session is broken and should have been terminated by the other side, but it's not, the sender would like to have a way that provides an "upper bound" for the session to be terminated deterministically at the transport layer.

The TCP_USER_TIMEOUT option seems to be a good fit in this case.

I think this thread is suffering from "impedance mismatch".

There is a known issue where the actual TCP stacks of some routers are buggy in ways that breaks things.

This proposal (TCP_USER_TIMEOUT) assumes that the local speakers' TCP stack isn't buggy, at least with regards to the handling of that option.

I'm not opposed to using that option, but I don't think relying on that exclusively is sufficient.

There is a layer issue, involving BGP protocol and TCP transport, where transport and/or protocol issues (or both) are causing (or at least have caused) global problems.

Having the BGP implementation be cognizant of the state of the TCP connections, and handle behavior violations or boundary condition problems expeditiously, is probably a good idea.

(The common term is "belt and suspenders", or perhaps "trust but verify".)

I.e. if the session is "broken", use all the available mechanisms in increasing order of effectiveness (or extreme-ness) until the connection dies, possibly with some grace periods in the escalation steps.

Brian