Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0
Job Snijders <job@sobornost.net> Tue, 15 December 2020 21:54 UTC
Return-Path: <job@sobornost.net>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7B0DA3A00D9 for <idr@ietfa.amsl.com>; Tue, 15 Dec 2020 13:54:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id v-h-iN81t-Cj for <idr@ietfa.amsl.com>; Tue, 15 Dec 2020 13:54:11 -0800 (PST)
Received: from outbound.soverin.net (outbound.soverin.net [IPv6:2a01:4f8:fff0:2d:8::215]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 03F8E3A00D8 for <idr@ietf.org>; Tue, 15 Dec 2020 13:54:10 -0800 (PST)
Received: from smtp.freedom.nl (unknown [10.10.3.36]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by outbound.soverin.net (Postfix) with ESMTPS id 5DECD6010C; Tue, 15 Dec 2020 21:54:08 +0000 (UTC)
Received: from smtp.freedom.nl (smtp.freedom.nl [116.202.65.211]) by soverin.net
Received: from localhost (bench.sobornost.net [local]) by bench.sobornost.net (OpenSMTPD) with ESMTPA id 06207b5c; Tue, 15 Dec 2020 21:54:02 +0000 (UTC)
Date: Tue, 15 Dec 2020 21:54:01 +0000
From: Job Snijders <job@sobornost.net>
To: Christoph Loibl <c@tix.at>
Cc: John Scudder <jgs@juniper.net>, "idr@ietf.org" <idr@ietf.org>, Robert Raszuk <robert@raszuk.net>
Message-ID: <X9kweQ5EtTL7tOAM@bench.sobornost.net>
References: <X9PHRuGndvsFzQrG@bench.sobornost.net> <CAOj+MME4OHmoqJfzNQ4Tj6+wCd1kJVHPfJsDbk_+Xh8fh5G8Dg@mail.gmail.com> <6F7C5906-51A8-43C2-8AEC-3DB74CB9941F@tix.at> <1B4E7C9D-BBFE-4865-87F9-133ACE55D122@cisco.com> <22C381D0-2174-4828-A724-FD97B2FE0BCB@tix.at> <9D6268BD-C555-4B9A-A883-9B55EEB5D5DA@juniper.net> <91D9B9F7-0DBE-45E6-84D5-2E3D9F8C44A1@tix.at>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <91D9B9F7-0DBE-45E6-84D5-2E3D9F8C44A1@tix.at>
X-Clacks-Overhead: GNU Terry Pratchett
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/DQTWtKesQs6sd8M7rYN6UMCd_QE>
Subject: Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Dec 2020 21:54:14 -0000
On Tue, Dec 15, 2020 at 09:57:47PM +0100, Christoph Loibl wrote: > Thanks for answering my question in more detail. Maybe I was unclear > (but reading your email I think we are talking about the same). > > On 15.12.2020, at 21:00, John Scudder <jgs@juniper.net> wrote: > > > > I think you are talking about this scenario. I’ll copy the example > > from Rob’s message cited above: > > > > rtr-A rtr-B > > (congested c-p) (uncongested c-p) > > send window: >0 send window: 0 > > recv window: 0 recv window: >0 > > > > In this case we expect: > > a) rtr-B does not send any BGP packet (KEEPALIVE/UPDATE/NOTIFICATION) > > to rtr-A in normal operating circumstances. > > b) rtr-A does not expect any KEEPALIVE/UPDATE packets from rtr-B. The > > session remains established even if no packet is received in the > > holdtime. > > c) rtr-A continues to send KEEPALIVE packets to rtr-B. > > The part I have a problem to understand is b). It is clear that rtr-A > will not receive any packets from rtr-B because rtr-B cannot send them > (send window: 0). But does "rtr-A does not expect any KEEPALIVE/UPDATE > packets from rtr-B” mean that rtr-A has essentially suspended its > hold-timer until it is ready to receive new messages and opens up its > recv window? If yes, why? I would expect timers to run independently > of the transport protocol. Yeah, I'd expect that too. We've seen congested BGP implementations continue to send KEEPALIVEs but not accept (or send!) other BGP messages. And rtr-B's attempts at KEEPALIVE just be TCP ACked with zero window. I'd argue in the above scenario rtr-A is simply broken and rtr-B MUST proceed to close down the session towards rtr-A, rtr-B must cleanup and generate WITHDRAWs for any routes pointing to rtr-A. By doing the clean-up rtr-B does both itself and rtr-A a favor. If the issue was transcient rtr-A and rtr-B will re-establish a few minutes later (IdleHoldTimer, right?) and things will normalize. Arguably and measurably, rtr-A is operating its Loc-RIB (forwarding) based on stale routing information (assuming rtr-A is working at all!): rtr-A has not received any WITHDRAWs, UPDATEs (or somewhat less importantly KEEPALIVEs) from rtr-B. Rtr-B is fully aware of this stale situation, because rtr-B was not able to write these BGP messages to the network: the messages are still in OutQ. Rtr-A didn't accept any KEEPALIVE (or UPDATE/WITHDRAW) from rtr-B. How to solve this? Claudio Jeker took a look at what it would take in OpenBGPD and came up with the (tiny!) following patch, should be readable to most: https://marc.info/?l=openbsd-tech&m=160796802508185&w=2 Ben Cox helped me create a 'EBGP peer from hell': a publicly accessible EBGP multihop instance which can reliably produce the undesirable TCP/BGP behavior we're discussing here. This 'peer from hell' will do the OPEN exchange but then manipulates the TCP recvwindow towards zero. All BGP implementations tested so far (5 famous ones) appear vulnerable because they continue to consider the BGP session healthy & stable (meanwhile OutQ keeps growing endlessly and zero BGP messages go across the wire). One network operator (with thousands of EBGP sessions in the DFZ) reported to me the above stalled-TCP scenario is *not* a common case on the Internet. On a normal day, a network operator will see no (zero) sessions stuck this way, which leads me to believe 'recvwind=0' ... *for the duration of the hold timer* is a very strong indicator for a really broken situation which should be attempted to automatically resolve. I believe BGP implementations are not helping any known deployment scenarios by *not* disconnecting a stuck peer, however on the other we now know about various operational examples where honoring recvwind=0 for (hours, days) longer than $holdtimer led to global scale problems. As the 'not-at-all progressing OutQ' situation seems somewhat rare in the wild (yet continues to happen from time to time) I think it is worth discussing & documenting how implementers can attempt to avoid this state from happening. It might help make the Internet 1% more robust. BGP implementers (or operators wanting to test their equipment) feel free to contact me off-list if you'd like to set up an EBGP multihop session towards the 'peer from hell' testbed. Testing potential solutions this way is quite easy, the behavior can be triggered within a few seconds. Kind regards, Job ps. At this moment we have (1) an attempt at problem description, (2) a demonstration BGP-4 implementation of a 'problem causer', and (3) a different BGP-4 implementation with a 'solution'. This enables IDR to test interopability & (potentially revised) protocol compliance, hopefully moving the problem a bit from theoretical to practical reality? :)
- [Idr] TCP & BGP: Some don't send terminate BGP wh… Job Snijders
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Tony Li
- Re: [Idr] TCP & BGP: Some don't send terminate BG… John Scudder
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jeff Tantsura
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Robert Raszuk
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Robert Raszuk
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jakob Heitz (jheitz)
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Job Snijders
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Tony Li
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Keyur Patel
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jeff Tantsura
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Robert Raszuk
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Keyur Patel
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jakob Heitz (jheitz)
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Enke Chen
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jakob Heitz (jheitz)
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Robert Raszuk
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Christoph Loibl
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jakob Heitz (jheitz)
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Christoph Loibl
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jared Mauch
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jared Mauch
- Re: [Idr] TCP & BGP: Some don't send terminate BG… William McCall
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Job Snijders
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jared Mauch
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Enke Chen
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Randy Bush
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jared Mauch
- Re: [Idr] TCP & BGP: Some don't send terminate BG… John Scudder
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Christoph Loibl
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jakob Heitz (jheitz)
- Re: [Idr] TCP & BGP: Some don't send terminate BG… John Scudder
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Job Snijders
- Re: [Idr] TCP & BGP: Some don't send terminate BG… John Scudder
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Robert Raszuk
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jakob Heitz (jheitz)
- Re: [Idr] TCP & BGP: Some don't send terminate BG… john heasley
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Tony Li
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Keyur Patel
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Keyur Patel
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Brian Dickson
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Claudio Jeker
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Claudio Jeker
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jakob Heitz (jheitz)
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Job Snijders
- Re: [Idr] TCP & BGP: Some don't send terminate BG… John Heasley
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jakob Heitz (jheitz)
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Claudio Jeker
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jakob Heitz (jheitz)
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Brian Dickson
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jakob Heitz (jheitz)
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Job Snijders
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jakob Heitz (jheitz)
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Robert Raszuk
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Job Snijders
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Job Snijders
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Brian Dickson
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jeffrey Haas
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jeffrey Haas
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jeffrey Haas
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jeffrey Haas
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Brian Dickson
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Enke Chen
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Robert Raszuk
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Gert Doering
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Claudio Jeker
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jeffrey Haas
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jeffrey Haas
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Enke Chen
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Robert Raszuk
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Brian Dickson
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jakob Heitz (jheitz)
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Enke Chen
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jeffrey Haas
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Enke Chen
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Brian Dickson
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jeffrey Haas
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jakob Heitz (jheitz)
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Gyan Mishra
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jeffrey Haas
- Re: [Idr] TCP & BGP: Some don't send terminate BG… John Scudder
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Gyan Mishra
- Re: [Idr] TCP & BGP: Some don't send terminate BG… William McCall
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Gyan Mishra
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Robert Raszuk
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Gyan Mishra
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Gyan Mishra
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Robert Raszuk
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Jeffrey Haas
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Gyan Mishra
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Robert Raszuk
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Gyan Mishra
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Gyan Mishra
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Enke Chen
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Enke Chen
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Enke Chen
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Enke Chen
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Enke Chen
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Job Snijders
- Re: [Idr] TCP & BGP: Some don't send terminate BG… Enke Chen