Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0

Jeffrey Haas <> Fri, 18 December 2020 19:02 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id D5A1B3A0ED2 for <>; Fri, 18 Dec 2020 11:02:58 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id Xg4SpSCnWkNg for <>; Fri, 18 Dec 2020 11:02:57 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id 639423A0ED1 for <>; Fri, 18 Dec 2020 11:02:57 -0800 (PST)
Received: by (Postfix, from userid 1001) id 07FD21E355; Fri, 18 Dec 2020 14:20:24 -0500 (EST)
Date: Fri, 18 Dec 2020 14:20:23 -0500
From: Jeffrey Haas <>
To: Brian Dickson <>
Cc: Enke Chen <>, "idr@ietf. org" <>
Message-ID: <>
References: <> <> <> <>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <>
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: <>
Subject: Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 18 Dec 2020 19:02:59 -0000

On Fri, Dec 18, 2020 at 10:28:48AM -0800, Brian Dickson wrote:
> On Fri, Dec 18, 2020 at 10:09 AM Enke Chen <> wrote:
> > No, I am not assuming that packets are getting somewhere. The
> > TCP_USER_TIMEOUT would work as long as there is "pending data" (either
> > unacked, or locally queued). The data can be from the local BGP Keepalives
> > or the TCP_KEEPALIVE.
> Actually, my point was not only about packets getting somewhere, but also
> that the LOCAL implementation of the TCP stack should not be assumed to be
> bug-free (in relevant ways).
> Your response is still assuming that those mechanisms actually work 100%
> reliably 100% of the time.
> Yes, if the implementation works correctly, TCP_USER_TIMEOUT would work.
> However, I'm saying the BGP code should not assume that is the case, and
> put some guard-rails around the behavior.
> The overhead of some small amount of checking, regardless of how it is
> done, is likely quite low.

What's also important is that using this option removes the ability for the
BGP implementation to make its own decisions.

In the presence of some level of packet drop, the window may not be able to
advance because the ACK covering the head end isn't getting through.  So,
even if some data is getting through and helping open space in the buffer,
this feature may cause us to close the session.

Jakob also makes the point that zero window in the send direction isn't
really helped here.

> (If packets are flowing, as viewed by updates and/or keepalives being seen
> from the peer, for example, it might not be necessary to invoke those
> checks? Or the check might only need to be done every $INTERVAL, like every
> minute or two.)

My experience is that using the TCP information in an advisory fashion is
helpful, but hard to make work consistently or portably.  Features that tie
into the stack to try to assess liveness or close sluggish sessions are
helpful if you don't care about the impact.  BGP implementations tend to
care about being resilient, especially for ISP circumstances.

-- Jeff