Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0

Claudio Jeker <> Thu, 17 December 2020 11:50 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 1936A3A00C4 for <>; Thu, 17 Dec 2020 03:50:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.896
X-Spam-Status: No, score=-1.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id W8tf7MzBuR1k for <>; Thu, 17 Dec 2020 03:50:11 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-CHACHA20-POLY1305 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id A208E3A00C9 for <>; Thu, 17 Dec 2020 03:49:15 -0800 (PST)
Received: (qmail 68530 invoked by uid 1000); 17 Dec 2020 11:49:13 -0000
Date: Thu, 17 Dec 2020 12:49:13 +0100
From: Claudio Jeker <>
To: Enke Chen <>
Cc: Job Snijders <>,
Message-ID: <>
References: <>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <>
User-Agent: Mutt/1.10.1 (2018-07-13)
Archived-At: <>
Subject: Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 17 Dec 2020 11:50:13 -0000

On Wed, Dec 16, 2020 at 06:40:01PM -0800, Enke Chen wrote:
> Hi, Folks:
> Regarding the patch for openBGPD pointed out by Job, I do not think it
> would work. When the TCP rcv window from the remote is 0, the BGP keepalive
> can still be queued to the socket buffer. It can take a long time for the
> socket buffer to be filled up by BGP keepalives.

I agree that it can take time for the socket buffers to fill up. This
affects both send and recv socket buffers. If a client stops to recv(2) any
data then it will take a long time for an idle session to fill up all
these buffers with 19 byte KEEPALIVEs. At the same time an idle session
will not result in stale RIB entries.
In the DFZ there is enough update chatter to reduce the time significantly.

In the end the goal is to reduce the current state of keeping the
connection open forever to something more reasonable. As mentioned to 
fill up the receive buffer and closing the window to 0 already takes time.
This trigger will not be precise but limiting the socket buffer size can
help to reduce the time.
> It seems that the TCP_USER_TIMEOUT option can be used for the persistent
> zero-size window issue.  The timeout value could be multiples of the
> holdtimer (with min and max adjustments), perhaps somewhere around 5 or 6
> minutes.

Not all systems implement TCP_USER_TIMEOUT. It is one way to detect send
side problems. With a large timeout it is probably a good mechanism.

:wq Claudio