Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0

William McCall <> Sat, 19 December 2020 10:38 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id B3A153A0FEF for <>; Sat, 19 Dec 2020 02:38:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id QZJc6PytO7yb for <>; Sat, 19 Dec 2020 02:38:11 -0800 (PST)
Received: from ( [IPv6:2607:f8b0:4864:20::72b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id AD53D3A0FED for <>; Sat, 19 Dec 2020 02:38:11 -0800 (PST)
Received: by with SMTP id 22so4564256qkf.9 for <>; Sat, 19 Dec 2020 02:38:11 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=YzTiW9SmLfJ8MSPSO1t/NRF3soUFRZKIH9z0TuB1Fwk=; b=bfnWOW16ulFWT1zh4EHIXyVVJ8GFIx+iTy1Rla/4KuNE0QfTTQlilxoSP6x6K5mKH+ ARPX21dyYIHH92K6+8OiCgR8nb+CLE+4crnflfCPOc+AFW9wfiWrh7/Do2gFzURxrFaj jdhqTDkUZsKC52I3iZFNVUPhPfMRg/aoQSzGqKLnv7ObgvU7Q5pqM+kHkhPq+TCk/kEC wvT2Uba2mITuwuUQg4F0bAi7sP5nEw5sCMQ8u12+NcKDXDiHc/FJljrGxrCDDGjhI4iY c5ehp2HYlqYbcSiKxnJaDZYxpM7uTfjckD8/ozcfKWwo0tgz9XVgqKE99MyDZm1a9xiB JJrw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=YzTiW9SmLfJ8MSPSO1t/NRF3soUFRZKIH9z0TuB1Fwk=; b=R7p4VQHgz8NKWixPuwc8TNvONE7q5zzMK+IGBG5o8KAs08k3kJ+McEjYmiMZkEiwXB VGPVfmYX9irLXR4w+2nNoIT8qYhngs2bTCHdy/ZW/ACB/nOuP30sx8jG6pzJ5uq2B4xZ XtWCdDGhq4p1E4AKUSAcuj6DY7wDxPeIoi+XH9OsTXP4EX90rdk0+kafxL0I51+NjXSZ sMMolC0nqWeiKrDXDUlp+98vqw1fjJl0efsvVmaZHafNLO847BUYKJZvICSsZTIdnjV2 jPBLQXKk/m0Q/ksIhangDsQGWcdIOkCV2h6Z7VYunAXAh2hukUje49nyfN9gaiEhwdsh FuHg==
X-Gm-Message-State: AOAM531Mv5QLdWo6XwYp3VLCBk+DVDVnU5W01rfILnVW6/6GOauLrg3D jj5QJA6ByQ6LRRkla4ziRD+u5Oyv5zhtSylk7cg=
X-Google-Smtp-Source: ABdhPJyTBtMlUQ9taggM+59rYmKfTT8UlESS3FWZaB7FnrpJXdly77iy8aRUmOhnxM4Ul99Kknjq5CWJtVzj0eNGuQI=
X-Received: by 2002:a37:a7d0:: with SMTP id q199mr9265853qke.217.1608374290852; Sat, 19 Dec 2020 02:38:10 -0800 (PST)
MIME-Version: 1.0
References: <> <> <> <>
In-Reply-To: <>
From: William McCall <>
Date: Sat, 19 Dec 2020 10:38:00 +0000
Message-ID: <>
To: John Scudder <>
Cc: Enke Chen <>, "idr@ietf. org" <>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <>
Subject: Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 19 Dec 2020 10:38:14 -0000

On Fri, Dec 18, 2020 at 10:33 PM John Scudder
<> wrote:
> On Dec 18, 2020, at 1:09 PM, Enke Chen <> wrote:
> >
> > No, I am not assuming that packets are getting somewhere. The TCP_USER_TIMEOUT would work as long as there is "pending data" (either unacked, or locally queued). The data can be from the local BGP Keepalives or the TCP_KEEPALIVE.
> Apart from the other objections to relying on TCP_USER_TIMEOUT, which I think are sufficient, it’s not clear to me that implementations will provide the desired semantics. RFC 793 seems like it specifies the right semantics (“get this data to the peer within N seconds or close”):
>         The timeout, if present, permits the caller to set up a timeout
>         for all data submitted to TCP.  If data is not successfully
>         delivered to the destination within the timeout period, the TCP
>         will abort the connection.  The present global default is five
>         minutes.
> However the Linux man page documents different semantics:
>        TCP_USER_TIMEOUT (since Linux 2.6.37)
>               This option takes an unsigned int as an argument.  When the
>               value is greater than 0, it specifies the maximum amount of
>               time in milliseconds that transmitted data may remain
>               unacknowledged before TCP will forcibly close the
>               corresponding connection and return ETIMEDOUT to the
>               application.  If the option value is specified as 0, TCP will
>               use the system default.
> The important difference being that whereas 793 implies data written to the socket, the Linux man page says “transmitted” data, which seems like it must mean data TCP has written to the network. These are two very different things! If Linux (or another stack) implements what the man page seems to say, it’s not useful for our purposes.
> —John
> _______________________________________________
> Idr mailing list

I was curious too. I read the manpage, relevant linux kernel code, the
RFC, and hacked up a test case (unicast me if you want the code).
Also, Cloudflare published a relevant blog entry[0]. For this specific
scenario, see under the sub-heading "Zero window ESTAB is...

TCP_USER_TIMEOUT doesn't appear to kick in until there is unACKed
data, meaning that it has already been transmitted from TCP's
perspective. Stuff hanging around in the buffers due to persist state
doesn't seem to count, per the test results and the docs. Confirms
your thoughts from the reading I think.


William McCall