Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0

Jeffrey Haas <jhaas@pfrc.org> Wed, 16 December 2020 21:48 UTC

Return-Path: <jhaas@slice.pfrc.org>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 87DF33A112A for <idr@ietfa.amsl.com>; Wed, 16 Dec 2020 13:48:16 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Pr5pifkwvgJ5 for <idr@ietfa.amsl.com>; Wed, 16 Dec 2020 13:48:14 -0800 (PST)
Received: from slice.pfrc.org (slice.pfrc.org [67.207.130.108]) by ietfa.amsl.com (Postfix) with ESMTP id 877363A1114 for <idr@ietf.org>; Wed, 16 Dec 2020 13:48:14 -0800 (PST)
Received: by slice.pfrc.org (Postfix, from userid 1001) id 726BA1E356; Wed, 16 Dec 2020 17:05:37 -0500 (EST)
Date: Wed, 16 Dec 2020 17:05:37 -0500
From: Jeffrey Haas <jhaas@pfrc.org>
To: Robert Raszuk <robert@raszuk.net>
Cc: Job Snijders <job@sobornost.net>, "idr@ietf. org" <idr@ietf.org>
Message-ID: <20201216220537.GF24940@pfrc.org>
References: <X9PHRuGndvsFzQrG@bench.sobornost.net> <CAOj+MME4OHmoqJfzNQ4Tj6+wCd1kJVHPfJsDbk_+Xh8fh5G8Dg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAOj+MME4OHmoqJfzNQ4Tj6+wCd1kJVHPfJsDbk_+Xh8fh5G8Dg@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/MZMcIAauYSGdtTGNARLaInn1hQY>
Subject: Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Dec 2020 21:48:17 -0000

Robert,

Using your post to make a meta point:

On Sat, Dec 12, 2020 at 10:22:53AM +0100, Robert Raszuk wrote:
> I went back and reread the thread:
> 
>    https://mailarchive.ietf.org/arch/msg/idr/q0Sx5d3zZjfOmOQ4lO2OZAHh9Lc/
> 
> Shouldn't it be better if we first ask implementations to provide show
> command/api to list all peers and min-max durations of TCP Window being 0
> without actually doing any automagic RST/NOTIFICATION/FIN ?
> 
> This could allow to better understand which peers are getting behind in
> their control plane and perhaps also allow to set the RST timer under such
> conditions by operator? If he chooses this to be equal to HOLD TIME so be
> it but I am not sure this would be universally an optimal choice.
> 
> Along the same lines we should perhaps also list per BGP peer number of
> DUPLICATE ACKS, RETRANSMISSIONS etc ...
> 
> Are there implementations already deployed in DFZ allowing such data to be
> displayed per each BGP peer ?

Gathering TCP state from stacks is messy at the best of times even from
standard stacks.  Doing so usually incurs the equivalent of a system call
and probably isn't something you might not be able to count on doing
regularly without potential impact on the router.

That said, it's my experience that tracking the windowing state
cross-correlated with hints of dropped packets (e.g. missing TCP ACKs) tends
to provide a good feel for the health of sessions when done in a time series
basis.  But such machinery really belongs in the TCP stacks and BGP as an
application is simply one such consumer of such telemetry.

This has the feel of work that should be spun to TSV, if any standardization
is to be done at all.

-- Jeff