Re: [Idr] WG LC for draft-ietf-idr-bgp-sendholdtimer-03 (3/23 to 4/12/2024) - Extended to 4/19/2024

Jeffrey Haas <jhaas@pfrc.org> Mon, 15 April 2024 17:53 UTC

Return-Path: <jhaas@slice.pfrc.org>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1D335C14F5E8 for <idr@ietfa.amsl.com>; Mon, 15 Apr 2024 10:53:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id T9R9cMDib8Mx for <idr@ietfa.amsl.com>; Mon, 15 Apr 2024 10:53:04 -0700 (PDT)
Received: from slice.pfrc.org (slice.pfrc.org [67.207.130.108]) by ietfa.amsl.com (Postfix) with ESMTP id 3BDA4C14CEFD for <idr@ietf.org>; Mon, 15 Apr 2024 10:53:03 -0700 (PDT)
Received: by slice.pfrc.org (Postfix, from userid 1001) id B63111E28C; Mon, 15 Apr 2024 13:53:02 -0400 (EDT)
Date: Mon, 15 Apr 2024 13:53:02 -0400
From: Jeffrey Haas <jhaas@pfrc.org>
To: Robert Raszuk <robert@raszuk.net>
Cc: Susan Hares <shares@ndzh.com>, "idr@ietf.org" <idr@ietf.org>
Message-ID: <20240415175302.GD31979@pfrc.org>
References: <DM6PR08MB48573863C5259A0DF98C335CB3042@DM6PR08MB4857.namprd08.prod.outlook.com> <CAOj+MMFe4C_G_T0OKvCJrk-RRyNuwRQZWa+UbvdLqCc9SrCu1A@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <CAOj+MMFe4C_G_T0OKvCJrk-RRyNuwRQZWa+UbvdLqCc9SrCu1A@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/INf2rPiiSDtFvDZBkWUiqCjHPHU>
Subject: Re: [Idr] WG LC for draft-ietf-idr-bgp-sendholdtimer-03 (3/23 to 4/12/2024) - Extended to 4/19/2024
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Apr 2024 17:53:08 -0000

Robert,

On Mon, Apr 15, 2024 at 10:31:59AM +0200, Robert Raszuk wrote:
> Just for the record I do not support this WG LC.
> 
> The proposed solution is not granular enough. While it is better
> than nothing, why to concentrate energy on a solution which may only react
> to stuck peer for hours (send buffer fill time when only keepalives are
> sent) and only cover subset of cases ?
> 
> We have proposed a TCP extension (which is already in the main kernel
> distro for a long time) which addresses the problem in a much more
> universal way.
> 
> https://www.ietf.org/archive/id/draft-chen-idr-tcp-user-timeout-01.txt

As a reminder to the list, each of the proposals has corner cases that don't
fully alleviate the problem individually.

https://mailarchive.ietf.org/arch/msg/idr/Rai4amR5Q0ZuZ60HaLWFGKhKuBA/

As a reminder from the above, in the case of tcp user timeout, excessively
short timer values can lead to unstable BGP when there's CPU congestion that
delays timely acks.  Or, when  there's sufficient packet loss to degrade TCP
throughput, it may now contribute to sessions dropping rather than staying
congested until repair of the loss.

But as you point out, a large TCP window plus minimal BGP traffic has
problems in the sendholdtimer behavior.

Support for tcp user timeout is nowhere near pervasive enough to try to
force this as the only option to address the issue.  Everyone using POSIX
sockets can benefit from the subsets of the problem addressed by
sendholdtimer.

My suggestion to you is to rethink whether your objection is "this doesn't
work" or "this is an incomplete solution".  If it's an incomplete solution,
instead focus on moving the tcp user timeout document forward separately
rather than trying to quash other work addressing at least a portion of the
problem that every BGP implementation can do today.

With regards to this last second appeal to force a merge of the proposals,
support for tcp user timeout was light during the poll.  I suggest spending
the effort to build support for a next round of adoption call.

https://mailarchive.ietf.org/arch/msg/idr/sXpqVZMJCgkRcQ6OHKws_HMdcg0/

-- Jeff