Re: [Idr] draft-spaghetti-idr-bgp-sendholdtimer - Feedback requested

Adam Chappell <> Wed, 28 April 2021 11:02 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 48B353A1DD2 for <>; Wed, 28 Apr 2021 04:02:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id iq0rhH6n0LZO for <>; Wed, 28 Apr 2021 04:02:09 -0700 (PDT)
Received: from ( [IPv6:2a00:1450:4864:20::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 5164E3A25F4 for <>; Wed, 28 Apr 2021 04:01:49 -0700 (PDT)
Received: by with SMTP id o16so71758397ljp.3 for <>; Wed, 28 Apr 2021 04:01:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=bkAz/D5ZDzzS4NZZuo3bUi38PVb7GlB2J1X9Y/BB9V4=; b=gXCSVrNZBGDT58EwwoE67CU2HLfr4pcYlXOG+2Rq7L2DPX+13wjboYbS7EIcHVqQGp nWN6SwDS0t6HYrF5cnKAqlB6NLSUptI4bzH7Di2ntibvOgRGtx47i3R7eSS4xDP0eD0t PUhzm+lKw9HD0UD/cAIBM+q04nC1wAbgi2KYTQf2Ie8XYDVBj3MiY05Qw6LNtWErmPJx Handp47j9LprXTgI1OO3SyYmn70NHfJ5dBBG6Nk+qKlWB/toJxjls3iefDXMuYKde6vE oj2ZCgr8uyKBbxHIvFvCeHQDyh5v1SEMerQJVnHTyPBOq+asmSH2EcDNggHEc3+sx0a9 s1pA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=bkAz/D5ZDzzS4NZZuo3bUi38PVb7GlB2J1X9Y/BB9V4=; b=DPGJyfDr+43Gd5gV87Y40dL2aflK1iwNkbD//8Mmt5V761/0QskCaOFOw4GsWOQKda oyLVBcytPpZcxDJtG9XFKhm31Fl7Vlntkpoe2Sytkdime4+dUsZgUNf4vtNBukPbwS2N Dhd/srlAT7Dt2Cc+ttGJVLz9XKodlPki7C5TtDMK0K9zIp4aX6f+OiiRAceIi/0e+8QK MWBkBIWjvPMhZbKFVdk0zMTaNzCP+pGWgbPvsezTYg0oCMnOGeEuWKYgCI40VMTKIjwZ wcyFpYXfoqHK8eeRdrUpbM1oMl4YEKqU4lMyQzEAqeRkSOrF/Ni+oM7jLrqv88YG7Xs/ yHnw==
X-Gm-Message-State: AOAM533ST4gk7/E9oHiyUCTd22cIzlNIKtwVRJoHQXYZLL0zE4sXv73+ ru5ylyabEnUTNZEHBpS1CWd+9duIOTNF4Q2FLRyaiXJXHBG0s5HO
X-Google-Smtp-Source: ABdhPJz0+xLSqXR9I6ckbAzjlxgQ1MFxReNmD3mI4xjOrzaMRfKqFiYZl4Zu4gTz1kRRCEpI1VAgKOUK4FRaIakgW6g=
X-Received: by 2002:a2e:bc24:: with SMTP id b36mr21154913ljf.91.1619607705760; Wed, 28 Apr 2021 04:01:45 -0700 (PDT)
MIME-Version: 1.0
References: <>
In-Reply-To: <>
From: Adam Chappell <>
Date: Wed, 28 Apr 2021 13:01:34 +0200
Message-ID: <>
To: Ben Cox <>, IETF IDR WG <>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <>
Subject: Re: [Idr] draft-spaghetti-idr-bgp-sendholdtimer - Feedback requested
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 28 Apr 2021 11:02:14 -0000

Hi Ben, I think you have a good draft here, but personally I think
this probably should be a simple advisory to implementors, eg. an
implementation MAY want to consider terminating sessions to peers with
whom they've been unable to write() to for a configurable amount of
time, maybe similar to Job's original change of wording[1] to sec6.5.
Leaves the door open for "profiles" as some have said eg.
"timeliness", "longevity" etc...

It is significant to me that at least two production implementations
I've seen offer visible CLI instrumentation when keepalives haven't
been written on schedule, so it seems that implementors are definitely
aware of the weakness, even if not acting on the situation today and
could do. Same viz a viz so-called update-groups and dynamic
definition of same.

I read your compelling blog post[2] and investigations but I have
missed the conclusive link between the phenomenon of zombies
(undisputed) and TCP sessions regularly stalled in this way. I
understand that you can synthesise the problem by creating a BGP
speaker that starves his peer of the oxygen to talk thus polluting
your own RIB; but I missed the pointer to evidence that it is indeed
this that is generally occurring and causing zombies. Not disputing
that it may be. I realise it may be bad form to associate names of
operators and vendors with bad events, but I think it is important
that we investigate exceptional events so that policy and protocol
changes like this have a sound basis.


On Tue, 20 Apr 2021 at 15:53, Ben Cox
<> wrote:
> Hi IDR folk,
> In response to previous discussion over BGP protocol interactions with
> TCP Zero Windows
> (
> me (who is new to the IETF, so please forgive me for my inevitable
> errors) and Job Snijders have decided to attempt to address this and
> focus discussion on handling this case with a internet draft, the gist
> of it in the abstract of the draft so I won't re-word it here :)
> --
>    This document defines the SendHoldTimer session attribute for the
>    Border Gateway Protocol (BGP) Finite State Machine (FSM).  A session
>    should be terminated if the TCP receive window is zero for the
>    duration of the Send Hold Timer, in this situation the peer is
>    expected to terminate the connection.  For robustness, this document
>    specifies that the local system should also close the connection.
>    This document updates RFC4271.
> --
> We submitted our first draft today (
> ) and we are looking for feedback knowing that it is not complete but
> is likely in a state for some discussion.
> Since this draft tweaks the BGP FSM, we would like to make sure that
> it's done correctly and so we are soliciting help from this working
> group.
> We already have a test BGP speaker that triggers some of these zero
> window edge cases. If that helps, please get in touch with us for the
> peering information.
> Looking forward to feedback!
> Ben
> _______________________________________________
> Idr mailing list


-- Adam.