Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0

Robert Raszuk <robert@raszuk.net> Sat, 19 December 2020 22:19 UTC

Return-Path: <robert@raszuk.net>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BB03E3A0C16 for <idr@ietfa.amsl.com>; Sat, 19 Dec 2020 14:19:00 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.088
X-Spam-Level:
X-Spam-Status: No, score=-2.088 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=raszuk.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0e4wXcDtXFxF for <idr@ietfa.amsl.com>; Sat, 19 Dec 2020 14:18:58 -0800 (PST)
Received: from mail-lf1-x12d.google.com (mail-lf1-x12d.google.com [IPv6:2a00:1450:4864:20::12d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 729203A0C15 for <idr@ietf.org>; Sat, 19 Dec 2020 14:18:58 -0800 (PST)
Received: by mail-lf1-x12d.google.com with SMTP id a12so14713555lfl.6 for <idr@ietf.org>; Sat, 19 Dec 2020 14:18:58 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raszuk.net; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=rY4qNsIleyVwJllSG6nR+KXsKQ12XK4WrLdcH4bOkgE=; b=IRFVi07C6aViBNKaecqJs4/jHoDl0qAwpuBvAXTpGT6O8jht8qQYKp7rR7TM/IWjsQ QkrlpO/8LZQxnPUDDkStXxMkQVbQNGIMqKuztDFcm/3gAMA/wCeyZVn5QTptlq91Kght hZwlY02CEf3EDdSvePnRXCU1JgrBiNseZUmokNibd2Mxg/L1DY7/JapdBIwnz/07/XUM inJ19cEmlRRSPrRxvWfZ2s9HjYwrqjQj1BEfF5hrFlGohQ9x7FMJkBN22jZjPpg/S7Pz lMT6un6Td3bD0ZG0p37A4ENz9t5Asf9sUf3REOYWc8Z5P0mbRWEkpqeSTkkkcLJJodtf WEGg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=rY4qNsIleyVwJllSG6nR+KXsKQ12XK4WrLdcH4bOkgE=; b=WtDIvY1ftG4uPcxMFJP9+cnJLW/lr2Ai6HDoas2gHw3rnMmWV5+fmiA0q5VsaDEWaD J5th3EqKEyTplt9J5GgQY66dCS795VXr2VrxxUJNLsHnbhzx+7QasQyF7H31699sI+mo SHvG30sBvlfiZPH47zQ6yKSItL3/aZADeLCiX87THAaslBqIfgXAxdJJ4PGa0FS/Jk59 jHLMLWxojP+CrkdQtFWx4d1VDRTCMnF9RD1TVTfvTy33fPQDueTBvklrzXZNQuCH/PwV EsdEJFWJ8J5jBWj4DIc+Tm6sqYW5yKljABn7Fzv64O7pzsWR/aCBcbrZ8mWUzNwj76Lk vceA==
X-Gm-Message-State: AOAM532ZrYgmLWJDodEp0C7TFhl6xfVFpRxBzdbUfe/iHFEOjW5wmsUq CstWiX9MD5cjxzE9QPc3Bc+2sK08/WeMrlzszGBebg==
X-Google-Smtp-Source: ABdhPJy+jUznKLDK3uCPBfEpeZIHRz7wIdRTf/9p4q+aHcnJKMJoMQlfrz1D0uqZGVkPdXCfNPfuLL2HgN2yVoOUddo=
X-Received: by 2002:a05:6512:287:: with SMTP id j7mr3902956lfp.541.1608416336521; Sat, 19 Dec 2020 14:18:56 -0800 (PST)
MIME-Version: 1.0
References: <CANJ8pZ-WMDotkQvhN-NuP7ivZkPRR-9S2KJSar=6463U0VKkow@mail.gmail.com> <EFC56A31-1276-4DAB-9526-9C2F24814D2C@pfrc.org> <CANJ8pZ_LnDna_jtipcLJq9rrS3MM32rLdxRW8ntC2aEi9VvzMg@mail.gmail.com> <722A787A-5B83-4802-A9F4-AB2957BB3305@juniper.net> <CA+eZshBse4g6jUBMxs4bJiE+uvWScwv7ggLNOMJbUiL1YsaisQ@mail.gmail.com> <CABNhwV1ikHAknsfNDw6GJ8BngHDNjNdCxmgipJvJ7G3rxmnZVA@mail.gmail.com>
In-Reply-To: <CABNhwV1ikHAknsfNDw6GJ8BngHDNjNdCxmgipJvJ7G3rxmnZVA@mail.gmail.com>
From: Robert Raszuk <robert@raszuk.net>
Date: Sat, 19 Dec 2020 23:18:48 +0100
Message-ID: <CAOj+MMHM0bHHL9UfVZC2QWy6=W5F7QtEq9v-rndcUG0u7CLi1Q@mail.gmail.com>
To: Gyan Mishra <hayabusagsm@gmail.com>
Cc: William McCall <william.mccall@gmail.com>, John Scudder <jgs=40juniper.net@dmarc.ietf.org>, "idr@ietf. org" <idr@ietf.org>, Enke Chen <enchen@paloaltonetworks.com>
Content-Type: multipart/alternative; boundary="00000000000063c0e105b6d89cad"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/Fs34i6TkGaxxQij7E-TZ2WHz4Lk>
Subject: Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 19 Dec 2020 22:19:01 -0000

Hi Gyan,

> Going down this path of does seem a lot more complicated and risker then
using BFD.

But BFD is not going to help at all to the problem at hand.

BFD is in the vast majority of cases distributed (and that is feature not a
bug) and responses are handled by line cards.

Here we are dealing with RE/RP based subsystems bugs regardless if those
are in TCP or BGP layer.

Thx,
R.






On Sat, Dec 19, 2020 at 10:36 PM Gyan Mishra <hayabusagsm@gmail.com> wrote:

>
> Here is the RFC 5482 TCP User timeout options from TCPM WG.
>
> https://tools.ietf.org/html/rfc5482
>
> TCPM has a bis draft update to 793 that has more info then the original.
>
> https://datatracker.ietf.org/wg/tcpm/documents/
>
> https://tools.ietf.org/html/draft-ietf-tcpm-rfc793bis-19#page-42
>
>
> From quick read there are caveats with devices supporting or not
> supporting the option.
>
> Also I guess setting the value is tricky as well not too low or too high
> that either could make matters worse with instability.
>
> Going down this path of does seem a lot more complicated and risker then
> using BFD.
>
>
> Kind Regards
>
> Gyan
>
>
> On Sat, Dec 19, 2020 at 5:38 AM William McCall <william.mccall@gmail.com>
> wrote:
>
>> On Fri, Dec 18, 2020 at 10:33 PM John Scudder
>> <jgs=40juniper.net@dmarc.ietf.org> wrote:
>> >
>> > On Dec 18, 2020, at 1:09 PM, Enke Chen <enchen@paloaltonetworks.com>
>> wrote:
>> > >
>> > > No, I am not assuming that packets are getting somewhere. The
>> TCP_USER_TIMEOUT would work as long as there is "pending data" (either
>> unacked, or locally queued). The data can be from the local BGP Keepalives
>> or the TCP_KEEPALIVE.
>> >
>> > Apart from the other objections to relying on TCP_USER_TIMEOUT, which I
>> think are sufficient, it’s not clear to me that implementations will
>> provide the desired semantics. RFC 793 seems like it specifies the right
>> semantics (“get this data to the peer within N seconds or close”):
>> >
>> >         The timeout, if present, permits the caller to set up a timeout
>> >         for all data submitted to TCP.  If data is not successfully
>> >         delivered to the destination within the timeout period, the TCP
>> >         will abort the connection.  The present global default is five
>> >         minutes.
>> >
>> > However the Linux man page documents different semantics:
>> >
>> >        TCP_USER_TIMEOUT (since Linux 2.6.37)
>> >               This option takes an unsigned int as an argument.  When
>> the
>> >               value is greater than 0, it specifies the maximum amount
>> of
>> >               time in milliseconds that transmitted data may remain
>> >               unacknowledged before TCP will forcibly close the
>> >               corresponding connection and return ETIMEDOUT to the
>> >               application.  If the option value is specified as 0, TCP
>> will
>> >               use the system default.
>> >
>> > The important difference being that whereas 793 implies data written to
>> the socket, the Linux man page says “transmitted” data, which seems like it
>> must mean data TCP has written to the network. These are two very different
>> things! If Linux (or another stack) implements what the man page seems to
>> say, it’s not useful for our purposes.
>> >
>> > —John
>> > _______________________________________________
>> > Idr mailing list
>> > Idr@ietf.org
>> > https://www.ietf.org/mailman/listinfo/idr
>>
>> I was curious too. I read the manpage, relevant linux kernel code, the
>> RFC, and hacked up a test case (unicast me if you want the code).
>> Also, Cloudflare published a relevant blog entry[0]. For this specific
>> scenario, see under the sub-heading "Zero window ESTAB is...
>> forever?".
>>
>> TCP_USER_TIMEOUT doesn't appear to kick in until there is unACKed
>> data, meaning that it has already been transmitted from TCP's
>> perspective. Stuff hanging around in the buffers due to persist state
>> doesn't seem to count, per the test results and the docs. Confirms
>> your thoughts from the reading I think.
>>
>> [0] https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/
>>
>> --
>> William McCall
>>
>> _______________________________________________
>> Idr mailing list
>> Idr@ietf.org
>> https://www.ietf.org/mailman/listinfo/idr
>>
> --
>
> <http://www.verizon.com/>
>
> *Gyan Mishra*
>
> *Network Solutions A**rchitect *
>
>
>
> *M 301 502-134713101 Columbia Pike *Silver Spring, MD
>
> _______________________________________________
> Idr mailing list
> Idr@ietf.org
> https://www.ietf.org/mailman/listinfo/idr
>