Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0

Robert Raszuk <robert@raszuk.net> Thu, 17 December 2020 20:07 UTC

Return-Path: <robert@raszuk.net>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1012D3A0FC1 for <idr@ietfa.amsl.com>; Thu, 17 Dec 2020 12:07:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.988
X-Spam-Level:
X-Spam-Status: No, score=-1.988 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=raszuk.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bX0ACKE9IRHz for <idr@ietfa.amsl.com>; Thu, 17 Dec 2020 12:07:21 -0800 (PST)
Received: from mail-lf1-x135.google.com (mail-lf1-x135.google.com [IPv6:2a00:1450:4864:20::135]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 69F253A0FC0 for <idr@ietf.org>; Thu, 17 Dec 2020 12:07:21 -0800 (PST)
Received: by mail-lf1-x135.google.com with SMTP id s26so26154625lfc.8 for <idr@ietf.org>; Thu, 17 Dec 2020 12:07:21 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raszuk.net; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=JkzvsDc9BgT3fJC96orhTrFvk/CSYdDJHFuFL31ZXkY=; b=Mm7AAIJ5907S+oWtED1owF+31U/8qaP7blu0sg/Ke5wN9nIrc3QJZWKMeiKQdXnA/A vmSjCvS2NMjmnHad2yLlb6Bv5QuViTtXzIWNBASO4Jc28gUHrasErQHKamKPFieyHxYV Ghp44kJ4aCIxqROByshwgbD41fnJBodmBnmv2fUpvM0BTJYsbZmHKKdq09Eb1/SihxdZ h6ackpK44zJ5brQxzvogUhrVMLm7PiZvGwnIFuJ+kMkl2rlBNjEgf6bqDpcHRS4dyo2N lTRZSYHEAU+0PT4IApfmyo5f5kjaB7Ktl2O5stmQH8Ad0xP0PS7GoiM2FOFugs9T1dVP W5Sg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=JkzvsDc9BgT3fJC96orhTrFvk/CSYdDJHFuFL31ZXkY=; b=lePipwV4FdILbDxSjTUWtt17J07LG9w6oCIPvONn0xy0Pksv4JmEciov6TWKjNBqlY RPspAnVKy+jl1g/noE66BggGYMGcxxqssg4nF9HFCuTR0E0BXGahogyqlmvaUQG+d5Pj xYegGmA/W10l9pMSuDYQPaESE/LNEzQMNXJTpv3jin8FFejcHmbVHOWcOXSo5wp7BeCs /nLXeYzV+6ZLkdDb4S3ro8TOGsekn80H+ev06ie6XMKhLuQN5ds0OhvPfeqQuMKiS620 JW3NhoGqMaWBU8tQ//satsgUe0+23T8t057TXqukCQ6oi8d0ld6GCRY6zDGaF2oVLMwh fn2w==
X-Gm-Message-State: AOAM5305tKoEY09V15+LrRmOkBwyf4ZvcPGUvb/+ZNeJUKhMPFBlx+we BQpX995WyQlwTXd5VRQt+PBa2CI85Y/tFrl5MskK+1THsM9UDw==
X-Google-Smtp-Source: ABdhPJyabPb83w0KP1utzVR76xw4v+IQlJtL1wJa8W55XkdELWcKU2q1nZshR8xdg8Bexni08RjS+IT9RCjSZol2Y6s=
X-Received: by 2002:ac2:4147:: with SMTP id c7mr164879lfi.396.1608235639139; Thu, 17 Dec 2020 12:07:19 -0800 (PST)
MIME-Version: 1.0
References: <CANJ8pZ_02njLOJxJPAW4vT3q0EPGB6WY1ZGemQpfiXNMhadb6A@mail.gmail.com> <CAOj+MMHC_uGRDwEmJJO0QCRXahfinbWw5wLzSQJ=C9CYAma-mw@mail.gmail.com> <CANJ8pZ-rq7MbFBLi26nb2yGJvsfrEcQZzn1ieq3LgnJM1p4ULA@mail.gmail.com>
In-Reply-To: <CANJ8pZ-rq7MbFBLi26nb2yGJvsfrEcQZzn1ieq3LgnJM1p4ULA@mail.gmail.com>
From: Robert Raszuk <robert@raszuk.net>
Date: Thu, 17 Dec 2020 21:07:09 +0100
Message-ID: <CAOj+MMHDjy44EdCYiF6zm_GZaZffJE5gNmUHKS8E+4OeKSeaJg@mail.gmail.com>
To: Enke Chen <enchen@paloaltonetworks.com>
Cc: Job Snijders <job@sobornost.net>, "idr@ietf. org" <idr@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000fc82aa05b6ae8944"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/dOpEAZIEfeduqxWYCRHlkD0DeKo>
Subject: Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Dec 2020 20:07:24 -0000

Hey Enke,

> for the session to be terminated deterministically at the transport
layer.

Yes and I would 100% agree with that.

However the discussion seems to be around terminating it at/by the
application layer just from not being able to write to transport socket any
more for N seconds. This mismatch is something IMHO is a bit ugly.

Your suggestion to solve it at transport itself is sound !

Otherwise with each failed write we start a counter. How often do we retry
to write the same msg ? Then after many failed attempts to queue a BGP
message the time is exceeded and we RST.  That triggering logic seems quite
dirty at best .. just imagine dealing with 1000s of peers.

Cheers,
R.



On Thu, Dec 17, 2020 at 8:48 PM Enke Chen <enchen@paloaltonetworks.com>
wrote:

> Hi, Robert:
>
> The receiver is broken for not closing the session after the holdtime
> expires, and that certainly needs attention.
>
> However, the rational for trying to do something on the sender seems to be
> the following: as the session is broken and should have been terminated by
> the other side, but it's not, the sender would like to have a way that
> provides an "upper bound" for the session to be terminated
> deterministically at the transport layer.
>
> The TCP_USER_TIMEOUT option seems to be a good fit in this case.
>
> Thanks.   -- Enke
>
> On Thu, Dec 17, 2020 at 2:21 AM Robert Raszuk <robert@raszuk.net> wrote:
>
>> Good catch Enke !
>>
>> Also what if TCP rcv takes the BGP messages and passes it to BGP I/O InQ
>> which drops it for some reason right there ? Looks to me like we are not
>> going to detect any event like this here. But the problem we are trying to
>> address will persist. I think in this thread we are focusing too much on
>> transport vs application level detection.
>>
>> And I will repeat the question already stated ... Why rcv would not close
>> the session in spite of missing KEEPALIVES or UPDATES ?
>>
>> Tx,
>> R.
>>
>> PS. Side note: BGP Operational Message addresses this type of
>> inconsistencies by periodically comparing BGP Adj_RIB_In and _Out counters.
>>
>>
>> On Thu, Dec 17, 2020 at 3:41 AM Enke Chen <enchen@paloaltonetworks.com>
>> wrote:
>>
>>> Hi, Folks:
>>>
>>> Regarding the patch for openBGPD pointed out by Job, I do not think it
>>> would work. When the TCP rcv window from the remote is 0, the BGP keepalive
>>> can still be queued to the socket buffer. It can take a long time for the
>>> socket buffer to be filled up by BGP keepalives.
>>>
>>> It seems that the TCP_USER_TIMEOUT option can be used for the persistent
>>> zero-size window issue.  The timeout value could be multiples of the
>>> holdtimer (with min and max adjustments), perhaps somewhere around 5 or 6
>>> minutes.
>>>
>>> Thanks.   -- Enke
>>>
>>> ----------
>>>
>>> Job Snijders <job@sobornost.net> Tue, 15 December 2020 21:54 UTC
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__mailarchive.ietf.org_arch_browse_idr_-23&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=OPLTTSu-451-QhDoSINhI2xYdwiMmfF5A2l8luvN11E&m=FlndUknstuJ9j_Pf40oKOLGgDCHrXNgX1l6gQZsjVxE&s=Gy3ZgD4mwrmy1k7kEyDCeqqUBxyXkv33m4XaHfegXGA&e=>
>>>
>>> [snip]
>>> How to solve this? Claudio Jeker took a look at what it would take in
>>> OpenBGPD and came up with the (tiny!) following patch, should be
>>> readable to most: https://marc.info/?l=openbsd-tech&m=160796802508185&w=2 <https://urldefense.proofpoint.com/v2/url?u=https-3A__marc.info_-3Fl-3Dopenbsd-2Dtech-26m-3D160796802508185-26w-3D2&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=OPLTTSu-451-QhDoSINhI2xYdwiMmfF5A2l8luvN11E&m=FlndUknstuJ9j_Pf40oKOLGgDCHrXNgX1l6gQZsjVxE&s=4Ip2QeM5GZ1ohdD4z1RB3-XR1zvrkGa-gnnnxVzd3Gs&e=>
>>>
>>> _______________________________________________
>>> Idr mailing list
>>> Idr@ietf.org
>>> https://www.ietf.org/mailman/listinfo/idr
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ietf.org_mailman_listinfo_idr&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=OPLTTSu-451-QhDoSINhI2xYdwiMmfF5A2l8luvN11E&m=FlndUknstuJ9j_Pf40oKOLGgDCHrXNgX1l6gQZsjVxE&s=hBMaxmukXgY-6uXgnTCoi6Zoz2jI0izuMOA06uP1Seg&e=>
>>>
>>