Re: [Idr] Fwd: New Version Notification for draft-spaghetti-idr-bgp-sendholdtimer-05.txt

Robert Raszuk <robert@raszuk.net> Thu, 04 August 2022 18:33 UTC

Return-Path: <robert@raszuk.net>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3B0B2C157B4F for <idr@ietfa.amsl.com>; Thu, 4 Aug 2022 11:33:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.107
X-Spam-Level:
X-Spam-Status: No, score=-7.107 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=raszuk.net
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mOygNDFXXY5r for <idr@ietfa.amsl.com>; Thu, 4 Aug 2022 11:33:16 -0700 (PDT)
Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 69D03C15C502 for <idr@ietf.org>; Thu, 4 Aug 2022 11:33:16 -0700 (PDT)
Received: by mail-ed1-x536.google.com with SMTP id z2so768887edc.1 for <idr@ietf.org>; Thu, 04 Aug 2022 11:33:16 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raszuk.net; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=+2KWbgmNrcOESSCOh8g0CbwSjrpSsQ2sIbzY0hDZ50Y=; b=d+7wej9se9sXnrFHaQ7vr5TG+/rbz0hiCP6dTU9w9lHFzMbT/u7EFaGd+aphDwvnWD 0vfYlysOSCQDzKLqMaTqmUY/IIhT5E3Gh/BSdEEwdREcpZ8lNx+Hncgzw4g2E4bMFX7a NgKDCerdiXTqfyzM1tqeRM6t0yJyXOO3otxSSR+wv4jZOs9iYL/ZWsVMvEeyTPtu5H2J Jd4xIlhhoj8qcUIR+QNtZQ704GLQCwLEgxO4KPy4p5Ior6jgj3v4pnScwzxpRUJtZNYV bpromCZHn1bOW4ePde22Q/gyYBEb5Zv4ojFktE8uJy4YI6w1JUOhEHJnAkCDGaykCnaK qV1A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=+2KWbgmNrcOESSCOh8g0CbwSjrpSsQ2sIbzY0hDZ50Y=; b=pYfPTeBMomiAwdFKcClT1XRectF8HP3GdbUYpeUvCnTSZjnfIpfMLOmJS5f+kZ1W79 +0M0nX0CJTi7FqvVqZMABxRCM0xoEZc9NsXSXkS/6y1sMU2RdGXYQC9wlXE8UxADMDdB K75M3ZLzYMiC2I+SJ6fzz7lV6QBE7fIOo/vXhkrusXZrMjrkCkKOEskPJVp6/XOaRsrc 5GFxWse+P/LnEGQCOYjVnsERn55zBhEEhedWq7OjejKJxfOB4ktEbiYZ+L9AdlXt7hym 4Y7vNJkk+yd9gQNsx6M2Ip77ttsK/N3u2ByH3qGjkz39C235kksR8vnK7YqKen0VGugf xWKw==
X-Gm-Message-State: ACgBeo3DI7zOkaOn8fO2SSCHu7rAM2R7jApOJ2BRG8BE9MFFpWNsqaVv 2aMCsvmreYlaFfMO7Ada571ItmNl7CV2puCDSAsnp7mpYSU=
X-Google-Smtp-Source: AA6agR5V8WyOYmW80SSxGH61jSkYTsWomLU2XAfOYejoi81tVRuzqoJ3IT2l3YR5gnovjs4I5Q05ixf8upIsCIB9qXU=
X-Received: by 2002:a05:6402:5508:b0:43a:896e:8edd with SMTP id fi8-20020a056402550800b0043a896e8eddmr3276807edb.203.1659637994252; Thu, 04 Aug 2022 11:33:14 -0700 (PDT)
MIME-Version: 1.0
References: <CAOj+MME7XnW7kDXL4muh4Qp1UvabQ9amUoU0Sn3h2axqKzswzA@mail.gmail.com> <77F3E1F0-486F-47DF-ABE4-EFDB9C2FB6D8@gmail.com> <CAOj+MMGR4f3eLEDZY++1m4Lpo9joG4L9OrWbeF6kREn-9a9onA@mail.gmail.com> <c6e44213-7667-0f67-71a4-634411cd102b@foobar.org> <CAOj+MMFajL6E42WCzC0ZqrfSBZjU-0B=ZzmtvCRPkuMzU8z5QA@mail.gmail.com> <Yun6e5jSb0OYZGAX@shrubbery.net> <CAOj+MMFRJr=cs+5DVOp72BVn_j3NgANwNftyj=jRbdsvPpg-wA@mail.gmail.com> <CANJ8pZ9oNvd0CGEbOQQpeZ1Sf-=ctVy8yhD0XFK-qYiE08BZUA@mail.gmail.com> <CAL=9YSX-iXEOQrERA5M_ZbG68UmgacchdODk7uwT3p0ZjLgJow@mail.gmail.com> <CAOj+MMFTikXbAC81mU7SiUHFq==5y5k9cSMK91B5YGVfAQLEvQ@mail.gmail.com> <20220804174412.GA25076@pfrc.org>
In-Reply-To: <20220804174412.GA25076@pfrc.org>
From: Robert Raszuk <robert@raszuk.net>
Date: Thu, 04 Aug 2022 20:33:18 +0200
Message-ID: <CAOj+MMEtnsuO0PKti2XjHwrtJQDOsX7wcaxXihLt_NURM4BRzw@mail.gmail.com>
To: Jeffrey Haas <jhaas@pfrc.org>
Cc: Ben Cox <ben=40benjojo.co.uk@dmarc.ietf.org>, "idr@ietf. org" <idr@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000001a7ff705e56e9558"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/VgZ_bc8TIHlQBe8eaK1B5KydLTg>
Subject: Re: [Idr] Fwd: New Version Notification for draft-spaghetti-idr-bgp-sendholdtimer-05.txt
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 04 Aug 2022 18:33:20 -0000

Hello Jeff,

Just to clarify a few things as I have a feeling that they were not too
well described by me.

> Long run we could use BGP PING and BFD and retire KEEPALIVEs.
>
> As much as I'm a fan of BFD, it is unlikely to become pervasive for all
> situations where BGP gets used.  Also, BFD is insufficiently coupled to the
> BGP session state.  By analogy, LDP still has keepalives in its TCP session
> even though it has a separate hello layer.
>

What I meant by "retire" was not in general spec/IETF wise. I meant to
disable/suspend
only a given session when both BFD and BGP PING would operate.

I don't think think the op message helps us here either.  The whole point is
> the connection is unidirectionally stalled and the upstream wants to take
> action on that.
>

Yes ... I mentioned OP MSG to highlight that BGP level PING would be very
different
from it - not in any form or shape to bring it as a candidate for this
discussion.

Where could this go into BGP?  Ideally, a clean update to the PDU format.
> As a hack?  As two uint32's embedded in the Marker, capability negotiated.
>
> The better question, does this help the situation in question?
> In my opinion, no.
>

Interesting ... Wouldn't it test BGP -- TCP --- NETWORK -- TCP --- BGP path
in a
bi-dir style ? Wouldn't it likely detect more "stuck" elements ?

Your below paragraph rather confirms that it would detect a stuck peer.


> In a "ping" or "bidirectional check" mode, the general idea becomes, "I
> sent
> you seq# X and I want to eventually see this come back to me".
>
> To get the echo reply, the echo must first work its way from the sender's
> TCP send buffer, across the network, through the receiver's TCP receive
> buffer, into the BGP, then the full loop again through their send and your
> receive buffers.  You'd also require a message to message to piggyback the
> reply to.  When there are pending updates, it's easy.  Otherwise, you could
> force another message like keepalive.
>

Note I am proposing a new BGP MESSAGE - BGP PING MESSAGE. Not to
piggyback in the Marker.



> (Yes, arguably you could do a less frequent "ping", but now you're also
> adding another set of optional timer procedures.)
>
> By analogy to BFD, you're getting the result of BFD Echo mode rather than
> BFD Async mode - and it's likely far longer than is desired for this use
> case.
>

Yes this came to me too, but it would not be BGP level so I dismissed it.


> Side reading, some thinking on timestamping BGP updates:
> https://datatracker.ietf.org/doc/html/draft-litkowski-idr-bgp-timestamp-00


Thx


> [1] Discussions with BGP developers over the years about various upgrade
> scenarios often comes back to the fact that the holdtimer is
> unidirectional.
> During an upgrade where the TCP and BGP RIB state is maintained, it's
> useful
> to simply pause consuming incoming routing state and do nothing other than
> send your keepalives.  The challenge for this discussion is that this
> "feature" is indistinguishable from bugs observed for sessions that are
> stuck for too long.
>

Yes .. and instead hacking what's working fine say in 99% of deployments
give
an option to check BGP peer liveness by BGP level PING.

Again with all of the above discussion I am yet to see a solid prove of the
seriousness of the issue.

Best,
Robert.