Re: [Idr] BGP Message level ACKs
Robert Raszuk <robert@raszuk.net> Fri, 10 March 2023 10:13 UTC
Return-Path: <robert@raszuk.net>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B292DC153CBB for <idr@ietfa.amsl.com>; Fri, 10 Mar 2023 02:13:35 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.095
X-Spam-Level:
X-Spam-Status: No, score=-2.095 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=raszuk.net
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GrEVHYEuvOgI for <idr@ietfa.amsl.com>; Fri, 10 Mar 2023 02:13:31 -0800 (PST)
Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C83CFC14CF1B for <idr@ietf.org>; Fri, 10 Mar 2023 02:13:31 -0800 (PST)
Received: by mail-wr1-x434.google.com with SMTP id bx12so4513707wrb.11 for <idr@ietf.org>; Fri, 10 Mar 2023 02:13:31 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raszuk.net; s=google; t=1678443210; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=1eFRT69sk5mnSbTRyi4nqvK79Sbe9Wn0rdfixZdTwnY=; b=K5gc7GQ3XuS5uJPnUGr6Vf29jWCT2BCotonzfgpF3G+G10lzuczbUzFmz3/0wpEXAL 8akwYAMOxWm9F10YQEgMj0MjBtBJRBmozHz3T59r181/EoAXdHbeehzIjseNHAPGpluj 3EqewGS1bcL+Yy9ZCvq3CR3BCqHN578b1ZTdnePJSa36y+wJzxNeVb0cTBdSJfU8B10R GuIt6jm7rtYi1Nn2YNJA83X6dCJPMLIkeVaqiFJpUPULnA8fIGHqY6X8QnzOhT7qlH5+ eCLSJlsyrK7l8m8hRDYA0eQ3xuX3IvFNgbvld20plHYstFvVoOyRmCF8gSA7bAF9uqFa qJrQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678443210; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1eFRT69sk5mnSbTRyi4nqvK79Sbe9Wn0rdfixZdTwnY=; b=M9Mdzap25mGWHTm+cF9RaXyYo3SnbP/EOE5ugYEGUoNXxnrNNwNPq9ZxfJ9zwoWV3w qciDF/UhFMeYw9UjILeUtwkaj0VWSvdyRKHGFbz7/Tyw1jHoyEqNgraJcEy0mAcFvi4N 2yS8YnK//9t9RmaORR7SkcdEYnmTSeHhaFPHr8SOFs440Y8fL3vRpxfBByi+zdy/TO/O MvHhGJHZkGIWv1bcH6hBobWOqcSCJgGVLW8xaO+scPDnPKDU8IF/a6Ag7cOpRBZNLZM4 EuUrq0VB1H1M+q1tLjTrIECh7l0ruGOW61/6wHkYn//0sqOBNBCz+KdDow5N+/lxtkQX El3w==
X-Gm-Message-State: AO0yUKVd4bAHkzLvaaODANfl44jpjndYifEOWiA4emAB0q3DdY7ASrxg h3p4Ae0MLjHnrpBAyEQ85EwpRWa/XNxo64b84eQSBEzi4+zY0RLWC0g=
X-Google-Smtp-Source: AK7set8BAztD/Cu9iq8BhNX4nkekMIkNXA/928Hi1FgjA7xOsDOs2WQRWF1/mfHgMvuvE7KB0dUDISs6c6gJvHozwvQ=
X-Received: by 2002:a5d:640d:0:b0:2c5:4e07:7987 with SMTP id z13-20020a5d640d000000b002c54e077987mr4715113wru.6.1678443210220; Fri, 10 Mar 2023 02:13:30 -0800 (PST)
MIME-Version: 1.0
References: <CAOj+MME+xOzrA3e1CKmHR2JC+fiWkV4LinuL5ovA3L0tRWORRg@mail.gmail.com> <45816498-6F09-4135-ACB4-6E51683001C7@pfrc.org> <CAOj+MMFaJViRnRqwaLXHsBbcm6g5XtUuc5kh71ScZ9OKSoBNZw@mail.gmail.com> <CA+wi2hMjjPepoPnqUu0aMH4OtkezrC+sEV7YsmO92NnG-8+Q8Q@mail.gmail.com> <CAOj+MMEjBfdVK_uJ_B4CHr5suG0Zf7CSukqywTofkDMbimFEcA@mail.gmail.com> <CA+wi2hO0e5b7R49j9giz0bxF6d1EWddH+rYG7cSjMM4aSri49g@mail.gmail.com> <CAOj+MMH_3nwoyDE7GxNSa+fLPO2VD+SWCiGxmCem0s74-E17Ww@mail.gmail.com> <ZAoIpXq8Z1vKU0g3@diehard.n-r-g.com> <CAOj+MMFQjc0K9dBji+Y2DSkZsuuEwgRxNYee+0pmKOzD9iL9pQ@mail.gmail.com> <alpine.DEB.2.20.2303101028540.2636@uplift.swm.pp.se>
In-Reply-To: <alpine.DEB.2.20.2303101028540.2636@uplift.swm.pp.se>
From: Robert Raszuk <robert@raszuk.net>
Date: Fri, 10 Mar 2023 11:13:19 +0100
Message-ID: <CAOj+MMH9TjNm98p9k-5afGyQD=mxXVf17P2Jd6azhBq5zmdWDw@mail.gmail.com>
To: Mikael Abrahamsson <swmike@swm.pp.se>
Cc: Claudio Jeker <cjeker@diehard.n-r-g.com>, idr@ietf.org
Content-Type: multipart/alternative; boundary="0000000000005238b805f6890343"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/xqGb2dLankLjDjNaIxlWUkeBqk0>
Subject: Re: [Idr] BGP Message level ACKs
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 10 Mar 2023 10:13:35 -0000
Mikael at all, > You're free to go work on this (I'm not opposed), but please don't hold up > sendholdtimer in the meantime. No - I do not intend to block anything. Just providing an alternative approach for all of us to consider. With that I spent a few minutes writing it up to better explain what I really meant by BGP ACKs approach. So here it is: *PROBLEM SUMMARY:* Operators reported that today some BGP implementations demonstrate no reaction to missing BGP KEEPALIVES or BGP UPDATE MESSAGES over an extended period of time. At the same time they still keep generating and sending BGP KEEPALIVES or BGP UPDATE MESSAGES to the other side. That results in a phenomenon called "stuck peer" where the healthy side of the connection does not have a mechanism to bring such connection down - resulting in possibly introducing a level of routing inconsistency in the network. The other assumption here stated is that network connectivity between such peers is healthy and TCP connection is also sound. *DIAGNOSIS:* The root of the stated problem is clearly in the buggy side of the BGP session. However, today protocol does not have a way to detect it. Moreover BGP KEEPALIVES generation and handling is very often separated from other BGP processing. Smart implementations also try to offload BGP KEEPALIVE processing to independent CPUs or perhaps even hardware (DPUs). *PROPOSAL - BGP ENHANCED KEEPALIVES MESSAGE:* When trying to explore solution space it appeared that what we are really facing here is lack of acknowledgements sent by the BGP peer that he is receiving our data. While originally thought about generating and sending BGP ACK messages after digesting input expressed on the list it appeared that for the problem at hand it would be way too complex and involved extension. Instead I would suggest defining the exact mirror of today's BGP KEEPALIVE MESSAGE with only one difference - it is sent towards the peer only if the RCV_PEER_MSG flag is set. And let's make it up front cristal clear that this is NOT in any form or shape REPLACEMENT of current BGP KEEPALIVES. It is envisioned that both can operate in parallel - as both have different roles. Of course frequency of both MAY and in fact SHOULD be different - so the multiplier (counter) where peer may act if such BGP ENHANCED KEEPALIVES are missing. By acting here it is RECOMMENDED to take a phased approach. After missing N ENHANCED KEEPALIVES peer should be moved out of its current update/peer group and declare to be slow peer. After missing N+X ENHANCED KEEPALIVES BGP should inject syslog WARNING level message indicating that we are dealing with stuck peer. After missing N+Y ENHANCED KEEPALIVES BGP should close the session. The selection of number of missing messages in each step as well as activating given step should be subject to local configuration. Clearly we can almost fully reuse current BGP KEEPALIVES code for it - as the only difference (as stated above) is conditional send when RCV_PEER_MSG flag is set. And that flag (1 bit per peer) should be set every time we receive any BGP message from the peer. Sending BGP ENHANCED KEEPALIVE MSG resets the flag back to 0. The frequency of BGP KEEPALIVE TIMER can be a fixed by the specification as multiplier of negotiated BGP HOLD TIME VALUE. In addition BGP CAPABILITIES MUST be used to signal it. That's about it :) Simplicity ! Cheers, Robert
- [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs Gert Doering
- Re: [Idr] BGP Message level ACKs Tony Przygienda
- Re: [Idr] BGP Message level ACKs John Scudder
- Re: [Idr] BGP Message level ACKs Jeffrey Haas
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs Tony Przygienda
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs Tony Przygienda
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs Tony Przygienda
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs Claudio Jeker
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs tom petch
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs tom petch
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs tom petch
- Re: [Idr] BGP Message level ACKs Gert Doering
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs Mikael Abrahamsson
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs Mikael Abrahamsson
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs Mikael Abrahamsson
- Re: [Idr] BGP Message level ACKs Claudio Jeker
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs Job Snijders
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs Mikael Abrahamsson
- Re: [Idr] BGP Message level ACKs Job Snijders
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs Jeffrey Haas
- Re: [Idr] BGP Message level ACKs Job Snijders
- Re: [Idr] BGP Message level ACKs Robert Raszuk
- Re: [Idr] BGP Message level ACKs Jeffrey Haas
- Re: [Idr] BGP Message level ACKs Jeffrey Haas
- [Idr] SendHoldTimer default value (was: BGP Messa… Job Snijders
- Re: [Idr] SendHoldTimer default value (was: BGP M… Robert Raszuk
- Re: [Idr] SendHoldTimer default value (was: BGP M… John Scudder
- Re: [Idr] BGP Message level ACKs tom petch
- Re: [Idr] BGP Message level ACKs Alejandro Acosta
- Re: [Idr] BGP Message level ACKs Robert Raszuk