Re: [Idr] Questions regarding BGP capabilities (RFC5492)

Jeffrey Haas <jhaas@pfrc.org> Thu, 18 April 2024 14:45 UTC

Return-Path: <jhaas@pfrc.org>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2057FC14F681 for <idr@ietfa.amsl.com>; Thu, 18 Apr 2024 07:45:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.899
X-Spam-Level:
X-Spam-Status: No, score=-6.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gVpvxPOvyNv9 for <idr@ietfa.amsl.com>; Thu, 18 Apr 2024 07:45:28 -0700 (PDT)
Received: from slice.pfrc.org (slice.pfrc.org [67.207.130.108]) by ietfa.amsl.com (Postfix) with ESMTP id 09FBFC14F616 for <idr@ietf.org>; Thu, 18 Apr 2024 07:45:27 -0700 (PDT)
Received: from smtpclient.apple (172-125-100-52.lightspeed.livnmi.sbcglobal.net [172.125.100.52]) by slice.pfrc.org (Postfix) with ESMTPSA id 3B1941E039; Thu, 18 Apr 2024 10:45:27 -0400 (EDT)
Content-Type: multipart/alternative; boundary="Apple-Mail=_18082A75-3610-4D30-B47E-631EF23025FE"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.8\))
From: Jeffrey Haas <jhaas@pfrc.org>
In-Reply-To: <ZiEspNe+RcgX9noV@diehard.n-r-g.com>
Date: Thu, 18 Apr 2024 10:45:26 -0400
Cc: idr@ietf.org
Message-Id: <C1EC10D7-835E-49D3-B33A-7BED9B85318C@pfrc.org>
References: <ZiD21ViYxwCLju8p@diehard.n-r-g.com> <2119777B-672A-4B3E-928C-84D4FA5FCCE5@pfrc.org> <ZiEspNe+RcgX9noV@diehard.n-r-g.com>
To: Claudio Jeker <cjeker@diehard.n-r-g.com>
X-Mailer: Apple Mail (2.3696.120.41.1.8)
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/WrE92qTQBC5g9OJ8HUcyqxBIMDg>
Subject: Re: [Idr] Questions regarding BGP capabilities (RFC5492)
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Apr 2024 14:45:29 -0000

Claudio,


> On Apr 18, 2024, at 10:22 AM, Claudio Jeker <cjeker@diehard.n-r-g.com> wrote:
> 
> On Thu, Apr 18, 2024 at 09:02:10AM -0400, Jeffrey Haas wrote:
>> 
>> 
>>> On Apr 18, 2024, at 6:32 AM, Claudio Jeker <cjeker@diehard.n-r-g.com <mailto:cjeker@diehard.n-r-g.com>> wrote:
>>> Are there still BGP implementations out there that do not support
>>> capabilities (RFC5492)?
>> 
>> "Implementations", yes.  From current vendors or major open source packages, probably not.
>> 
>> BGP-4 is simple enough for basic ipv4 applications that people continually make their own stuff even in spite of available tooling.
> 
> Guess people need to slowly update their own stuff :)

A strong lesson of BGP, and partially covered in the IETF 119 technical deep dive talk I gave with John Scudder, is reminding everyone about incremental deployment.  Part of that lesson is old software lingers for a very long time.


> While we can configure session to have no capabilities (you need to turn
> of 4-byte ASN, open policy and rrefresh) it is not our default. So people
> with their own stuff need to fumble extra config knobs. Also I wonder how
> many of those "implementations" do proper error handling :)

My implementation advice would be that most general purpose BGP stacks should have the ability to fall back "vanilla" RFC 4271.

Special purpose stacks can do whatever they wish, although I'd similarly remind implementors that Bad Things done in a local stack can leak into the Internet and cause damage if you're not careful.  Based on some of our outages over the years, "not careful" is too common.

>> Generally, the desired behavior is you list the state you don't like.  
>> 
>> State you want but isn't present shouldn't be sent back in the notification.
> 
> Isn't that the inverse of the text in the RFC?
>    ... the Unsupported Capability NOTIFICATION is a way for a BGP speaker
>    to complain that its peer does not support a required capability 

It's a regular point of contention, yes.  I suspect John Scudder even has some specific stuff in his archive covering the point.

The use cases are mixed:
- A router requires you to have a specific capability supported to let the session come up.  Pick, for example, 4-byte ASes.  That's something you could signal in your notification.
- For features like multiprotocol that send a vector of things, what do you send back?  Your own set, which the peer has already seen?  Their set, which you didn't like? (Juniper's behavior).

A thing that has complicated this over the years is if you don't like a capability, you're welcome to ignore it.  Most features signaled by capability require both sides to assent to its use.


> 
> At least my understanding is that 'Unsupported Capability NOTIFICATION' is
> used to inform the peer that a required capability is missing. Stuff you
> don't like falls under:
>    It MUST NOT be used when a BGP speaker receives a capability that it
>    does not understand; such capabilities MUST be ignored.
> 
> In fact my interpretation of that sentence is that even badly encoded
> capabilities need to be ignored and no NOTIFICATION should be sent back in
> that case.

Syntactically invalid capabilities are good reason to not permit the session to come up.

If the high level enveloping of the capabilities are good, you only know that it's syntactically invalid if you understand the capability in question.

Juniper's choice for reporting such malformations is a bit peculiar, but picking the most appropriate code/subcode for notifications is a favorite discussion point for implementors.  Fundamentally, if the code/subcode doesn't trigger a FSM behavior, the notification is good enough to terminate the session, and if you're lucky the code/subcode provides opportunity for encoding some hint to the peer about what you didn't like.

If the peer didn't encode the right thing... too bad.  Reacting to bad notifications when the session is down is largely impossible.  

> 
> Guess our handling is good enough.

Which is often how such conversations go.

-- Jeff (still filing the bug on the error data in his own implementation)