Re: [Idr] AD Review of draft-ietf-idr-wide-bgp-communities-08

Alvaro Retana <aretana.ietf@gmail.com> Mon, 27 March 2023 13:26 UTC

Return-Path: <aretana.ietf@gmail.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0F99BC151B07; Mon, 27 Mar 2023 06:26:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.096
X-Spam-Level:
X-Spam-Status: No, score=-2.096 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id k7LpPhUizK_Y; Mon, 27 Mar 2023 06:26:39 -0700 (PDT)
Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2470DC15C528; Mon, 27 Mar 2023 06:26:39 -0700 (PDT)
Received: by mail-pl1-x633.google.com with SMTP id ix20so8434065plb.3; Mon, 27 Mar 2023 06:26:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679923598; h=content-transfer-encoding:cc:to:subject:message-id:date :mime-version:references:in-reply-to:from:from:to:cc:subject:date :message-id:reply-to; bh=p/tBPu84tGELXGBpYA7KHACYJIXjXIyXJGj5j3WZvNY=; b=Ttjos2fy7BA9D8Y2AufCE2f5fftnMcIORH+qokKnR6ueiyf8LEBC71pA4yEdMK2D7X b20F/oNJhTVOFvOgd8mAffv82QRDwVxq/5e/2RPisOvQ9bLiSZJUra4BIwgfG9VpR47O ODivndsgdD6QONcYR+c7oeG5q0UKDOcjh6G1MqxrYaQn5OXjIHhUZgCyDExazjf1fzR/ fWppNUy5tDIEf2e00FL0NoEONHuwiW+FgFtTCFfsMdXEv9ieYgAeCTNcPSqieMZff1WM wvXmXq69EUrTf7Yhc84vqGT+84UFM2krQ9Mj9Id/rl/mWC4KSj0U+sZErpgibeQF3guz p98w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679923598; h=content-transfer-encoding:cc:to:subject:message-id:date :mime-version:references:in-reply-to:from:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=p/tBPu84tGELXGBpYA7KHACYJIXjXIyXJGj5j3WZvNY=; b=p7I0JvzuFmonLfDTJiwlX/872mzakwYakiajrBbeM3tYtOQCAfu5QrRzA30hmILGny lYZkj/UL5IVomXnwo7jMXnbi1nhXJwM8T6uGSw+iVRLBrao28SscwwIoKuUSMNRJeiv4 cG06YgwRYG2p6j666c+2lFpv+Gg2if4jVWEmgwEwy4RHnarlFSyA7qOIz+4gQphHBFFC krD7bDoa6sXOXdnDx3OkwZ7yczEJxRAgYDXOmqcOJ82izimSXXoKgLRTKGQn3/0XD3Ir 54MJWw5mMkPxBhIUyKwl0VwZwV/0OUF/TpzEu+mhXoEuTbbp0SGHEb7VMak04Kdj5xkP fS/A==
X-Gm-Message-State: AAQBX9dTDXqE/LfjRXt38IwrhflJDyhyFkEoIbv8N8i9O7+ilk37QcNg lgdJowMN4mj55O0rhs3DpNKfGe+a0sMIzBfOrss=
X-Google-Smtp-Source: AKy350aE06akoHC64m6W8RWcTX1FgpMrjS3vzZ12+QgzamKh6CoDQ4GqStBNIGRS1PeQ8ehL0ZPEfQtPVuGjEgVYFNw=
X-Received: by 2002:a17:902:ec90:b0:1a0:763d:6c2a with SMTP id x16-20020a170902ec9000b001a0763d6c2amr4443137plg.10.1679923597931; Mon, 27 Mar 2023 06:26:37 -0700 (PDT)
Received: from 1058052472880 named unknown by gmailapi.google.com with HTTPREST; Mon, 27 Mar 2023 09:26:37 -0400
From: Alvaro Retana <aretana.ietf@gmail.com>
In-Reply-To: <CAOj+MMG6AqcJZbAL57ynoj4bXg=JfSTgUPPx8nbaCBd=DVHY6Q@mail.gmail.com>
References: <CAMMESsw_kC_V-7px1xNbgxsinWW0RFUrugDmu2ziF8ioxqxCRQ@mail.gmail.com> <CAOj+MMG6AqcJZbAL57ynoj4bXg=JfSTgUPPx8nbaCBd=DVHY6Q@mail.gmail.com>
MIME-Version: 1.0
Date: Mon, 27 Mar 2023 09:26:37 -0400
Message-ID: <CAMMESswSYtnQJUwpdmG=R_dJ4eXpGg+FS5TV1-YBe-0b2t2+7g@mail.gmail.com>
To: Robert Raszuk <robert@raszuk.net>
Cc: idr-chairs@ietf.org, draft-ietf-idr-wide-bgp-communities@ietf.org, idr@ietf.org, Keyur Patel <keyur@arrcus.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/4X7pyjE1IICxXih7sbSVCR4puHk>
Subject: Re: [Idr] AD Review of draft-ietf-idr-wide-bgp-communities-08
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 27 Mar 2023 13:26:43 -0000

On March 1, 2023 at 9:35:29 AM, Robert Raszuk wrote:


Robert:

Hi!

> Part 1 of 2 below. Part 2 will be done tomorrow.

Some replies below.  I'll get to Part 2 soon.

Thanks!

Alvaro.


...
> Fixed (moved explanation of fields to section 3.1).
>
> > 180 Container Type:
> > 181 Container Type 1, BGP Wide Community is defined in this document.

[minor] This is not a definition of the field.

   Container Type:  Indicates the Type of the container.



...
> > [major] Allowing only one Community Container doesn't permit the
> > ability to create both local *and* external policies for a given
> > prefix. All BGP Wide Communities would either be transitive or not.
> > If transitive, any local policy would be propagated. There is a
> > mention (in §10.1) of the ability to remove unwanted received
> > communities, but the document says nothing about controlling the
> > advertisement by filtering. Should this functionality exist? Please
> > add some text.
>
> There is no restriction on the number of Community Containers in BGP
> Community Container Attribute.
>
> Each container (on a per container granularity) is also subject to
> selective propagation rules.
>
> Section 3 contains this text already which IMHO describes it pretty well:
>
> "The attribute contains a set of typed containers. Any given container
> type may appear multiple times, unless that container type's
> definition specifies otherwise."

Yes, it does.

Using the hierarchy that you showed later - very helpful!

> Attribute
> |
> --------- Container Type
> |
> ------------ Community Value
> |
> -------------- TLVs with ATOMs

(1) The attribute itself can only occur once, which may result in a
very big attribute.  Maybe we don't have to worry about this just yet,
but what is the expected behavior if the attribute can't be propagated
any longer because it won't fit in the BGP message?  Specifically, is
this attribute eligible for "attribute discard" (§4/rfc8654)?  I hope
the answer is no.


(2) The container types can occur multiple times.  How are conflicts
resolved?  For example, several Wide Communities can be present -- if
they all have the same Community Value, Source/Context AS, and Target
TLVs, but different Parameters TLV (i.e. different actions).  In this
case, the scope of the container may be different, but that doesn't
determine where the actions take place.


(3) There's only one Community Value per Container Type, right?


(4) The Wide Community TLVs can only appear once.


(5) The Atoms (like the TLVs) can only occur once.



> I added the below to further clarify this:
>
> "Each such container can have a unique propagation scope and can be
> subject to local filtering."

Good.

The Security Considerations section talks about limiting the
propagation to avoid memory use, etc.  What about the case where the
propagation is limited (by a rogue node or misconfiguration) resulting
in the expected actions not being executed?  Please include something
about this in the Security Considerations.

This is not a new threat, just something that continues here.



> > 242 3.1. BGP Community Container Attribute Common Header
> >
> > 244 Containers always start with the following common header:
> >
> > 246 0 1 2 3
> > 247 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
> > 248 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> > 249 | Type | Flags |C|T| Reserved |
> > 250 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> > 251 | Length |
> > 252 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
...
> > [major] Should IANA manage the potential allocation of the reserved
> > bits? If so, please include the request for a new registry in §11.
>
> After some time digesting it I don't think that IANA should manage it. Reason
> being is that new definitions require much more than new IANA type number and
> perhaps new bit(s).
>
> It requires a new draft/RFC. And the definition of those bits would be per
> Type (some bits would not necessarily need to be legal in other types) so the
> new doc could extend the 3..7 bits.

I think you just provided justification for the bit space being managed: ;-)

- requires a new draft/RFC
- the definition would be per type

I suggest you expand the "BGP Community Container Types" registry to
include the bits for each type.  For an example of something similar,
take a look at the PIM Message Types registry:
https://www.iana.org/assignments/pim-parameters/pim-parameters.xhtml#message-types




> > 278 Bit 0 (T bit) Transitivity bit:
> >
> > 280 When not set (value 0), the community in the container is
> > 281 transitive across AS boundaries, but not across an administrative
> > 282 boundary.
> >
> > [major] "transitive across AS boundaries, but not across an
> > administrative boundary"
> >
> > This definition is confusing, specially for the case where the
> > "administrative boundary" is defined as only containing the local AS
> > -- the texts seems to contradict itself: transitive across AS
> > boundaries, but not "this" AS boundary. Please clarify.
>
> Yes indeed. I limited the definition of T bit to administrative boundary.
>
> Moreover I moved the definition of the administrative boundary to
> new section 2.4 as it applies to the entire document.

=====
188	2.4.  Propagation of BGP Community Containers

190	   Propagation of BGP Community Containers is scoped to the
191	   administrative boundary.  The definition of an administrative
192	   boundary consists of arbitrary set of connected ASes, possibly under
193	   control of a single entity.  How such an administrative boundary is
194	   determined is out of scope of this document.

[nit] s/scoped to the/scoped by an
=====

[major] I get the gist of what you mean, but the specification is not clear.

What does T = 0 mean?  The definition is: "not transitive across [an]
administrative boundary."  I interpret that to mean that the container
can be advertised to other EBGP peers inside the "arbitrary set of
connected ASes", but not to others.  Is that correct?

What should an implementation do when an UPDATE is to be sent to an
EBGP peer and T=0?  How does it know whether the remote AS belongs to
the "arbitrary set of connected ASes"?

I assume the answer has to do with configuration.  Please talk about
this somewhere, maybe in the Operational Considerations.  It seems to
me that it there's no indication (by the operator) of where the
administrative boundary is then an implementation has to assume that
the "arbitrary set of connected ASes" includes only the local AS.

For this document, explaining the configuration-related requirements
and leaving the specific mechanism out of scope should be enough.



...
> > [major] What about the interaction between the T and C bits?
> > Specifically, if the T bit is set ("transitive across all ASes") and
> > the C bit is not set ("not transitive across confederation sub-AS
> > boundary"). A Member-AS is an AS, so it would fit in the description
> > of "all ASes".
>
> That got fixed automatically by defining T bit to only control behaviour
> across administrative boundary not an AS.

How?  Aren't the Member ASes in a Confederation inside the same
administrative boundary?



...
> > 333 The Community Value indicates what set of actions a router is
> > 334 requested to take upon reception of a route containing this
> > 335 community. The semantics of this value depend on whether this is a
> > 336 private/local community or IANA registered.
> >
> > 338 When the high order bit of the Community Value field - I - is set,
> > 339 the value is IANA Registered and has a well defined meaning with
> > 340 underlying semantics. See the documentation for each Registered BGP
> > 341 Wide Community for its semantics and validation requirements.
> >
> > 343 When the high order bit of the Community Value field is clear, the
> > 344 value is Locally defined and has semantics solely within the control
> > 345 of the AS defining that community. The Context AS Number provides
> > 346 the namespace in which this Community Value is interpreted. It is
> > 347 that AS's responsibility to provide the semantics and validation
> > 348 requirements for the BGP Wide Community.
> >
> > 350 See Section 11.5 for code point space partitioning.
> >
> > [major] Given how the ranges are allocated, the definition of the I
> > bit is not providing any additional information.
>
> Well it actually does. Notice the ranges:
>
> 0x00000001-0x7FFFFFFF
> vs
> 80000001-0xFFFFFEFF
>
> And that is the crux of it.

Yeah, the point was that the differentiation is already part of the
registry.  From the point of view of the operation of the protocol,
there's no difference whether the value is in a registry or not.



...
> > It is also not
> > correct to say that it indicates that the value is "IANA Registered
> > and has a well defined meaning with underlying semantics" for two
> > reasons: (1) values in the experimental range are not recorded by
> > IANA,
>
> We are talking about FCFS for registered use. Experimental here would
> be also registered. I recognize that even for experiment we could use early
> allocation model from 80000001-0xFFFFFEFF range but this is just leaving
> extra space in IANA for some new category more like EFT vs experimental.

The text talks about "the high order bit of the Community Value field
- I - is set" -- which includes the Experimental Range.  IANA doesn't
register Experimental values, so there's no early allocation process.


> Note that split of the specs into protocol encoding (this one) and actual
> definitions draft-ietf-idr-registered-wide-bgp-communities made this as
> separate documents hence likely generated some confusion.
>
> > and (2) there may be undefined values...
>
> Not sure what you mean. Undefined should not have I bit set.

FCFS doesn't have any documentation requirement -- so anyone can
request a value without defining it.

[This is akin to your comments about the Capability codes having an FCFS range.]



...
> > What should a router do
> > if the Context ASN doesn't correspond to the local ASN?
>
> That's actually quite expected. It should do exactly as
> the spec says .. :)
>
> Propagate if T/C bits allow and execute when Context
> AS matches.

When the Context AS matches what?  The local AS?

I'm missing where this is clearly specified.  The description of the
Context AS indicates that it "provides the semantics to interpret this
Community", but not that it is the only AS that can take an action.
Among potentially others, the Source AS also knows the semantics...

Also, the definition of Context AS doesn't seem to match the intent of
"registered" communities as their semantics are defined in a document,
not by a particular AS.



...
> > [major] What should a receiver do if it encounters an unknown Community
> > Value?
>
> Ignore it.
>
> I added an explicit sentence (below) to highlight it into section 4.1:
>
> "When Community Value is not recognized by the BGP speaker it should
> ignore it while maintaining propagation rules as specified by T/C bits.""

[] "should ignore"

Can we make that Normative?

Suggestion>
   If a Community Values is not recognized, the receiving BGP
   speaker MUST ignore it and propagate it according to the T/C
   bits.



...
> > 352 4.2. Source AS Number
> > ...
> > 356 The Autonomous System number indicates the AS originating this BGP
> > 357 Wide Community.
> >
> > [major] Should this field always have a value?
>
> Hmm ... I am under the opinion that if we do not mark any field in the
> protocol encoding spec as option it is mandatory.

The question is not about whether the field is optional: it is part of
the format so there's no option, it has to be included.

The question is about whether a valid value needs to be included.
IOW, what is the Source AS used for?  I know it can be used for
filtering (§10.2), so it would seem that a value has to exist.

What should a receiver to if the Source AS is set to 0?


...
> > What should a receiver do if the field is
> > empty (zero)?
>
> Impossible :)

Not impossible -- that's what bugs are made for! :-(

Seriously, if an ASN is expected then please indicate what should be
done if one is not received.

rfc7607 reserved AS 0 and says (§2):

   Authors of future protocol extensions that carry the Autonomous
   System number are encouraged to keep in mind that AS 0 is reserved
   and to provide clear direction on how to handle AS 0.

rfc6793 reserved AS_TRANS and says (§4.2.1):

   ...AS_TRANS MUST be used when the NEW BGP speaker does not have a
   two-octet AS number (even if multiple Autonomous Systems would
   use it).

So AS_TRANS is a valid ASN.  What are the considerations/effect (if
any) if AS_TRANS is used as the Source ASN?


rfc5398 reserved a range of ASNs (64496-64511) for documentation
purposes and says (§3):

   ...BGP operational configurations should not peer with neighboring
   ASes that are numbered from this reserved AS number set.

It says nothing about other uses, but I would assume that they
shouldn't be used here.


rfc7300 reserved the "Last ASNs" and says (§4):

   Operators SHOULD NOT use these Last ASNs for any other purpose or as
   Private Use ASNs.

Is there a reason to use these as a Source AS?


It may not be necessary to list each case individually.  Perhaps a
general statement can be used; something like this:

   The Source AS MUST NOT be set to any reserved value in [rfc7607],
   [rfc6793], [rfc5398], or [rfc7300].  If such a value is received,
   the community MUST/SHOULD be xxx.



...
> > Should there be any additional verification -- for
> > example, comparing the AS_PATH to the Source ASN?
>
> Interesting suggestion (ala flow spec validation), but I am not sure about it
> here.

Ok -- if it is not compared, then filtering, for example, may not work
as expected (the communities originated by a specific ASN may be
allowed if a different number is used) with results that may not be
expected.  This potentially allows for policy to be injected by rogue
actors.

I don't think this is a new issue: it exists in other communities that
expect the ASN to mean something.  Mentioning it in the Security
Considerations would be ideal.



> > 359 4.3. Context AS Number
> > ...
> > 363 This identifies the AS that provides the semantics to interpret this
> > 364 Community.

[] The questions/comments above about the value of the ASN also apply here.


> > [major] It is clear from above that this field should be considered
> > when a Private/Experimental value is used. But, what about when
> > registered values are used? Please indicate what should be done (I
> > assume ignore, but better to be explicit).
>
> The intention is that we always have it and we always act on given community
> container if it matches local AS.

See the comments above about this point: context AS = local AS.



...
> I do believe that this spec would be useful far beyond the local scope.
>
> Of course being aware that peers could drop it at will ...

... or originate communities with incorrect Source ASNs.



...
> > 366 4.4. BGP Wide Community TLVs
> > ...
> > 382 Sub-Type:
> > 383 The sub-type of the BGP Wide Community TLV. A given Sub-Type
> > 384 MUST NOT appear more than once.
> >
> > [nit] These are not sub-TLVs... Just a nit.
>
> Not sure I get it.
>
> If BGP Community Container, Type 1: BGP Wide Community defines a space for
> TLVs and one of the TLV is BGP Wide Community TLVs it will contain a number
> of sub-TLVs.
>
> Again this spec uses nesting like really complex Recursive Russian Dolls.

The point was that the title of §4.4 is "BGP Wide Community TLVs" (not
sub-TLVs), so calling them "sub-types" (which I realize is different
than sub-TLVs) is just confusing.

Yes, from the point of view of the overall Container these are
sub-TLVs.  From the point of view of the Wide Community these are
TLVs.  It depends on which Russian Doll is your reference. ;-)

Note that later the text says things like "the Wide Community
Target(s) TLV (Sub-Type 1) is a series of Atom TLVs."  You're changing
the reference even in mid-sentence.

As I said above, this is just a nit.  An long as the reader adjusts to
the changing reference point we'll be ok. :-)



...
> > 392 4.4.1. Sub-Type 1, BGP Wide Community Target(s) TLV
> >
> > [minor] Even if only one Target can be configured, should the name of
> > this TLV be generalized by using "Targets" (and not "Target(s)"? It
> > would make the text clearer.
>
> Fixed.
>
> Likewise done the same in other TLV titiles.

Check the text too -- there are some instances of "(s)" left.



...
> > 439 If the BGP Wide Community Target(s) TLV and the BGP Wide Community
> > 440 Exclude Target(s) TLV have conflicting semantics, priority MUST be
> > 441 given to the Wide Community Exclude Target(s) TLV.
> >
> > [major] "conflicting semantics...priority MUST be given"
>
> > I think you may mean "conflicting results" (or something like that) as
> > the semantics (include vs exclude) are already conflicting. To be
> > clear, I mean that the Targets TLV matched on a set of targets, but
> > the Exclude Targets TLV also matches on the same. Is that an example
> > of that you mean?
> >
> > If so, then it seems to me that the result is to ignore the matches
> > from the Targets TLV and act (exclude) on the matched from the Exclude
> > Targets TLV. I'm sure there must be a succinct way of specifying the
> > behavior clearly.
> >
> > All this comes from the fact that "priority" implies (to me) that the
> > Targets may be considered *after* the Exclude Targets.
>
> I reworded this to say:
>
> "If the BGP Wide Community Target(s) TLV and the BGP Wide Community Exclude
> Target(s) TLV result in conflicting actions, priority MUST be given to the
> Wide Community Exclude Target(s) TLV."

"priority" is still bothering me.   Maybe just a nit...

Maybe s/priority MUST.../the exclude action MUST be considered for the
overlapping set.

[EoR]