[Idr] John Scudder's Discuss on draft-ietf-idr-rfc7752bis-14: (with DISCUSS and COMMENT)

John Scudder via Datatracker <noreply@ietf.org> Tue, 13 December 2022 01:29 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: idr@ietf.org
Delivered-To: idr@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id A2615C14CE4E; Mon, 12 Dec 2022 17:29:47 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: John Scudder via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-idr-rfc7752bis@ietf.org, idr-chairs@ietf.org, idr@ietf.org, shares@ndzh.com, jhaas@pfrc.org, aretana.ietf@gmail.com, jhaas@pfrc.org
X-Test-IDTracker: no
X-IETF-IDTracker: 9.2.1
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: John Scudder <jgs@juniper.net>
Message-ID: <167089498765.45764.5586703047949203559@ietfa.amsl.com>
Date: Mon, 12 Dec 2022 17:29:47 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/afIPIEQa-3Mdy53lGGvzYKl2zmw>
Subject: [Idr] John Scudder's Discuss on draft-ietf-idr-rfc7752bis-14: (with DISCUSS and COMMENT)
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Dec 2022 01:29:47 -0000

John Scudder has entered the following ballot position for
draft-ietf-idr-rfc7752bis-14: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/ 
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-idr-rfc7752bis/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

# John Scudder, RTG AD, comments for draft-ietf-idr-rfc7752bis-14
CC @jgscudder

Thanks for the work you've put into this needed document. I reviewed the diff
vs. RFC 7752, but didn't focus on any unchanged sections.

Thanks also to the shepherd, Jeff Haas, for the meticulous and helpful shepherd
write-up.

As an aside to the author, my review would have been easier if a native HTML
rendering had been available. I assume you must have uploaded txt instead of
XML -- please consider uploading XML in the future? Thanks.

## DISCUSS

### Section 5.2.1, using ASN to disambiguate Router ID

```
   BGP-LS uses the Autonomous System (AS) Number to
   disambiguate the Router-IDs, as described in Section 5.2.1.1.
```
My concern with this is straightforward: Section 5.2.1.1 doesn't describe the
use of ASN to disambiguate Router ID. In fact, it's a little hard to understand
WHAT Section 5.2.1.1 describes :-( but it's not that.

I see that this claim is a carry forward from RFC 7752, but since you're
touching it, I think you need to make it right, sorry. If I were confident I
knew what the intent was, I'd propose a rewrite, but since I'm not, I'm just
flagging the problem.

### Section 8.2.2, treat-as-withdraw

I have a concern about this:

```
   When the error that is determined allows for the router to skip the
   malformed NLRI(s) and continue the processing of the rest of the BGP
   UPDATE message (e.g. when the TLV ordering rule is violated), then it
   MUST handle such malformed NLRIs as 'Treat-as-withdraw'.
```
It's novel, in the context of RFC 7606, to allow malformed NLRI to result in
something other than a session reset (or equivalent). I see why you are OK with
it in the context of TLV misordering, however I think it's better to limit the
exception to the exact case, instead of making it a general guideline open to
implementor creativity.

A second comment (not DISCUSS level but I'll include it here) is that you don't
have to characterize this as treat-as-withdraw. Since an implementation that
follows this rule will never accept such an NLRI, there will never be anything
to withdraw. So, following the example of RFC 7606 Section 5.4, we'd have
something like this as a replacement for the quoted sentence: "An NLRI that
violates the TLV ordering rule MUST be discarded." There's probably a little
knock-on rewriting that would suggest, too, as in:

```
   An NLRI that violates the TLV ordering rule MUST be discarded. Other NLRI
   error cases SHOULD be handled using 'AFI/SAFI disable', when...
```
If you think there are other NLRI error cases that should get the more lenient
treatment, my preference would be to enumerate them specifically, but if I'm
missing some reason this should be left as written, let's discuss.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

## COMMENT

### Throughout, use of "new"

The first four occurrences of "new" (up to but not including Section 5.2) might
have been appropriate in RFC 7752 but these aren't new any more, you could
remove them without harm.

### Section 3, bullet 1

```
           If R1 and
      R2 are in the same IGP flooding domain, then they should originate
      the same link-state information into BGP-LS.
```
I assume the "should" above indicates expectation. It's a perfectly good
English sentence, but I think there would be less scope for ambiguity if it
were re-worded like "... then they would ordinarily originate the same..."  The
"ordinarily" is suggested because as the document mentions elsewhere, BGP-LS is
subject to policy so it would be possible, though not wise, to configure R1 and
R2 with different policies leading to origination of different information into
LS.

It's true that formally speaking, the RFC 2119 boilerplate moots this point,
but the difference between theory and practice is greater in practice than it
is in theory. I haven't picked on every lower-case "may" and "should" in your
changeset, only the one that struck me as having genuine potential to lead to
confusion (and a few more noted later in their own points).

And since I'm commenting on this paragraph anyway, "source... sources" is a
little funny: ```
              R1 may also source
              information from sources other than IGP
```
especially since one is a verb and one is a noun. Consider "R1 can also
originate information into BGP-LS from sources other than IGP"?

### Section 3, bullet 3

```
      The BGP implementation on RRm is doing the propagation of BGP-LS
      UPDATE messages and performing BGP Decision Process.
```
It's not, strictly speaking, propagating the messages, it's propagating the
information from the messages, right? So perhaps, "The BGP implementation on
RRm is propagating BGP-LS information. It performs the BGP Decision Process as
part of deciding what information is to be propagated."

I realize this is a fairly picky point but I've spent enough time, over the
years, explaining to people that no we do not blindly forward UPDATE messages
around, that I'd like to avoid propagating [*] the meme further.

[*] Oops, sorry.

### Section 4

You set up, but do not resolve, a conflict, in your first two paragraphs:

```
   The origination and propagation of IGP link-state information via BGP
   needs to provide a consistent and true view of the topology of the
   IGP domain.
```
and,

```
   The link-state information advertised in BGP-LS from the IGPs is
   derived from the IGP LSDB built using the OSPF Link State
   Advertisements (LSAs) or the IS-IS Link State Packets (LSPs).
   However, it does not serve as a true reflection of the originating
   router's LSDB.
```
What's a reader to think when they encounter this contradiction, between "needs
to provide... a true view" and "does not serve as a true reflection"? I guess
the former talks about the *topology* and the latter talks about the *LSDB*,
but if you intend to make something of this subtle difference, it's escaped me.
I presume that having set up this dichotomy, you have some resolution in mind
for it -- please don't leave it as a conundrum to be solved by the reader.

### Section 5.1, malformed

```
          NLRIs having TLVs
   which do not follow the above ordering rules MUST be considered as
   malformed by a BGP-LS Propagator.  This ensures that multiple copies
   of the same NLRI from multiple BGP-LS Producers and the ambiguity
   arising therefrom is prevented.
```
This is OK as written but the last sentence might be even clearer with a bit of
additional explanatory text, as in "This insistence on canonical ordering
ensures that multiple variant copes of..."?

### Section 5.2, alleged deviation from RFC 7606

```
   does not support a particular Link-State NLRI type.  To allow the
   introduction of new Link-State NLRI types seamlessly in the future,
   without the need for upgrading all BGP speakers in the propagation
   path (e.g., a route reflector), this document deviates from the
   default handling behavior specified by [RFC7606] for Link-State
   address-family.  An implementation MUST handle unknown Link-State
   NLRI types as opaque objects and MUST preserve and propagate them.
```
I think you must be referring to the generic advice from Section 5.4 of RFC
7606:

   A BGP speaker advertising support for such a typed address family
   MUST handle routes with unrecognized NLRI types within that address
   family by discarding them, unless the relevant specification for that
   address family specifies otherwise.

Right? If so I guess it would be helpful to cite that specifically, "...
specified by Section 5.4, paragraph 2, of [RFC7606]" (paragraph-level citation
optional but why not).

(And thanks for taking care with this case!)

### Section 5.2.1.4, deprecate harder!

You note in Table 3 that the BGP-LS Identifier is deprecated, but in the
description that follows you just call it optional. Regrettably, experience
shows that although everyone thinks they know what "deprecated" means,
different people have different assumptions. The paragraph at the end of the
section helps, somewhat, but I think more could be done.

```
   BGP-LS Identifier:  Opaque value (32-bit ID).  This is an optional
      TLV.  Its original purpose was that, in conjunction with
      Autonomous System Number (ASN), it would uniquely identify the
      BGP-LS domain and that the combination of ASN and BGP-LS ID would
      be globally unique.  However, the BGP-LS Instance-ID carried in
      the Identifier field in the fixed part of the NLRI also provides a
      similar functionality.  Hence the inclusion of the BGP-LS
      Identifier TLV is not necessary.  If advertised, all BGP-LS
      speakers within an IGP flooding-set (set of IGP nodes within which
      an LSP/LSA is flooded) MUST use the same (ASN, BGP-LS ID) tuple
      and if an IGP domain consists of multiple flooding-sets, then all
      BGP-LS speakers within the IGP domain SHOULD use the same (ASN,
      BGP-LS ID) tuple.
```
It seems to me this could be cut down considerably, since a bunch of it is
historical trivia and justification, and especially since the final paragraph
repeats some of the needed information, and more clearly too.

```
   BGP-LS Identifier:  Opaque value (32-bit ID).  This is an optional
      TLV.  By default, it SHOULD NOT be advertised, but MAY be advertised
      when enabled by configuration, for compatibility with [RFC7752]
      implementations. See the final paragraph of this section for further
      considerations and recommended default value.
```
The curious reader can find the former definition and rules (that you're
deprecating) by referring back to RFC 7752, I don't think it's needed to inline
them, and indeed I think it's unhelpful for document usability. If you think
it's desirable to retain the justification for deprecating it, that could go in
Appendix A.

### Section 5.2.2, more lower-case "may"

In this paragraph, I have a similar concern to the one I raised about Section 3
--

```
   An implementation may end up suppressing the advertisement of a Link
   NLRI, corresponding to a half-link, from a link-state IGP unless the
   IGP has verified that the link is being reported in the IS-IS LSP or
   OSPF Router LSA by both the nodes connected by that link.  This 'two-
   way connectivity check' is performed by link-state IGPs during their
   computation and may be leveraged before passing information for any
   half-link that is reported from these IGPs into BGP-LS.  This ensures
   that only those Link State IGP adjacencies which are established get
   reported via Link NLRIs.  Such a 'two-way connectivity check' may be
   also required in certain cases (e.g., with OSPF) to obtain the proper
   link identifiers of the remote node.
```
There are three "may" in this paragraph. I think the first ("may end up")
expresses expectation, but is that your desired meaning? If so, perhaps reword
as "could end up"? If you're actually expressing permission, that's something
more like an RFC 2119 MAY, perhaps either use the MAY, or at least get rid of
"end up". The second doesn't concern me, though it could be "can". The third
also is not very concerning but could be "could".

### Section 5.3, limiting the size of the BGP-LS Attribute

Thanks for this discussion. One of the points you raise is:

```
         BGP-LS Producers MUST
   ensure that they limit the TLVs included in the BGP-LS Attribute to
   ensure that a BGP UPDATE message for a single Link-State NLRI does
   not cross the maximum limit for a BGP message.  The determination of
   the types of TLVs to be included may be made by the BGP-LS Producer
   based on the BGP-LS Consumer applications requirement and is outside
   the scope of this document.
```
In other words, it's a policy decision. But that means that in this scenario
such policy filtering of TLVs would be expected on a per-producer basis -- does
this motivate any concern that inconsistent policy being applied at different
BGP-LS Producers could lead to redundant but non-comparable information being
propagated into BGP-LS? If so, is this worth noting?

### Section 5.3.1.1, reserved bits

I notice the reserved bits are no longer flagged as such, which is OK, I see
they're still called out as unallocated in the IANA registry. Is there a reason
you haven't provided the customary SHOULD be zero on Tx, MUST be disregarded on
Rx guidance, though? I guess you might have refrained simply because the advice
isn't in RFC 7752, but it seems like good advice nonetheless and wouldn't
introduce any backward compatibility issues.

### Section 5.3.1.5, here's "may" again

```
   In the case of OSPF, this TLV may be used only to advertise the TLVs
   in the OSPF Router Information (RI) LSA [RFC7770].
```
It sounds like you're trying to forbid use of the TLV for anything else, in
which case this really does seem like a candidate for RFC 2119 treatment, as in
something like,

```
   In the case of OSPF, this TLV MUST NOT be used to advertise TLVs
   other than those in the OSPF Router Information (RI) LSA [RFC7770].
```
If you're not trying to do that, then this is yet again a case of the confusing
"may".

### Section 5.3.2.4, small metrics wording

This wording is a little funny, as written it's contradictory:

```
        IS-IS small metrics have a length of 1 octet.  Since the
   ISIS small metrics are of 6-bit size, the two most significant bits
   MUST be set to 0 and MUST be ignored by the receiver.
```
I guess you mean something like "IS-IS small metrics are of 6-bit size, but are
encoded in a 1 octet field; therefore the two most significant bits of the
field MUST be set to 0 by the originator and MUST be ignored by the receiver."
(You could also use a formulation around the legal range of small metrics being
0-63, but that makes it a little harder to talk coherently about the top two
bits of the field.)

### Section 5.3.2.6, "may" again

Same comment as for 5.3.1.5, mutatis mutandis.

### Section 5.3.3.1, reserved bits

same comment as for 5.3.1.1.

### Section 5.3.3.6, "may" again

Same comment as for 5.3.1.5, mutatis mutandis.

### Section 5.4, needs a reference

I'm glad you've started to introduce this practice into BGP! But I think this
needs to be a reference, not a naked URL embedded in the body:

```
   The Enterprise Codes are listed at <http://www.iana.org/assignments/
   enterprise-numbers>
```
Apart from other reasons to put this in the (Normative) references, the way
it's embedded in the body right now ends up making it wrap as shown, leading
the unwary reader to the wrong landing page.

### Section 5.7, expand acronyms

VRF should be expanded on first use. (Not starred on
https://www.rfc-editor.org/materials/abbrev.expansion.txt)

## NITS

## Notes

This review is in the ["IETF Comments" Markdown format][ICMF], You can use the
[`ietf-comments` tool][ICT] to automatically convert this review into
individual GitHub issues.

[ICMF]: https://github.com/mnot/ietf-comments/blob/main/format.md
[ICT]: https://github.com/mnot/ietf-comments