Re: [Idr] Document shepherd review for draft-ietf-idr-rfc7752bis

Jeffrey Haas <jhaas@pfrc.org> Tue, 09 November 2021 21:45 UTC

Return-Path: <jhaas@slice.pfrc.org>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1EC0E3A117A; Tue, 9 Nov 2021 13:45:08 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NKIy9j6_r5FW; Tue, 9 Nov 2021 13:45:05 -0800 (PST)
Received: from slice.pfrc.org (slice.pfrc.org [67.207.130.108]) by ietfa.amsl.com (Postfix) with ESMTP id C5B763A117C; Tue, 9 Nov 2021 13:45:02 -0800 (PST)
Received: by slice.pfrc.org (Postfix, from userid 1001) id 125881E2D5; Tue, 9 Nov 2021 16:45:02 -0500 (EST)
Date: Tue, 09 Nov 2021 16:45:01 -0500
From: Jeffrey Haas <jhaas@pfrc.org>
To: "Ketan Talaulikar (ketant)" <ketant@cisco.com>
Cc: "draft-ietf-idr-rfc7752bis@ietf.org" <draft-ietf-idr-rfc7752bis@ietf.org>, "idr@ietf.org" <idr@ietf.org>
Message-ID: <20211109214501.GF28677@pfrc.org>
References: <20210827200116.GJ19054@pfrc.org> <MW3PR11MB45701D09EF6D4483BAE06226C1BF9@MW3PR11MB4570.namprd11.prod.outlook.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <MW3PR11MB45701D09EF6D4483BAE06226C1BF9@MW3PR11MB4570.namprd11.prod.outlook.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/Ck00-gzBju4rEx8ximxnOZSEoUs>
Subject: Re: [Idr] Document shepherd review for draft-ietf-idr-rfc7752bis
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2021 21:45:08 -0000

Ketan,

Thanks for your continued patience as we iterate.

I've deleted prior text where I accept the changes you've done in -09.

There remain a small number of changes you've proposed to address my
comments.  There is one somewhat larger point covering NLRI opacity that
deserves Working Group scutiny.  That larger point may result in no changes
in text.

On Thu, Oct 21, 2021 at 04:48:05PM +0000, Ketan Talaulikar (ketant) wrote:
> NLRI keying consistency:
[...]
> [KT] I agree with your point. How about we add the follow text towards the
> end of Sec 4.1 for both this one and the next comment? We want to cover
> this aspect but not really try to specify the behavior of a BGP-LS
> Consumer as that is outside the scope of this document.
> 
> <NEW>
> When there are multiple BGP-LS Producers originating the same link-state
> information, implementation variations of BGP-LS Producers may result in
> the generation of different and inconsistent BGP-LS updates for the same
> link-state object based on the inclusion or exclusion of optional TLVs. An
> inconsistency between BGP-LS Producers with regards to the inclusion of
> optional TLVs in the NLRI results in multiple NLRIs being generated for
> the same link-state object. A BGP-LS Consumer would need the ability to
> merge such duplicate updates to handle such situations. An inconsistency
> between BGP-LS Producers with regards to the inclusion of optional TLVs in
> the BGP-LS Attribute results in one of them being delivered to a BGP-LS
> Consumer as part the BGP propagation and best-path selection procedures in
> most typical deployments. This can result in a BGP-LS Consumer missing out
> on some of the information in a potentially unpredictable manner. The use
> of BGP-LS Producers that have a consistent support for the origination of
> optional TLVs between them can help mitigate such situations for the
> BGP-LS Consumers.
> </NEW> 

This text was merged into -09 and I think addresses the issue reasonably.

A critical detail is you've added "A BGP-LS Consumer would need the ability
to merge..." and we're not asking routers carrying this information to do
the work.

> The remainder of my comments have to do with Error Handling.  My intent is
> not to re-litigate prior decisions, it's to note that the consequences of
> those prior decisions are not adequately discussed in the document.

[...]

> The strict TLV structuring of BGP-LS makes the most types of syntactic
> errors impossible.  However, BGP-LS describes a number of semantic
> validations using RFC 2119/8174 keywords without adequately describing
> what to do when those behaviors are violated at the BGP-LS Consumer.

> [KT] The specification of behavior and handling at a BGP-LS Consumer was
> considered to be out of scope of this draft.

I think I'll add one final comment on this particular matter, then likely
let it rest:

The issue we're discussing here is opacity of the attributes as routes
traverse BGP implementations.  Flowspec, for example, has been injured by
exactly this issue.

If an implementation, upon receiving PDUs that it didn't originate, simply
checks the structure of the PDU for well-formedness and then largely stores
the information in a datastore as opaque bit-strings, these things all work
out okay.  

However, that's not how many real-world implementations behave.

A BGP implementation, that is not the BGP-LS Consumer, that receives and
parses the PDU and locally stores its contents in data structures can have
problems when things are "invalid" in some form.  Such implementations tend
to regenerate the PDU from internal state.  This includes taking optional
elements that aren't parsed, but stored opaquely and inserting them
appropriately.

The implicit advice that is given is, "Regardless of what you do with this
data, keep a full copy of it in original form as well".  This point deserves
explicit acknowledgment by the Working Group if that's our effective intent.

The similar advice is, "Be careful about being a BGP-LS implementation that
serves as a distributing router of BGP-LS while also being a Consumer".

If that's what we really mean, that's what we should say.  Unlike flowspec
v1, we at least have strong TLV rules to ensure it's well-formed before
propagating.

> "Make the same mistakes everywhere" is not a great recipe.
> 
> I strongly recommend that this issue be addressed either in section 7.2.2 or in a new section.
> [KT] Give, the previous comment, I am open to inputs on what can be incorporated into the document in this respect.

Let's find a place to add text warning about the above.

> Some comments on the draft text are noted below.  This is from the idnits output.
> 
> 407	4.1.  TLV Format
> [...]
> 435	   type MUST be in ascending order based on the value field.  Comparison
> 436	   of the value fields is performed by treating the entire field as an
> 437	   opaque hexadecimal string.  Standard string comparison rules apply.
> 
> I suspect what is intended here is opaque binary data rather than
> representing it in hexadecimal format.  "Standard string comparison" might
> make sense if you were doing strcmp() on an actual hexadecimal string.
> The intent here is probably lexicographically ordered since "standard
> string comparison" isn't really defined here.
>
> [KT] Yes, I believe so. How about:
> 
> <OLD>
> Comparison of the value fields is performed by treating the entire field as an opaque hexadecimal string.  Standard string comparison rules apply. 
> </OLD>
> <NEW>
> Comparison of the value fields is performed by treating the entire field as opaque binary data and ordered lexicographically.
> </NEW>

That would be good.  (This didn't make it into -09.)

> 447	   The TLVs within the BGP-LS Attribute MAY be ordered in ascending
> 448	   order by TLV type.  BGP-LS Attribute with unordered TLVs MUST NOT be
> 449	   considered malformed.
> 
> Probably "SHOULD be ordered"?  If it's MAY be ordered and MUST NOT be malformed, why mention it at all?
> [KT] I am happy to drop both the sentences. The current text is more to clarify given the ambiguity in this regards in RFC7752.

I'd suggest leaving it and update to SHOULD.  Reinforcing that unordered
isn't malformed will likely save us implementations that didn't deal with
that.

> 525	                            Table 1: NLRI Types
> 
> Note that the contents of this table do not match the IANA Considerations section.
> [KT] I think you are referring to 0 and the Private Use? Do they need to match since the IANA is more about allocation and this is about listing what is being introduced in the document.

Ok, that seems reasonable.

> 753	   Autonomous System:  Opaque value (32-bit AS Number).  This is an
> 754	      optional TLV.  The value SHOULD be set to the AS Number associated
> 755	      with the BGP process originating the link-state information.  An
> 756	      implementation MAY provide a configuration option on the BGP-LS
> 757	      Producer to use a different value.
> 
> Some text discussing private AS numbers would be helpful.  The item of concern is when BGP-LS is used between cooperating ASes not under a single administrative control.  In such cases, avoiding private AS collisions is an edge case.
> [KT] I am not sure if I've understood your point correctly, but how about add the following at the end of the current text. Please clarify or suggest text if you meant something different.
> 
> E.g., to avoid collisions when using private AS numbers.

That'd work.  I'd suggest:
"to use a different value; e.g., to avoid..."

> 988	   where the Type and Length fields of the TLV are defined in Table 5.
> 989	   The OSPF Route Type field values are defined in the OSPF protocol and
> 990	   can be one of the following:
> 
> 992	   o  Intra-Area (0x1)
> 993	   o  Inter-Area (0x2)
> 
> 995	   o  External 1 (0x3)
> 
> 997	   o  External 2 (0x4)
> 
> 999	   o  NSSA 1 (0x5)
> 
> 1001	   o  NSSA 2 (0x6)
> 
> The semantics of these are from OSPF, but what's the matching RFC section reference for the code points?
> [KT] These are actually BGP-LS code points. We don't anticipate new route types in OSPF (been so for decades) and that is why I skipped creating an IANA registry for them. But we can if that is the right thing to do.

I understand now that these are assigned by BGP-LS.  I'm fine that they may
not have an IANA considerations section.  The confusing point is the
sentence, "... are defined in the OSPF protocol".  

I immediately went to cross-check these vs. OSPF specifications and IANA.  

It'd be helpful to clarify that these are features from the OSPF protocol
but that the code points are not defined within OSPF - or at least what OSPF
code point feature they're intended to cover.
> 

-- Jeff