[Lsvr] AD Review of draft-ietf-lsvr-bgp-spf-19

Alvaro Retana <aretana.ietf@gmail.com> Tue, 07 March 2023 17:32 UTC

Return-Path: <aretana.ietf@gmail.com>
X-Original-To: lsvr@ietfa.amsl.com
Delivered-To: lsvr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6859BC1524A3; Tue, 7 Mar 2023 09:32:41 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.094
X-Spam-Level:
X-Spam-Status: No, score=-2.094 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id R01B4XHJtSQO; Tue, 7 Mar 2023 09:32:39 -0800 (PST)
Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CDC30C151B1A; Tue, 7 Mar 2023 09:32:33 -0800 (PST)
Received: by mail-pg1-x52e.google.com with SMTP id h31so8054635pgl.6; Tue, 07 Mar 2023 09:32:33 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1678210353; h=content-transfer-encoding:cc:to:subject:message-id:date :mime-version:from:from:to:cc:subject:date:message-id:reply-to; bh=9JsVZ134eJDehYPWLpXeH/LLrZIbngmfEEGqecH/nBA=; b=enPpeIsyEZ5Nag5t3NsicywvuvNdlHsU4tbOG0/pu6yM8URE9hbAicVfttA3nNpL3F HVVo3MPKhHjYWR3mS7Pdi6F/MVPWMK5ArTZ+RYe2gk6ta0be3DY6aTk568xtqGBMIKP9 0sXGBxrlTcOy5abyoAc4axIGg9MRSBdgG+NTOL1hNh38No/Whs2/6nua8O6lJy6igRoD iWyQP0m7RF+gLCsoSkV/aZr2dVUItJ0cCQXzCR5TbAQ+ANaVcnPplHcgpryDikveNVbn ryfDjJYwDnyIwY9O315HT9DtcUKFTDurRsU1L/QX41wcp+WNRHbiTJdP2+Yp+JdWELHs uDGQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678210353; h=content-transfer-encoding:cc:to:subject:message-id:date :mime-version:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=9JsVZ134eJDehYPWLpXeH/LLrZIbngmfEEGqecH/nBA=; b=WAJSlkhxrFsFFie0Oz4ZShnKU1IFIRt3SzRt1cOaEFBE9Hp7BV13bw86w5KH9VKKqd gZVupkwkYTRc/CnSSxGP6OuG2HZNp0RT37Jf6PwjTSCZB+NEekZ6LtOaQG4HXxrjdyJb HendQx39oi2caVX8yX+bA+toG0lyx2LTZbRQTWhS7ldsWE/ovwtumz4sYgMGOOP6hHdR joYq+T4hx/K5TVfTFrKiFYkcnYzD5LQPgX83FhX2RNmYCUPHs+AxyKhLkjgOLysItSys 69ZRonSK5A906vvq4Rmlj6dMVuG7lLQC5rpywk1bQvcaji8JaK76meKnAo8Kp5aBTjlW lD3A==
X-Gm-Message-State: AO0yUKVxnBtjADcq4qYvzhIx4dzTbPWCgTqdktoe6HANoTpYusRjwvjN kIN4qo1yUqnlljbXbCMtRWwmJ1H9RQCcpIFND8YGYuEi
X-Google-Smtp-Source: AK7set/5XYQ8TGgerbZiF2BCi/b90CpYi12o12n5BfEibNeSmKhcmMLi9G34hhNIkbQ026TL1JTEJPXBSdO6rXyRj/4=
X-Received: by 2002:a62:8247:0:b0:5a8:bdd2:f99c with SMTP id w68-20020a628247000000b005a8bdd2f99cmr6478072pfd.1.1678210349407; Tue, 07 Mar 2023 09:32:29 -0800 (PST)
Received: from 1058052472880 named unknown by gmailapi.google.com with HTTPREST; Tue, 7 Mar 2023 09:32:28 -0800
From: Alvaro Retana <aretana.ietf@gmail.com>
MIME-Version: 1.0
Date: Tue, 07 Mar 2023 09:32:28 -0800
Message-ID: <CAMMESsyqvrTH70NXGBpB9DLW6VHvpyY8TSm2m_rovXoxZKPyVQ@mail.gmail.com>
To: draft-ietf-lsvr-bgp-spf@ietf.org
Cc: Victor Kuarsingh <victor@jvknet.com>, lsvr-chairs@ietf.org, "lsvr@ietf.org" <lsvr@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsvr/80Awa3weqFJC5G--ep0EZVYFfmc>
Subject: [Lsvr] AD Review of draft-ietf-lsvr-bgp-spf-19
X-BeenThere: lsvr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Link State Vector Routing <lsvr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsvr>, <mailto:lsvr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsvr/>
List-Post: <mailto:lsvr@ietf.org>
List-Help: <mailto:lsvr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsvr>, <mailto:lsvr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Mar 2023 17:32:41 -0000

Dear authors:

I have now finished looking at the diffs between my last review (-11)
and -19.  I focused mainly on the changes and additions.  I know -20
is out -- I was already half way through -19 and decided to not
transcribe the comments as the diffs (-19 to -20) are small.

The document looks a lot better.  Thank you for the hard work.


My main concern is still the lack of semantic checking (see my
comments in §7.*).  BGP-LS (rfc7752bis) only does syntactic validation
because any semantic checking is left to the BGP-LS Consumer (and the
specifics left out of scope).  The way BGP SPF uses the information is
different (all the nodes run SPF), so the validity of the contents
need to be verified.

Thanks!

Alvaro.



[Line numbers from idnits.]


...
119	1.  Introduction

121	   Many Massively Scaled Data Centers (MSDCs) have converged on
122	   simplified layer 3 routing.  Furthermore, requirements for
123	   operational simplicity have led many of these MSDCs to converge on
124	   BGP [RFC4271] as their single routing protocol for both their fabric
125	   routing and their Data Center Interconnect (DCI) routing [RFC7938].
126	   This document describes an alternative solution which leverages BGP-
127	   LS [I-D.ietf-idr-rfc7752bis] and the Shortest Path First algorithm
128	   used by Internal Gateway Protocols (IGPs) such as OSPF [RFC2328].

[major] Because BGP SPF is it's own protocol, please don't compare it
to other protocols.  I'll try to point at cases where I find them, but
please take a look too.

In this case, the SPF is not the same as what OSPF uses because the
details are different, which is why it is specified later on.



...
151	1.1.  Terminology
...
167	   Dijkstra Algorithm:  An algorithm for computing the shortest path
168	      from a given node in a graph to every other node in the graph.  At
169	      each iteration of the algorithm, there is a list of candidate
170	      vertices.  Paths from the root to these vertices have been found,
171	      but not necessarily the shortest ones.  However, the paths to the
172	      candidate vertex that is closest to the root are guaranteed to be
173	      shortest; this vertex is added to the shortest-path tree, removed
174	      from the candidate list, and its adjacent vertices are examined
175	      for possible addition to/modification of the candidate list.  The
176	      algorithm then iterates again.  It terminates when the candidate
177	      list becomes empty.  [RFC2328]

[minor] This seems to be a very long definition of a term -- IMO you
only need the first sentence.  Because there is a full definition of
how it works later, a reference to that would be better.

BTW, the other references to this term can be simplified:

s/Dijkstra algorithm Section 1.1/Dijkstra algorithm/g


[major] The reference to rfc2328 is not needed.



179	1.2.  BGP Shortest Path First (SPF) Motivation
...
191	   A primary advantage is that all BGP SPF speakers in the BGP SPF
192	   routing domain will have a complete view of the topology.  This will
193	   allow support for ECMP, IP fast-reroute (e.g., Loop-Free
194	   Alternatives), Shared Risk Link Groups (SRLGs), and other routing
195	   enhancements without advertisement of additional BGP paths [RFC7911]
196	   or other extensions.  In short, the advantages of an IGP such as OSPF
197	   [RFC2328] are availed in BGP.

[] "In short, the advantages of an IGP such as OSPF [RFC2328] are
availed in BGP."

Even if this is a positive comparison, no need to mention OSPF.  Also,
there are (good) aspects of OSPF that are not in BGP SPF: areas, for
example.



199	   With the simplified BGP decision process as defined in Section 6,
200	   NLRI changes can be disseminated throughout the BGP routing domain
201	   much more rapidly (equivalent to IGPs with the proper
202	   implementation).  The added advantage of BGP using TCP for reliable
203	   transport leverages TCP's inherent flow-control and guaranteed in-
204	   order delivery.

[minor] s/simplified BGP decision process/BGP SPF decision process


[] s/ (equivalent to IGPs with the proper implementation)./.

You seem to claim that IGPs are not implemented properly. :-(



...
217	   With controller and route-reflector peering models, BGP SPF
218	   advertisement and distributed computation require a minimal number of
219	   sessions and copies of the NLRI since only the latest version of the
220	   NLRI from the originator is required.  Given that verification of the
221	   adjacencies is done outside of BGP (see Section 4), each BGP SPF
222	   speaker will only need as many sessions and copies of the NLRI as
223	   required for redundancy (see Section 4).  Additionally, a controller
224	   could inject topology that is learned outside the BGP SPF routing
225	   domain.

[nit] "Given that verification of the adjacencies is done outside of
BGP (see Section 4), each BGP SPF speaker will only need as many
sessions and copies of the NLRI as required for redundancy (see
Section 4)."

No need to refer to §4 twice in the same sentence.


[minor] Also, I didn't see anything in §4 (or anywhere else) that
talked about adjacency verification "outside of BGP".



227	   Given that controllers are already consuming BGP-LS NLRI
228	   [I-D.ietf-idr-rfc7752bis], this functionality can be reused for BGP-
229	   LS-SPF NLRI.

[minor] rfc7752bis clarified the role of a Consumer (§3) and it avoids
calling it a controller...in fact it leaves the interface to the
Consumer as out of scope.  Also, the consuming functionality is
different from running SPF, etc.  All this is to say that I think you
should drop this sentence.



...
264	2.  Base BGP Protocol Relationship

266	   With the exception of the decision process, the BGP SPF extensions
267	   leverage the BGP protocol [RFC4271] without change.  This includes
268	   the BGP protocol Finite State Machine, BGP messages and their
269	   encodings, processing of BGP messages, BGP attributes and path
270	   attributes, BGP NLRI encodings, and any error handling defined in the
271	   [RFC4271] and [RFC7606].

[nit] s/defined in the/defined in



273	   Due to the changes to the decision process, there are mechanisms and
274	   encodings that are no longer applicable.  While not necessarily
275	   required for computation, the ORIGIN, AS_PATH, MULTI_EXIT_DISC,
276	   LOCAL_PREF, and NEXT_HOP path attributes are mandatory [RFC4271] and
277	   will be validated.  Unless explicitly specified in the context of BGP
278	   SPF, all other attributes SHOULD NOT be advertised.  However, if they
279	   are advertised, they will be accepted, validated, and propagated
280	   consistent with the BGP protocol.

[major] "NEXT_HOP path attributes are mandatory [RFC4271]"

I know that I suggested this text, but rfc4760 converts the NEXT_HOP
to optional.



...
290	   The BGP SPF extension fundamentally changes the decision process, as
291	   described herein, to be more like a link-state protocol (e.g., OSPF
292	   [RFC2328]).  Specifically:

[] Take out references to OSPF.  In this case, it is not only an
example, but the description is not specific.

s/
   The BGP SPF extension fundamentally changes the decision process, as
   described herein, to be more like a link-state protocol (e.g., OSPF
   [RFC2328]). Specifically:
/
   The BGP SPF extension fundamentally changes the decision process, as
   described herein:



294	   1.  BGP advertisements are readvertised to neighbors immediately
295	       without waiting or dependence on the route computation as
296	       specified in phase 3 of the base BGP decision process.  Multiple
297	       peering models are supported as specified in Section 4.

299	   2.  Determining the degree of preference for BGP routes for the SPF
300	       calculation as described in phase 1 of the base BGP decision
301	       process is replaced with the mechanisms in Section 6.1.

303	   3.  Phase 2 of the base BGP protocol decision process is replaced
304	       with the Shortest Path First (SPF) algorithm, also known as the
305	       Dijkstra algorithm Section 1.1.

[nit] Reorder these so the Phases are in sequence.



307	3.  BGP Link-State (BGP-LS) Relationship

309	   [I-D.ietf-idr-rfc7752bis] describes a mechanism by which link-state
310	   and TE information can be collected from networks and shared with
311	   external entities using BGP.  This is achieved by defining NLRI
312	   advertised using the BGP-LS AFI.  The BGP-LS extensions defined in
313	   [I-D.ietf-idr-rfc7752bis] make use of the decision process defined in
314	   [RFC4271].  This document reuses NLRI and TLVs defined in
315	   [I-D.ietf-idr-rfc7752bis].  Rather than reusing the BGP-LS SAFI, the
316	   BGP-LS-SPF SAFI Section 5.1 is introduced to insure backward
317	   compatibility for the BGP-LS SAFI usage.

[nit] s/SAFI Section 5.1/SAFI (Section 5.1)



319	   The BGP SPF extensions reuse the Node, Link, and Prefix NLRI defined
320	   in [I-D.ietf-idr-rfc7752bis].  The usage of the BGP-LS NLRI,
321	   attributes, and attribute extensions is described in Section 5.2.
322	   The usage of others BGP-LS attribute TLVs is not precluded and is, in
323	   fact, expected.  However, the details are beyond the scope of this
324	   document and will be specified in future documents.

[nit] s/will be specified/may be specified


[] "others BGP-LS attributes"

There is only one BGP-LS Attribute.  This got me thinking about
clarifying, simplifying and avoiding some redundancy.

Suggestion>

   [Eliminate this sentence from the previous paragraph: "This document
   reuses NLRI and TLVs defined in [RFC7752]."]

   The BGP SPF extensions reuse the format of the Link-State NLRI, the
   BGP-LS Attribute, and the TLVs defined in [RFC7752bis]. The usage of
   is described in Section 5.2. The usage of other BGP-LS TLVs or
   extensions is not precluded and is, in fact, expected. However, the
   details are beyond the scope of this document and may be specified
   in future documents.

   [Eliminate the next paragraph.]



326	   Support for Multiple Topology Routing (MTR) similar to the OSPF MTR
327	   computation described in [RFC4915] is beyond the scope of this
328	   document.  Consequently, the usage of the Multi-Topology TLV as
329	   described in section 3.2.1.5 of [I-D.ietf-idr-rfc7752bis] is not
330	   specified.

[] See the suggestion above.


[major] Also, unless used as part of the specification, please avoid
references to other protocols -- specially comparing them.



332	   The rules for setting the NLRI next-hop path attribute for the BGP-
333	   LS-SPF SAFI will follow the BGP-LS SAFI as specified in section 3.4
334	   of [I-D.ietf-idr-rfc7752bis].

[] "section 3.4 of [I-D.ietf-idr-rfc7752bis]"

That section doesn't exist.

This pointer contradicts what §5.3 says.  Please see comments there.



336	4.  BGP Peering Models

338	   Depending on the topology, scaling, capabilities of the BGP SPF
339	   speakers, and redundancy requirements, various peering models are
340	   supported.  The only requirements are that all BGP SPF speakers in
341	   the BGP SPF routing domain exchange BGP-LS-SPF NLRI, run an SPF
342	   calculation, and update their routing table appropriately.

[minor] These are not the only possible deployments or the only ones
supported, right?  For example, a mixture is also possible.  It would
be good to add something about these being examples of common
deployments.



344	4.1.  BGP Single-Hop Peering on Network Node Connections

346	   The simplest peering model is the one where EBGP single-hop sessions
347	   are established over direct point-to-point links interconnecting the
348	   nodes in the BGP SPF routing domain.  Once the single-hop BGP session
349	   has been established and the BGP-LS-SPF AFI/SAFI capability has been
350	   exchanged [RFC4760] for the corresponding session, then the link is
351	   considered up from a BGP SPF perspective and the corresponding BGP-
352	   LS-SPF Link NLRI is advertised.  Depending on the topology and
353	   convergence requirements, it MAY be required that BGP End-of-RIB
354	   (EoR) marker [RFC4724] is received for the BGP-LS-SPF SAFI prior to
355	   advertisement.  This is analogous to the IGP requirement that the
356	   Link State Databases be synchronized prior to advertisement of the
357	   router link.  Enforcement of requiring the EoR constraint SHOULD be
358	   configurable.  If the session goes down, the corresponding Link NLRI
359	   will be withdrawn.  Topologically, this would be equivalent to the
360	   peering model in [RFC7938] where there is a BGP session on every link
361	   in the data center switch fabric.  The content of the Link NLRI is
362	   described in Section 5.2.2.

[major] "BGP-LS-SPF AFI/SAFI capability"

This capability doesn't exist.  The Multiprotocol Extensions
capability including the BGP-LS-SPF AFI/SAFI pair does...

In the context of this document and in this section, it is not
necessary to talk about the capability at all because it is obvious
that the text is talking about a session used for BGP-LS-SPF...  Also,
the specification of the requirement to exchange the capability is
already in §5.1.  The text just adds complexity to what should be
simple text.


[**] I noticed the changes in -20 related to the EoR text.  Two comments:

- You only changed the text in this section, but the same applies to
the next two models.

- Because the text applies to all peering models, it would be better
to put it in §10.1.2 and keep it in a single place.



364	4.2.  BGP Peering Between Directly-Connected Nodes

366	   In this model, BGP SPF speakers peer with all directly-connected
367	   nodes but the sessions may be between loopback addresses (i.e., two-
368	   hop sessions) and the direct connection discovery and liveliness
369	   detection for the interconnecting links are independent of the BGP
370	   protocol.  For example, liveliness detection could be done using the
371	   BFD protocol [RFC5880].  Precisely how discovery and liveliness
372	   detection is accomplished is outside the scope of this document.
373	   Consequently, there will be a single BGP session even if there are
374	   multiple direct connections between BGP SPF speakers.  BGP-LS-SPF
375	   Link NLRI is advertised as long as a BGP session has been
376	   established, the BGP-LS-SPF AFI/SAFI capability has been exchanged
377	   [RFC4760], the link is operational as determined using liveliness
378	   detection mechanisms, and, optionally, the EoR Marker has been
379	   received as described in the Section 4.1.  This is much like the
380	   previous peering model only peering is between loopback addresses and
381	   the interconnecting links can be unnumbered.  However, since there
382	   are BGP sessions between every directly-connected node in the BGP SPF
383	   routing domain, there is only a reduction in BGP sessions when there
384	   are parallel links between nodes.  Hence, this peering model is
385	   RECOMMENDED over the single-hop peering model Section 4.1.

[nit] s/BGP-LS-SPF Link NLRI is/The BGP-LS-SPF Link NLRI is/g



387	4.3.  BGP Peering in Route-Reflector or Controller Topology
...
396	   This peering model, known as sparse peering, allows for fewer BGP
397	   sessions and, consequently, fewer instances of the same NLRI received
398	   from multiple peers.  Normally, the route-reflectors or controller
399	   BGP sessions would be on directly-connected links to avoid dependence
400	   on another routing protocol for session connectivity.  However,
401	   multi-hop peering is not precluded.  The number of BGP sessions is
402	   dependent on the redundancy requirements and the stability of the BGP
403	   sessions.  This is discussed in greater detail in
404	   [I-D.ietf-lsvr-applicability].

[minor] Does I-D.ietf-lsvr-applicability only cover this model or all
of them?  Maybe point at that draft at the start of §4.



...
415	5.1.  BGP-LS Shortest Path Routing (SPF) SAFI

417	   In order to replace the existing BGP decision process with an SPF-
418	   based decision process in a backward compatible manner by not
419	   impacting the BGP-LS SAFI, this document introduces the BGP-LS-SPF
420	   SAFI.  The BGP-LS-SPF (AFI 16388 / SAFI 80) [RFC4760] is allocated by
421	   IANA as specified in the Section 8.  In order for two BGP SPF
422	   speakers to exchange BGP SPF NLRI, they MUST exchange the
423	   Multiprotocol Extensions Capability [RFC5492] [RFC4760] to ensure
424	   that they are both capable of properly processing such NLRI.  This is
425	   done with AFI 16388 / SAFI 80 for BGP-LS-SPF advertised within the
426	   BGP SPF Routing Domain.  The BGP-LS-SPF SAFI is used to carry IPv4
427	   and IPv6 prefix information in a format facilitating an SPF-based
428	   decision process.

[minor] "to replace the existing BGP decision process with an
SPF-based decision process in a backward compatible manner by not
impacting the BGP-LS SAFI"

The replacement is not backwards compatible with anything.  Instead,
the SPF only applies to the new SAFI.

Suggestion>

   This document introduces the BGP-LS-SPF SAFI with a value of 80.
   The SPF-based decision process (Section 6) applies only to the
   BGP-LS-SPF SAFI and MUST NOT be used with other combinations of
   the BGP-LS AFI (16388).  In order for two BGP SPF speakers to
   exchange BGP-LS-SPF NLRI, they MUST exchange the Multiprotocol
   Extensions Capability [RFC4760] to ensure that they are both
   capable of properly processing such NLRI.  This is done with AFI
   16388 / SAFI 80.  The BGP-LS-SPF SAFI is used to carry IPv4 and
   IPv6 prefix information in a format facilitating an SPF-based
   decision process.



430	5.1.1.  BGP-LS-SPF NLRI TLVs

432	   The NLRI format of BGP-LS-SPF SAFI uses exactly same format as the
433	   BGP-LS AFI [I-D.ietf-idr-rfc7752bis].  In other words, all the TLVs
434	   used in BGP-LS AFI are applicable and used for the BGP-LS-SPF SAFI.
435	   These TLVs within BGP-LS-SPF NLRI advertise information that
436	   describes links, nodes, and prefixes comprising IGP link-state
437	   information.

[minor] The first sentence is already covered in §3, but with slightly
different words.  Using "exactly same format" and having all the TLVs
being applicable are not the same thing ("In other words").

Suggestion>
   All the TLVs defined for BGP-LS [rfc7752bis] are applicable and
   can be used with the BGP-LS-SPF SAFI to describe links, nodes,
   and prefixes comprising IGP link-state information.



439	   In order to compare the NLRI efficiently, it is REQUIRED that all the
440	   TLVs within the given NLRI must be ordered in ascending order by the
441	   TLV type.  For multiple TLVs of same type within a single NLRI, it is
442	   REQUIRED that these TLVs are ordered in ascending order by the TLV
443	   value field.  Comparison of the value fields is performed by treating
444	   the entire value field as a hexadecimal string.  NLRIs having TLVs
445	   which do not follow the ordering rules MUST be considered as
446	   malformed and discarded with appropriate error logging.

[major] This is (almost) exactly the same text that is in
§5.1/rfc7752bis.  If the behavior is the same as what rfc7752bis
specified, please don't respecify it again.

In this case, there has been a discussion during IESG Evaluation about
this specific text, which is already different in rfc7752bis-15 (and
may change some more).

   https://mailarchive.ietf.org/arch/msg/idr/j7Iuc6ZY22A4zeDPFOSMGgZ37Ds/
   https://mailarchive.ietf.org/arch/msg/idr/y9YPmWclAxRi-27UQ_d44ApUU7E/



448	   [I-D.ietf-idr-rfc7752bis] defines certain NLRI TLVs as a mandatory
449	   TLVs.  These TLVs are considered mandatory for the BGP-LS-SPF SAFI as
450	   well.  All the other TLVs are considered as an optional TLVs.

[minor] This is similar to rfc7752bis, even if the wording is
different; from §5.1:

   For both the NLRI and BGP-LS Attribute parts, all TLVs are considered
   as optional except where explicitly specified as mandatory or
   required in specific conditions.

I would prefer it if only changes or exceptions are listed.


[major] In rfc7752bis, there is no specific meaning to "mandatory": it
doesn't matter if the TLVs are there or not.  Does this "tag" matter
on BGP SPF?  What happens if a mandatory TLV is not present?

I know that I asked related questions in my last review.  I did it
with the assumption that in this case (BGP SPF) it mattered, even if
it didn't for BGP-LS.



452	   Given that there is a single BGP-LS Attribute for all the BGP-LS-SPF
453	   NLRI in a BGP Update, Section 3.3, [I-D.ietf-idr-rfc7752bis], a BGP
454	   Update for BGP-LS-SPF SAFI SHALL contain a single BGP-LS-SPF NLRI.
455	   It is unlikely that multiple NLRI would have identical attributes so,
456	   for simplicity, there will be one BGP-LS_SPF SAFI NLRI per BGP
457	   Update.

[major] "Section 3.3, [I-D.ietf-idr-rfc7752bis]"

rfc7752bis has changed and §3.3 doesn't exist.  To avoid reference
changes, don't include the section (for work in progress).



459	5.1.2.  BGP-LS Attribute

461	   The BGP-LS attribute of the BGP-LS-SPF SAFI uses exactly same format
462	   of the BGP-LS AFI [I-D.ietf-idr-rfc7752bis].  In other words, all the
463	   TLVs used in BGP-LS attribute of the BGP-LS AFI are applicable and
464	   used for the BGP-LS attribute of the BGP-LS-SPF SAFI.  This attribute
465	   is an optional, non-transitive BGP attribute that is used to carry
466	   link, node, and prefix properties and attributes.  The BGP-LS
467	   attribute is a set of TLVs.

[] "In other words" -- see my comment in the last section.



469	   The BGP-LS attribute may potentially grow large in size depending on
470	   the amount of link-state information associated with a single Link-
471	   State NLRI.  The BGP specification [RFC4271] mandates a maximum BGP
472	   message size of 4096 octets.  It is RECOMMENDED that an
473	   implementation support [RFC8654] in order to accommodate larger size
474	   of information within the BGP-LS Attribute.  BGP SPF speakers MUST
475	   ensure that they limit the TLVs included in the BGP-LS Attribute to
476	   ensure that a BGP update message for a single Link-State NLRI does
477	   not cross the maximum limit for a BGP message.  The determination of
478	   the types of TLVs to be included by the BGP SPF speaker originating
479	   the attribute is outside the scope of this document.  When a BGP SPF
480	   speaker finds that it is exceeding the maximum BGP message size due
481	   to addition or update of some other BGP Attribute (e.g., AS_PATH), it
482	   MUST consider the BGP-LS Attribute to be malformed and the attribute
483	   discard handling of [RFC7606] applies.

[major] As above, most of this text is a copy from rfc7752bis.


[major] "MUST consider the BGP-LS Attribute to be malformed and the
attribute discard handling of [RFC7606] applies."

In BGP SPF this is a serious problem because it results in loss of
synchronization: after a router discards the attribute all the other
routers won't have the same information as the ones before the
discard.

§7.3 has some text (also copied from rfc7752bis) that tries to say
that it is ok to discard the BGP-LS Attribute because the remains can
be used for troubleshooting.  Even if that was true (see my comments
there), the lack of synchronization remains.

Can this issue be mitigated?



485	   In order to compare the BGP-LS attribute efficiently, it is REQUIRED
486	   that all the TLVs within the given attribute must be ordered in
487	   ascending order by the TLV type.  For multiple TLVs of same type
488	   within a single attribute, it is REQUIRED that these TLVs are ordered
489	   in ascending order by the TLV value field.  Comparison of the value
490	   fields is performed by treating the entire value field as a
491	   hexadecimal string.  Attributes having TLVs which do not follow the
492	   ordering rules MUST NOT be considered as malformed.

[] See the comment above about the general text.


[major] This is a change in BGP-LS.  Also, there's a Normative
mismatch in that the order is REQUIRED, but not following that has no
effect.  This is the related text from §5.1/rfc7752bis:

   The TLVs within the BGP-LS Attribute SHOULD be ordered in ascending
   order by TLV type.  BGP-LS Attribute with unordered TLVs MUST NOT be
   considered malformed.



...
497	5.2.  Extensions to BGP-LS

499	   [I-D.ietf-idr-rfc7752bis] describes a mechanism by which link-state
500	   and TE information can be collected from IGPs and shared with
501	   external components using the BGP protocol.  It describes both the
502	   definition of the BGP-LS NLRI that advertise links, nodes, and
503	   prefixes comprising IGP link-state information and the definition of
504	   a BGP path attribute (BGP-LS attribute) that carries link, node, and
505	   prefix properties and attributes, such as the link and prefix metric
506	   or auxiliary Router-IDs of nodes, etc.  This document extends the
507	   usage of BGP-LS NLRI for the purpose of BGP SPF calculation via
508	   advertisement in the BGP-LS-SPF SAFI.

[] Some of this text is redundant with the new §3.



510	   The protocol identifier specified in the Protocol-ID field
511	   [I-D.ietf-idr-rfc7752bis] will represent the origin of the advertised
512	   NLRI.  For Node NLRI and Link NLRI, this MUST be the direct protocol
513	   (4).  Node or Link NLRI with a Protocol-ID other than direct will be
514	   considered malformed.  For Prefix NLRI, the specified Protocol-ID
515	   MUST be the origin of the prefix.  The local and remote node
516	   descriptors for all NLRI MUST include the BGP Identifier (TLV 516)
517	   and the AS Number (TLV 512) [I-D.ietf-idr-rfc7752bis].  The BGP
518	   Confederation Member (TLV 517) [I-D.ietf-idr-rfc7752bis] is not
519	   applicable and SHOULD NOT be included.  If TLV 517 is included, it
520	   MUST be ignored.

[] "If TLV 517 is included, it MUST be ignored."

Because there may be use for confederations in the future, let's drop
this sentence.



...
527	5.2.1.1.  BGP-LS-SPF Node NLRI Attribute SPF Capability TLV

529	   The SPF capability is an additional Node Attribute TLV.  This
530	   attribute TLV MUST be included with the BGP-LS-SPF SAFI and SHOULD
531	   NOT be used for other SAFIs.  The TLV type 1180 will be assigned by
532	   IANA.  The Node Attribute TLV will contain a single-octet SPF
533	   algorithm as defined in [RFC8665].

[minor] "...will be assigned by IANA."

It has been assigned already.  This note about IANA is no needed -- we
just need the requests in the IANA section.  Other sections have
similar text.



...
543	   The SPF algorithm inherits the values from the IGP Algorithm Types
544	   registry [RFC8665].  Algorithm 0, (Shortest Path Algorithm (SPF)
545	   based on link metric, is supported and described in Section 6.3.
546	   Support for other algorithm types is beyond the scope of this
547	   specification.

[major] For this initial definition, only 0 is valid.  But, what is
the general definition of the "SPF Algorithm" field?

Suggestion>
   The SPF Algorithm field is used to advertise the algorithm used by
   the router to calculate the paths to other routers in the BGP SPF
   routing domain.  This field inherits the values from...


[minor] Even though only algorithm 0 is supported at this time, it
should be called out (§10.1 maybe) that it is the operator's
responsibility to make sure that all the nodes support the same SPF
Algorithm.



...
556	5.2.1.2.  BGP-LS-SPF Node NLRI Attribute SPF Status TLV

[] The text in other sections describing the SPF Status TLV is very
similar, so the comments apply there too.



558	   A BGP-LS Attribute TLV of the BGP-LS-SPF Node NLRI is defined to
559	   indicate the status of the node with respect to the BGP SPF
560	   calculation.  This will be used to rapidly take a node out of service
561	   Section 6.5.2 or to indicate the node is not to be used for transit
562	   (i.e., non-local) traffic Section 6.3.  If the SPF Status TLV is not
563	   included with the Node NLRI, the node is considered to be up and is
564	   available for transit traffic.  The SPF status is acted upon with the
565	   execution of the next SPF calculation Section 6.3.  A single TLV type
566	   will be shared by the BGP-LS-SPF Node, Link, and Prefix NLRI.  The
567	   TLV type 1184 will be assigned by IANA.

[nit] s/Section 6.5.2...Section 6.3...Section 6.3/(Section
6.5.2)...(Section 6.3)...(Section 6.3)

There are several places where the section is mentioned without a
preposition (?): "Section x" instead of "in Section x".  Not all
occurrences would flow better by adding "in", so I suspect this may be
a formatting issue where the parenthesis were left out.  There are
several throughout, please take a look.



569	   0                   1                   2                   3
570	   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
571	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
572	   |   Type (1184)                 |       Length (1 Octet)        |
573	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
574	   | SPF Status    |
575	   +-+-+-+-+-+-+-+-+

577	   BGP Status Values: 0 - Reserved
578	   1 - Node Unreachable with respect to BGP SPF
579	   2 - Node does not support transit with respect
580	   to BGP SPF
581	   3-254 - Undefined
582	   255 - Reserved

[nit] s/BGP Status/SPF Status


[major] Should IANA manage the assignment of these values?  For this
TLV, the values defined for each NLRI are slightly different, so I'm
assuming there will really be 3 registries -- or one with different
columns...



584	   The BGP-LS-SPF Node Attribute SPF Status TLV, Link Attribute SPF
585	   Status TLV, and Prefix Attribute SPF Status TLV use the same TLV Type
586	   (1184).  This implies that a BGP Update cannot contain multiple NLRI
587	   with differing status.  If the BGP-LS-SPF Status TLV is advertised
588	   and the advertised value is not defined for all NLRI included in the
589	   BGP update, then the SPF Status TLV is ignored and not used in SPF
590	   computation but is still announced to other BGP SPF speakers.  An
591	   implementation MAY log an error for further analysis.

[nit] The first sentence is duplicated from above.


[major] "This implies that a BGP Update cannot contain multiple NLRI
with differing status."

§5.1.1 already says that "a BGP Update for BGP-LS-SPF SAFI SHALL
contain a single BGP-LS-SPF NLRI".  I guess you mean that the NLRIs
associated with a specific node must have the same status.  Is that
it?


[major] "for all NLRI included in the BGP update"

Again, each UPDATE will have a different NLRI, so I'm assuming you
mean "the NLRI's associated with a specific node".


[major] "the advertised value is not defined"

Is this action only followed when the value is undefined/unknown, or
also if they're different?

Are there any cases where the status can be different?  Can a link be
taken out of service even if the (resf of) the node is available?  Can
a prefix on a link be taken out of service even if the (resf of) the
link is available?


[major] "the SPF Status TLV is ignored and not used in SPF computation
but is still announced to other BGP SPF speakers."

It seems (because of the announcement part) that the text means that
the NLRI is not used for SPF; i.e. it implies unreachable.  Or did you
really mean that the TLV is ignored; i.e. the node is reachable?

BTW, I've been assuming that the UPDATES are propagated ("flooded")
without change to guarantee synchronization.


[] The Normative text below repeats some of what is above.



593	   If a BGP SPF speaker received the Node NLRI but the SPF Status TLV is
594	   not received, then any previously received information is considered
595	   as implicitly withdrawn and the update is propagated to other BGP SPF
596	   speakers.  A BGP SPF speaker receiving a BGP Update containing a SPF
597	   Status TLV in the BGP-LS attribute [I-D.ietf-idr-rfc7752bis] with a
598	   value that is outside the range of defined values SHOULD be processed
599	   and announced to other BGP SPF speakers.  However, a BGP SPF speaker
600	   MUST NOT use the Status TLV in its SPF computation.  An
601	   implementation MAY log this condition for further analysis.

[] "...and the update is propagated to other BGP SPF speakers...and
announced to other BGP SPF speakers."

The UPDATE is propagated anyway, right?


[] The Normative text in this paragraph is a restatement of what was
in the last one.


[minor] Is "outside the range of defined values" the same as undefined
or unknown?  Is so, s/outside the range of defined values/unknown


[] These comments apply to the other sections describing the Status TLV.



603	5.2.2.  Link NLRI Usage
...
608	   Link NLRI is advertised with unique local and remote node descriptors
609	   dependent on the IP addressing.  For IPv4 links, the link's local
610	   IPv4 (TLV 259) and remote IPv4 (TLV 260) addresses will be used.  For
611	   IPv6 links, the local IPv6 (TLV 261) and remote IPv6 (TLV 262)
612	   addresses will be used.  For unnumbered links, the link local/remote
613	   identifiers (TLV 258) will be used.  For links supporting having both
614	   IPv4 and IPv6 addresses, both sets of descriptors MAY be included in
615	   the same Link NLRI.  The link identifiers are described in table 5 of
616	   [I-D.ietf-idr-rfc7752bis].

[] In rfc7752bis Table 5 is "Prefix Descriptor TLVs".  Also, I think
you may want to point at "link descriptors", not "link identifiers".



618	   For a link to be used in Shortest Path Tree (SPT) for a given address
619	   family, i.e., IPv4 or IPv6, both routers connecting the link MUST
620	   have an address in the same subnet for that address family.  However,
621	   an IPv4 or IPv6 prefix associated with the link MAY be installed
622	   without the corresponding address on the other side of link.

[major] "both routers connecting the link MUST have an address in the
same subnet...an IPv4 or IPv6 prefix associated with the link MAY be
installed without the corresponding address on the other side of
link."

This is a Normative inconsistency: the text requires both sides to
have an address, but then there's an option to only use one.  Is there
a difference between using the link in the SPT and "installing" an
address?



624	   The link IGP metric attribute TLV (TLV 1095) MUST be advertised.  If
625	   a BGP SPF speaker receives a Link NLRI without an IGP metric
626	   attribute TLV, then it SHOULD consider the received NLRI as a
627	   malformed and the receiving BGP SPF speaker MUST handle such
628	   malformed NLRI as 'Treat-as-withdraw' [RFC7606].  The BGP SPF metric
629	   length is 4 octets.  Like OSPF [RFC2328], a cost is associated with
630	   the output side of each router interface.  This cost is configurable
631	   by the system administrator.  The lower the cost, the more likely the
632	   interface is to be used to forward data traffic.  One possible
633	   default for metric would be to give each interface a cost of 1 making
634	   it effectively a hop count.  Algorithms such as setting the metric
635	   inversely to the link speed as supported in the OSPF MIB [RFC4750]
636	   MAY be supported.  However, this is beyond the scope of this
637	   document.  Refer to Section 10.1.1 for operational guidance.

[minor] s/link IGP metric attribute TLV/IGP Metric TLV/g
That's what rfc7752 calls it.


[major] "SHOULD consider the received NLRI as a malformed NLRI"

When is it ok to not consider the NLRI as malformed?  Why is this
behavior recommended and not required?


[minor] "...a Link NLRI without an IGP metric attribute TLV...MUST
handle such malformed NLRI as 'Treat-as-withdraw'"

On one hand, I like that you discuss errors while talking about the
TLV.  On the other, there's a section later on which covers some of
the other TLVs.  Please be consistent.


[minor] s/Like OSPF [RFC2328], a cost is associated with the output
side of each router interface./The metric is associated with the
output side of each router interface.


[minor] The TLV talks about a "metric", but the description is about
"cost"...and in some cases both are used: "default for metric would be
to give each interface a cost of 1".   Other sections, like §10.1.1
also have similar consistency issues.  Please pick one and stick to
it.


[major] Please keep configuration-related guidance in one place --
§10.1.1 seems more appropriate.  Note that declaring metric setting as
out of scope but also using Normative language is also inconsistent
(from above):

   Algorithms such as setting the metric inversely to the link speed as
   supported in the OSPF MIB [RFC4750] MAY be supported.  However, this
   is beyond the scope of this document.

BTW, the OSPF MIB is not related to BGP SPF, please don't use it as a reference.


[?] About the metric default being out of scope...  Interoperability
still concerns me.  For example, if just using default metrics,
replacing a BGP SPF node with one from a different vendor may result
in a change in the traffic patterns.  It may not be clear to an
operator that the default configuration may be different.  Can we say
something like this?

   Setting a default metric of 1 is RECOMMENDED.  An implementation is
   expected to document any changes in this setting.



...
642	5.2.2.1.  BGP-LS-SPF Link NLRI Attribute Prefix-Length TLVs
...
662	   The Prefix-Length TLV is only relevant to Link NLRIs.  The Prefix-
663	   Length TLVs MUST be discarded as an error and not passed to other BGP
664	   peers as specified in [RFC7606] when received with any NLRIs other
665	   than Link NRLIs.  An implementation MAY log an error for further
666	   analysis.

[major] "TLVs MUST be discarded...as specified in [RFC7606]"

rfc7606 talks about attribute discard, which is not the same as
discarding a TLV.  There are similar comments in §7.

The next two paragraphs say the same thing.


668	   The maximum prefix-length for IPv4 Prefix-Length TLV is 32 bits.  A
669	   prefix-length field indicating a larger value than 32 bits MUST be
670	   discarded as an error and the received TLV is not passed to other BGP
671	   peers as specified in [RFC7606].  The corresponding Link NLRI is
672	   considered as malformed and MUST be handled as 'Treat-as-withdraw'.
673	   An implementation MAY log an error for further analysis.

[major] "MUST be discarded as an error...The corresponding Link NLRI
is considered as malformed and MUST be handled as
'Treat-as-withdraw'."

If the whole UPDATE is considered malformed, there's no point in also
discarding the TLV.

The next paragraph contains similar text.


675	   The maximum prefix-length for IPv6 Prefix-Length Type is 128 bits.  A
676	   prefix-length field indicating a larger value than 128 bits MUST be
677	   discarded as an error and the received TLV is not passed to other BGP
678	   peers as specified in [RFC7606].  The corresponding Link NLRI is
679	   considered as malformed and MUST be handled as 'Treat-as-withdraw'.
680	   An implementation MAY log an error for further analysis.



...
726	5.2.3.  IPv4/IPv6 Prefix NLRI Usage

728	   IPv4/IPv6 Prefix NLRI is advertised with a Local Node Descriptor and
729	   the prefix and length.  The Prefix Descriptors field includes the IP
730	   Reachability Information TLV (TLV 265) as described in
731	   [I-D.ietf-idr-rfc7752bis].  The Prefix Metric attribute TLV (TLV
732	   1155) MUST be advertised.  The IGP Route Tag TLV (TLV 1153) MAY be
733	   advertised.  The usage of other BGP-LS attribute TLVs is beyond the
734	   scope of this document.  For loopback prefixes, the metric should be
735	   0.  For non-loopback prefixes, the setting of the metric is a local
736	   matter and beyond the scope of this document.

[minor] s/The prefix metric attribute TLV/The Prefix Metric TLV


[] See the comments in §5.2.2 about the metric -- and let's keep all
the related recommendations/requirements in one place (§10.1.1).


[] "For loopback prefixes, the metric should be 0."

As suggested in §5.2.2, this could be part of the RECOMMENDED default settings.



...
781	5.2.4.  BGP-LS Attribute Sequence-Number TLV

783	   A BGP-LS Attribute TLV of the BGP-LS-SPF NLRI types is defined to
784	   assure the most recent version of a given NLRI is used in the SPF
785	   computation.  The Sequence-Number TLV is mandatory for BGP-LS-SPF
786	   NLRI.  The TLV type 1181 has been assigned by IANA.  The BGP-LS
787	   Attribute TLV will contain an 8-octet sequence number.  The usage of
788	   the Sequence Number TLV is described in Section 6.1.

[nit] s/mandatory for BGP-LS-SPF/mandatory for the BGP-LS-SPF


[nit] Phrases in the future, such as "will contain" are used in many
places in this document (there are 75 instances of the word "will"),
but the specification should be in the present tense ("the TLV
contains").  Just a nit, maybe something the RFC Editor can address,
but it would be nice if you did.



...
819	   The Sequence-Number TLV is mandatory for BGP-LS-SPF NLRI.  If the
820	   Sequence-Number TLV is not received then the corresponding Link NLRI
821	   is considered as malformed and MUST be handled as 'Treat-as-
822	   withdraw'.  An implementation MAY log an error for further analysis.

[minor] The first sentence is a repeat of text above.


[major] "TLV is not received then the corresponding Link NLRI is
considered as malformed"

Only the Link NLRI?  Or all the NLRIs?



824	5.3.  NEXT_HOP Manipulation

[nit] rfc4760 uses Next Hop to refer to the contents of the MP_*_NLRI
attributes, please be consistent with the terminology.
s/Next-Hop/Next Hop/g

That way the word could be distinguished from the "next-hop" calculated by SPF.


826	   All BGP peers that support SPF extensions would locally compute the
827	   Local-RIB Next-Hop as a result of the SPF process.  Consequently, the
828	   Next-Hop is always ignored on receipt.  The Next-Hop address MUST be
829	   encoded as described in [RFC4760].  BGP SPF speakers MUST interpret
830	   the Next-Hop address of MP_REACH_NLRI attribute as an IPv4 address
831	   whenever the length of the Next-Hop address is 4 octets, and as a
832	   IPv6 address whenever the length of the Next-Hop address is 16
833	   octets.

[major] This first paragraph says that the "Next-Hop is always ignored
on receipt", but also talks about requirements for it to be set ("MUST
be encoded as described in [RFC4760]") and interpreted when received
("MUST interpret the Next-Hop address of MP_REACH_NLRI attribute...").
Which one is it?  If ignored, does it matter what it is set to?
Should it be checked when received?  What happens if it is not set as
expected?

The text above doesn't say when IPv4 or IPv6 should be used.  I can
see how the Prefix NLRI may provide a hint, but what about the other
NLRIs?

Note that rfc7752bis says that "the next-hop will be set to the local
endpoint address of the BGP session."  For BGP SPF, it doesn't seem to
matter whether the BGP session runs over IPv4 or IPv6...so that seems
like a nice "compromise": set the next-hop according to the address of
the session.

Assuming you want to set it and check it, here's a suggestion:

   All BGP SPF speakers locally compute the Local-RIB Next-Hop as a
   result of the SPF process (Section 6). Consequently, the Next Hop
   information in the BGP UPDATE is not used as described in [RFC4760].
   The Next Hop address is set to the local endpoint address of the
   BGP session [rfc7752bis].

rfc7752bis doesn't say what to do if the Next Hop doesn't match the
BGP session endpoint, so I didn't add anything here.

Another option would be to simply set the Length of Next Hop Network
Address field to 0 and forget about it.  It is not clear from rfc4760
that 0 would be ok -- and I'm not sure if it would break any
implementations.  OTOH, the AFI/SAFI combination determines what the
Next Hop looks like, so for the BGP-LS-SPF SAFI it could be that:
nothing.


835	   [RFC4760] modifies the rules of NEXT_HOP attribute whenever the
836	   multiprotocol extensions for BGP-4 are enabled.  BGP SPF speakers
837	   MUST set the NEXT_HOP attribute according to the rules specified in
838	   [RFC4760] as the BGP-LS-SPF routing information is carried within the
839	   multiprotocol extensions for BGP-4.

[major] This paragraph talks about the NEXT_HOP attribute and requires
be set "according to the rules specified in [RFC4760]".  However,
rfc4760 says:

   An UPDATE message that carries no NLRI, other than the one encoded in
   the MP_REACH_NLRI attribute, SHOULD NOT carry the NEXT_HOP attribute.
   If such a message contains the NEXT_HOP attribute, the BGP speaker
   that receives the message SHOULD ignore this attribute.

Given that BGP SPF inherits that, I don't think the paragraph above is
needed.  Also, the requirement can cause confusion as rfc4760 doesn't
specify what to set the value to.



841	6.  Decision Process with SPF Algorithm
...
881	   The Phase 3 decision function of the Decision Process [RFC4271] is
882	   also simplified since under normal SPF operation, a BGP SPF speaker
883	   MUST advertise the changed NLRIs to all BGP peers with the BGP-LS-SPF
884	   AFI/SAFI and install the changed routes in the Global RIB.  The only
885	   exception are unchanged NLRIs or stale NLRIs, i.e., NLRI received
886	   with a less recent (numerically smaller) sequence number.

[nit] s/Global RIB/GLOBAL-RIB



888	6.1.  BGP NLRI Selection

[minor] s/BGP NLRI Selection/BGP SPF NLRI Selection



890	   The rules for all BGP-LS-SPF NLRIs selection for phase 1 of the BGP
891	   decision process, section 9.1.1 [RFC4271], no longer apply.

[minor] Some type of transition is missing between the initial
paragraph and the bullets below.  Maybe remind people that this first
step is about selecting the advertisements that will be used in the
calculation.



893	   1.  Routes originated by directly connected BGP SPF peers are
894	       preferred.  This condition can be determined by comparing the BGP
895	       Identifiers in the received Local Node Descriptor and OPEN
896	       message.  This rule will assure that stale NLRI is updated even
897	       if a BGP-LS router loses its sequence number state due to a cold-
898	       start.

900	   2.  The NLRI with the most recent Sequence Number TLV, i.e., highest
901	       sequence number is selected.

903	   3.  The route received from the BGP SPF speaker with the numerically
904	       larger BGP Identifier is preferred.

[] These steps mix "route" and "NLRI".  At this point we're not
talking about "routes" (which haven't been calculated), but about NLRI
advertisements, so maybe using NLRI consistently helps.



906	   When a BGP SPF speaker completely loses its sequence number state,
907	   i.e., due to a cold start, or in the unlikely possibility that 64-bit
908	   sequence number wraps, the BGP routing domain will still converge.
909	   This is due to the fact that BGP SPF speakers adjacent to the router
910	   will always accept self-originated NLRI from the associated speaker
911	   as more recent (rule # 1).  When a BGP SPF speaker reestablishes a
912	   connection with its peers, any existing session will be taken down
913	   and stale NLRI will be replaced.  The adjacent BGP SPF speaker will
914	   update their NLRI advertisements, hop by hop, until the BGP routing
915	   domain has converged.

[nit] "The adjacent BGP SPF speaker will update their NLRI
advertisements, hop by hop, until the BGP routing domain has
converged."

This sentence sounds as if the BGP SPF speaker advertises hop by hop...

Suggestion>
   The adjacent BGP SPF speaker updates their NLRI advertisements
   and the BGP routing domain converges as they are propagated.



917	   The modified SPF Decision Process performs an SPF calculation rooted
918	   at the BGP SPF speaker using the metrics from the Link Attribute IGP
919	   Metric TLV (1095) and the Prefix Attribute Prefix Metric TLV (1155)
920	   [I-D.ietf-idr-rfc7752bis].  As a result, any other BGP attributes
921	   that would influence the BGP decision process defined in [RFC4271]
922	   including ORIGIN, MULTI_EXIT_DISC, and LOCAL_PREF attributes are
923	   ignored by the SPF algorithm.  The NEXT_HOP attribute is discussed in
924	   Section 5.3.  The AS_PATH and AS4_PATH [RFC6793] attributes are
925	   preserved and used for loop detection [RFC4271].  They are ignored
926	   during the SPF computation for BGP-LS-SPF NRLIs.

[minor] s/rooted at the BGP SPF speaker/rooted at the local BGP SPF speaker



928	6.1.1.  BGP Self-Originated NLRI

930	   Node, Link, or Prefix NLRI with Node Descriptors matching the local
931	   BGP SPF speaker are considered self-originated.  When self-originated
932	   NLRI is received and it doesn't match the local node's NLRI content
933	   (including sequence number), special processing is required.

935	   *  If a self-originated NLRI is received and the sequence number is
936	      more recent (i.e., greater than the local node's sequence number
937	      for the NLRI), the NLRI sequence number will be advanced to one
938	      greater than the received sequence number and the NLRI will be
939	      readvertised to all peers.

[nit] s/If a self-originated/If self-originated



941	   *  If self-originated NLRI is received and the sequence number is the
942	      same as the local node's sequence number but the attributes
943	      differ, the NLRI sequence number will be advanced to one greater
944	      than the received sequence number and the NLRI will be
945	      readvertised to all peers.

947	   *  If self-originated Link or Prefix NLRI is received and the Link or
948	      Prefix NLRI is no longer being advertised by the local node, the
949	      NLRI will be withdrawn.

[major] All 3 conditions say that something "will be" done.  Can they
be made Normative (MUST?)?



...
969	6.3.  SPF Calculation based on BGP-LS-SPF NLRI

971	   This section details the BGP-LS-SPF local routing information base
972	   (RIB) calculation.  The router will use BGP-LS-SPF Node, Link, and
973	   Prefix NLRI to compute routes using the following algorithm.  This
974	   calculation yields the set of routes associated with the BGP SPF
975	   Routing Domain.  A router calculates the shortest-path tree using
976	   itself as the root.  Optimizations to the BGP-LS-SPF algorithm are
977	   possible but MUST yield the same set of routes.  The algorithm below
978	   supports Equal Cost Multi-Path (ECMP) routes.  Weighted Unequal Cost
979	   Multi-Path routes are out of scope.  The organization of this section
980	   owes heavily to section 16 of [RFC2328].

[minor] "The organization of this section owes heavily to section 16
of [RFC2328]."

Nice, but there's no need to mention OSPF.



982	   The following abstract data structures are defined in order to
983	   specify the algorithm.

985	   *  Local Route Information Base (Local-RIB) - This routing table
986	      contains reachability information (i.e., next hops) for all
987	      prefixes (both IPv4 and IPv6) as well as BGP-LS-SPF node
988	      reachability.  Implementations may choose to implement this with
989	      separate RIBs for each address family and/or Prefix versus Node
990	      reachability.  While slightly different in scope and semantics, It
991	      is analogous to the Loc-RIB specified in [RFC4271].

[minor] "While slightly different in scope and semantics, It is
analogous to the Loc-RIB specified in [RFC4271]."

Not needed.



993	   *  Global Routing Information Base (GLOBAL-RIB) - This is Routing
994	      Information Base (RIB) containing the current routes that are
995	      installed in the router's forwarding plane.  This is commonly
996	      referred to in networking parlance as "the RIB".

[nit] s/This is Routing Information Base (RIB) containing/This is a
Routing Information Base (RIB) containing



...
1001	   *  Candidate List (CAN-LIST) - This is a list of candidate Node NLRIs
1002	      used during the BGP SPF calculation Section 6.3.  The list is
1003	      sorted by the cost to reach the Node NLRI with the Node NLRI with
1004	      the lowest reachability cost at the head of the list.  This
1005	      facilitates execution of the Dijkstra algorithm Section 1.1 where
1006	      the shortest paths between the local node and other nodes in graph
1007	      area computed.  The CAN-LIST is typically implemented as a heap
1008	      but other data structures have been used.

[nit] Delete "Section 6.3" as this is that section.



...
1022	   3.  The Node NLRI with the lowest cost is removed from the candidate
1023	       list for processing.  If the BGP-LS Node attribute doesn't
1024	       include an SPF Capability TLV (Section 5.2.1.1, the Node NLRI is
1025	       ignored and the next lowest cost Node NLRI is selected from
1026	       candidate list.  The If the BGP-LS Node attribute includes an SPF
1027	       Status TLV (Section 5.2.1.1) indicating the node is unreachable,
1028	       the Node NLRI is ignored and the next lowest cost Node NLRI is
1029	       selected from candidate list.  The Node corresponding to this
1030	       NLRI will be referred to as the Current-Node.  If the candidate
1031	       list is empty, the SPF calculation has completed and the
1032	       algorithm proceeds to step 6.

[nit] s/candidate list/CAN-LIST/g
For consistency.


[major] "If the BGP-LS Node attribute doesn't include an SPF
Capability TLV (Section 5.2.1.1, the Node NLRI is ignored and the next
lowest cost Node NLRI is selected from candidate list."

This condition is already considered in §6: "The SPF...process starts
with selecting only those Node NLRI whose SPF capability TLV matches
with the local BGP SPF speaker's SPF capability TLV value."  You don't
need to repeat here.


[?] "The If the BGP-LS Node attribute includes an SPF Status TLV
(Section 5.2.1.1) indicating the node is unreachable, the Node NLRI is
ignored and the next lowest cost Node NLRI is selected from candidate
list."

Why aren't all the NLRI with an SPT Status TLV indicating unreachable
taken out of consideration upfront (in §6.1)?

s/5.2.1.1/5.2.1.2



1034	   4.  All the Prefix NLRI with the same Node Identifiers as the
1035	       Current-Node will be considered for installation.  The next-
1036	       hop(s) for these Prefix NLRI are inherited from the Current-Node.
1037	       The cost for each prefix is the metric advertised in the Prefix
1038	       Attribute Prefix Metric TLV (1155) added to the cost to reach the
1039	       Current-Node.  The following will be done for each Prefix NLRI
1040	       (referred to as the Current-Prefix):

[major] s/Node Identifiers/Local Node Descriptors (?)

§5.2.1.1/rfc7752bis (Globally Unique Node/Link/Prefix Identifiers)
talks about the Identifier field (BGP-LS Instance-ID) and how that +
an  "IGP representation" (area id, etc.) can uniquely identify a node.
This document (in §5.2) says that:

   The local and remote node descriptors for all NLRI MUST include
   the BGP Identifier (TLV 516) and the AS Number (TLV 512)
   [I-D.ietf-idr-rfc7752bis].

This requirement is consistent with rfc6286 to uniquely identify a BGP node.



...
1067	       *  If the Current-Prefix's corresponding prefix is in the Local-
1068	          RIB and the cost is the same as the Local-RIB route's metric,
1069	          the Current-Node's next-hops will be merged with Local-RIB
1070	          route's next-hops.  If the number of merged next-hops exceeds
1071	          the Equal-Cost Multi-Path (ECMP) limit, the number of next-
1072	          hops is reduced with next-hops on numbered links preferred
1073	          over next-hops on unnumbered links.  Among next-hops on
1074	          numbered links, the next-hops with the highest IPv4 or IPv6
1075	          addresses are preferred.  Among next-hops on unnumbered links,
1076	          the next-hops with the highest Remote Identifiers are
1077	          preferred [RFC5307].  If the IGP Route Tag TLV (1153) is
1078	          included in the Current-Prefix's NLRI Attribute, the tag(s)
1079	          are merged into the Local-RIB route's current tags.

[minor] "next-hops will be merged"

Perhaps something like: the next-hops will be added to ...


[minor] This is the only place that talks about an "(ECMP) limit",
which is not part of the spec, but a potential limitation of the
underlying hw.  It would be ideal if the limit can be explicitly
called out as out of scope, maybe in §6.3 (where ECMP is mentioned).

Suggestion>
OLD>
   The algorithm below supports Equal Cost Multi-Path (ECMP) routes.
   Weighted Unequal Cost Multi-Path routes are out of scope.

NEW (in a new paragraph)>
   The algorithm below supports Equal Cost Multi-Path (ECMP) routes.
   Some platforms or implementations may have limits on the number of
   ECMP routes that can be supported.  The setting or identification
   of any limitations is outside the scope if this document.
   Nonetheless, step 4 (below) includes a set of recommendations in
   case such as limit is encountered.  Weighted Unequal Cost Multi-Path
   routes are out of scope as well.



1081	   5.  All the Link NLRI with the same Node Identifiers as the Current-
1082	       Node will be considered for installation.  Each link will be
1083	       examined and will be referred to in the following text as the
1084	       Current-Link.  The cost of the Current-Link is the advertised IGP
1085	       Metric TLV (1095) from the Link NLRI BGP-LS attribute added to
1086	       the cost to reach the Current-Node.  If the Current-Node is for
1087	       the local BGP Router, the next-hop for the link will be a direct
1088	       next-hop pointing to the corresponding local interface.  For any
1089	       other Current-Node, the next-hop(s) for the Current-Link will be
1090	       inherited from the Current-Node.  The following will be done for
1091	       each link:

[] "If the Current-Node is for the local BGP Router, the next-hop for
the link will be a direct next-hop pointing to the corresponding local
interface."

Similar text should also exist in step 4.



1093	       a.  The prefix(es) associated with the Current-Link are installed
1094	           into the Local-RIB using the same rules as were used for
1095	           Prefix NLRI in the previous steps.  Optionally, in
1096	           deployments where BGP-SPF routers have limited routing table
1097	           capacity, installation of these subnets can be suppressed.
1098	           Suppression will have an operational impact as the IPv4/IPv6
1099	           link endpoint addresses will not be reachable and tools such
1100	           as traceroute will display addresses that are not reachable.

[major] "The prefix(es) associated with the Current-Link..."

I may very well be confused, please help me out.

A node has links and prefixes.  Prefixes may be associated with the
node (step 4) or a link (this step), or both (?).

The Link Descriptor TLVs from rfc7752bis include only the host-address
on the interface (carried in the IPv4/IPv6 interface/neighbor address
TLVs (259/260/261/262)), without an indication of the prefix length,
which is carried only in the Prefix NLRI using the IP Reachability
Information TLV (256).  Even though it is out of scope in rfc7752bis,
the BGP-LS Consumer can use the IP Address (from the Link TLVs) and
the Node Descriptors to find the relationship between the a Prefix
NLRI and the corresponding Link.

This document also uses TLV 256 in the Prefix NLRI.  But it adds the
Attribute Prefix-Length TLVs to the Link NLRI, which now results in a
prefix (address/length) also being associated directly with a link.
This is where the "prefix(es) associated with the Current-Link" come
from, right?

I see how a Prefix can be advertised "without" a Link: associated to
the loopback, for example.  Or how a prefix is associated with a link:
it's the address on the link.

But a prefix can also be advertised in both the Prefix NLRI and Link
NLRI.  Is that the intent here?  What if the information is not the
same?

I guess that the length of the prefix can de different -- for example,
/30 for the link and /24 for the prefix.  These would be different
prefixes.  Is the other way around (a more specific Prefix NLRI) ok?

The Link NLRI is required to use the IGP Metric TLV (1095) to
advertise the metric, while the Prefix NLRI uses the Prefix Metric
attribute TLV (1155).  Even if using different TLVs, the values could
be compared.  What if a prefix is advertised in both NLRIs with
different metric (by the same node)?

The second part of the paragraph talks about "link endpoint
addresses", which I take to be host routes.  If these (host routes)
are the prefixes we're talking about here, please be clear about it.
I wonder why the Attribute Prefix-Length TLVs are needed and whether
it matters (or makes sense) if the prefix length is set to anything
other than 32/128.



1102	       b.  If the Current-Node NLRI attributes includes the SPF status
1103	           TLV (Section 5.2.1.2) and the status indicates that the Node
1104	           doesn't support transit, the next link for the Current-Node
1105	           is processed in Step 5.

[?] In this case, should any associated prefixes be considered or not (step a)?



1107	       c.  If the Current-Link's NLRI attribute includes an SPF Status
1108	           TLV indicating the link is down, the BGP-LS-SPF Link NLRI is
1109	           considered down and the next link for the Current-Node is
1110	           examined in Step 5.

[major] This step should be examined before a (and any prefixes are installed).



1112	       d.  The Current-Link's Remote Node NLRI is accessed (i.e., the
1113	           Node NLRI with the same Node identifiers as the Current-
1114	           Link's Remote Node Descriptors).  If it exists, it will be
1115	           referred to as the Remote-Node and the algorithm will proceed
1116	           as follows:

1118	           *  If the Remote-Node's NLRI attribute includes an SPF Status
1119	              TLV indicating the node is unreachable, the next link for
1120	              the Current-Node is examined in Step 5.

1122	           *  All the Link NLRI corresponding the Remote-Node will be
1123	              searched for a Link NLRI pointing to the Current-Node.
1124	              Each Link NLRI is examined for Remote Node Descriptors
1125	              matching the Current-Node and Link Descriptors matching
1126	              the Current-Link.  For numbered links to match, the Link
1127	              Descriptors MUST share a common IPv4 or IPv6 subnet.  For
1128	              unnumbered links to match, the Current Link's Local
1129	              Identifier MUST match the Remote Node Link's Remote
1130	              Identifier and the Current Link's Remote Identifier MUST
1131	              the Remote Node Link's Local Identifier [RFC5307].  If
1132	              these conditions are satisfied for one of the Remote-
1133	              Node's links, the bi-directional connectivity check
1134	              succeeds and the Remote-Node may be processed further.
1135	              The Remote-Node's Link NLRI providing bi-directional
1136	              connectivity will be referred to as the Remote-Link.  If
1137	              no Remote-Link is found, the next link for the Current-
1138	              Node is examined in Step 5.

[major] "[RFC5307]"

I know that's where the contents of the TLV are detailed.  Instead of
pointing at an IS-IS specification, let's use "(see TLV 258
[rfc7752bis])".


[major] "the Remote-Node may be processed further"

What does this mean?  Does it mean that we now process the Remote Node
NLRI starting in Step 3, or simply that we move on?

s/If these conditions are satisfied for one of the Remote-Node's
links, the bi-directional connectivity check succeeds and the
Remote-Node may be processed further./If these conditions are
satisfied for one of the Remote-Node's links, the bi-directional
connectivity check succeeds.



...
1179	       *  If the route was not installed during the current BGP SPF
1180	          computation, remove the route from both the GLOBAL-RIB and the
1181	          Local-RIB.

[minor] "If the route was not installed...remove the route from...the
Local-RIB."

It is was not installed then it is not in the Local-RIB.



1183	6.4.  IPv4/IPv6 Unicast Address Family Interaction

1185	   While the BGP-LS-SPF address family and the IPv4/IPv6 unicast address
1186	   families MAY install routes into the same device routing tables, they
1187	   will operate independently much the same as OSPF and IS-IS would
1188	   operate today (i.e., "Ships-in-the-Night" mode).  There is no
1189	   implicit route redistribution between the BGP address families.

[] How can IPv6 routes be redistributed into an IPv4 address family --
regardless of the protocol?  This paragraph doesn't make any sense to
me.

Also, s/MAY/may because it is stating a fact.  No need to even mention
OSPF/IS-IS as different address families already operate
independently.



1191	   It is RECOMMENDED that BGP-LS-SPF IPv4/IPv6 route computation and
1192	   installation be given scheduling priority by default over other BGP
1193	   address families as these address families are considered as underlay
1194	   SAFIs.  Similarly, it is RECOMMENDED that the route preference or
1195	   administrative distance give active route installation preference to
1196	   BGP-LS-SPF IPv4/IPv6 routes over BGP routes from other AFI/SAFIs.
1197	   However, this preference MAY be overridden by an operator-configured
1198	   policy.

[major] "RECOMMENDED that the route preference or administrative
distance give active route installation preference...over [other] BGP
routes...MAY be overridden by an operator-configured policy."

I see how the the initial recommendation (scheduling) is
implementation-dependent.  But the second one is an operational
consideration.  When is it ok to not prefer BGP SPF routes?  What is
the effect?  What should an operator consider when deciding to change
the configuration?  The answers could belong in a subsection of §10.1.



...
1201	6.5.1.  Link/Prefix Failure Convergence
...
1210	   An IGP such as OSPF [RFC2328] will stop using the link as soon as the
1211	   Router-LSA for one side of the link is received.  With a BGP
1212	   advertisement, the link would continue to be used until the last copy
1213	   of the BGP-LS-SPF Link NLRI is withdrawn.  In order to avoid this
1214	   delay, the originator of the Link NLRI SHOULD advertise a more recent
1215	   version with an increased Sequence Number TLV for the BGP-LS-SPF Link
1216	   NLRI including the SPF Status TLV (Section 5.2.2.2) indicating the
1217	   link is down with respect to BGP SPF.  The configurable
1218	   LinkStatusDownAdvertise timer controls the interval that the BGP-LS-
1219	   LINK NLRI is advertised with SPF Status indicating the link is down
1220	   prior to withdrawal.  If the link becomes available in that period,
1221	   the originator of the BGP-LS-SPF LINK NLRI SHOULD advertise a more
1222	   recent version of the BGP-LS-SPF Link NLRI without the SPF Status TLV
1223	   in the BGP-LS Link Attributes.  The suggested default value for the
1224	   LinkStatusDownAdvertise timer is 2 seconds.

[] "An IGP such as OSPF..."

Don't compare with other protocols.


[major] "With a BGP advertisement, the link would continue to be used
until the last copy of the BGP-LS-SPF Link NLRI is withdrawn."

I don't think this is true because Step 5.d checks the Remote
Node...so if only one side was withdrawn it would still not find the
link.  What did I miss?


[major] "In order to avoid this delay, the originator of the Link NLRI
SHOULD advertise..."

When is it ok to not advertise?  Why is this behavior recommended and
not required?


[major] "If the link becomes available in that period, the originator
of the BGP-LS-SPF LINK NLRI SHOULD advertise..."

When is it ok to not advertise?  Why is this behavior recommended and
not required?


[major] The combination of the SHOULDs above results in no specific
requirement to advertise the failure "fast".  Or the recovery.  Given
that BGP SPF is a link state protocol, I would expect the link up/down
notifications to happen immediately.

The text above also doesn't mention when the Link NLRI should be withdrawn.


[?] Just wondering...  Does using the Status TLV vs withdrawing the
Link NLRI result in any type of performance improvement?  It seems
(from the SPF description) that not having a link or finding a Status
TLV is roughly the same.



1226	   Similarly, when a prefix becomes unreachable, a more recent version
1227	   of the BGP-LS-SPF Prefix NLRI SHOULD be advertised with the SPF
1228	   Status TLV (Section 5.2.3.1) indicating the prefix is unreachable in
1229	   the BGP-LS Prefix Attributes and the prefix will be considered
1230	   unreachable with respect to BGP SPF.  The configurable
1231	   PrefixStatusDownAdvertise timer controls the interval that the BGP-
1232	   LS-Prefix NLRI is advertised with SPF Status indicating the prefix is
1233	   unreachable prior to withdrawal.  If the prefix becomes reachable in
1234	   that period, the originator of the BGP-LS-SPF Prefix NLRI SHOULD
1235	   advertise a more recent version of the BGP-LS-SPF Prefix NLRI without
1236	   the SPF Status TLV in the BGP-LS Prefix Attributes.  The suggested
1237	   default value for the PrefixStatusDownAdvertise timer is 2 seconds.

[] Same comment/questions as above.



1239	6.5.2.  Node Failure Convergence

1241	   With BGP without graceful restart [RFC4724], all the NLRI advertised
1242	   by a node are implicitly withdrawn when a session failure is
1243	   detected.  If fast failure detection such as BFD is utilized, and the
1244	   node is on the fastest converging path, the most recent versions of
1245	   BGP-LS-SPF NLRI may be withdrawn.  This will result into an older
1246	   version of the NLRI being used until the new versions arrive and,
1247	   potentially, unnecessary route flaps.  For the BGP-LS-SPF SAFI, NLRI
1248	   SHOULD NOT be implicitly withdrawn immediately to prevent such
1249	   unnecessary route flaps.  The configurable
1250	   NLRIImplicitWithdrawalDelay timer controls the interval that NLRI is
1251	   retained prior to implicit withdrawal after a BGP SPF speaker has
1252	   transitioned out of Established state.  This will not delay
1253	   convergence since the adjacent nodes will detect the link failure and
1254	   advertise a more recent NLRI indicating the link is down with respect
1255	   to BGP SPF (Section 6.5.1) and the BGP SPF calculation will fail the
1256	   bi-directional connectivity check Section 6.3.  The suggested default
1257	   value for the NLRIImplicitWithdrawalDelay timer is 2 seconds.

[minor] s/implicitly withdrawn/withdrawn

The withdraw would be implicit if there was a replacement route -- but
it would be explicit if there wasn't one.  The important point here is
to point that the NLRI is withdrawn.


[nit] I find the mention of GR weird -- the default behavior [rfc4271]
is to clear all the routes...which is what GR was put in place to
address, so it feels like a long way to talk about the default.

Suggestion>
   By default [rfc4271], all the NLRI advertised by a node are
   withdrawn when a session failure is detected.


[?] I had to read this several times:

   ...the node is on the fastest converging path, the most recent
   versions of BGP-LS-SPF NLRI may be withdrawn.  This will result
   into an older version of the NLRI being used until the new versions
   arrive and, potentially, unnecessary route flaps.

This text assumes some type of strange lack of synchronization, and it
feels to me to be a corner case: when the node fails as soon as it
advertised NLRIs, faster than the NLRIs can be propagated.  Is this
the common case?  Do you really expect the network to take 2 seconds
to converge (NLRIImplicitWithdrawalDelay)?

I would have assumed that the Sequence Number would take care of
removing all NLRIs, old and new.  Why isn't that used?

Unlike a "plain" rfc4271 withdraw, rfc4760 doesn't preclude carrying
other attributes when MP_UNREACH_NLRI is present.


[minor] "For the BGP-LS-SPF SAFI, NLRI SHOULD NOT be...NLRI is
retained...advertise a more recent NLRI"

Please be specific about which NLRI you're talking about.  I realize
that this section is about node failure, so I can guess that for the
first 2 mentions you mean "Node NLRI" and "Link NLRI" for the last
one...but I shouldn't have to guess.


[major] "NLRI SHOULD NOT be implicitly withdrawn immediately"

When is it ok to withdraw it?  Why is this a recommendation and not a
requirement?



...
1265	7.1.  Processing of BGP-LS-SPF TLVs

1267	   When a BGP SPF speaker receives a BGP Update containing a malformed
1268	   Node NLRI SPF Status TLV in the BGP-LS Attribute
1269	   [I-D.ietf-idr-rfc7752bis], it MUST ignore the received TLV and MUST
1270	   NOT pass it to other BGP peers as specified in [RFC7606].  When
1271	   discarding an associated Node NLRI with a malformed TLV, a BGP SPF
1272	   speaker SHOULD log an error for further analysis.

[major] "malformed...TLV in the BGP-LS Attribute...it MUST ignore the
received TLV and MUST NOT pass it to other BGP peers as specified in
[RFC7606]."

rfc7606 doesn't talk about TLVs.  It does offer "attribute discard" as
an option.  Did you mean that the BGP-LS Attribute must be discarded,
or are you really talking about just discarding the TLV?

The last sentence makes things more confusing as it introduces
"discarding an associated Node NLRI with a malformed TLV" -- is the
NLRI to be discarded too or ??

Note that rfc7606 says this:

   o  Attribute discard: In this approach, the malformed attribute MUST
      be discarded and the UPDATE message continues to be processed.
      This approach MUST NOT be used except in the case of an attribute
      that has no effect on route selection or installation.

Regardless of what is discarded (the TLV or the attribute), the action
goes against rfc7606 because the Status TLV has a direct impact on
both route selection and installation.

Also, ignoring the Status TLV will most likely result in the incorrect
SPF decision as the TLV is used to indicate some type of
unavailability -- if ignored and SPF runs assuming that the Node is
available...

The last sentence talks about "discarding an associated Node NLRI with
a malformed TLV", but it doesn't say that this action is to be taken.
The rfc7607 approach to "discarding" NLRI is to treat-as-withdraw.  Is
this the action?


** These comments apply to all the other entries in this section
related to the Status TLV.



1274	   When a BGP SPF speaker receives a BGP Update containing a malformed
1275	   Link NLRI SPF Status TLV in the BGP-LS Attribute
1276	   [I-D.ietf-idr-rfc7752bis], it MUST ignore the received TLV and MUST
1277	   NOT pass it to other BGP peers as specified in [RFC7606].  When
1278	   discarding an associated Link NLRI with a malformed TLV, a BGP SPF
1279	   speaker SHOULD log an error for further analysis.

1281	   When a BGP SPF speaker receives a BGP Update containing a malformed
1282	   Prefix NLRI SPF Status TLV in the BGP-LS Attribute
1283	   [I-D.ietf-idr-rfc7752bis], it MUST ignore the received TLV and MUST
1284	   NOT pass it to other BGP peers as specified in [RFC7606].  When
1285	   discarding an associated Prefix NLRI with a malformed TLV, a BGP SPF
1286	   speaker SHOULD log an error for further analysis.

1288	   When a BGP SPF speaker receives a BGP Update containing a malformed
1289	   SPF Capability TLV in the Node NLRI BGP-LS Attribute
1290	   [I-D.ietf-idr-rfc7752bis], it MUST ignore the received TLV and the
1291	   Node NLRI and MUST NOT pass it to other BGP peers as specified in
1292	   [RFC7606].  When discarding a Node NLRI with a malformed TLV, a BGP
1293	   SPF speaker SHOULD log an error for further analysis.

[major] "MUST ignore the received TLV and...MUST NOT pass it to other BGP peers"

I was going to make the same comment as above, but then the text also
says that the Node NLRI is also ignored/discarded (see the next
comment).  This means that it doesn't seem to matter what happens to
the TLV.  Is there a reason to keep that text?


[major] "MUST ignore...the Node NLRI...as specified in [RFC7606].
When discarding a Node NLRI"

rfc7606 doesn't talk about ignoring an UPDATE, but it does present
"AFI/SAFI disable: ... "ignore all the subsequent routes with that
AFI/SAFI received over that session"."  I'm assuming that is not what
you mean, right?

Take a look also at §5.4/rfc7606 which presents a version of "NLRI
discard", but I don't think that is something you want to do either.

What do you mean?  The text above requires the Node NLRI to be
ignored, but it also hints at discarding it.  In my mind discarding is
different than ignoring...  ??



1295	   When a BGP SPF speaker receives a BGP Update containing a malformed
1296	   IPv4 Prefix-Length TLV in the Link NLRI BGP-LS Attribute
1297	   [I-D.ietf-idr-rfc7752bis], it MUST ignore the received TLV and the
1298	   Node NLRI and MUST NOT pass it to other BGP peers as specified in
1299	   [RFC7606].  The corresponding Link NLRI is considered as malformed
1300	   and MUST be handled as 'Treat-as-withdraw'.  An implementation MAY
1301	   log an error for further analysis.

[major] "MUST ignore the received TLV and...MUST NOT pass it to other BGP peers"

I was going to make the same comment as above, but then the text also
says that the "Link NLRI is considered as malformed and MUST be
handled as 'Treat-as-withdraw'".  This means that it doesn't matter
what happens to the TLV.  Is there a reason to keep that text?


[major] "MUST ignore...the Node NLRI"

Why is the Node NLRI ignored if the error was in the Link NLRI?



1303	   When a BGP SPF speaker receives a BGP Update containing a malformed
1304	   IPv6 Prefix-Length TLV in the Link NLRI BGP-LS Attribute
1305	   [I-D.ietf-idr-rfc7752bis], it MUST ignore the received TLV and the
1306	   Node NLRI and MUST NOT pass it to other BGP peers as specified in
1307	   [RFC7606].  The corresponding Link NLRI is considered as malformed
1308	   and MUST be handled as 'Treat-as-withdraw'.  An implementation MAY
1309	   log an error for further analysis.

[] Same comments as above.


[minor] Assuming that the action to be taken for all the cases is
treat-as-withdraw, does this allow the originator to somehow fix the
problem?  I imagine that the originator will eventually receive a
message withdrawing the NLRI.  What should it do?  §6.1.1 (BGP
Self-Originated NLRI) didn't say anything about receiving a withdraw
for a self-originated NLRI.



1311	7.2.  Processing of BGP-LS-SPF NLRIs

[major] The text in this section was borrowed from the base BGP-LS
spec, but that text has changed (see §8.2.2/rfc7752bis).

If the processing is the same as rfc7752bis, then just say so, don't
copy the text.  Please make sure that it applies.  See more comments
below.



1313	   A Link-State NLRI MUST NOT be considered as malformed or invalid
1314	   based on the inclusion/exclusion of TLVs or contents of the TLV
1315	   fields (i.e., semantic errors), as described in Section 5.1 and
1316	   Section 5.1.1.

[major] This statement ("NLRI MUST NOT be considered as malformed or
invalid based on the inclusion/exclusion of TLVs") is in normative
conflict with §5.2 (and probably other sections) where it says that
"node descriptors for all NLRI MUST include the BGP Identifier (TLV
516) and the AS Number (TLV 512)".  IOW, what happens if 516/512 are
not present?  Is the NLRI still ok?



1318	   A BGP-LS-SPF Speaker MUST perform the following syntactic validation
1319	   of the BGP-LS-SPF NLRI to determine if it is malformed.

[major] The validation in rfc7752bis is syntactic, checking that the
lengths are ok, etc.  What about semantic validation?  For example, if
TLV 516 is present, but the value of the ID is 0, what should the
receiver do?  Is the TLV valid?  Is the Node Descriptor valid?  Is the
NLRI valid?

This is just an example -- we should go through all the TLVs.  Note
that none of the BGP-LS documents talk about semantic validation, so
there isn't a place to point to :-( because the Consumer is expected
to take care of that -- and how it operates is out of scope.  IOW,
BGP-LS can be a garbage-in-garbage-out transport from a semantic point
of view, but BGP SPF can't!



...
1339	   When the error determined allows for the router to skip the malformed
1340	   NLRI(s) and continue processing of the rest of the update message
1341	   (e.g., when the TLV ordering rule is violated), then it MUST handle
1342	   such malformed NLRIs as 'Treat-as-withdraw'.  In other cases, where
1343	   the error in the NLRI encoding results in the inability to process
1344	   the BGP update message (e.g., length related encoding errors), then
1345	   the router SHOULD handle such malformed NLRIs as 'AFI/SAFI disable'
1346	   when other AFI/SAFI besides BGP-LS are being advertised over the same
1347	   session.  Alternately, the router MUST perform 'session reset' when
1348	   the session is only being used for BGP-LS-SPF or when its 'AFI/SAFI
1349	   disable' action is not possible.

[major] Note that rfc7752bis moved from treat-as-withdraw (in this
paragraph) to "NLRI discard" (§5.4/rfc7606); that is fine because the
BGP-LS nodes are not checking the contents of the updates.  But that
approach doesn't work with BGP SPF.



1351	7.3.  Processing of BGP-LS Attribute

[nit] s/of BGP-LS Attribute/of the BGP-LS Attribute



1353	   A BGP-LS Attribute MUST NOT be considered as malformed or invalid
1354	   based on the inclusion/exclusion of TLVs or contents of the TLV
1355	   fields (i.e., semantic errors), as described in Section 5.1 and
1356	   Section 5.1.1.

1358	   A BGP-LS-SPF Speaker MUST perform the following syntactic validation
1359	   of the BGP-LS Attribute to determine if it is malformed.

[] Same observations/comments/questions as above.



...
1370	   When the detected error allows for the router to skip the malformed
1371	   BGP-LS Attribute and continue processing of the rest of the update
1372	   message (e.g., when the BGP-LS Attribute length and the total Path
1373	   Attribute Length are correct but some TLV/sub-TLV length within the
1374	   BGP-LS Attribute is invalid), then it MUST handle such malformed BGP-
1375	   LS Attribute as 'Attribute Discard'.  In other cases, when the error
1376	   in the BGP-LS Attribute encoding results in the inability to process
1377	   the BGP update message, then the handling is the same as described
1378	   above for malformed NLRI.

[major] As mentioned before, using "attribute discard" with the BGP-LS
Attribute for BGP SPF doesn't work because of the impact of the TLVs
on route selection and installation.



1380	   Note that the 'Attribute Discard' action results in the loss of all
1381	   TLVs in the BGP-LS Attribute and not the removal of a specific
1382	   malformed TLV.  The removal of specific malformed TLVs may give a
1383	   wrong indication to a BGP SPF speaker that the specific information
1384	   is being deleted or is not available.

[] ...but that is what is written above.



1386	   When a BGP SPF speaker receives an update message with Link-State
1387	   NLRI(s) in the MP_REACH_NLRI but without the BGP-LS-SPF Attribute, it
1388	   is most likely an indication that a BGP SPF speaker preceding it has
1389	   performed the 'Attribute Discard' fault handling.  An implementation
1390	   SHOULD preserve and propagate the Link-State NLRIs in such an update
1391	   message so that the BGP SPF speaker can detect the loss of link-state
1392	   information for that object and not assume its deletion/withdrawal.
1393	   This also makes it possible for a network operator to trace back to
1394	   the BGP SPF speaker which actually detected a problem with the BGP-LS
1395	   Attribute.

[nit] s/BGP-LS-SPF Attribute/BGP-LS Attribute


[major] Nothing in this paragraph applies.



...
1401	   IGP metric TLV in the Link NLRI BGP-LS Attribute
1402	   [I-D.ietf-idr-rfc7752bis], it MUST ignore the received TLV and the
1403	   Link NLRI and MUST NOT pass it to other BGP peers as specified in
1404	   [RFC7606].  When discarding a Link NLRI with a malformed TLV, a BGP
1405	   SPF speaker SHOULD log an error for further analysis.

[] This paragraph seems to be out of place.  Also, see the comments in §7.1.



1407	8.  IANA Considerations

1409	   This document defines the use of SAFI (80) for BGP SPF operation
1410	   Section 5.1, and requests IANA to assign the value from the First
1411	   Come First Serve (FCFS) range in the Subsequent Address Family
1412	   Identifiers (SAFI) Parameters registry.

[] I know I suggested this text before...but IANA has already assigned
a value, so there's no need to request again.  However, we should ask
IANA to update the registry to point to this document.

Suggestion>
   IANA has assigned value 80 for BGP-LS-SPF from the First Come First
   Served range in the Subsequent Address Family Identifiers (SAFI)
   Parameters registry.  IANA is requested to update the registration
   to reference only [This Document].



1414	   This document also defines five attribute TLVs of BGP-LS-SPF NLRI.
1415	   We request IANA to assign types for the SPF capability TLV, Sequence
1416	   Number TLV, IPv4 Link Prefix-Length TLV, IPv6 Link Prefix-Length TLV,
1417	   and SPF Status TLV from the "BGP-LS Node Descriptor, Link Descriptor,
1418	   Prefix Descriptor, and Attribute TLVs" Registry.

[] Same comment.  Note that rfc7752bis renames the registry to "BGP-LS
NLRI and Attribute TLVs".



1420	    +=========================+=================+====================+
1421	    | Attribute TLV           | Suggested Value | NLRI Applicability |
1422	    +=========================+=================+====================+
1423	    | SPF Capability          | 1180            | Node               |
1424	    +-------------------------+-----------------+--------------------+
1425	    | SPF Status              | 1184            | Node, Link, Prefix |
1426	    +-------------------------+-----------------+--------------------+
1427	    | IPv4 Link Prefix Length | 1182            | Link               |
1428	    +-------------------------+-----------------+--------------------+
1429	    | IPv6 Link Prefix Length | 1183            | Link               |
1430	    +-------------------------+-----------------+--------------------+
1431	    | Sequence Number         | 1181            | Node, Link, Prefix |
1432	    +-------------------------+-----------------+--------------------+

1434	                       Table 1: NLRI Attribute TLVs

[major] This table, in this section, should reflect the structure of
the registry with these columns: TLV Code Point, Description, IS-IS
TLV/Sub-TLV, and Reference.

If you want to keep a summary table of the TLVs described in this
document, please put it in a different section.  For example, see
§9/rfc7752bis.



1436	9.  Security Considerations

1438	   This document defines a BGP SAFI, i.e., the BGP-LS-SPF SAFI.  This
1439	   document does not change the underlying security issues inherent in
1440	   the BGP protocol [RFC4271].  The Security Considerations discussed in
1441	   [RFC4271] apply to the BGP SPF functionality as well.  The analysis
1442	   of the security issues for BGP mentioned in [RFC4272] and [RFC6952]
1443	   also applies to this document.  The analysis of Generic Threats to
1444	   Routing Protocols done in [RFC4593] is also worth noting.  As the
1445	   modifications described in this document for BGP SPF apply to IPv4
1446	   Unicast and IPv6 Unicast as underlay SAFIs in a single BGP SPF
1447	   Routing Domain, the BGP security solutions described in [RFC6811] and
1448	   [RFC8205] are somewhat constricted as they are meant to apply for
1449	   inter-domain BGP where multiple BGP Routing Domains are typically
1450	   involved.  The BGP-LS-SPF SAFI NLRI described in this document are
1451	   typically advertised between EBGP or IBGP speakers under a single
1452	   administrative domain.

[nit] Move the rfc6811/rfc8205 considerations into their own paragraph.


[major] "[RFC6811] and [RFC8205] are somewhat constricted as they are
meant to apply for inter-domain BGP where multiple BGP Routing
Domains"

It is not clear from the text if origin validation and BGPSec are
applicable or not.  I know you're comparing "multiple BGP Routing
Domains" to a "single administrative domain" to somehow not consider
them.  However, they are applicable in the common case of multiple
ASNs managed by the same entity (which I'm sure many would argue is a
form of a single administrative domain).

Besides overhead (certificates, RPKI, etc.), what prevents the use of
rfc6811/rfc8205 in a BGP SPF network?

Maybe declaring its use out of scope is better.



1454	   The BGP SPF protocol and the BGP-LS-SPF SAFI inherit the encoding
1455	   from BGP-LS [I-D.ietf-idr-rfc7752bis], and consequently, inherit the
1456	   security considerations for BGP-LS.  Additionally, given that the BGP
1457	   SPF protocol is used to install IPv4 and IPv6 Unicast routes, the BGP
1458	   SPF protocol is vulnerable to attacks to the routing control plane
1459	   that aren't applicable to BGP-LS.  One notable Denial-of-Service
1460	   attack, would be to include malformed BGP attributes in a replicated
1461	   BGP Update, causing the receiving peer to treat the advertised BGP-
1462	   LS-SPF to a withdrawal [RFC7606].

[] "inherit the security considerations for BGP-LS."

No.  Except for the first paragraph (already considered above), the
security considerations in rfc7752bis are all about the BGP Consumer
-- which doesn't exist in the BGP SPF model.



1464	   In the context of the BGP peering associated with this document, a
1465	   BGP speaker MUST NOT accept updates from a peer that is not within
1466	   any administrative control of an operator.  That is, a participating
1467	   BGP speaker SHOULD be aware of the nature of its peering
1468	   relationships.  Such protection can be achieved by manual
1469	   configuration of peers at the BGP speaker.

[] This paragraph was also borrowed/modified from rfc7752bis.  Note
that the updated text there doesn't use normative language  because it
is a configuration issue.


[major] "BGP speaker MUST NOT accept updates from a peer that is not
within any administrative control of an operator"

By "from a peer" I assume you're referring to only directly connected
peers, right?  Why would the BGP session be established in the first
place?


[major] "BGP speaker SHOULD be aware of the nature of its peering relationships"

How is a BGP speaker made aware?  In the usual BGP deployments, the
ASN is used to easily determine if the peer is in the local AS or not.
IOW, this seems to be a configuration issue better left to the
operator.

Also, "SHOUD be aware" is in normative conflict with the requirement
of not accepting updates.



1471	   In order to mitigate the risk of peering with BGP speakers
1472	   masquerading as legitimate authorized BGP speakers, it is recommended
1473	   that the TCP Authentication Option (TCP-AO) [RFC5925] be used to
1474	   authenticate BGP sessions.  If an authorized BGP peer is compromised,
1475	   that BGP peer could advertise modified Node, Link, or Prefix NLRI
1476	   will result in misrouting, repeating origination of NLRI, and/or
1477	   excessive SPF calculations.  When a BGP speaker detects that its
1478	   self-originated NLRI is being originated by another BGP speaker, an
1479	   appropriate error should be logged so that the operator can take
1480	   corrective action.

[major] "it is recommended that the TCP Authentication Option (TCP-AO)
[RFC5925] be used"

Any transport-level mechanism should be left to the base spec.
rfc4271 already required authentication, so there's no need to
re-specify here.  Also, the language above is not the same as in (and
some would say in normative conflict with) rfc4271.


[minor] There are several other vectors: advertise an incorrect SPF
Algorithm, remove the BGP-LS Attribute, omit the Sequence-Number TLV,
advertise unordered TLVs, etc..   We don't need to be exhaustive and
perhaps the current text is enough.  Please just add that this risk is
not new -- I'm sure some of the RFCs mentioned above already talks
about this type of stuff.



...
1492	10.1.1.  Link Metric Configuration

1494	   Within a BGP SPF Routing Domain, the IGP metrics for all advertised
1495	   links SHOULD be configured or defaulted consistently.  For example,
1496	   if a default metric is used for one router's links, then a similar
1497	   metric should be used for all router's links.  Similarly, if the link
1498	   cost is derived from using the inverse of the link bandwidth on one
1499	   router, then this SHOULD be done for all routers and the same
1500	   reference bandwidth should be used to derive the inversely
1501	   proportional metric.  Failure to do so will not result in correct
1502	   routing based on link metric.

[major] "SHOULD be configured or defaulted consistently"

When is it ok to not do this?  Why is this action recommended and not
required?  There seem to be possible suboptimal routing in some
topologies -- without having to go into details, maybe s/SHOULD/should


[major] "then this SHOULD be done for all routers"

In this case, it looks like this sentence is just part of the example
and a statement of fact.  s/SHOULD/should



...
1513	10.1.3.  backoff-config

1515	   In addition to configuration of the BGP-LS-SPF address family,
1516	   implementations SHOULD support the "Shortest Path First (SPF) Back-
1517	   Off Delay Algorithm for Link-State IGPs" [RFC8405].  If supported,
1518	   configuration of the INITIAL_SPF_DELAY, SHORT_SPF_DELAY,
1519	   LONG_SPF_DELAY, TIME_TO_LEARN, and HOLDDOWN_INTERVAL MUST be
1520	   supported [RFC8405].  Section 6 of [RFC8405] recommends consistent
1521	   configuration of these values throughout the IGP routing domain and
1522	   this also applies to the BGP SPF Routing Domain.

[major] Even if consistent with rfc8405, the second sentence is not
needed because it re-specifies what rfc8405 already specifies.


[EoR -19]