Re: [Idr] WG LC for draft-ietf-idr-bgp-ct-09.txt (6/26 to 7/17) - extending call to 7/23

Ketan Talaulikar <ketant.ietf@gmail.com> Sat, 22 July 2023 19:39 UTC

Return-Path: <ketant.ietf@gmail.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E0432C15152B for <idr@ietfa.amsl.com>; Sat, 22 Jul 2023 12:39:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.094
X-Spam-Level:
X-Spam-Status: No, score=-2.094 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9-mIvKEW2D9H for <idr@ietfa.amsl.com>; Sat, 22 Jul 2023 12:38:59 -0700 (PDT)
Received: from mail-ej1-x631.google.com (mail-ej1-x631.google.com [IPv6:2a00:1450:4864:20::631]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EB62DC14CE46 for <idr@ietf.org>; Sat, 22 Jul 2023 12:38:58 -0700 (PDT)
Received: by mail-ej1-x631.google.com with SMTP id a640c23a62f3a-99454855de1so446323066b.2 for <idr@ietf.org>; Sat, 22 Jul 2023 12:38:58 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690054737; x=1690659537; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=8O91gRqU0JKPrMk76tfpqR8TmIRuott6rqXwr0sDDqI=; b=l4aF9vzskZxjSW0fDsVxpjGbj/TwMHNpcdgUUvm8SN5nR1Y5sJ1H78Qi59S/iyuTiV neRCOFOdqMloZPZ66XPXatKECCzkeLexLru5W8QeBdOg//Xlnr4kf57ZFwuuURhQeo86 j1sLBH9y8gnbc4IwAojkNA+poKmfeHW0rigF7KJIZG4vj1yYt19FkX0imb+oZ8bfoW38 10xtA+uFqHdN8+7ydBWwRaxO6K226CvJWHcFvuyuCip+7IToxZM4Gwv0EQRzHwkJvQV0 WpDZa5apJKBId5Hp8PwbLtV3xortcr8kxNmNdrQj3gzBME21t1sK2aYCYmmnl3kqS/wi mRmA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690054737; x=1690659537; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8O91gRqU0JKPrMk76tfpqR8TmIRuott6rqXwr0sDDqI=; b=GmWsyhRl2scnj2wYDt9H/0HxaGiSwtDfWK4TW0WSfvBLf8Zr868equxXfbx20MZqo/ 0Q2xMAJ8cZtjwmtEumrqNPJClPslerankV/kLJraZQdB+fId+4RGH+q2Iwj5eLfPIh4x e645eNu9BR/qL/M77iyxE+a9DHQvxOyLyWByP01UL0zPvCRUcBGTWP689t842ND10DZV U9n82YFqC4l2z/ygREzD/JVARmK2ozQ6BU2J5MmTkttUfO1+MvMLK40jl5TfZ6QC32aJ 7WLTAO2rUGQKgSkkQChTYah6p0BLe4m5LDaNKL75p11ymnlUM8vanE3w7GBXg5UwZSN9 y0Zg==
X-Gm-Message-State: ABy/qLYowYSHAKZahAMHpu2YJVe02tzpNgAtDOEqXMi/+YStYdhNlam2 U2ihmi4NDkjrXmCK3NURwRoF3y/NTA01pLJnvVov8LeN+G8euA==
X-Google-Smtp-Source: APBJJlHbKTBpGaS1xBhU5RfSLRbGj73DkECuTJ6LF81DflrCtOk1akd6Ck1H3vgc+7SJIV4K7Yo8QIN6ApPvPu58JkQ=
X-Received: by 2002:a17:906:cc49:b0:982:8c28:c50d with SMTP id mm9-20020a170906cc4900b009828c28c50dmr5403559ejb.58.1690054736434; Sat, 22 Jul 2023 12:38:56 -0700 (PDT)
MIME-Version: 1.0
References: <BYAPR08MB487225C256745CDDA29C57C1B33BA@BYAPR08MB4872.namprd08.prod.outlook.com>
In-Reply-To: <BYAPR08MB487225C256745CDDA29C57C1B33BA@BYAPR08MB4872.namprd08.prod.outlook.com>
From: Ketan Talaulikar <ketant.ietf@gmail.com>
Date: Sat, 22 Jul 2023 12:38:44 -0700
Message-ID: <CAH6gdPww7cFoiK66jtWzF_RkDPetUDV5UoZSwszLZ4dbe1pn3w@mail.gmail.com>
To: Susan Hares <shares@ndzh.com>
Cc: "idr@ietf.org" <idr@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000003758ed06011888ec"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/q1PBTKAnlsuyjYwd4fwpAgCEssI>
Subject: Re: [Idr] WG LC for draft-ietf-idr-bgp-ct-09.txt (6/26 to 7/17) - extending call to 7/23
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Jul 2023 19:39:04 -0000

Hi All,



This is a detailed review of the draft as asked for by Sue.



Summary:



   1. The document has improved with the multiple recent versions; thanks
   to the authors. I hope the comments and suggestions below help improve the
   document further.
   2. I have provided comments to cross check and fix the inconsistent and
   confusing use of terminologies (e.g., mapping community, color vs transport
   class, tunnel vs encapsulation, etc.) and the use of some new terminologies
   while we have some existing well-known ones.
   3. The document can be trimmed significantly by removing the text
   related to functionality that are provided by other individual drafts
   (e.g., MNH, MPLS private labels, SRv6 inter-domain, etc.) to their
   respective drafts. While those are being referred to as informative
   references, that is incorrect since the functionality described cannot be
   realized without those features – therefore normative. Not to mention, it
   is always better to provide a precise document that focuses on the
   specification (even if experimental) to the IESG.
   4. The draft does not cover the use of BGP CT with SRv6 on its own
   without dependencies on features in other individual drafts. There are also
   some details missing. One option may be to remove SRv6 at this point and
   have a separate document for it when ready.
   5. There are some technical issues identified which need to be fixed.
   6. I’ve also provided some suggestions and minor comments or questions.
   7. The document has many warnings and some errors as reported by IDnits
   - these should be easy to fix. There are also spelling and grammatical
   errors which can be identified and fixed. I've focused only on the
   technical aspects.



Thanks,

Ketan



The detailed review is below as comments with IDnits as reference.





24            This document specifies protocol procedures for BGP that
enable

25             dissemination of such service mapping information in a
network that

26             may span multiple cooperating administrative domains.  These
domains

27             may be administered either by the same provider or by closely

28             coordinating providers.  A new BGP address family that
leverages



[minor] Coordinating providers do provide service spanning their domains

(ASes) but I believe their transport reachability is limited to their
domains.

Since we are talking of transport, perhaps what is meant here is a single
AS

or multiple ASes that are under a single administrative control?



215           The constructs and procedures defined in this document apply

216           homogenously to Intra-AS as well as Inter-AS Option A, Option
B and

217           Option C style deployments in provider networks.



[major] From the transport perspective, for both Inter-AS Option A & B,
there

is no requirement for inter-AS transport reachability. The classful

transport routes are essentially intra-AS. Therefore, the BGP-CT provides a

classful transport solution for Intra-AS and Inter-AS Option C deployments.

As I’ve stated in further comments, the authors seem to associate the
concepts

of TRDB and such internal implementation constructs that they have

introduced for BGP-CT as something to “standardize”. This is clearly not so
as

implementations should be free to realize them in other ways – therefore,

where there is no role to play for BGP-CT, there is nothing then needs to
be said

in this document.



227           The mechanisms defined in this document are agnostic to the
tunneling

228           technologies.  These can be applied homogenously to
intra-domain

229           tunneling technologies used in brownfield networks (e.g.  MPLS

230           Traffic Engineering) as well as greenfield networks (e.g.
Segment

231           Routing).



[minor] The terms like brownfield and greenfield are subjective. Can we
avoid

such usage? I don't believe SR networks are greenfield anymore and some

operators may choose to use MPLS-TE for their new/greenfield networks.



265           SN : Service Node



267           eSN : Egress Service Node



269           iSN : Ingress Service Node



[minor] Suggest to use the term PE instead of Service Node as normally used.

If it is different from PE then please clarify what is different.



271           BN : Border Node



[minor] Isn't BN a ABR/ASBR? If so, please consider using that term.



273           TN : Transport Node, P-router



[minor] Consider use the term P-router instead of introducing this new TN

term.



294           PNH : Protocol Next hop address carried in a BGP Update
message



[minor] Consider using BGP NH instead of PNH as is the usual practice unless

it is different for some reason.



308           EP : Endpoint, a loopback address in the network



310           SEP : Service Endpoint, the PNH of a Service route



[minor] The terms EP and SEP are hardly used. Consider using "loopback"

instead or simply Prefix as appropriate for that context.





334           Transport Family : BGP address family used for advertising
tunnels,

335           which are in turn used by service routes for resolution
(e.g.  AFI/

336           SAFIs 1/4 or 1/76).



[minor] Shouldn’t this include AFI/SAFI 2/1 that can be used as transport
for SRv6?



338           Transport Tunnel : A tunnel over which a service may place
traffic

339           (e.g.  GRE, UDP, LDP, RSVP-TE, IGP FLEX-ALGO or SRTE).



[major] There is the notion of encapsulation and a notion of tunnel. A
tunnel

is often modeled as an interface. In the above list, IGP SR or IGP

Flex-Algo is not really a tunnel but provides an encapsulation - MPLS or
SRv6.

RSVP-TE is a tunnel. Can the usage of the term "tunnel" be reviewed and
perhaps

replace with encapsulation where appropriate? Another option is to use the
term

"path" instead of tunnel.



356           Transport Route Database (TRDB): At the SN and BN, a
Transport Class

357           has an associated Transport Route Database that collects its
tunnel

358           ingress routes.



[question] Does TRDB contain only the "tunnel" routes or also the CT
routes?



363           Mapping Community : Any BGP Community/Extended-community on a
BGP

364           route that maps to a Resolution Scheme.  E.g. color:0:100,
transport-

365           target:0:100.



[major] A new terminology "Mapping Community" is introduced which is not an

actual BGP community and it implies different types of communities in

different context. This is making the spec confusing and ambiguous. I will

point a few instances in further comments. Suggestion: Do away with this

terminology and instead direct call out the actual type of community in the

specific context in the document.





421           Figure 1, depicts the intra-AS and inter-AS application of
these

422           constructs.



[minor] Suggestion: introduce the base reference figure first; then add the

new constructs and CT on top in additional figures after having introduced

them. This would be easier for a reader to comprehend.



452           Overlay routes carry sufficient indication of the desired
Transport

453           Classes using a BGP community which assumes the role of as a
"Mapping

454           community".  A Resolution Scheme is identified by its "Mapping

455           Community", where its configuration can either be
auto-generated or

456           done manually.



[question] So, is the "resolution scheme" applicable to overlay routes or CT

routes or both? This is confusing because mapping community for overlay
route

the Color ExtComm and for CT routes is Transport RT and they are different.

Using the same term "mapping community" for both is problematic in such

contexts.



489           A Transport Class is identified by a unique 32-bit "Transport
Class"

490           identifier, that is assigned by the operator.  An operator may



[major] Is this 32-bit TC unique on a given node or across the network (one
or

more AS)? I assume it is normally unique across the multi-domain network
since

 it is getting encoded into the TC RT but in some cases it can be
per-domain with

mapping/rewriting across domain boundaries. This is covered further in the
document

but is missing here. This is a general issue that I see in this document,
where the

formal specification portions of the text are not detailed normatively but
they are

buried in verbose use-case or deployment design related sections.



491           configure an SN/BN to classify a tunnel into an appropriate
Transport

492           Class.  How exactly these tunnels are made Transport Class
aware is

493           implementation specific and outside the scope of this
document.



[question] Can the same "tunnel" be part of two TCs? Please check my
comment

on sec 13.1.3.1 for further information.



522           [RFC8664] extends Path Computation Element Communication
Protocol

523           (PCEP) to carry SRTE Color.  This color association learnt
from PCEP



[major] RFC8664 does not talk about color.



565           An implementation may realize the TRDB for e.g., as a
"Routing Table"

566           referred in Section 9.1.2.1 of RFC4271 (https://www.rfc-

567           editor.org/rfc/rfc4271#section-9.1.2.1) which is "only" used
for

568           resolving nexthop reachability in control plane with no
footprint in

569           forwarding plane.  However, an implementation may choose a
different

570           methodology to realize this logical construct while still
adhering to

571           the procedures defined in this document.



[major] Please specify what an operator needs to do to

create/enable/instantiate this TRDB. E.g., the table needs to be created, an

RD needs to be assigned to it, an TC RT needs to be configured for it that
is

consistent and unique across the domain, its import policy, also I am
guessing

export policy? - I believe this is pretty much similar to how operators

provision VRFs for L3VPN services? But still it is better to specify.



585           "Transport Class" Route Target Extended Community is a
transitive

586           extended community EXT-COMM [RFC4360] of extended type, which
has the

587           format as shown in Figure 2.



[question] I see that the new RT EC is also being registered as
non-transitive.

Why is that so?



622           A BGP speaker that implements RT Constraint Route Target
Constraints

623           [RFC4684] MUST apply the RT Constraint procedures to the
Transport

624           Class Route Target Extended community as well.



[major] So, there is a need to enhance/extend the RTC implementation to

support BGP CT. It is not automatically and seamlessly applied to TC RT
without code

changes. Right? If so, please state the same since I believe RFC4684 does
not

cover this?



626           The Transport Class Route Target Extended community is
carried on

627           Classful Transport family routes and is used to associate
them with

628           appropriate TRDBs at receiving BGP speakers.



[minor] Please clarify if the TC RT ExtCom is to be limited to SAFI 76.



637        5.  Resolution Scheme



639           This section defines the Resolution Scheme construct that is
used to

640           specify how a service route or a BGP CT route can resolve its
next

641           hop using its associated Mapping Community over a specific
TRDB or an

642           ordered set of TRDBs.



[minor] Suggest to split the service route resolution and then the CT route

resolution into two separate sections for clarity.



644           Resolution Schemes enable a BGP speaker to resolve next hop

645           reachability for overlay routes over the appropriate underlay
tunnels

646           within the scope of the TRDBs identified by the Mapping
Community.



[minor] Here Mapping Community refers to something like Color ExtCom?



660           Mapping community is a "role" and not a new type of
community; any

661           BGP Community or Extended Community may play this role.  A
Mapping

662           Community maps to exactly one Resolution Scheme.



664           An example of mapping community is "color:0:100", described in

665           [RFC9012], or the "transport-target:0:100" described in
Section 4.3

666           in this document.



[minor] So far this section has been referring to overlay routes and
therefore

something like Color ExtCom. Suddenly, we see TC RT coming into the

conversation which is confusing.



668           A BGP route is associated with a resolution scheme during
import

669           processing.  The first community on the route that matches a
Mapping

670           Community of a locally configured Resolution Scheme is
considered the

671           effective Mapping Community for the route.  The Resolution
Scheme

672           thus found is used when resolving the route's PNH.  If a route

673           contains more than one Mapping Community, it indicates that
the route

674           considers these distinct Mapping Communities as equivalent in
Intent.

675           So, the first community that maps to a Resolution Scheme is
chosen as

676           the effective Mapping Community.



[major] If this is about Color ExtCom, then it conflicts with RFC9256 and

existing implementations. This may be OK for TC RT or BGP CT routes which
are

new. Is that the intention? Perhaps the confusion is due to use of the
Mapping

Community term.



704           The procedures described for AFI/SAFIs 1/4 or 1/128 and
AFI/SAFIs 2/4

705           or 2/128 in Section 2 of [RFC8277] apply for AFI/SAFI 1/76
and AFI/

706           SAFI 2/76 respectively as well.  BGP CT routes may carry
multiple

707           labels in the NLRI, by negotiating the Multiple Labels
Capability as

708           described in Section 2.1 of [RFC8277]



[major] Can the use of multiple labels with BGP CT be specified here? There

were interop issues with the use of that mechanism for BGP-LU due to

under specification and I hope we don't get into same situation with

this new SAFI.



743        Label:



745             The Label field is a 20-bit field containing an MPLS label
value

746             (see [RFC3032]).



748        Rsrv:



750             This 3-bit field SHOULD be set to zero on transmission and
MUST be

751             ignored on reception.



753        S:

754             When single label is advertised, this 1-bit field MUST be
set to

755             one on transmission and MUST be ignored on reception.



[major] Please consider just referring to RFC8227 for the label, Rsvr and

S fields. e.g., this description does not say about setting of S bit

when there are multiple labels in the stack. Better to refer to RFC8277?



788        6.1.  Carrying multiple Encapsulation Information



790           To ease interoperability between nodes supporting different

791           forwarding technologies, a BGP CT route allows carrying
multiple

792           encapsulation information.



[major] I assume the intention above is to say that a BGP CT route

can be used for different encapsulation types. However, the text gives

an impression that multiple encapsulation types may be carried

simultaneously without any specification/procedures for how that works.

i.e., NH selection, different GW metric for different encapsulation, etc.



794           An MPLS Label is carried using the encoding in [RFC8277] . A
node

795           that does not support MPLS forwarding advertises the special
label 3

796           (Implicit Null) in the RFC 8277 MPLS Label field.



[major] The above text implies that Imp-Null label indicates that MPLS

forwarding is not used. While what it means is that no label is required

to be pushed on the label stack. Imp-null is a perfectly valid label

to be advertised for its BGP CT route by the Egress PE to its

penultimate Border Router. If so, then Imp-Null label cannot be

use as an indication of not using MPLS forwarding plane. What is the

way to indicate that a specific node does not support MPLS encapsulation

for BGP CT?



802           The SRv6 SID is carried using Prefix SID attribute as
specified in

803           [RFC9252], without Transposition Scheme.  The Transposition
Length is

804           set to 0 and Transposition Offset is set to 0 to indicate
nothing is

805           transposed and that the entire SRv6 SID value is encoded in
the SID

806           Information Sub-TLV.



[major] The BGP Prefix SID attribute carries multiple TLVs and its mere

presence does not indicate the use of SRv6 SID. e.g. it is also used

for MPLS for BGP Prefix SID RFC8669. Please indicate which specific

TLV of this attribute is going to be used to indicate SRv6 dataplane

for CT. Also, what SRv6 Endpoint Behavior is used for BGP CT.



811        6.2.  Comparison with Other Families using RFC-8277 Encoding



[minor] This section is not really a part of the protocol specification

but gives the rationale for the NLRI design choices. As such, perhaps it

may be either removed or at least moved into an Appendix section?





833           In this document, SAFI 76 (BGP CT) is used instead of reusing
SAFI

834           128 (BGP VPN) for AFIs 1 or 2 to carry these transport routes
because

835           it is operationally advantageous to segregate transport and
service

836           prefixes into separate address families.  For e.g., such an
approach

837           allows operators to safely enable "per-prefix" label
allocation

838           scheme for Classful Transport prefixes, typically with a space

839           complexity of O(1K), without affecting SAFI 128 service
prefixes,

840           with a space complexity of O(1M).  The "per prefix" label
allocation

841           scheme keeps the routing churn local during topology changes.



[major] I read the above as the authors expects BGP CT deployment in
typically

small network with nodes in the order of 1000s? If this is indeed so, it
needs

to be called out clearly. The transport scale requirements in

draft-hr-spring-intentaware-routing-using-color are several orders of
magnitude

higher for current and upcoming networks.



870           For e.g., unique "RDx:EP1" prefixes can be advertised by an
SN for an

871           EP1 to different upstream BNs with unique forwarding specific

872           encapsulation (e.g., Label), in order to collect traffic
statistics

873           at the SN for each BN.  In absence of RD, duplicated
Transport Class/

874           Color values will be needed in the transport network to
achieve such

875           use cases.



[minor] The above paragraph describes a use-case or deployment design

than a protocol specification. The use-case is not described

fully long with its implications on forwarding state. More importantly,

IMHO this is not a very important or common use-case. Others may

disagree, in which case it is better that this use-case be described

with a topology and design details along with implications in detail

in an appendix and a reference added here. Another option is to simply

omit this paragraph



877           The allocation of RDs is done at the point of origin of the
BGP CT

878           route.  This can either be an Egress SN or a BN.  The default
RD

879           allocation mode is to use a unique RD per originating node
for an EP.

880           This mode allows for the ingress to uniquely identify each
originated

881           path.  Alternatively, the same RD may be provisioned for
multiple

882           originators of the same EP.  This mode can be used when the
ingress

883           does not require full visibility of all nodes originating an
EP.



[minor] These are very important considerations. Suggest to order

the different options along with its pros/cons as a bullet list.



919              Implementations MAY provide automatic generation and
assignment of

920              RD, RT values; they MAY also provide a way to manually
override

921              the automatic mechanism in order to deal with any
conflicts that

922              may arise with existing RD, RT values in different network
domains

923              participating in the deployment.



[major] How would TC RTs be automatically generated on a router when they
need

to be unique and identical across the network for a given TC? Doesn’t TC
value

need to be configured by the operator?



949              This route SHOULD NOT be advertised to the IBGP core that
contains

950              the tunnel, using policy configuration.  Impact of not
prohibiting

951              such advertisements is outside the scope of this document.



[minor] It should not be very difficult to explain in short the consequences

since the document is going into such deployment design details.



978              If the resolution process does not find a matching route
in any of

979              the associated TRDBs, the received BGP CT route MUST be
considered

980              unusable for forwarding purpose and be withdrawn.



[major] I believe it should be considered ineligible for best path

computation. If it was the only path then it would be withdrawn.



1026      7.6.  Avoiding Path Hiding Through Route Reflectors



1028            When multiple BNs exist such that they advertise a "RD:EP"
prefix

1029            to Route Reflectors (RRs), the RRs may hide all but one of
the

1030            BNs, unless ADDPATH [RFC7911] is used for the Classful
Transport

1031            family.  This is similar to L3VPN Option B scenarios.
Hence,

1032            ADDPATH SHOULD be used for Classful Transport family, to
avoid

1033            path-hiding through RRs.  This improves convergence time
when path

1034            via one of the multiple BNs fails.


[minor] Is this the "same (or duplicate as it was called previously) RD"
design

 being referred to here? Suggestion: Stick consistently to the one
recommended

deployment design in the main body of the document. Then, add in the
appendix

the other variations with their pros/cons. The issue in the document is
that there

are flip-flops between various design options all along the flow of the
text in

individual sections which affect the clarity of the spec.



1077      7.8.  Ingress Nodes Receiving Service Routes with a Mapping
Community



1079            Upon receipt of a BGP service route with a PNH that is not

1080            directly connected (e.g. an IBGP-route), a Mapping
Community on

1081            the route (e.g, Color Extended Community) is used to decide
to

1082            which resolution scheme this route is to be mapped.  The

1083            resolution scheme for a Color Extended Community with Color
"C1"

1084            contains TRDB for Transport Class with same ID, followed by
Best-

1085            Effort TRDB.  The administrator MAY customize the resolution

1086            scheme to map to a different ordered list of TRDBs.



[major] Until this point, we had a Resolution Scheme that associates a TC RT

with an ordered set of TRDBs. Now, there is another Resolution Scheme
required

that maps the Color value from the Color ExtCom to an ordered set of TRDBs.

This is not coming out clearly in the document. Therefore, the need to treat

the two types of resolutions - Overlay/Service Route Resolution and

Underlay/CT Route Resolution schemes - distinctly. On the same lines, remove

the use of the Mapping Community term and be specific on which Community is

being referred to in what context.





1154            Implementations MAY provide configuration to selectively
install

1155            BGP CT routes to the Forwarding Information Base (FIB), to
provide

1156            reachability for control plane peering towards endpoints in
other

1157            domains.



[question] Isn't the default table the TRDB for best effort and this is
usually

always programmed into the FIB?



1167         The mechanisms described in BGP MultiNexthop Attribute

1168         [MULTI-NH-ATTR] allow a BGP route to carry multiple next hop

1169         addresses.  It also allows specifying 'Transport Class ID' as a

1170         qualifier for each next hop address.



[major] Since this document is being considered for WGLC and the MNH draft
is

an individual draft, I wonder why this document is at all talking about MNH.

Can't the applicability and/or improvement to BGP CT with the use of MNH be

covered in the MNH draft? Please tighten the spec to focus on what can be

achieved with existing standards or at least WG adopted mechanisms.



1172         It should be noted that in such cases "Transport Class/Color"
can

1173         exist in multiple places on the same route, and a precedence
order

1174         needs to be established to determine which Transport Class the

1175         route's next hop should resolve over.  This document suggests
the

1176         following order of precedence, more preferred first:



1178            Transport Class ID SubTLV, in MultiNexthop Attribute.



1180            Color SubTLV, in Tunnel Encapsulation Attribute.



1182            Transport Target Extended community, on BGP CT route.



1184            Color Extended community, on BGP service route.



1186         The above precedence order follows more specific scoping of
Color to

1187         less specific scoping.



[major] I understand the motivation for the above, however, there is a

problem. This experimental draft can specify rules for BGP CT (except for
the

MNH step). It should not include such rules for other BGP services - that is

beyond the scope of this document. There is also the problem of mixing up
BGP

CT and BGP Service routes in the above order - at any point, the document

should talk about either the service route resolution or the CT route

resolution and be clear about this.



1201         Such Flowspec BGP routes with Redirect to IP next hop MAY be
attached

1202         with a Mapping Community (e.g.  Color:0:100), which allows

1203         redirecting the flow traffic over a tunnel to the IP next hop

1204         satisfying the desired SLA (e.g.  Transport Class color 100).



[major] There is again the issue of using this vague new terminology of

"mapping community" for BGP FlowSpec. Let this document stick to BGP CT and

not try to creep into specifying for BGP FlowSpec. What is being proposed
here

is conflicting with draft-ietf-idr-ts-flowspec-srv6-policy (passed WGLC)
that

has the semantics that association of Color ExtCom with BGP FlowSpec

along with a NH is an indication of redirecting into an SR Policy to that
NH

with the Color in the ColorExtCom. Therefore, this document needs to be

very specific about what Community is being referred to in each and

every context.





1290      8.3.  Limiting The Visibility Scope of PE Loopback as PNHs



1292            It may be even more desirable to limit the number of PNHs
that are

1293            globally visible in the network.  This is possible using
mechanism

1294            described in Appendix D



1296            Such that advertisement of PE loopback addresses as
next-hop in

1297            BGP service routes is confined to the region they belong
to.  An

1298            anycast IP-address called "Context Protocol Nexthop Address"

1299            (CPNH, Appendix D.3) abstracts the SNs in a region from
other

1300            regions in the network, swapping the SN scoped service
label with

1301            a CPNH scoped private namespace label.



1303            This provides much greater advantage in terms of scaling and

1304            convergence.  Changes to implement this feature are
required only

1305            on the local region's BNs and RRs.



[major] This is again a "plug-in" for a private draft that is not even
adopted

by the WG. Same as in the case of MNH, this does not seem to be integral to

the BGP CT proposal - if it were, the BGP CT work would get blocked waiting

for the adoption and progression of those other mechanisms. Suggestion:
please

move these extraneous aspects into those other individual drafts.



1307      9.  OAM Considerations



1309         MPLS OAM procedures specified in [RFC8029] also apply to BGP
Classful

1310         Transport.



1312         The 'Target FEC Stack' sub-TLV for IPv4 Classful Transport has
a Sub-

1313         Type of 31744, and a length of 13.  The Value field consists
of the

1314         RD advertised with the Classful Transport prefix, the IPv4
prefix

1315         (with trailing 0 bits to make 32 bits in all) and a prefix
length

1316         encoded as shown in below in Figure 4.



[major] How is it validated that the path taken is actually for the given TC

since there is no TC here? How much value is there to determine just the

reachability without verification of the "intent". Also, how would this work

in a network with different RD and TC RT numbering in different domains?



1373      11.  SRv6 Support



1375         This section describes how BGP CT family (e.g.  AFI/SAFI 1/76)
may be

1376         used to set up inter domain tunnels of a certain Transport
Class,

1377         when using Segment Routing over IPv6 (SRv6) data plane on the
inter-

1378         AS links or as an intra-AS tunneling mechanism.



1380         [RFC8986] specifies the SRv6 Endpoint behaviors (End USD,
End.BM,

1381         End.B6.Encaps).  [SRV6-INTER-DOMAIN] specifies the SRv6
Endpoint

1382         behaviors (END.REPLACE, END.REPLACEB6 and END.DB6).  These are

1383         leveraged for BGP CT routes with SRv6 data plane.



[major] The draft-salih is also an individual draft. Is the SRv6 support

dependent on that? If so, it may be better to take SRv6 out of scope for
this

document and add it in a separate document when the base is adopted.



1385         The BGP Classful Transport route update for SRv6 MUST include
an

1386         attribute containing SRv6 SID information.  This may be either
the

1387         BGP Prefix-SID attribute as specified in [RFC9252] or the BGP

1388         MultiNexthop attribute as specified in BGP MultiNexthop
Attribute

1389         [MULTI-NH-ATTR] Section 5.4.3.3.  If the Prefix-SID attribute
is

1390         used, it MUST NOT include SRv6 SID structure for Transposition

1391         described in [RFC9252].



[major] Same comment as before for MNH.





1538         Similarly, these transport classes are also configured on
ASBRs, ABRs

1539         and PEs with same Transport Route Target and unique RDs.



[minor] It would help to clarify what is entailed in this provisioning of
TCs

at all these routers. I believe this is similar to VRF configuration along

with all the things (RD, TC-RT, import/export policies) but then also

resolution mapping for CT over the underlay?





1693         Assuming ASBR22_to_ASBR13 link goes down, such that traffic
with Gold

1694         SLA going to PE11 needs repair.  ASBR22 has an alternate BGP
CT route

1695         for 192.0.2.11:100:192.0.2.11 from ASBR14.  This has been

1696         preprogrammed in forwarding by ASBR22 as FRR backup next hop
for

1697         label B-L4.  This allows the Gold SLA traffic to be locally
repaired

1698         at ASBR22 without the failure event propagated in the BGP CT
network.

1699         In this case, ingress node PE25 will not know there was a
failure,

1700         and traffic restoration will be independent of prefix scale
(PIC).



[major] It is misleading to call this "locally repaired" and "without
failure

event propagated in the BGP CT network" when the impact results in network
wide

 churn in BGP CT control plane. Sure, the data plane is undergoing FRR, but

there is still the job for BGP CT control plane to withdraw that route
which

was previously announced all across the network. I am referring to the

recommended design option. There may be other ways to implement, but

the document keeps flip-flopping between these options and does not

provide a full picture with implications of each design option.





1723         Traffic repair to absorb the failure happens at ingress node
PE25, in

1724         a service prefix scale independent manner.  This is called PIC

1725         (Prefix scale Independent Convergence).  The repair time will
be

1726         proportional to time taken for withdrawing the BGP CT route.



[major] This is major implications in the NH resolution that have not been

covered in this document so far. I get the usual BGP PIC where we have a

primary NH and then a less preferred backup NH. However, both of them are
from

the same TRDB. Now, this is bringing up a new requirement to resolve for the

same BGP Service Route in the fallback TRDB even when there is a path in the

 preferred TRDB. Now, there may be a "classical" backup in the preferred
TRDB

as well - how does one decide where to pick the backup from? This does not

seem to fit into the resolution mapping construct described previously.





1745         This section describes how BGP CT is deployed in such
scenarios to

1746         preserve end to end Intent.  Example described in this section
use

1747         Inter-AS Option C domains.  But similar mechanisms will work
for

1748         Inter-AS Option A and Inter-AS Option B scenarios as well.



[major] I don’t follow what role BGP CT has to play in option A or B since

transport is intra-AS?



1750      13.1.1.  Service Layer Color Management



1752         At the service layer, it is recommended that a global color
namespace

1753         be maintained across multiple co-operating domains.  BGP CT
allows

1754         indirection using resolution schemes to be able to maintain a
global

1755         namespace in the service layer.  This is possible even if each
domain

1756         independently maintains its own local transport color
namespace.



[major] Not sure that I follow this. It is ok and possible to maintain a

global namespace for Color that indicates intent at service level but it is

not possible to do the same for TC which indicates intent at transport
level.

Aren't they going to be the same intent?



1758         As explained in next hop Resolution Scheme (Section 5) ,
mapping

1759         community carried on service route maps to a resolution
scheme.  The

1760         mapping community values for the service route can be abstract
and

1761         does not require to match the transport color namespace.  This

1762         abstract mapping community value representing a global service
layer

1763         intent is mapped to a local transport layer intent available
in each

1764         domain.



1766         In this manner, it is recommended to keep color namespace
management

1767         at the service layer and the transport layer decoupled from
each

1768         other.  In the following sections the service layer agrees on a

1769         single global namespace.



[major] This is again conflicting. Why would an operator not keep Color = TC

(as has been mentioned in a few examples/designs in this document)? And then

both are consistent. If there are scenarios which make it difficult to

maintain a network wide TC namespace, then the same should also apply to the

Color namespace for services.



1771      13.1.2.  Non-Agreeing Color Transport Domains



1773         Non-agreeing color domains require a mapping community rewrite
on

1774         each domain boundary.  This rewrite helps to map one domain's

1775         namespace to another.



[major] True, but more importantly they also need a similar mapping for the

Color ExtCom for service routes as well. Right?



1777         The below example illustrates how traffic is stitched and SLA
is

1778         preserved when domains don't use the same namespace at the
transport

1779         layer.  Each domain specifies the same SLA using different
color

1780         values.



1782                  Gold(100)              Gold(300)
Gold(500)



1784
[PE11]----[ASBR11]---[ASBR21------[ASBR22]---[ASBR31-------[PE31]

1785                    AS1                     AS2                    AS3



1787                  Bronze(200)          Bronze(400)
Bronze(600)



1789                    ----------- Packet Forwarding Direction -------->



1791              Figure 7: Transport Layer with Non-agreeing Color Domains



1793         In the above topology shown in Figure 7, we have three
Autonomous

1794         Systems.  All the nodes in the topology supports BGP CT.



1796         In AS1 Gold SLA is represented by color 100 and Bronze by 200.



1798         In AS2 Gold SLA is represented by color 300 and Bronze by 400.



1800         In AS3 Gold SLA is represented by color 500 and Bronze by 600.



1802         Though the color values are different, they map to tunnels with

1803         sufficiently similar TE characteristics in each domain.



[major] The use of "color" in this section seems to actually mean TC. Why is

that being used when until now it was all about TC? Looks like there is a
need

to review the use of the term "color" here and replace with TC where

appropriate. My understanding is that "color" is only what is there in the

Color ExtCom or SR Policy.



1805         The service route carries an abstract mapping community that
maps to

1806         the required SLA.  For example, Service routes that need to
resolve

1807         over gold transport tunnels, carries a mapping community

1808         color:0:100500.  In AS3 it maps to a resolution scheme
containing

1809         TRDB with color 500 whereas in AS2 it maps to a TRDB with
color 300

1810         and in AS1 it maps to a TRDB with color 100.  Co-ordination is
needed

1811         to provision the resolution schemes in each domain as explained

1812         above.



[major] Very confusing! How does TRDB get a color suddenly?



1827         Transport-target re-write requires co-ordination of color
values

1828         between domains in the transport layer.  This method avoids
the need

1829         to re-write service route mapping community, keeping the
service

1830         layer homogenous and simple to manage.  Coordinating Transport
Class

1831         RT between adjacent domains is easier than coordinating
service layer

1832         colors deployed in various non-adjacent domains.



[major] Right! So, if the operator has to do the hard thing at service level

which is network wide, why would they not keep the same consistency in the
TC

namespace at the transport layer which is (as you say) a much simpler task!

And, then is the notion of "non-cooperating" color domains probably

only theoretical?





1876         These tunnels will be installed in TRDBs corresponding to
transport

1877         classes of color 101, 102.



[major] Please see previous comment in Sec 4. There are aspects such as the

 same tunnel being added into more than one TRDB that need to be clearly

specified which are buried within such use-cases and solution

descriptions which would be hard to ensure consistent implementations. Since

this document is primarily a protocol spec, it would be good if all such

aspects are specified clearly in the "normative" portion of the document.



1879         Service routes received with mapping community (eg:
transport-target

1880         or color community) can resolve over these tunnels in the TRDB
with

1881         matching color by using resolution schemes.



[major] I believe Transport RT is for BGP CT and not Service Routes. The use

of this term "mapping community" is very confusing again.



1883         This approach consumes more resources in the transport and
forwarding

1884         layer, because of the duplicate tunnels.



1886      13.1.3.2.  Customized Resolution Schemes Approach



[major] This section brings about a lot of rather complicated requirements
for

the resolution scheme/mapping, import/export policies and TRDB construct.
Yet,

these are not covered in an appropriate normative manner in the previous

sections where those constructs were introduced. My concern with this spec,
is

that it is large and discusses key spec details embedded within such

descriptive use-cases and design options which makes it difficult for

implementors to get things right.



2058      13.2.2.1.  Interop Between MPLS and SRv6 Nodes.



2060         BGP speakers may carry MPLS label and SRv6 SID in BGP CT SAFI
76 for

2061         AFIs 1 or 2 routes using protocol encoding as described in
Carrying

2062         Multiple Encapsulation information (Section 6.1)



[major] There is no description of procedures for how both MPLS and SRv6
are

carried together for the same CT Route.



2064         MPLS Labels are carried using RFC 8277 encoding, and SRv6 SID
is

2065         carried using Prefix SID attribute as specified in [RFC9252].



[major] RFC9252 does not specify carrying both MPLS label and SRv6 SID

together - it only covers carrying SRv6 SID.





2106         R1 and R4 send and receive SRv6 SID in the BGP CT control plane

2107         routes using BGP Prefix-SID attribute, without Transposition
Scheme.

2108         This allows them to be ingress and egress for SRv6 data
plane.  R4

2109         will carry the special MPLS Label with value 3 (Implicit-NULL)
in RFC

2110         8277 encoding, which tells R1 not to push any MPLS label
towards R4.

2111         The MPLS Label advertised by R1 in RFC 8277 NLRI will not be
used by

2112         R4.  SRv6 forwarding will be used between R1 and R4.



[major] As mentioned in previous comment, impl-null is a valid label for

BGP-CT and cannot be used to indicate a lack of presence of MPLS label. How
does

R1 know that it cannot send MPLS labeled to R4 vs it can send MPLS labeled
to R4

but it does not need to slap a label for BGP CT?





2161         R1 and R4 send and receive UDP tunneling info in the BGP CT
control

2162         plane routes using BGP TEA attribute.  This allows them to be
ingress

2163         and egress for UDP tunneled data plane.  R4 will carry special
MPLS

2164         Label with value 3 (Implicit-NULL) in RFC 8277 encoding, which
tells

2165         R1 not to push any MPLS label towards R4.  The MPLS Label
advertised

2166         by R1 will not be used by R4.  UDP tunneled forwarding will be
used

2167         between R1 and R4.



[major] Same as previous comment; applies for UDP as well.



2174      13.3.  Managing Transport Route Visibility



2176         This section details the usage of BGP CT RD and label
allocation

2177         modes to calibrate the level of path visibility and the amount
of

2178         route churn in a multi-domain network.



[minor] I don't follow the use of the term "route churn"; did you mean
"route

and label scale" instead? There is no description in the document for how

routes churn due to various events such as failures or metric changes for

the various design options and flavors.



2211               +--------+------+-------+-------+---------+---------+

2212               |EP-type |Origin|RD-Mode|PP-Mode|CT Routes|CT Labels|

2213               +--------+------+-------+-------+---------+---------+

2214               |Unicast |SN    |Unique |TC,EP  |    16   |    8    |

2215               |Unicast |SN    |Unique |RD,EP  |    16   |   16    |

2216               |Unicast |BN    |Unique |TC,EP  |    16   |    8    |

2217               |Unicast |BN    |Unique |RD,EP  |    16   |   16    |

2218               |--------|------|-------|-------|---------|---------|

2219               |Anycast |SN    |Unique |TC,EP  |    16   |    2    |

2220               |Anycast |SN    |Unique |RD,EP  |    16   |   16    |

2221               |Anycast |SN    |Same   |TC,EP  |     2   |    2    |

2222               |Anycast |SN    |Same   |RD,EP  |     2   |    2    |

2223               |Anycast |BN    |Unique |TC,EP  |     4   |    2    |

2224               |Anycast |BN    |Unique |RD,EP  |     4   |    4    |

2225               |Anycast |BN    |Same   |TC,EP  |     2   |    2    |

2226               |Anycast |BN    |Same   |RD,IP  |     2   |    2    |

2227               +--------+------+-------+-------+---------+---------+



2229                  Figure 13: Route and Path Visibility at Ingress Node



[minor] What is PP-Mode?



2422      14.4.  Best Effort Transport Class ID



2424         This document reserves the Transport class ID value 0 to
represent

2425         "Best Effort Transport Class ID".  This is used in the
'Transport

2426         Class ID' field of Transport Route Target extended community
that

2427         represents best effort transport class.  Please create a new
registry

2428         for this.



2430          Registry Group: BGP CT Parameters



2432          Registry Name: Transport Class ID



2434           Value                 Name

2435          -----------------+--------------------------------

2436            0                Best Effort Transport Class ID



[major] The protocol spec can by itself associate the

"best-effort" semantics for the value 0. We don’t need registry for this.





2632      16.2.  Informative References



[major] Many of these informative references are actually normative

because the spec relies on them to provide key functionality. There

are a couple of options (a) remove those aspects (if they are optional)

and move into those other documents so as not to block this document

or (b) keep them in normative section and wait for their progression.





2733      A.1.  Signaling Intent over PE-CE Attachment Circuit



[major] This entire section (and its subsections) do not belong in this

document and can be moved in to the MNH draft.





2856      A.2.  BGP CT Egress TE



2858         Mechanisms described in [BGP-LU-EPE] also applies to BGP CT
family.



2860         The Peer/32 or Peer/128 EPE route MAY be originated in BGP CT
family

2861         with appropriate Mapping Community (e.g.
transport-target:0:100),

2862         thus allowing an EPE path to the peer that satisfies the
desired SLA.



[major] This section also does not belong to this document similarly, it can

be moved into the individual BGP-LU-EPE draft.



2864      Appendix B.  Applicability to Intra-AS and different Inter-AS

2865                   deployments.



[major] The entire Appendix B has got nothing to do with BGP-CT. The
use-cases

described below have been realized in networks and implementations using

mechanisms other than TRDB. Keeping this section (even if in the appendix)

gives an impression that this is something provided by BGP-CT infra. While

this is just a local implementation matter. Please remove appendix B.





3206         This is a more pragmatic approach, rather than abandoning time
tested

3207         design pattern like RFC 4364 and RFC 8277, just to invent
something

3208         completely new that is not backward compatible with existing

3209         deployments.  Overloading RFC 8277 NLRI MPLS Label field with

3210         information related to non MPLS data plane leads to backward

3211         compatibility issues.



[major] I assume all of the above in Appedix C is really no longer required

 to be in this document?



3225         The document Intent-aware Routing using Color
[Intent-Routing-Color]

3226         Section 6.3.2.1 suggests scaling numbers for transport network
where

3227         BGP CT can be deployed.  Experiments were conducted with this
scale

3228         to find the convergence time with BGP CT for those scaling
numbers.

3229         Scenarios involving BGP CT carrying IPv4 and IPv6 endpoints
with MPLS

3230         label, and IPv6 endpoints with SRv6 SID were tested.



[question] Can you please add the case where RFC8669 is used and revaluate?



3246      Appendix D.  Scaling using BGP MPLS Namespaces



[major] The entire Appendix D needs to be removed from this document;
perhaps

move it into the MPLS private namespaces individual draft.





3433      Appendix E.  BGP CT deployment in SRv6 networks



3435         This section describes BGP CT deployment in SRv6 multi-domain
network

3436         using Inter-AS Option C architecture.



3438      E.1.  SID stacking approach



[major] The section E.1 and its subsections should be removed from this

document and moved to the SRv6-inter-domain document.





3741      E.2.  Color-encoded Service SID (CPR) Approach



[major] This section E.2 and its subsections can also be removed; not sure

what is the motivation for putting that comparison in this document.



[ end of review ]