Re: [Idr] WG LC for draft-ietf-idr-bgp-ct-09.txt (6/26 to 7/17) - extending call to 7/23
Ketan Talaulikar <ketant.ietf@gmail.com> Sat, 22 July 2023 19:39 UTC
Return-Path: <ketant.ietf@gmail.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E0432C15152B for <idr@ietfa.amsl.com>; Sat, 22 Jul 2023 12:39:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.094
X-Spam-Level:
X-Spam-Status: No, score=-2.094 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9-mIvKEW2D9H for <idr@ietfa.amsl.com>; Sat, 22 Jul 2023 12:38:59 -0700 (PDT)
Received: from mail-ej1-x631.google.com (mail-ej1-x631.google.com [IPv6:2a00:1450:4864:20::631]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EB62DC14CE46 for <idr@ietf.org>; Sat, 22 Jul 2023 12:38:58 -0700 (PDT)
Received: by mail-ej1-x631.google.com with SMTP id a640c23a62f3a-99454855de1so446323066b.2 for <idr@ietf.org>; Sat, 22 Jul 2023 12:38:58 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690054737; x=1690659537; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=8O91gRqU0JKPrMk76tfpqR8TmIRuott6rqXwr0sDDqI=; b=l4aF9vzskZxjSW0fDsVxpjGbj/TwMHNpcdgUUvm8SN5nR1Y5sJ1H78Qi59S/iyuTiV neRCOFOdqMloZPZ66XPXatKECCzkeLexLru5W8QeBdOg//Xlnr4kf57ZFwuuURhQeo86 j1sLBH9y8gnbc4IwAojkNA+poKmfeHW0rigF7KJIZG4vj1yYt19FkX0imb+oZ8bfoW38 10xtA+uFqHdN8+7ydBWwRaxO6K226CvJWHcFvuyuCip+7IToxZM4Gwv0EQRzHwkJvQV0 WpDZa5apJKBId5Hp8PwbLtV3xortcr8kxNmNdrQj3gzBME21t1sK2aYCYmmnl3kqS/wi mRmA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690054737; x=1690659537; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8O91gRqU0JKPrMk76tfpqR8TmIRuott6rqXwr0sDDqI=; b=GmWsyhRl2scnj2wYDt9H/0HxaGiSwtDfWK4TW0WSfvBLf8Zr868equxXfbx20MZqo/ 0Q2xMAJ8cZtjwmtEumrqNPJClPslerankV/kLJraZQdB+fId+4RGH+q2Iwj5eLfPIh4x e645eNu9BR/qL/M77iyxE+a9DHQvxOyLyWByP01UL0zPvCRUcBGTWP689t842ND10DZV U9n82YFqC4l2z/ygREzD/JVARmK2ozQ6BU2J5MmTkttUfO1+MvMLK40jl5TfZ6QC32aJ 7WLTAO2rUGQKgSkkQChTYah6p0BLe4m5LDaNKL75p11ymnlUM8vanE3w7GBXg5UwZSN9 y0Zg==
X-Gm-Message-State: ABy/qLYowYSHAKZahAMHpu2YJVe02tzpNgAtDOEqXMi/+YStYdhNlam2 U2ihmi4NDkjrXmCK3NURwRoF3y/NTA01pLJnvVov8LeN+G8euA==
X-Google-Smtp-Source: APBJJlHbKTBpGaS1xBhU5RfSLRbGj73DkECuTJ6LF81DflrCtOk1akd6Ck1H3vgc+7SJIV4K7Yo8QIN6ApPvPu58JkQ=
X-Received: by 2002:a17:906:cc49:b0:982:8c28:c50d with SMTP id mm9-20020a170906cc4900b009828c28c50dmr5403559ejb.58.1690054736434; Sat, 22 Jul 2023 12:38:56 -0700 (PDT)
MIME-Version: 1.0
References: <BYAPR08MB487225C256745CDDA29C57C1B33BA@BYAPR08MB4872.namprd08.prod.outlook.com>
In-Reply-To: <BYAPR08MB487225C256745CDDA29C57C1B33BA@BYAPR08MB4872.namprd08.prod.outlook.com>
From: Ketan Talaulikar <ketant.ietf@gmail.com>
Date: Sat, 22 Jul 2023 12:38:44 -0700
Message-ID: <CAH6gdPww7cFoiK66jtWzF_RkDPetUDV5UoZSwszLZ4dbe1pn3w@mail.gmail.com>
To: Susan Hares <shares@ndzh.com>
Cc: "idr@ietf.org" <idr@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000003758ed06011888ec"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/q1PBTKAnlsuyjYwd4fwpAgCEssI>
Subject: Re: [Idr] WG LC for draft-ietf-idr-bgp-ct-09.txt (6/26 to 7/17) - extending call to 7/23
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Jul 2023 19:39:04 -0000
Hi All, This is a detailed review of the draft as asked for by Sue. Summary: 1. The document has improved with the multiple recent versions; thanks to the authors. I hope the comments and suggestions below help improve the document further. 2. I have provided comments to cross check and fix the inconsistent and confusing use of terminologies (e.g., mapping community, color vs transport class, tunnel vs encapsulation, etc.) and the use of some new terminologies while we have some existing well-known ones. 3. The document can be trimmed significantly by removing the text related to functionality that are provided by other individual drafts (e.g., MNH, MPLS private labels, SRv6 inter-domain, etc.) to their respective drafts. While those are being referred to as informative references, that is incorrect since the functionality described cannot be realized without those features – therefore normative. Not to mention, it is always better to provide a precise document that focuses on the specification (even if experimental) to the IESG. 4. The draft does not cover the use of BGP CT with SRv6 on its own without dependencies on features in other individual drafts. There are also some details missing. One option may be to remove SRv6 at this point and have a separate document for it when ready. 5. There are some technical issues identified which need to be fixed. 6. I’ve also provided some suggestions and minor comments or questions. 7. The document has many warnings and some errors as reported by IDnits - these should be easy to fix. There are also spelling and grammatical errors which can be identified and fixed. I've focused only on the technical aspects. Thanks, Ketan The detailed review is below as comments with IDnits as reference. 24 This document specifies protocol procedures for BGP that enable 25 dissemination of such service mapping information in a network that 26 may span multiple cooperating administrative domains. These domains 27 may be administered either by the same provider or by closely 28 coordinating providers. A new BGP address family that leverages [minor] Coordinating providers do provide service spanning their domains (ASes) but I believe their transport reachability is limited to their domains. Since we are talking of transport, perhaps what is meant here is a single AS or multiple ASes that are under a single administrative control? 215 The constructs and procedures defined in this document apply 216 homogenously to Intra-AS as well as Inter-AS Option A, Option B and 217 Option C style deployments in provider networks. [major] From the transport perspective, for both Inter-AS Option A & B, there is no requirement for inter-AS transport reachability. The classful transport routes are essentially intra-AS. Therefore, the BGP-CT provides a classful transport solution for Intra-AS and Inter-AS Option C deployments. As I’ve stated in further comments, the authors seem to associate the concepts of TRDB and such internal implementation constructs that they have introduced for BGP-CT as something to “standardize”. This is clearly not so as implementations should be free to realize them in other ways – therefore, where there is no role to play for BGP-CT, there is nothing then needs to be said in this document. 227 The mechanisms defined in this document are agnostic to the tunneling 228 technologies. These can be applied homogenously to intra-domain 229 tunneling technologies used in brownfield networks (e.g. MPLS 230 Traffic Engineering) as well as greenfield networks (e.g. Segment 231 Routing). [minor] The terms like brownfield and greenfield are subjective. Can we avoid such usage? I don't believe SR networks are greenfield anymore and some operators may choose to use MPLS-TE for their new/greenfield networks. 265 SN : Service Node 267 eSN : Egress Service Node 269 iSN : Ingress Service Node [minor] Suggest to use the term PE instead of Service Node as normally used. If it is different from PE then please clarify what is different. 271 BN : Border Node [minor] Isn't BN a ABR/ASBR? If so, please consider using that term. 273 TN : Transport Node, P-router [minor] Consider use the term P-router instead of introducing this new TN term. 294 PNH : Protocol Next hop address carried in a BGP Update message [minor] Consider using BGP NH instead of PNH as is the usual practice unless it is different for some reason. 308 EP : Endpoint, a loopback address in the network 310 SEP : Service Endpoint, the PNH of a Service route [minor] The terms EP and SEP are hardly used. Consider using "loopback" instead or simply Prefix as appropriate for that context. 334 Transport Family : BGP address family used for advertising tunnels, 335 which are in turn used by service routes for resolution (e.g. AFI/ 336 SAFIs 1/4 or 1/76). [minor] Shouldn’t this include AFI/SAFI 2/1 that can be used as transport for SRv6? 338 Transport Tunnel : A tunnel over which a service may place traffic 339 (e.g. GRE, UDP, LDP, RSVP-TE, IGP FLEX-ALGO or SRTE). [major] There is the notion of encapsulation and a notion of tunnel. A tunnel is often modeled as an interface. In the above list, IGP SR or IGP Flex-Algo is not really a tunnel but provides an encapsulation - MPLS or SRv6. RSVP-TE is a tunnel. Can the usage of the term "tunnel" be reviewed and perhaps replace with encapsulation where appropriate? Another option is to use the term "path" instead of tunnel. 356 Transport Route Database (TRDB): At the SN and BN, a Transport Class 357 has an associated Transport Route Database that collects its tunnel 358 ingress routes. [question] Does TRDB contain only the "tunnel" routes or also the CT routes? 363 Mapping Community : Any BGP Community/Extended-community on a BGP 364 route that maps to a Resolution Scheme. E.g. color:0:100, transport- 365 target:0:100. [major] A new terminology "Mapping Community" is introduced which is not an actual BGP community and it implies different types of communities in different context. This is making the spec confusing and ambiguous. I will point a few instances in further comments. Suggestion: Do away with this terminology and instead direct call out the actual type of community in the specific context in the document. 421 Figure 1, depicts the intra-AS and inter-AS application of these 422 constructs. [minor] Suggestion: introduce the base reference figure first; then add the new constructs and CT on top in additional figures after having introduced them. This would be easier for a reader to comprehend. 452 Overlay routes carry sufficient indication of the desired Transport 453 Classes using a BGP community which assumes the role of as a "Mapping 454 community". A Resolution Scheme is identified by its "Mapping 455 Community", where its configuration can either be auto-generated or 456 done manually. [question] So, is the "resolution scheme" applicable to overlay routes or CT routes or both? This is confusing because mapping community for overlay route the Color ExtComm and for CT routes is Transport RT and they are different. Using the same term "mapping community" for both is problematic in such contexts. 489 A Transport Class is identified by a unique 32-bit "Transport Class" 490 identifier, that is assigned by the operator. An operator may [major] Is this 32-bit TC unique on a given node or across the network (one or more AS)? I assume it is normally unique across the multi-domain network since it is getting encoded into the TC RT but in some cases it can be per-domain with mapping/rewriting across domain boundaries. This is covered further in the document but is missing here. This is a general issue that I see in this document, where the formal specification portions of the text are not detailed normatively but they are buried in verbose use-case or deployment design related sections. 491 configure an SN/BN to classify a tunnel into an appropriate Transport 492 Class. How exactly these tunnels are made Transport Class aware is 493 implementation specific and outside the scope of this document. [question] Can the same "tunnel" be part of two TCs? Please check my comment on sec 13.1.3.1 for further information. 522 [RFC8664] extends Path Computation Element Communication Protocol 523 (PCEP) to carry SRTE Color. This color association learnt from PCEP [major] RFC8664 does not talk about color. 565 An implementation may realize the TRDB for e.g., as a "Routing Table" 566 referred in Section 9.1.2.1 of RFC4271 (https://www.rfc- 567 editor.org/rfc/rfc4271#section-9.1.2.1) which is "only" used for 568 resolving nexthop reachability in control plane with no footprint in 569 forwarding plane. However, an implementation may choose a different 570 methodology to realize this logical construct while still adhering to 571 the procedures defined in this document. [major] Please specify what an operator needs to do to create/enable/instantiate this TRDB. E.g., the table needs to be created, an RD needs to be assigned to it, an TC RT needs to be configured for it that is consistent and unique across the domain, its import policy, also I am guessing export policy? - I believe this is pretty much similar to how operators provision VRFs for L3VPN services? But still it is better to specify. 585 "Transport Class" Route Target Extended Community is a transitive 586 extended community EXT-COMM [RFC4360] of extended type, which has the 587 format as shown in Figure 2. [question] I see that the new RT EC is also being registered as non-transitive. Why is that so? 622 A BGP speaker that implements RT Constraint Route Target Constraints 623 [RFC4684] MUST apply the RT Constraint procedures to the Transport 624 Class Route Target Extended community as well. [major] So, there is a need to enhance/extend the RTC implementation to support BGP CT. It is not automatically and seamlessly applied to TC RT without code changes. Right? If so, please state the same since I believe RFC4684 does not cover this? 626 The Transport Class Route Target Extended community is carried on 627 Classful Transport family routes and is used to associate them with 628 appropriate TRDBs at receiving BGP speakers. [minor] Please clarify if the TC RT ExtCom is to be limited to SAFI 76. 637 5. Resolution Scheme 639 This section defines the Resolution Scheme construct that is used to 640 specify how a service route or a BGP CT route can resolve its next 641 hop using its associated Mapping Community over a specific TRDB or an 642 ordered set of TRDBs. [minor] Suggest to split the service route resolution and then the CT route resolution into two separate sections for clarity. 644 Resolution Schemes enable a BGP speaker to resolve next hop 645 reachability for overlay routes over the appropriate underlay tunnels 646 within the scope of the TRDBs identified by the Mapping Community. [minor] Here Mapping Community refers to something like Color ExtCom? 660 Mapping community is a "role" and not a new type of community; any 661 BGP Community or Extended Community may play this role. A Mapping 662 Community maps to exactly one Resolution Scheme. 664 An example of mapping community is "color:0:100", described in 665 [RFC9012], or the "transport-target:0:100" described in Section 4.3 666 in this document. [minor] So far this section has been referring to overlay routes and therefore something like Color ExtCom. Suddenly, we see TC RT coming into the conversation which is confusing. 668 A BGP route is associated with a resolution scheme during import 669 processing. The first community on the route that matches a Mapping 670 Community of a locally configured Resolution Scheme is considered the 671 effective Mapping Community for the route. The Resolution Scheme 672 thus found is used when resolving the route's PNH. If a route 673 contains more than one Mapping Community, it indicates that the route 674 considers these distinct Mapping Communities as equivalent in Intent. 675 So, the first community that maps to a Resolution Scheme is chosen as 676 the effective Mapping Community. [major] If this is about Color ExtCom, then it conflicts with RFC9256 and existing implementations. This may be OK for TC RT or BGP CT routes which are new. Is that the intention? Perhaps the confusion is due to use of the Mapping Community term. 704 The procedures described for AFI/SAFIs 1/4 or 1/128 and AFI/SAFIs 2/4 705 or 2/128 in Section 2 of [RFC8277] apply for AFI/SAFI 1/76 and AFI/ 706 SAFI 2/76 respectively as well. BGP CT routes may carry multiple 707 labels in the NLRI, by negotiating the Multiple Labels Capability as 708 described in Section 2.1 of [RFC8277] [major] Can the use of multiple labels with BGP CT be specified here? There were interop issues with the use of that mechanism for BGP-LU due to under specification and I hope we don't get into same situation with this new SAFI. 743 Label: 745 The Label field is a 20-bit field containing an MPLS label value 746 (see [RFC3032]). 748 Rsrv: 750 This 3-bit field SHOULD be set to zero on transmission and MUST be 751 ignored on reception. 753 S: 754 When single label is advertised, this 1-bit field MUST be set to 755 one on transmission and MUST be ignored on reception. [major] Please consider just referring to RFC8227 for the label, Rsvr and S fields. e.g., this description does not say about setting of S bit when there are multiple labels in the stack. Better to refer to RFC8277? 788 6.1. Carrying multiple Encapsulation Information 790 To ease interoperability between nodes supporting different 791 forwarding technologies, a BGP CT route allows carrying multiple 792 encapsulation information. [major] I assume the intention above is to say that a BGP CT route can be used for different encapsulation types. However, the text gives an impression that multiple encapsulation types may be carried simultaneously without any specification/procedures for how that works. i.e., NH selection, different GW metric for different encapsulation, etc. 794 An MPLS Label is carried using the encoding in [RFC8277] . A node 795 that does not support MPLS forwarding advertises the special label 3 796 (Implicit Null) in the RFC 8277 MPLS Label field. [major] The above text implies that Imp-Null label indicates that MPLS forwarding is not used. While what it means is that no label is required to be pushed on the label stack. Imp-null is a perfectly valid label to be advertised for its BGP CT route by the Egress PE to its penultimate Border Router. If so, then Imp-Null label cannot be use as an indication of not using MPLS forwarding plane. What is the way to indicate that a specific node does not support MPLS encapsulation for BGP CT? 802 The SRv6 SID is carried using Prefix SID attribute as specified in 803 [RFC9252], without Transposition Scheme. The Transposition Length is 804 set to 0 and Transposition Offset is set to 0 to indicate nothing is 805 transposed and that the entire SRv6 SID value is encoded in the SID 806 Information Sub-TLV. [major] The BGP Prefix SID attribute carries multiple TLVs and its mere presence does not indicate the use of SRv6 SID. e.g. it is also used for MPLS for BGP Prefix SID RFC8669. Please indicate which specific TLV of this attribute is going to be used to indicate SRv6 dataplane for CT. Also, what SRv6 Endpoint Behavior is used for BGP CT. 811 6.2. Comparison with Other Families using RFC-8277 Encoding [minor] This section is not really a part of the protocol specification but gives the rationale for the NLRI design choices. As such, perhaps it may be either removed or at least moved into an Appendix section? 833 In this document, SAFI 76 (BGP CT) is used instead of reusing SAFI 834 128 (BGP VPN) for AFIs 1 or 2 to carry these transport routes because 835 it is operationally advantageous to segregate transport and service 836 prefixes into separate address families. For e.g., such an approach 837 allows operators to safely enable "per-prefix" label allocation 838 scheme for Classful Transport prefixes, typically with a space 839 complexity of O(1K), without affecting SAFI 128 service prefixes, 840 with a space complexity of O(1M). The "per prefix" label allocation 841 scheme keeps the routing churn local during topology changes. [major] I read the above as the authors expects BGP CT deployment in typically small network with nodes in the order of 1000s? If this is indeed so, it needs to be called out clearly. The transport scale requirements in draft-hr-spring-intentaware-routing-using-color are several orders of magnitude higher for current and upcoming networks. 870 For e.g., unique "RDx:EP1" prefixes can be advertised by an SN for an 871 EP1 to different upstream BNs with unique forwarding specific 872 encapsulation (e.g., Label), in order to collect traffic statistics 873 at the SN for each BN. In absence of RD, duplicated Transport Class/ 874 Color values will be needed in the transport network to achieve such 875 use cases. [minor] The above paragraph describes a use-case or deployment design than a protocol specification. The use-case is not described fully long with its implications on forwarding state. More importantly, IMHO this is not a very important or common use-case. Others may disagree, in which case it is better that this use-case be described with a topology and design details along with implications in detail in an appendix and a reference added here. Another option is to simply omit this paragraph 877 The allocation of RDs is done at the point of origin of the BGP CT 878 route. This can either be an Egress SN or a BN. The default RD 879 allocation mode is to use a unique RD per originating node for an EP. 880 This mode allows for the ingress to uniquely identify each originated 881 path. Alternatively, the same RD may be provisioned for multiple 882 originators of the same EP. This mode can be used when the ingress 883 does not require full visibility of all nodes originating an EP. [minor] These are very important considerations. Suggest to order the different options along with its pros/cons as a bullet list. 919 Implementations MAY provide automatic generation and assignment of 920 RD, RT values; they MAY also provide a way to manually override 921 the automatic mechanism in order to deal with any conflicts that 922 may arise with existing RD, RT values in different network domains 923 participating in the deployment. [major] How would TC RTs be automatically generated on a router when they need to be unique and identical across the network for a given TC? Doesn’t TC value need to be configured by the operator? 949 This route SHOULD NOT be advertised to the IBGP core that contains 950 the tunnel, using policy configuration. Impact of not prohibiting 951 such advertisements is outside the scope of this document. [minor] It should not be very difficult to explain in short the consequences since the document is going into such deployment design details. 978 If the resolution process does not find a matching route in any of 979 the associated TRDBs, the received BGP CT route MUST be considered 980 unusable for forwarding purpose and be withdrawn. [major] I believe it should be considered ineligible for best path computation. If it was the only path then it would be withdrawn. 1026 7.6. Avoiding Path Hiding Through Route Reflectors 1028 When multiple BNs exist such that they advertise a "RD:EP" prefix 1029 to Route Reflectors (RRs), the RRs may hide all but one of the 1030 BNs, unless ADDPATH [RFC7911] is used for the Classful Transport 1031 family. This is similar to L3VPN Option B scenarios. Hence, 1032 ADDPATH SHOULD be used for Classful Transport family, to avoid 1033 path-hiding through RRs. This improves convergence time when path 1034 via one of the multiple BNs fails. [minor] Is this the "same (or duplicate as it was called previously) RD" design being referred to here? Suggestion: Stick consistently to the one recommended deployment design in the main body of the document. Then, add in the appendix the other variations with their pros/cons. The issue in the document is that there are flip-flops between various design options all along the flow of the text in individual sections which affect the clarity of the spec. 1077 7.8. Ingress Nodes Receiving Service Routes with a Mapping Community 1079 Upon receipt of a BGP service route with a PNH that is not 1080 directly connected (e.g. an IBGP-route), a Mapping Community on 1081 the route (e.g, Color Extended Community) is used to decide to 1082 which resolution scheme this route is to be mapped. The 1083 resolution scheme for a Color Extended Community with Color "C1" 1084 contains TRDB for Transport Class with same ID, followed by Best- 1085 Effort TRDB. The administrator MAY customize the resolution 1086 scheme to map to a different ordered list of TRDBs. [major] Until this point, we had a Resolution Scheme that associates a TC RT with an ordered set of TRDBs. Now, there is another Resolution Scheme required that maps the Color value from the Color ExtCom to an ordered set of TRDBs. This is not coming out clearly in the document. Therefore, the need to treat the two types of resolutions - Overlay/Service Route Resolution and Underlay/CT Route Resolution schemes - distinctly. On the same lines, remove the use of the Mapping Community term and be specific on which Community is being referred to in what context. 1154 Implementations MAY provide configuration to selectively install 1155 BGP CT routes to the Forwarding Information Base (FIB), to provide 1156 reachability for control plane peering towards endpoints in other 1157 domains. [question] Isn't the default table the TRDB for best effort and this is usually always programmed into the FIB? 1167 The mechanisms described in BGP MultiNexthop Attribute 1168 [MULTI-NH-ATTR] allow a BGP route to carry multiple next hop 1169 addresses. It also allows specifying 'Transport Class ID' as a 1170 qualifier for each next hop address. [major] Since this document is being considered for WGLC and the MNH draft is an individual draft, I wonder why this document is at all talking about MNH. Can't the applicability and/or improvement to BGP CT with the use of MNH be covered in the MNH draft? Please tighten the spec to focus on what can be achieved with existing standards or at least WG adopted mechanisms. 1172 It should be noted that in such cases "Transport Class/Color" can 1173 exist in multiple places on the same route, and a precedence order 1174 needs to be established to determine which Transport Class the 1175 route's next hop should resolve over. This document suggests the 1176 following order of precedence, more preferred first: 1178 Transport Class ID SubTLV, in MultiNexthop Attribute. 1180 Color SubTLV, in Tunnel Encapsulation Attribute. 1182 Transport Target Extended community, on BGP CT route. 1184 Color Extended community, on BGP service route. 1186 The above precedence order follows more specific scoping of Color to 1187 less specific scoping. [major] I understand the motivation for the above, however, there is a problem. This experimental draft can specify rules for BGP CT (except for the MNH step). It should not include such rules for other BGP services - that is beyond the scope of this document. There is also the problem of mixing up BGP CT and BGP Service routes in the above order - at any point, the document should talk about either the service route resolution or the CT route resolution and be clear about this. 1201 Such Flowspec BGP routes with Redirect to IP next hop MAY be attached 1202 with a Mapping Community (e.g. Color:0:100), which allows 1203 redirecting the flow traffic over a tunnel to the IP next hop 1204 satisfying the desired SLA (e.g. Transport Class color 100). [major] There is again the issue of using this vague new terminology of "mapping community" for BGP FlowSpec. Let this document stick to BGP CT and not try to creep into specifying for BGP FlowSpec. What is being proposed here is conflicting with draft-ietf-idr-ts-flowspec-srv6-policy (passed WGLC) that has the semantics that association of Color ExtCom with BGP FlowSpec along with a NH is an indication of redirecting into an SR Policy to that NH with the Color in the ColorExtCom. Therefore, this document needs to be very specific about what Community is being referred to in each and every context. 1290 8.3. Limiting The Visibility Scope of PE Loopback as PNHs 1292 It may be even more desirable to limit the number of PNHs that are 1293 globally visible in the network. This is possible using mechanism 1294 described in Appendix D 1296 Such that advertisement of PE loopback addresses as next-hop in 1297 BGP service routes is confined to the region they belong to. An 1298 anycast IP-address called "Context Protocol Nexthop Address" 1299 (CPNH, Appendix D.3) abstracts the SNs in a region from other 1300 regions in the network, swapping the SN scoped service label with 1301 a CPNH scoped private namespace label. 1303 This provides much greater advantage in terms of scaling and 1304 convergence. Changes to implement this feature are required only 1305 on the local region's BNs and RRs. [major] This is again a "plug-in" for a private draft that is not even adopted by the WG. Same as in the case of MNH, this does not seem to be integral to the BGP CT proposal - if it were, the BGP CT work would get blocked waiting for the adoption and progression of those other mechanisms. Suggestion: please move these extraneous aspects into those other individual drafts. 1307 9. OAM Considerations 1309 MPLS OAM procedures specified in [RFC8029] also apply to BGP Classful 1310 Transport. 1312 The 'Target FEC Stack' sub-TLV for IPv4 Classful Transport has a Sub- 1313 Type of 31744, and a length of 13. The Value field consists of the 1314 RD advertised with the Classful Transport prefix, the IPv4 prefix 1315 (with trailing 0 bits to make 32 bits in all) and a prefix length 1316 encoded as shown in below in Figure 4. [major] How is it validated that the path taken is actually for the given TC since there is no TC here? How much value is there to determine just the reachability without verification of the "intent". Also, how would this work in a network with different RD and TC RT numbering in different domains? 1373 11. SRv6 Support 1375 This section describes how BGP CT family (e.g. AFI/SAFI 1/76) may be 1376 used to set up inter domain tunnels of a certain Transport Class, 1377 when using Segment Routing over IPv6 (SRv6) data plane on the inter- 1378 AS links or as an intra-AS tunneling mechanism. 1380 [RFC8986] specifies the SRv6 Endpoint behaviors (End USD, End.BM, 1381 End.B6.Encaps). [SRV6-INTER-DOMAIN] specifies the SRv6 Endpoint 1382 behaviors (END.REPLACE, END.REPLACEB6 and END.DB6). These are 1383 leveraged for BGP CT routes with SRv6 data plane. [major] The draft-salih is also an individual draft. Is the SRv6 support dependent on that? If so, it may be better to take SRv6 out of scope for this document and add it in a separate document when the base is adopted. 1385 The BGP Classful Transport route update for SRv6 MUST include an 1386 attribute containing SRv6 SID information. This may be either the 1387 BGP Prefix-SID attribute as specified in [RFC9252] or the BGP 1388 MultiNexthop attribute as specified in BGP MultiNexthop Attribute 1389 [MULTI-NH-ATTR] Section 5.4.3.3. If the Prefix-SID attribute is 1390 used, it MUST NOT include SRv6 SID structure for Transposition 1391 described in [RFC9252]. [major] Same comment as before for MNH. 1538 Similarly, these transport classes are also configured on ASBRs, ABRs 1539 and PEs with same Transport Route Target and unique RDs. [minor] It would help to clarify what is entailed in this provisioning of TCs at all these routers. I believe this is similar to VRF configuration along with all the things (RD, TC-RT, import/export policies) but then also resolution mapping for CT over the underlay? 1693 Assuming ASBR22_to_ASBR13 link goes down, such that traffic with Gold 1694 SLA going to PE11 needs repair. ASBR22 has an alternate BGP CT route 1695 for 192.0.2.11:100:192.0.2.11 from ASBR14. This has been 1696 preprogrammed in forwarding by ASBR22 as FRR backup next hop for 1697 label B-L4. This allows the Gold SLA traffic to be locally repaired 1698 at ASBR22 without the failure event propagated in the BGP CT network. 1699 In this case, ingress node PE25 will not know there was a failure, 1700 and traffic restoration will be independent of prefix scale (PIC). [major] It is misleading to call this "locally repaired" and "without failure event propagated in the BGP CT network" when the impact results in network wide churn in BGP CT control plane. Sure, the data plane is undergoing FRR, but there is still the job for BGP CT control plane to withdraw that route which was previously announced all across the network. I am referring to the recommended design option. There may be other ways to implement, but the document keeps flip-flopping between these options and does not provide a full picture with implications of each design option. 1723 Traffic repair to absorb the failure happens at ingress node PE25, in 1724 a service prefix scale independent manner. This is called PIC 1725 (Prefix scale Independent Convergence). The repair time will be 1726 proportional to time taken for withdrawing the BGP CT route. [major] This is major implications in the NH resolution that have not been covered in this document so far. I get the usual BGP PIC where we have a primary NH and then a less preferred backup NH. However, both of them are from the same TRDB. Now, this is bringing up a new requirement to resolve for the same BGP Service Route in the fallback TRDB even when there is a path in the preferred TRDB. Now, there may be a "classical" backup in the preferred TRDB as well - how does one decide where to pick the backup from? This does not seem to fit into the resolution mapping construct described previously. 1745 This section describes how BGP CT is deployed in such scenarios to 1746 preserve end to end Intent. Example described in this section use 1747 Inter-AS Option C domains. But similar mechanisms will work for 1748 Inter-AS Option A and Inter-AS Option B scenarios as well. [major] I don’t follow what role BGP CT has to play in option A or B since transport is intra-AS? 1750 13.1.1. Service Layer Color Management 1752 At the service layer, it is recommended that a global color namespace 1753 be maintained across multiple co-operating domains. BGP CT allows 1754 indirection using resolution schemes to be able to maintain a global 1755 namespace in the service layer. This is possible even if each domain 1756 independently maintains its own local transport color namespace. [major] Not sure that I follow this. It is ok and possible to maintain a global namespace for Color that indicates intent at service level but it is not possible to do the same for TC which indicates intent at transport level. Aren't they going to be the same intent? 1758 As explained in next hop Resolution Scheme (Section 5) , mapping 1759 community carried on service route maps to a resolution scheme. The 1760 mapping community values for the service route can be abstract and 1761 does not require to match the transport color namespace. This 1762 abstract mapping community value representing a global service layer 1763 intent is mapped to a local transport layer intent available in each 1764 domain. 1766 In this manner, it is recommended to keep color namespace management 1767 at the service layer and the transport layer decoupled from each 1768 other. In the following sections the service layer agrees on a 1769 single global namespace. [major] This is again conflicting. Why would an operator not keep Color = TC (as has been mentioned in a few examples/designs in this document)? And then both are consistent. If there are scenarios which make it difficult to maintain a network wide TC namespace, then the same should also apply to the Color namespace for services. 1771 13.1.2. Non-Agreeing Color Transport Domains 1773 Non-agreeing color domains require a mapping community rewrite on 1774 each domain boundary. This rewrite helps to map one domain's 1775 namespace to another. [major] True, but more importantly they also need a similar mapping for the Color ExtCom for service routes as well. Right? 1777 The below example illustrates how traffic is stitched and SLA is 1778 preserved when domains don't use the same namespace at the transport 1779 layer. Each domain specifies the same SLA using different color 1780 values. 1782 Gold(100) Gold(300) Gold(500) 1784 [PE11]----[ASBR11]---[ASBR21------[ASBR22]---[ASBR31-------[PE31] 1785 AS1 AS2 AS3 1787 Bronze(200) Bronze(400) Bronze(600) 1789 ----------- Packet Forwarding Direction --------> 1791 Figure 7: Transport Layer with Non-agreeing Color Domains 1793 In the above topology shown in Figure 7, we have three Autonomous 1794 Systems. All the nodes in the topology supports BGP CT. 1796 In AS1 Gold SLA is represented by color 100 and Bronze by 200. 1798 In AS2 Gold SLA is represented by color 300 and Bronze by 400. 1800 In AS3 Gold SLA is represented by color 500 and Bronze by 600. 1802 Though the color values are different, they map to tunnels with 1803 sufficiently similar TE characteristics in each domain. [major] The use of "color" in this section seems to actually mean TC. Why is that being used when until now it was all about TC? Looks like there is a need to review the use of the term "color" here and replace with TC where appropriate. My understanding is that "color" is only what is there in the Color ExtCom or SR Policy. 1805 The service route carries an abstract mapping community that maps to 1806 the required SLA. For example, Service routes that need to resolve 1807 over gold transport tunnels, carries a mapping community 1808 color:0:100500. In AS3 it maps to a resolution scheme containing 1809 TRDB with color 500 whereas in AS2 it maps to a TRDB with color 300 1810 and in AS1 it maps to a TRDB with color 100. Co-ordination is needed 1811 to provision the resolution schemes in each domain as explained 1812 above. [major] Very confusing! How does TRDB get a color suddenly? 1827 Transport-target re-write requires co-ordination of color values 1828 between domains in the transport layer. This method avoids the need 1829 to re-write service route mapping community, keeping the service 1830 layer homogenous and simple to manage. Coordinating Transport Class 1831 RT between adjacent domains is easier than coordinating service layer 1832 colors deployed in various non-adjacent domains. [major] Right! So, if the operator has to do the hard thing at service level which is network wide, why would they not keep the same consistency in the TC namespace at the transport layer which is (as you say) a much simpler task! And, then is the notion of "non-cooperating" color domains probably only theoretical? 1876 These tunnels will be installed in TRDBs corresponding to transport 1877 classes of color 101, 102. [major] Please see previous comment in Sec 4. There are aspects such as the same tunnel being added into more than one TRDB that need to be clearly specified which are buried within such use-cases and solution descriptions which would be hard to ensure consistent implementations. Since this document is primarily a protocol spec, it would be good if all such aspects are specified clearly in the "normative" portion of the document. 1879 Service routes received with mapping community (eg: transport-target 1880 or color community) can resolve over these tunnels in the TRDB with 1881 matching color by using resolution schemes. [major] I believe Transport RT is for BGP CT and not Service Routes. The use of this term "mapping community" is very confusing again. 1883 This approach consumes more resources in the transport and forwarding 1884 layer, because of the duplicate tunnels. 1886 13.1.3.2. Customized Resolution Schemes Approach [major] This section brings about a lot of rather complicated requirements for the resolution scheme/mapping, import/export policies and TRDB construct. Yet, these are not covered in an appropriate normative manner in the previous sections where those constructs were introduced. My concern with this spec, is that it is large and discusses key spec details embedded within such descriptive use-cases and design options which makes it difficult for implementors to get things right. 2058 13.2.2.1. Interop Between MPLS and SRv6 Nodes. 2060 BGP speakers may carry MPLS label and SRv6 SID in BGP CT SAFI 76 for 2061 AFIs 1 or 2 routes using protocol encoding as described in Carrying 2062 Multiple Encapsulation information (Section 6.1) [major] There is no description of procedures for how both MPLS and SRv6 are carried together for the same CT Route. 2064 MPLS Labels are carried using RFC 8277 encoding, and SRv6 SID is 2065 carried using Prefix SID attribute as specified in [RFC9252]. [major] RFC9252 does not specify carrying both MPLS label and SRv6 SID together - it only covers carrying SRv6 SID. 2106 R1 and R4 send and receive SRv6 SID in the BGP CT control plane 2107 routes using BGP Prefix-SID attribute, without Transposition Scheme. 2108 This allows them to be ingress and egress for SRv6 data plane. R4 2109 will carry the special MPLS Label with value 3 (Implicit-NULL) in RFC 2110 8277 encoding, which tells R1 not to push any MPLS label towards R4. 2111 The MPLS Label advertised by R1 in RFC 8277 NLRI will not be used by 2112 R4. SRv6 forwarding will be used between R1 and R4. [major] As mentioned in previous comment, impl-null is a valid label for BGP-CT and cannot be used to indicate a lack of presence of MPLS label. How does R1 know that it cannot send MPLS labeled to R4 vs it can send MPLS labeled to R4 but it does not need to slap a label for BGP CT? 2161 R1 and R4 send and receive UDP tunneling info in the BGP CT control 2162 plane routes using BGP TEA attribute. This allows them to be ingress 2163 and egress for UDP tunneled data plane. R4 will carry special MPLS 2164 Label with value 3 (Implicit-NULL) in RFC 8277 encoding, which tells 2165 R1 not to push any MPLS label towards R4. The MPLS Label advertised 2166 by R1 will not be used by R4. UDP tunneled forwarding will be used 2167 between R1 and R4. [major] Same as previous comment; applies for UDP as well. 2174 13.3. Managing Transport Route Visibility 2176 This section details the usage of BGP CT RD and label allocation 2177 modes to calibrate the level of path visibility and the amount of 2178 route churn in a multi-domain network. [minor] I don't follow the use of the term "route churn"; did you mean "route and label scale" instead? There is no description in the document for how routes churn due to various events such as failures or metric changes for the various design options and flavors. 2211 +--------+------+-------+-------+---------+---------+ 2212 |EP-type |Origin|RD-Mode|PP-Mode|CT Routes|CT Labels| 2213 +--------+------+-------+-------+---------+---------+ 2214 |Unicast |SN |Unique |TC,EP | 16 | 8 | 2215 |Unicast |SN |Unique |RD,EP | 16 | 16 | 2216 |Unicast |BN |Unique |TC,EP | 16 | 8 | 2217 |Unicast |BN |Unique |RD,EP | 16 | 16 | 2218 |--------|------|-------|-------|---------|---------| 2219 |Anycast |SN |Unique |TC,EP | 16 | 2 | 2220 |Anycast |SN |Unique |RD,EP | 16 | 16 | 2221 |Anycast |SN |Same |TC,EP | 2 | 2 | 2222 |Anycast |SN |Same |RD,EP | 2 | 2 | 2223 |Anycast |BN |Unique |TC,EP | 4 | 2 | 2224 |Anycast |BN |Unique |RD,EP | 4 | 4 | 2225 |Anycast |BN |Same |TC,EP | 2 | 2 | 2226 |Anycast |BN |Same |RD,IP | 2 | 2 | 2227 +--------+------+-------+-------+---------+---------+ 2229 Figure 13: Route and Path Visibility at Ingress Node [minor] What is PP-Mode? 2422 14.4. Best Effort Transport Class ID 2424 This document reserves the Transport class ID value 0 to represent 2425 "Best Effort Transport Class ID". This is used in the 'Transport 2426 Class ID' field of Transport Route Target extended community that 2427 represents best effort transport class. Please create a new registry 2428 for this. 2430 Registry Group: BGP CT Parameters 2432 Registry Name: Transport Class ID 2434 Value Name 2435 -----------------+-------------------------------- 2436 0 Best Effort Transport Class ID [major] The protocol spec can by itself associate the "best-effort" semantics for the value 0. We don’t need registry for this. 2632 16.2. Informative References [major] Many of these informative references are actually normative because the spec relies on them to provide key functionality. There are a couple of options (a) remove those aspects (if they are optional) and move into those other documents so as not to block this document or (b) keep them in normative section and wait for their progression. 2733 A.1. Signaling Intent over PE-CE Attachment Circuit [major] This entire section (and its subsections) do not belong in this document and can be moved in to the MNH draft. 2856 A.2. BGP CT Egress TE 2858 Mechanisms described in [BGP-LU-EPE] also applies to BGP CT family. 2860 The Peer/32 or Peer/128 EPE route MAY be originated in BGP CT family 2861 with appropriate Mapping Community (e.g. transport-target:0:100), 2862 thus allowing an EPE path to the peer that satisfies the desired SLA. [major] This section also does not belong to this document similarly, it can be moved into the individual BGP-LU-EPE draft. 2864 Appendix B. Applicability to Intra-AS and different Inter-AS 2865 deployments. [major] The entire Appendix B has got nothing to do with BGP-CT. The use-cases described below have been realized in networks and implementations using mechanisms other than TRDB. Keeping this section (even if in the appendix) gives an impression that this is something provided by BGP-CT infra. While this is just a local implementation matter. Please remove appendix B. 3206 This is a more pragmatic approach, rather than abandoning time tested 3207 design pattern like RFC 4364 and RFC 8277, just to invent something 3208 completely new that is not backward compatible with existing 3209 deployments. Overloading RFC 8277 NLRI MPLS Label field with 3210 information related to non MPLS data plane leads to backward 3211 compatibility issues. [major] I assume all of the above in Appedix C is really no longer required to be in this document? 3225 The document Intent-aware Routing using Color [Intent-Routing-Color] 3226 Section 6.3.2.1 suggests scaling numbers for transport network where 3227 BGP CT can be deployed. Experiments were conducted with this scale 3228 to find the convergence time with BGP CT for those scaling numbers. 3229 Scenarios involving BGP CT carrying IPv4 and IPv6 endpoints with MPLS 3230 label, and IPv6 endpoints with SRv6 SID were tested. [question] Can you please add the case where RFC8669 is used and revaluate? 3246 Appendix D. Scaling using BGP MPLS Namespaces [major] The entire Appendix D needs to be removed from this document; perhaps move it into the MPLS private namespaces individual draft. 3433 Appendix E. BGP CT deployment in SRv6 networks 3435 This section describes BGP CT deployment in SRv6 multi-domain network 3436 using Inter-AS Option C architecture. 3438 E.1. SID stacking approach [major] The section E.1 and its subsections should be removed from this document and moved to the SRv6-inter-domain document. 3741 E.2. Color-encoded Service SID (CPR) Approach [major] This section E.2 and its subsections can also be removed; not sure what is the motivation for putting that comparison in this document. [ end of review ]
- Re: [Idr] WG LC for draft-ietf-idr-bgp-ct-09.txt … Susan Hares
- Re: [Idr] WG LC for draft-ietf-idr-bgp-ct-09.txt … Ketan Talaulikar
- Re: [Idr] WG LC for draft-ietf-idr-bgp-ct-09.txt … Kaliraj Vairavakkalai
- Re: [Idr] WG LC for draft-ietf-idr-bgp-ct-09.txt … Kaliraj Vairavakkalai