Re: AD review of draft-ietf-rtgwg-bgp-pic-18
Ahmed Bashandy <abashandy.ietf@gmail.com> Mon, 31 October 2022 15:35 UTC
Return-Path: <abashandy.ietf@gmail.com>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A8856C152578; Mon, 31 Oct 2022 08:35:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.139
X-Spam-Level:
X-Spam-Status: No, score=0.139 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, FREEMAIL_REPLY=1, HTML_MESSAGE=0.001, NORMAL_HTTP_TO_IP=0.001, NUMERIC_HTTP_ADDR=1.242, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id R1uyVItEPiSJ; Mon, 31 Oct 2022 08:35:50 -0700 (PDT)
Received: from mail-pf1-x434.google.com (mail-pf1-x434.google.com [IPv6:2607:f8b0:4864:20::434]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 37C9BC1524D9; Mon, 31 Oct 2022 08:35:50 -0700 (PDT)
Received: by mail-pf1-x434.google.com with SMTP id f140so11015784pfa.1; Mon, 31 Oct 2022 08:35:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=KQnL0qKadcE1fVqZmIZvGMfQfrPw6cbZSVZKsTtIRzo=; b=hUrLlS9kfXsz+pMedda6flwFgt4tNzQlZDTQS0D25v2w5+oKPm1x6g6aXtzlkBOzLM NHECLU4w51hiq1Gb/p5vqXmWIpJYngcIdv3w2v0gxisEOZEdCquOAFa8WBsvgUTovteL QIVH4jgh4ejafI1cbKiqZ8sq32OPCLafzTU7rVfrByWIRXT7tRhhDbmEtPhM8+Wt/ggU OUmlbbHwViYY/x+DI6d4W+ffB2G5HLX+EQXFE+p9YhiwnZus5iuyw2T4KE4IyzNqgRz7 fPdfx2YtqeV/CcYrvZL/tKfZw1trMhkzx0U2z1Z6DVbWSWRjX0VEOI5pze32guZ/Fnss N0lg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=KQnL0qKadcE1fVqZmIZvGMfQfrPw6cbZSVZKsTtIRzo=; b=KHeGKY46tM9s4ay119HDAONbKxT6EM5A3eW6z1BAfoyOtgUKJbdYW8XF0CdGDgoI0V im46j8b0A9dmfRla3iIqzqj5G/dpSfFnXCX5sbE5vBwMPZRRCkGNWPXZqYwa2WQfXh4A BOfU/qoDzZ2Hvpe3FM05css6/AMnWjvJg8ngKy63dz5GfeQ1ccpe3yFyFaHXXesfy4f2 kZlZZdnEgpKmC4MJeqoqYc7MJJyv+yvP6yP+w+u3kTa1qCsnoZ+ozhBQcVcOwjBrWFu1 iAMXv49ovSntoY0IGNdl0PyK6yqy252gWjjKAqPITcxLIcTchweLfyLn5lAIrdoHjejB a0EA==
X-Gm-Message-State: ACrzQf3OqWUoFVqpNrPE1ZKnLZ4duIwHaOYtFiZROARynTpwfpyyz1/Z 18b2Mt5nKY0hta0goObemofwuqDUOBYRU0tVngo=
X-Google-Smtp-Source: AMsMyM5vrAdyk7Qv5QenCyRv1Dz4c+lG5oi/uiClYKWQXHHmJYGwPctMG8iItrBwScbmVfZk1cNJ1y6wFzDHBDeiTHs=
X-Received: by 2002:a63:4d0c:0:b0:46f:932:450a with SMTP id a12-20020a634d0c000000b0046f0932450amr12981605pgb.204.1667230549052; Mon, 31 Oct 2022 08:35:49 -0700 (PDT)
MIME-Version: 1.0
References: <CAMMESsw1bdYc-p7TD7GtHNHa6zrO3Ms20VZX89Hza_JtWpK=Bw@mail.gmail.com> <CAMMESszTT8bRMt1UMS+OvsqePhZDJMHtOJ9sfcUs8zu19rtjsQ@mail.gmail.com>
In-Reply-To: <CAMMESszTT8bRMt1UMS+OvsqePhZDJMHtOJ9sfcUs8zu19rtjsQ@mail.gmail.com>
From: Ahmed Bashandy <abashandy.ietf@gmail.com>
Date: Mon, 31 Oct 2022 08:35:36 -0700
Message-ID: <CAAgvS4pkKBnezbbzGd4uiRezJFEGOWDBShd26v2vR-18tM3a8g@mail.gmail.com>
Subject: Re: AD review of draft-ietf-rtgwg-bgp-pic-18
To: Alvaro Retana <aretana.ietf@gmail.com>
Cc: draft-ietf-rtgwg-bgp-pic@ietf.org, Yingzhen Qu <yingzhen.ietf@gmail.com>, RTGWG <rtgwg@ietf.org>, rtgwg-chairs <rtgwg-chairs@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000a291f605ec565c5a"
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtgwg/jnU3efuusiSZkBwEEpuQlD9Iagc>
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtgwg/>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 31 Oct 2022 15:35:56 -0000
Sorry for the late reply. I am wrote busy I will try to address comments in the next 1-2 weeks Ahmed On Thu, Sep 8, 2022, 8:46 AM Alvaro Retana <aretana.ietf@gmail.com> wrote: > Ping! > > On August 2, 2022 at 4:23:53 PM, Alvaro Retana (aretana.ietf@gmail.com) > wrote: > > > Dear authors: > > Thank you for documenting this important feature. > > After reading the document, I think that it still needs work: > > (1) The terminology used is not aligned with rfc4271. Of major importance > is > the description of the Routing Table, the Decision Process, the use of > best > routes (not paths!), etc. I pointed out multiple occurrences below, > buy I > need you to check the whole docuent for consistency. > > (2) The choice to describe the mechanism using a complex scenario (VPNs, > etc.) > has resulted in the text being more complex than it needs to be. > > It is probably too late for this, but a better approach (especially > because > this is an Informational document) would have been to describe the > functionality using a simple IP-only network. A more elaborate > example > could have then been included in an Appendix. > > (3) The text (mainly in §5) related to hardware limitations is interesting > but > also highly speculative. Given that it is not part of the main > mechanism, > I would suggest putting this information in an appendix. > > > I put detailed comments below. > > > To the Chairs: I couldn't find a request to the idr WG for review. Did I > miss it? Given the subject, we need to give idr the opportunity to > comment. > > > Thanks! > > Alvaro. > > > [Line numbers from idnits.] > > ... > 12 Abstract > > 14 In a network comprising thousands of BGP peers exchanging millions of > 15 routes, many routes are reachable via more than one next-hop. Given > 16 the large scaling targets, it is desirable to restore traffic after > 17 failure in a time period that does not depend on the number of BGP > 18 prefixes. In this document we proposed an architecture by which > 19 traffic can be re-routed to ECMP or pre-calculated backup paths in a > 20 timeframe that does not depend on the number of BGP prefixes. The > 21 objective is achieved through organizing the forwarding data > 22 structures in a hierarchical manner and sharing forwarding elements > 23 among the maximum possible number of routes. The proposed technique > 24 achieves prefix independent convergence while ensuring incremental > 25 deployment, complete automation, and zero management and provisioning > 26 effort. It is noteworthy to mention that the benefits of BGP Prefix > 27 Independent Convergence (BGP-PIC) are hinged on the existence of more > 28 than one path whether as ECMP or primary-backup. > > [style nit] "we proposed" > > Please don't write using the first person ("we propose"), use the third > person instead ("this document proposes"). > > > [minor] s/we proposed an architecture/describes an architecture > > Other places also say that the document "proposes" -- at this point in the > process we're past proposals. :-) > > > [minor] Please expand ECMP in the Abstract, and later on first use. > > > [] "The proposed technique achieves prefix independent convergence while > ensuring incremental deployment, complete automation, and zero management > and provisioning effort." > > Great marketing! This is a technical document, please avoid selling the > solution. > > > [minor] Consider breaking the Abstract into two paragraphs. > > > > ... > 109 1. Introduction > > 111 BGP speakers exchange reachability information about > 112 prefixes[1][2] and, for labeled address families, namely AFI/SAFI > 113 1/4, 2/4, 1/128, and 2/128, an edge router assigns local labels to > 114 prefixes and associates the local label with each advertised > 115 prefix using technologies such as L3VPN [9], 6PE [10], and > 116 Softwire [8] using BGP label unicast (BGP-LU) technique[3]. A BGP > 117 speaker then applies the path selection steps to choose the best > 118 path. In modern networks, it is not uncommon to have a prefix > 119 reachable via multiple edge routers. In addition to proprietary > 120 techniques, multiple techniques have been proposed to allow for > 121 BGP to advertise more than one path for a given prefix > 122 [7][12][13], whether in the form of equal cost multipath or > 123 primary-backup. Another common and widely deployed scenario is > 124 L3VPN with multi-homed VPN sites with unique Route Distinguisher. > 125 It is advantageous to utilize the commonality among paths used by > 126 NLRIs[1] to significantly improve convergence in case of topology > 127 modifications. > > [] I still haven't read the rest of the document... PIC is a valuable and > simple technique. Just from the first sentence (and peeking ahead at some > of the figures and examples), it seems to me that the description could > have been significantly simpler. :-( > > > [major] rfc7322 requires that RFCs be referenced as [RFCXXXX], please > update the citations. > > > [minor nit] Multiple citations don't have spaces around the brackets > ("[]"). > > > [minor] The reference to rfc4271 is the general reference for BGP, no need > to include a reference to rfc4760 with it. > > > [] "for labeled address families...L3VPN with multi-homed VPN sites with > unique Route Distinguisher" > > The detail about labeled AFs/L3VPN seems unnecessary given that PIC > doesn't just apply to them. > > > [major] "BGP label unicast (BGP-LU) technique[3]" > > Note that rfc8277 doesn't use the terms BGP-LU, "label unicast", or > "labeled unicast". I know that BGP-LU is commonly used, but it is not > defined in any IETF document (that I could find). Please define it in the > terminology section. Also, don't refer to rfc8277 every time you use the > term. > > > [major] "A BGP speaker then applies the path selection steps to choose the > best path." > > rfc4271 doesn't talk about "path selection" or finding a "best path". It > specifies a Decision Process that results in best routes. Please use > consistent terminology. > > > [minor] "In addition to proprietary techniques..." > > These proprietary techniques are not described anywhere. Please don't > mention them -- it will only raise unnecessary questions. > > > [nit] "techniques have been proposed...[7][12][13]" > > Add-path and Diverse paths are not just "proposed". > > > [minor] "It is advantageous to utilize the commonality among paths used by > NLRIs[1] to significantly improve convergence in case of topology > modifications." > > You may have to be more explicit about which commonality -- mention it > explicitly. > > > > 129 This document proposes a hierarchical and shared forwarding chain > 130 organization that allows traffic to be restored to pre-calculated > 131 alternative equal cost primary path or backup path in a time > 132 period that does not depend on the number of BGP prefixes. The > 133 technique relies on internal router behavior that is completely > 134 transparent to the operator and can be incrementally deployed and > 135 enabled with zero operator intervention. In other words, once it > 136 is implemented and deployed on a router, nothing is required from > 137 the operator to make it work. It is noteworthy to mention that > 138 this document describes FIB architecture that can be implemented > 139 in both hardware and/or software. > > [nit] s/to pre-calculated/to a pre-calculated > > > [nit] s/describes FIB/describes a FIB > > > [minor] Expand FIB on first mention. > > > > 141 1.1. Terminology > > [major] Please don't deviate (at least) from terminology used in standards > track specifications. Please don't redefine terms to mean something > different in this document. > > If using terminology defined elsewhere, please just reference those > documents, and don't redefine the terms here. > > > > 143 This section defines the terms used in this document. For ease of > 144 use, we will use terms similar to those used by L3VPN [9]. > > [major] Similar? If you're using the terminology from rfc4364, then > please just use it. Redefining terms doesn't make anything easier. > > To be clear: remove any terms that are already defined in rfc4364 and make > the reference Normative. > > > > 146 o BGP prefix: A prefix P/m (of any AFI/SAFI) that a BGP speaker > 147 has a path for. > > [major] rfc4271 uses the concept of an "IP prefix" carried in a "route" > and described by a "path" (from §1.1): > > Route > A unit of information that pairs a set of destinations with the > attributes of a path to those destinations. The set of > destinations are systems whose IP addresses are contained in one > IP address prefix carried in the Network Layer Reachability > Information (NLRI) field of an UPDATE message. The path is the > information reported in the path attributes field of the same > UPDATE message. > > > > 149 o IGP prefix: A prefix P/m (of any AFI/SAFI) that is learnt via > 150 an Interior Gateway Protocol (IGP), such as OSPF and ISIS. The > 151 prefix may be learnt directly through the IGP or redistributed > 152 from other protocol(s). > > [major] rfc4271 uses the term "IGP route". > > > [minor] The "P/m" notation is not explained -- and is not used outside of > this section. > > > [major] "is learnt via an...IGP...may be learnt directly through the IGP > or redistributed from other protocol(s)" > > There's a conflict in the definition: does it come directly from the IGP > or from something else? Might might those "other protocol(s)" be? > > > > 154 o CE[7]: An external router through which an egress PE can reach > 155 a prefix P/m. > > [minor] Please expand CE. > > > [major] The reference to draft-ietf-idr-best-external seems incorrect; > that draft doesn't use CE at all. This seems to be a term that comes from > RFC4364. > > > [minor] Please expand PE in first use. > > > > 157 o Egress PE[7], "ePE": A BGP speaker that learns about a prefix > 158 through an eBGP peer and chooses that eBGP peer as the next-hop > 159 for that prefix. > > [major] The reference to draft-ietf-idr-best-external seems incorrect; > that draft doesn't use PE at all. This seems to be a term that comes from > RFC4364. > > > [minor] s/eBGP/EBGP/g > That is how rfc4271 uses it. > > > [minor] Expand EBGP on first use. > > > > 161 o Ingress PE, "iPE": A BGP speaker that learns about a prefix > 162 through a iBGP peer and chooses an egress PE as the next-hop for > 163 the prefix. > > [minor] s/iBGP/IBGP/g > That is how rfc4271 uses it. > > > [minor] Expand IBGP on first use. > > > > 165 o Path: The next-hop in a sequence of nodes starting from the > 166 current node and ending with the destination node or network > 167 identified by the prefix. The nodes may not be directly > 168 connected. > > [major] See the definition above from rfc4271. > > > [major] Some of the definitions refer to BGP, but others to the FIB > characteristics....differentiate. "FIB Path" vs "BGP Path", for example. > Note that rfc4271 uses the term "Routing Table" as follows (from §3.2: > Routing Information Base): > > Routing information that the BGP speaker uses to forward packets (or > to construct the forwarding table used for packet forwarding) is > maintained in the Routing Table. The Routing Table accumulates > routes to directly connected networks, static routes, routes learned > from the IGP protocols, and routes learned from BGP. Whether a > specific BGP route should be installed in the Routing Table, and > whether a BGP route should override a route to the same destination > installed by another source, is a local policy decision, and is not > specified in this document. In addition to actual packet forwarding, > the Routing Table is used for resolution of the next-hop addresses > specified in BGP updates (see Section 5.1.3). > > > > 170 o Recursive path: A path consisting only of the IP address of the > 171 next-hop without the outgoing interface. Subsequent lookups are > 172 necessary to determine the outgoing interface and a directly > 173 connected next-hop. > > [major] rfc4271 already used "recursive route". > > > > 175 o Non-recursive path: A path consisting of the IP address of a > 176 directly connected next-hop and outgoing interface. > > [] See above. > > > > ... > 181 o Primary path: A recursive or non-recursive path that can be > 182 used all the time as long as a walk starting from this path can > 183 end to an adjacency. A prefix can have more than one primary > 184 path. > > [minor] "can be used all the time" > > I'm not sure what you mean here. > > > [minor] The concept of a "walk" hasn't been introduced -- please put a > pointer to where that is discussed. > > > > ... > 189 o Leaf: A container data structure for a prefix or local label. > 190 Alternatively, it is the data structure that contains prefix > 191 specific information. > > [] So a leaf can be many things: "prefix or local label...[or] prefix > specific information". Hmmm.... > > > > ... > 198 o Pathlist: An array of paths used by one or more prefixes to > 199 forward traffic to destination(s) covered by an IP prefix. Each > 200 path in the pathlist carries its "path-index" that identifies its > 201 position in the array of paths. In general, the value of the > 202 "path-index" stored in path may not necessarily have the same > 203 value of the location of the path in the pathlist. For example > 204 the 3rd path may carry path-index value of 1. A pathlist may > 205 contain a mix of primary and backup paths. > > [nit] s/its "path-index"/a "path-index" > > > > 207 o OutLabel-List: Each labeled prefix is associated with an > 208 OutLabel-List. The OutLabel-List is an array of one or more > 209 outgoing labels and/or label actions where each label or label > 210 action has 1-to-1 correspondence to a path in the pathlist. > 211 Label actions[6] are: push the label, pop the label, swap the > 212 incoming label with the label in the OutLabel-List entry, or > 213 don't push anything at all in case of "unlabeled". The prefix > 214 may be an IGP or BGP prefix. > > [major] If the label label actions are defined in rfc3031, please just > point to it -- is the "swap the incoming label with the label in the > OutLabel-List entry" defined there too? > > It seems to me that the action defined here is part of a general action > taken by the FIB, and not specific only to labels (as in a different > encapsulation, for example). > > > > 216 o Forwarding chain: It is a compound data structure consisting of > 217 multiple connected block that a forwarding engine walks one > 218 block at a time to forward the packet out of an interface. > 219 Section 2.2 explains an example of a forwarding chain. > 220 Subsequent sections provide additional examples > > [nit] s/multiple connected block/multiple connected blocks > > > [nit] s/additional examples/additional examples. > > > [minor] Note that both "forwarding chain" and "forwarding Chain" are used > throughout the text. Please be consistent. > > > > ... > 229 o Route: A prefix with one or more paths associated with it. The > 230 minimum set of objects needed to construct a route is a leaf > 231 and a pathlist. > > [major] See the comments above about rfc4271 and the FIB in general. > > > > 233 2. Overview > > 235 The idea of BGP-PIC is based on two pillars > > [minor] The descriptions below are very thick (hard to read through) > without more context. Suggestion: keep the description simple in this > section and point to where there is more detail. > > > > 236 o A shared hierarchical forwarding chain: It is not uncommon to see > 237 multiple destinations reachable via the same list of next-hops. > 238 Instead of having a separate list of next-hops for each > 239 destination, all destinations sharing the same list of next-hops > 240 can point to a single copy of this list thereby allowing fast > 241 convergence by making changes to a single shared list of next- > 242 hops rather than possibly a large number of destinations. Because > 243 paths in a pathlist may be recursive, a hierarchy is formed > 244 between pathlist and the resolving prefix whereby the pathlist > 245 depends on the resolving prefix. > > 247 o A forwarding plane that supports multiple levels of indirection: > 248 A forwarding chain that starts with a destination and ends with > 249 an outgoing interface is not a simple flat structure. Instead a > 250 forwarding entry is constructed via multiple levels of > 251 dependency. A BGP NLRI uses a recursive next-hop, which in turn > 252 resolves via an IGP next-hop, which in turn resolves via an > 253 adjacency consisting of one or more outgoing interface(s) and > 254 next-hop(s). > > [nit] s/Instead a/Instead, a > > > [minor] Are "multiple levels of indirection" and "multiple levels of > dependency" the same thing? The latter is used to describe the former. > > > [] Next-hop is also a term that needs to be differentiated. In this > paragraph is it used in the context of a BGP route, an IGP route, and a > local address on the FIB. See rfc4271. > > > > 256 Designing a forwarding plane that constructs multi-level forwarding > 257 chains with maximal sharing of forwarding objects allows rerouting a > 258 large number of destinations by modifying a small number of objects > 259 thereby achieving convergence in a time frame that does not depend > 260 on the number of destinations. For example, if the IGP prefix that > 261 resolves a recursive next-hop is updated there is no need to update > 262 the possibly large number of BGP NLRIs that use this recursive next- > 263 hop. > > [] The example is orders of magnitue simpler that the actual explanation. > BTW, none of the words used ("multi-level forwarding chains with maximal > sharing of forwarding objects") are mentioned in the description of the > pillars. > > > > 265 2.1. Dependency > > 267 This section describes the required functionalities in the > 268 forwarding and control planes to support BGP-PIC described in this > 269 document. > > [nit] s/BGP-PIC described/BGP-PIC as described > > > > 271 2.1.1. Hierarchical Hardware FIB (Forwarding Information Base) > > [major] The Introduction says that "this document describes FIB > architecture that can be implemented in both hardware and/or software", but > this section requires a hardware implementation. ?? > > > > 273 BGP-PIC requires a hierarchical hardware FIB support: for each BGP > 274 forwarded packet, a BGP leaf is looked up, then a BGP Pathlist is > 275 consulted, then an IGP Pathlist, then an Adjacency. > > [major] "BGP forwarded packet" > > I think you mean something along the lines of a packet destined to an > external destination, or to a destination learned through BGP, or ... ?? > > > [major] The definitions of both a leaf and a pathlist don't assume > protocol origin. IOW, the reader has to make significant assumptions to > understand that you mean "a leaf/pathlist that resulted from > BGP/IGP-learned information" (or something like that). If the > identification of the origin protocol is required, please also say so. > > > [major] "Pathlist" and "Adjancency" are capitalized, but "leaf" isn't. > Personally, I like capitalizing terms with specific meaning. Not all > instances (throughout the document) are treated the same. Please be > consistent throughout. > > > [major] The general process above: lookup the destination (BGP leaf), > which points to possible alternatives (BGP pathlist), which recursively > resolve... assumes knowledge of how a FIB is generally built and the > operations that forwarding a packet requires. This may not be common > knowledge to every reader. Please either explain the assumptions or > reference a place where the explanation exists already. > > > > 277 An alternative method consists in "flattening" the dependencies when > 278 programming the BGP destinations into HW FIB resulting in > 279 potentially eliminating both the BGP Path-List and IGP Path-List > 280 consultation. Such an approach decreases the number of memory > 281 lookups per forwarding operation at the expense of HW FIB memory > 282 increase (flattening means less sharing thereby less duplication), > 283 loss of ECMP properties (flattening means less pathlist entropy) and > 284 loss of BGP-PIC properties. > > [minor] HW is mentioned here again...and the concept of programming, which > goes back to the general knowledge assumptions that you're making. > > > [minor] s/Path-List/Pathlist/g > > > [major] Am I to assume that this "alternative method" shouldn't be > considered/used, especially if it results in loss of the "BGP-PIC > properties"?? This section started by mentioning a requirement -- if this > other approach is not what is required the why even mention it? > > > > 286 2.1.2. Availability of more than one BGP next-hops > > 288 When the primary BGP next-hop fails, BGP-PIC depends on the > 289 availability of one or more pre-computed and pre-installed secondary > 290 BGP next-hop(s) in the BGP Pathlist. > > [major] "primary BGP next-hop" > > At this point in the document you've hinted at the possibility of having > multiple BGP routes (with potentially different NEXT_HOPs), but the concept > of "primary BGP next-hop" hasn't been explained. I'm assuming that by this > you mean the BGP best route [rfc4271]. (Again, you're making assumptions > of the knowledge of the reader.) > > I can see that this section talks about "secondary next-hops" later. > Perhaps change the order of the text so that the conditions precede the > explanation of the actions. > > > [major] "BGP next-hop fails" > > rfc4271 talks about the the address in the NEXT_HOP attribute becoming not > resolvable. Please use established terminology! > > > [major] "pre-computed and pre-installed...in the BGP Pathlist" > > There's an example of "pre-computed" later on, but I couldn't find an > explanation of "pre-installed...in the BGP Pathlist". > > > > 292 The existence of a secondary next-hop is clearly required for the > 293 following reason: a service caring for network availability will > 294 require two disjoint network connections resulting in two BGP next- > 295 hops. > > [minor] You mean "clearly required" for BGP-PIC, right? Not for the > service. > > > > 297 The BGP distribution of secondary next-hops is available thanks to > 298 the following BGP mechanisms: Add-Path [12], BGP Best-External [7], > 299 diverse path [13], and the frequent use in VPN deployments of > 300 different VPN RD's per PE. Another option to learn multiple BGP > 301 NH/path is simply to receive IBGP paths from multiple BGP RR > 302 selection a different path as best. It is noteworthy to mention that > 303 the availability of another BGP path does not mean that all failure > 304 scenarios can be covered by simply forwarding traffic to the > 305 available secondary path. The discussion of how to cover various > 306 failure scenarios is beyond the scope of this document. > > [minor] "BGP NH/path" is a new term. Even if using "NH" seems obvious, > please mention it at least once. Also, you've been talking (incorrectly) > about BGP paths (instead of routes) or BGP next-hops and now you're > combining the two. This is the only instance. > > > [nit] s/is simply to receive/is to receive > > > [nit] s/multiple BGP RR selection/multiple BGP RRs selecting > > > [minor] You might want to reference rfc9107 as an example of "multiple BGP > RRs selecting a different path as best". > > > > 308 2.2. BGP-PIC Illustration > > 310 To illustrate the two pillars above as well as the platform > 311 dependency, we will use an example of a simple multihomed L3VPN [9] > 312 prefix in a BGP-free core running LDP [4] or segment routing over > 313 MPLS forwarding plane [5]. > > [nit] s/a simple multihomed/a multihomed > > What is "simple" to you may not be simple to others. Just state the > facts. > > > [minor] s/L3VPN [9]/L3VPN > The citation is not needed all the time. > > > [minor] "core running LDP [4] or segment routing over MPLS forwarding > plane [5]" > > This is an example. To simplify, pick one. > > > > 315 +--------------------------------+ > 316 | | > 317 | ePE2 (IGP-IP1 192.0.2.1, Loopback) > 318 | | \ > 319 | | \ > 320 | | \ > 321 iPE | CE....VRF "Blue", ASnum 65000 > 322 | | / (VPN-IP1 198.51.100.0/24) > > 323 | | / (VPN-IP2 203.0.113.0/24) > 324 | LDP/Segment-Routing Core | / > 325 | ePE1 (IGP-IP2 192.0.2.2, Loopback) > 326 | | > 327 +--------------------------------+ > 328 Figure 1 VPN prefix reachable via multiple PEs > > [nit] s/ASnum/ASN > > Expand on first use. > > > > 330 Referring to Figure 1, suppose the iPE (the ingress PE) receives > 331 NLRIs for the VPN prefixes VPN-IP1 and VPN-IP2 from two egress PEs, > 332 ePE1 and ePE2 with next-hop BGP-NH1 and BGP-NH2, respectively. > 333 Assume that ePE1 advertise the VPN labels VPN-L11 and VPN-L12 while > 334 ePE2 advertise the VPN labels VPN-L21 and VPN-L22 for VPN-IP1 and > 335 VPN-IP2, respectively. Suppose that BGP-NH1 and BGP-NH2 are resolved > 336 via the IGP prefixes IGP-IP1 and IGP-IP2, where each happen to have > 337 2 equal cost paths with IGP-NH1 and IGP-NH2 reachable via the > 338 interfaces I1 and I2, respectively. Suppose that local labels > 339 (whether LDP [4] or segment routing [5]) on the downstream LSRs[4] > 340 for IGP-IP1 are IGP-L11 and IGP-L12 while for IGP-IP2 are IGP-L21 > 341 and IGP-L22. As such, the routing table at iPE is as follows: > > [minor] "receives NLRIs" > > I other places you're talked about paths or routes...please be consistent! > > > [minor] "IGP-NH1 and IGP-NH2 reachable via the interfaces I1 and I2" > > I'm sure you're talking about interfaces on the iPE, but others may find > that the picture is lacking detail when compared to the description: the > core is just a box.. > > > > 343 65000: 198.51.100.0/24 > 344 via ePE1 (192.0.2.1), VPN Label: VPN-L11 > 345 via ePE2 (192.0.2.2), VPN Label: VPN-L21 > > [minor] You need to explain the notation and, for example, why some > entries have an ASN and others don't. > > > [minor] It might help if the description included the IP addresses. For > example, when mentioning BGP-NH1, indicate that it refers to 192.0.2.1. > > > > ... > 362 IP Leaf: Pathlist: IP Leaf: Pathlist: > 363 -------- +-------+ -------- +----------+ > 364 VPN-IP1-->|BGP-NH1|-->IGP-IP1(BGP NH1)--->|IGP-NH1,I1|--->Adjacency1 > 365 | |BGP-NH2|-->.... | |IGP-NH2,I2|--->Adjacency2 > 366 | +-------+ | +----------+ > 367 | | > 368 | | > 369 v v > 370 OutLabel-List: OutLabel-List: > 371 +----------------------+ +----------------------+ > 372 |VPN-L11 (VPN-IP1, NH1)| |IGP-L11 (IGP-IP1, NH1)| > 373 |VPN-L21 (VPN-IP1, NH2)| |IGP-L12 (IGP-IP1, NH2)| > 374 +----------------------+ +----------------------+ > > 376 Figure 2 Shared Hierarchical Forwarding Chain at iPE > > 378 The forwarding chain depicted in Figure 2 illustrates the first > 379 pillar, which is sharing and hierarchy. We can see that the BGP > 380 pathlist consisting of BGP-NH1 and BGP-NH2 is shared by all NLRIs > 381 reachable via ePE1 and ePE2. As such, it is possible to make changes > 382 to the pathlist without having to make changes to the NLRIs. For > 383 example, if BGP-NH2 becomes unreachable, there is no need to modify > 384 any of the possibly large number of NLRIs. Instead only the shared > 385 pathlist needs to be modified. Likewise, due to the hierarchical > 386 structure of the forwarding chain, it is possible to make > 387 modifications to the IGP routes without having to make any changes > 388 to the BGP NLRIs. For example, if the interface "I2" goes down, only > 389 the shared IGP pathlist needs to be updated, but none of the IGP > 390 prefixes sharing the IGP pathlist nor the BGP NLRIs using the IGP > 391 prefixes for resolution need to be modified. > > [minor] "We can see that the BGP pathlist consisting of BGP-NH1 and > BGP-NH2 is shared by all NLRIs reachable via ePE1 and ePE2." > > I only see VPN-IP1 using that pathlist. > > I know it is not easy to draw using ASCII, but you may want to provide a > conceptual figure showing how the leafs and Pathlists relate to each other > and then describe the contents. OR You can also integrate an SVI drawing > in the XML. > > > [minor] The representation of the Pathlists is not consistent: the first > one shows just BGP-NHs, while the second one has an extra element. It may > be obvious to you that the interface element points at the corresponding > Adjacencies, but please explain that too. > > > [major] The description is not complete -- it jumps from using one > Pathlist to the other, but it doesn't cover the recursive nature of the > resolution: resolving the BGP-NHs in the Pathlist through another leaf. > > > [major] The outlabel-list is not explained anywhere. > > > > 393 Figure 2 can also be used to illustrate the second BGP-PIC pillar. > 394 Having a deep forwarding chain such as the one illustrated in Figure > 395 2 requires a forwarding plane that is capable of accessing multiple > 396 levels of indirection in order to calculate the outgoing > 397 interface(s) and next-hops(s). While a deeper forwarding chain > 398 minimizes the re-convergence time on topology change, there will > 399 always exist platforms with limited capabilities and hence imposing > 400 a limit on the depth of the forwarding chain. Section 5 describes > 401 how to gracefully trade off convergence speed with the number of > 402 hierarchical levels to support platforms with different > 403 capabilities. > > [minor] "Figure 2 can also be used to illustrate the second BGP-PIC > pillar." > > Ok. But the rest of the paragraph doesn't explain how that is needed in > the example. > > > > 405 Another example using IPv6 addresses can be something like the > 406 following > > 408 65000: 2001:DB8:1::/48 > 409 via ePE1 (192::1), VPN Label: VPN6-L11 > 410 via ePE2 (192::2), VPN Label: VPN6-L21 > > 412 65000: 2001:DB8:2:/48 > 413 via ePE1 (192::1), VPN Label: VPN6-L12 > 414 via ePE2 (192::2), VPN Label: VPN6-L22 > > 416 192::1/128 > 417 via Core, Label: IGP6-L11 > 418 via Core, Label: IGP6-L12 > > 420 192::2/128 > 421 via Core, Label: IGP6-L21 > 422 via Core, Label: IGP6-L22 > > 424 The same hierarchical forwarding chain described can be constructed > 425 for IPv6 addresses/prefixes. > > [major] Please use the addresses from rfc3849. > > > [] BTW, sorry to be blunt, but this is a lazy attempt at using IPv6 in an > example. You should at least show a figure and maybe a different > scenario...not just similar numbers. > > > > 427 3. Constructing the Shared Hierarchical Forwarding Chain > > 429 Constructing the forwarding chain is an application of the two > 430 pillars described in Section 2. This section describes how to > 431 construct the forwarding chain in hierarchical shared manner. > > [nit] s/in hierarchical/in a hierarchical > > > > 433 3.1. Constructing the BGP-PIC forwarding Chain > > 435 The whole process starts when BGP downloads a prefix to FIB. The > 436 prefix contains one or more outgoing paths. For certain labeled > 437 prefixes, such as VPN [9] prefixes, each path may be associated with > 438 an outgoing label and the prefix itself may be assigned a local > 439 label. The list of outgoing paths defines a pathlist. If such > 440 pathlist does not already exist, then FIB manager (software or > 441 hardware entity responsible for managing the FIB) creates a new > 442 pathlist, otherwise the existing pathlist is used. The BGP prefix is > 443 added as a dependent of the pathlist. > > [major] "BGP downloads a prefix to FIB" > > rfc4271 talks about "routes...installed in the Routing Table". Please, as > much as possible, use the terminology from established RFCS. Note that > this document doesn't define what a FIB is. > > > [major] "The prefix contains one or more outgoing paths." > > This may be another terminology issue: a BGP route doesn't contain > multiple next-hops. If the routes came from ADD_PATH or ORR, etc., there > will be multiple BGP routes with a common NLRI and different next-hops. > IOW, it looks like the Pathlist would be built incrementally as routes for > the same NLRI are installed. > > The end result should be the same. Terminology matters! > > > [minor] "VPN [9]" > > The other references to rfc4364 use "L3VPN", should this one use it too? > > > [major] "If such pathlist does not already exist..." > > It should be mentioned somewhere the reason for this Pathlist to already > exist: there's probably another BGP route using it. This is where the > connection to the shared nature of the chain needs to be made. Otherwise, > the description sounds as if there might simply be a Pathlist structure > available... > > > [nit] s/FIB manager/the FIB manager/g > > > [] It might be easier/better/clearer to explain the process as a series of > steps. Note that the description above is really a series of steps: BGP > has a route to install, the route is accepted in the routing table (local > policy), a leaf for this NLRI exists (or not), the Pathlist exists (or > not), etc... > > > > 445 The previous step constructs the upper part of the hierarchical > 446 forwarding chain. The forwarding chain is completed by resolving the > 447 paths of the pathlist. A BGP path usually consists of a next-hop. > 448 The next-hop is resolved by finding a matching IGP prefix. > > [major] "The forwarding chain is completed by resolving the paths of the > pathlist." > > The example in the last section shows that this "lower part" (is that the > right term?) could also include a Pathlist. The explanation of how the > "upper part" is constructed didn't mention that the process was to > "resolve" the BGP NLRI. As a result, there's a disconnect between what was > done before and what is assumed here. > > Again, a series of steps would serve the purpose better. > > Also, be mindful of the use of "resolve" in rfc4271. > > > [major] "A BGP path usually consists of a next-hop." > > As opposed to what? I mean the NEXT_HOP attribute is mandatory, a route > includes multiple other Path Attributes... Not sure what the point it. > > > [minor] "The next-hop is resolved by finding a matching IGP prefix." > > Yes, as an extreme simplification. §9.1.2.1/rfc4271 offers a better > description. > > > > 450 The end result is a hierarchical shared forwarding chain where the > 451 BGP pathlist is shared by all BGP prefixes that use the same list of > 452 paths and the IGP prefix is shared by all pathlists that have a path > 453 resolving via that IGP prefix. It is noteworthy to mention that the > 454 forwarding chain is constructed without any operator intervention at > 455 all. > > [minor] "The end result is a hierarchical shared forwarding chain where > the BGP pathlist is shared..." > > As I mentioned above, more emphasis should be places on (when explaining > the steps) the shared nature of the Pathlists and how its dependents are > independent of each other too. > > > [minor] "without any operator intervention" > > Yes, this was mentioned as an advantage somewhere. It would be better if > the document was explicit from the start on the point that it is describing > a set of abstract structures and procedures to be internally instantiated. > Just like protocol structures (neighbors, routes, etc.) that are > transparent to the user. > > > > ... > 460 3.2. Example: Primary-Backup Path Scenario > > 462 Consider the egress PE ePE1 in the case of the multi-homed VPN > 463 prefixes in the BGP-free core depicted in Figure 1. Suppose ePE1 > 464 determines that the primary path is the external path but the backup > 465 path is the iBGP path to the other PE ePE2 with next-hop BGP-NH2. > 466 ePE2 constructs the forwarding chain depicted in Figure 3. We are > 467 only showing a single VPN prefix for simplicity. But all prefixes > 468 that are multihomed to ePE1 and ePE2 share the BGP pathlist. > > [minor] "BGP-free core depicted in Figure 1" > > It doesn't really make a difference, but it wasn't mentioned before that > Figure 1 has a BGP-free core. I'm mentioning this because there are > inconsistencies in the description. Also, this is a detail that is not > important to the description, but that can create confusion for other > readers and generate unnecessary questions and comments. > > > [minor] "Suppose ePE1 determines..." > > It would be great if there was either a pointer to how ePE1 does all this > (where are backups specified?), or a little bit more explanation of the > assumptions. > > > [nit] s/external path but the backup/external path, while the backup > > > > 470 BGP OutLabel-List > 471 VPN-L11 +---------+ > 472 (Label-leaf)---+---->|Unlabeled| > 473 | +---------+ > 474 v | VPN-L21 | > 475 | | (swap) | > 476 | +---------+ > 477 | > 478 | BGP Pathlist > 479 | +------------+ Connected route > 480 | | CE-NH |------>(to the CE) > 481 | |path-index=0| > 482 | +------------+ > 483 | | VPN-NH2 | > 484 VPN-IP1 -----+------------------>| (backup) |------>IGP Leaf > 485 (IP leaf) |path-index=1| (Towards ePE2) > 486 | +------------+ > 487 | > 488 | BGP OutLabel-List > 489 | +---------+ > 490 +------------->|Unlabeled| > 491 +---------+ > 492 | VPN-L21 | > 493 | (push) | > 494 +---------+ > > 496 Figure 3 : VPN Prefix Forwarding Chain with eiBGP paths on egress PE > > [minor] Figure 2 provided an easy-to-follow dependency order: Leaf to > Pathlist to Leaf to Pathlist. The dependencies in Figure 3 are not > straightforward, please provide a description of the figure before > explaining the differences with respect to the other example. > > > [major] The definition of Pathlist (§1.1) implies that all its entries can > be used for forwarding, but the Pathlist above (and this use case) clearly > indicate that not all the entries in a Pathlist are used in the same way, > or at the same time. > > > > 498 The example depicted in Figure 3 differs from the example in Figure > 499 2 in two main aspects. First, as long as the primary path towards > 500 the CE (external path) is useable, it will be the only path used for > 501 forwarding while the OutLabel-List contains both the unlabeled label > 502 (primary path) and the VPN label (backup path) advertised by the > 503 backup path ePE2. The second aspect is presence of the label leaf > 504 corresponding to the VPN prefix. This label leaf is used to match > 505 VPN traffic arriving from the core. Note that the label leaf shares > 506 the pathlist with the IP prefix. > > [major] "the...path...is useable" > > What does "useable" mean? Please describe this term in the terminology > section. > > > [major] "primary path towards the CE (external path) is useable, it will > be the only path used" > > The arrow from VPN-IP1 points to the VPN-NH2 entry in the Pathlist, so the > description doesn't seem to match. > > > [major] "the OutLabel-List" > > There are two boxes named "BGP OutLabel-List", and they contain different > backup actions (swap vs push). > > > [minor] "unlabeled label" > > rfc3031 talks about unlabeled packets, not labels. > > > [minor] A third difference is that this figure includes a "path-index" -- > but that is not explained. > > > > 508 4. Forwarding Behavior > ... > 513 When a packet arrives at a router, it matches a leaf. A labeled > 514 packet matches a label leaf while an IP packet matches an IP leaf. > 515 The forwarding engines walks the forwarding chain starting from the > 516 leaf until the walk terminates on an adjacency. Thus when a packet > 517 arrives, the chain is walked as follows: > > [minor] Instead of summarizing, simply start with the steps. > > > > 519 1. Lookup the leaf based on the destination address or the label at > 520 the top of the packet. > > [major] Lookup how? How do you know you found the right leaf? > > I think rfc3031 has some text related to the forwarding of labeled > packets. But I don't know of a reference for IP forwarding, so you should > at least mention something around a longest-prefix-match. > > > > 522 2. Retrieve the parent pathlist of the leaf. > > [minor] While it is related, the terminology section defines the > dependency relationship as the leaf being the child of the Pathlist -- > keeping consistency is important. > > > > 524 3. Pick the outgoing path "Pi" from the list of resolved paths in > 525 the pathlist. The method by which the outgoing path is picked is > 526 beyond the scope of this document (e.g. flow-preserving hash > 527 exploiting entropy within the MPLS stack and IP header). Let the > 528 "path-index" of the outgoing path "Pi" be "j". > > [minor] s/Pick the outgoing path "Pi"/Pick an outgoing path "Pi" > > > [] "Let the "path-index" of the outgoing path "Pi" be "j"." > > I'm not sure if you mean that "j is the path-index of Pi" or "assign a > path-index of j". If the former, how is the path-index assigned? > > > > 530 4. If the prefix is labeled, use the "path-index" "j" to retrieve > 531 the jth label "Lj" stored the jth entry in the OutLabel-List and > 532 apply the label action of the label on the packet (e.g. for VPN > 533 label on the ingress PE, the label action is "push"). As > 534 mentioned in Section 1.1 the value of the "path-index" stored > 535 in path may not necessarily be the same value of the location of > 536 the path in the pathlist. > > [minor] "retrieve the jth label "Lj" stored the jth entry in the > OutLabel-List" > > This sounds as if there were (at least) j labels in each of (at least) j > entries in the OutLabel-List. But I assume that there is really only one > label (or label stack) in that entry. ?? > > > > 538 5. Move to the parent of the chosen path "Pi". > > 540 6. If the chosen path "Pi" is recursive, move to its parent prefix > 541 and go to step 2. > > 543 7. If the chosen path is non-recursive move to its parent adjacency. > 544 Otherwise go to the next step. > > [minor] These 2 steps say the same thing: "move to the parent of..."Pi"", > "path "Pi"...move to its parent", and "move to its parent". I think you > need to better differentiate (?) the options. > > > > ... > 549 Let's apply the above forwarding steps to the forwarding chain > 550 depicted in Figure 2 in Section 2. Suppose a packet arrives at > 551 ingress PE iPE from an external neighbor. Assume the packet matches > 552 the VPN prefix VPN-IP1. While walking the forwarding chain, the > 553 forwarding engine applies a hashing algorithm to choose the path and > 554 the hashing at the BGP level yields path 0 while the hashing at the > 555 IGP level yields path 1. In that case, the packet will be sent out > 556 of interface I2 with the label stack "IGP-L12,VPN-L11". > > [minor] s/path 0...path 1/path-index 0...path-index 1 > > > [minor] The example would be better understood if it was presented as > steps matching the process described above (vs a narrative explanation). > Also, be specific in which path is the one with each path-index... > > > [major] Figure 2 doesn't contain any indication of a path-index. > > 558 5. Handling Platforms with Limited Levels of Hierarchy > > > [] This whole section, which is longer than the text describing the > "normal" mechanism (!), seems to be more appropriate for an Appendix than > the main body of the document. Please move it. > > > > ... > 564 o Being able to reduce the number of hierarchical levels from any > 565 arbitrary value to a smaller arbitrary value that can be > 566 supported by the forwarding engine. > > [?] What is the worst case in terms of number of levels? 3? Just > curious. I'm asking simply because an "arbitrary value" sounds like it > could be 10s o even 1000s, when in reality it is not more than a handful. > Are there many platforms out there that wouldn't support 2-3 levels (for > the number of routes/paths they normally would carry)? > > > [minor] "smaller arbitrary value" > > This smaller value is not really arbitrary (as in random), it is > determined by the platform. Maybe something along the lines of a "value > supported by the platform" would be better. > > > > ... > 571 5.1. Flattening the Forwarding Chain > ... > 584 1. The forwarding engine is now at leaf "R1". > ... > 605 9. Now the forwarding engine continues the walk to the parent of > 606 "Qj". > > [minor] This list of steps is just a restatement of the process in the > last section. IMO there's no need to repeat the steps here. Point at the > 2-level example, and jump directly to the next paragraph. > > To reference from the explanation below, "a picture is worth a thousand > words". ;-) > > > > ... > 614 1. FIB manager wants to reduce the number of levels used by "Pi" by > 615 1. > > [nit] This is not really a step... > > > > ... > 622 4. FIB manager also extracts the OutLabel-list(R2) associated with > 623 the leaf "R2". Remember that OutLabel-list(R2) = <L1, L2,..., > 624 Lm>. > > [] New syntax: "OutLabel-list(R2)". > > > > ... > 643 It is noteworthy to mention that the labels in the OutLabel-list > 644 associated with the "flattened" pathlist may be stored in the same > 645 memory location as the path itself to avoid additional memory > 646 access. But that is an implementation detail that is beyond the > 647 scope of this document. > > [] Why even mention it then? > > > > 649 The same steps can be applied to all paths in the pathlist <P1, > 650 P2,..., Pn> so that all paths are "flattened" thereby reducing the > 651 number of hierarchical levels by one. Note that that "flattening" a > 652 pathlist pulls in all paths of the parent paths, a desired feature > 653 to utilize all ECMP/UCMP paths at all levels. A platform that has a > 654 limit on the number of paths in a pathlist for any given leaf may > 655 choose to reduce the number paths using methods that are beyond the > 656 scope of this document. > > [minor] Expand UCMP. > > > > ... > 663 Because a flattened pathlist may have an associated OutLabel-list > 664 the forwarding behavior has to be slightly modified. The > 665 modification is done by adding the following step right after step 4 > 666 in Section 4. > > 668 5. If there is an OutLabel-list associated with the pathlist, then > 669 if the path "Pi" is chosen by the hashing algorithm, retrieve the > 670 label at location "i" in that OutLabel-list and apply the label > 671 action of that label on the packet. > > [major] Step 4 says this: > > 4. If the prefix is labeled, use the "path-index" "j" to retrieve > the jth label "Lj" stored the jth entry in the OutLabel-List and > apply the label action of the label on the packet... > > Besides using "j" instead of "i" as the path-index, both seem to say the > same thing. Am I missing something? > > > > ... > 676 5.2. Example: Flattening a forwarding chain. > > 678 This example uses a case of inter-AS option C [9] where there are 3 > 679 levels of hierarchy. Figure 4 illustrates the sample topology. To > 680 force 3 levels of hierarchy, the ASBRs[9] on the ingress domain > 681 (domain 1) advertise the core routers of the egress domain (domain > 682 2) to the ingress PE (iPE) via BGP-LU [3] instead of redistributing > 683 them into the IGP of domain 1. The end result is that the ingress PE > 684 (iPE) has 2 levels of recursion for the VPN prefixes VPN-IP1 and > 685 VPN-IP2. > > [] Suggestion: > > OLD> > To force 3 levels of hierarchy, the ASBRs[9] on the ingress > domain (domain 1) advertise the core routers of the egress > domain (domain 2) to the ingress PE (iPE) via BGP-LU [3] > instead of redistributing them into the IGP of domain 1. > > NEW> > The ASBRs on the ingress domain (Domain 1) use BGP to advertise > the core routers (ASBRs and ePEs) of the egress domain (Domain > 2) to the iPE. > > > [minor] Expand ASBR on first use. No need to add a reference. > > > > 687 Domain 1 Domain 2 > 688 +-------------+ +-------------+ > 689 | | | | > 690 | LDP/SR Core | | LDP/SR core | > 691 | | | | > 692 | (192.0.2.4) | | > 693 | ASBR11-------ASBR21........ePE1(192.0.2.1) > 694 | | \ / | . . |\ > 695 | | \ / | . . | \ > 696 | | \ / | . . | \ > 697 | | \/ | .. | \VPN-IP1(198.51.100.0/24) > > 698 | | /\ | . . | /VRF "Blue" ASn: 65000 > 699 | | / \ | . . | / > 700 | | / \ | . . | / > 701 | | / \ | . . |/ > 702 iPE ASBR12-------ASBR22........ePE2 (192.0.2.2) > 703 | (192.0.2.5) | |\ > 704 | | | | \ > 705 | | | | \ > 706 | | | | \VRF "Blue" ASn: 65000 > 707 | | | | /VPN-IP2(203.0.113.0/24) > 708 | | | | / > 709 | | | | / > 710 | | | |/ > 711 | ASBR13-------ASBR23........ePE3(192.0.2.3) > 712 | (192.0.2.6) | | > 713 | | | | > 714 | | | | > 715 +-------------+ +-------------+ > 716 <=========== <========= <============ > 717 Advertise ePEx Advertise Redistribute > 718 Using iBGP-LU ePEx Using IGP into > 719 eBGP-LU BGP > > 721 Figure 4 : Sample 3-level hierarchy topology > > [minor] s/ASn/ASN/g > > > [major] "Redistribute IGP into BGP" ?? > > I'm sure you don't mean that, but probably more something like > "redistribute the ePEx routes only"... > > > > 723 We will make the following assumptions about connectivity > > [nit] s/connectivity/connectivity: > > > > 725 o In "domain 2", both ASBR21 and ASBR22 can reach both ePE1 and > 726 ePE2 using the same distance. > > [nit] The figure capitalizes "Domain", but the text doesn't. Please be > consistent! > > > [minor] By "distance" I assume you mean "IGP metric". Please use that > term instead. Apply to other occurences. > > > > ... > 733 We will make the following assumptions about the labels > > [nit] s/labels/labels: > > > > ... > 740 o The labels advertised by ASBR11 to iPE using BGP-LU [3] for the > 741 egress PEs ePE1 and ePE2 are LASBR111(ePE1) and LASBR112(ePE2), > 742 respectively. > > [] Why change the notation of the labels? This new notation > ("LASBR111(ePE1)") just makes the explanation more complex. :-( > > > > ... > 764 65000: 198.51.100.0/24 > 765 via ePE1 (192.0.2.1), VPN Label: VPN-L11 > 766 via ePE2 (192.0.2.2), VPN Label: VPN-L21 > 767 65000: 203.0.113.0/24 > 768 via ePE2 (192.0.2.2), VPN Label: VPN-L22 > 769 via ePE3 (192.0.2.3), VPN Label: VPN-L32 > > 771 192.0.2.1/32 (ePE1) > 772 Via ASBR11, BGP-LU Label: LASBR111(ePE1) > 773 Via ASBR12, BGP-LU Label: LASBR121(ePE1) > 774 192.0.2.2/32 (ePE2) > 775 Via ASBR11, BGP-LU Label: LASBR112(ePE2) > 776 Via ASBR12, BGP-LU Label: LASBR122(ePE2) > 777 192.0.2.3/32 (ePE3) > 778 Via ASBR13, BGP-LU Label: LASBR13(ePE3) > > 780 192.0.2.4/32 (ASBR11) > 781 via Core, Label: IGP-L11 > 782 192.0.2.5/32 (ASBR12) > 783 via Core, Label: IGP-L12 > 784 192.0.2.6/32 (ASBR13) > 785 via Core, Label: IGP-L13 > > [nit] s/Via/via/g > > > [minor] Only some of the addresses are annotated ("192.0.2.1/32 > (ePE1)"). This is helpful, but inconsistent -- both within this example > and with previous ones. > > > > ... > 797 o The index inside the pathlist entry indicates the label that will > 798 be picked from the OutLabel-List associated with the child leaf > 799 if that path is chosen by the forwarding engine hashing function. > > [] s/index/path-index > > > > 801 OutLabel-List OutLabel-List > 802 For VPN-IP1 For VPN-IP2 > 803 +------------+ +--------+ +-------+ +------------+ > 804 | VPN-L11 |<---| VPN-IP1| |VPN-IP2|-->| VPN-L22 | > 805 +------------+ +---+----+ +---+---+ +------------+ > 806 | VPN-L21 | | | | VPN-L32 | > 807 +------------+ | | +------------+ > 808 | | > 809 V V > 810 +---+---+ +---+---+ > 811 | 0 | 1 | | 0 | 1 | > 812 +-|-+-\-+ +-/-+-\-+ > 813 | \ / \ > 814 | \ / \ > 815 | \ / \ > 816 | \ / \ > 817 v \ / \ > 818 +-----+ +-----+ +-----+ > 819 +----+ ePE1| |ePE2 +-----+ | ePE3+-----+ > 820 | +--+--+ +-----+ | +--+--+ | > 821 v | / v | v > 822 +--------------+ | / +--------------+ | +-------------+ > 823 |LASBR111(ePE1)| | / |LASBR112(ePE2)| | |LASBR13(ePE3)| > 824 +--------------+ | / +--------------+ | +-------------+ > 825 |LASBR121(ePE1)| | / |LASBR122(ePE2)| | OutLabel-List > 826 +--------------+ | / +--------------+ | For ePE3 > 827 OutLabel-List | / OutLabel-List | > 828 For ePE1 | / For ePE2 | > 829 | / | > 830 | / | > 831 | / | > 832 v / v > 833 +---+---+ Shared Pathlist +---+ Pathlist > 834 | 0 | 1 | For ePE1 and ePE2 | 0 | For ePE3 > 835 +-|-+-\-+ +-|-+ > 836 | \ | > 837 | \ | > 838 | \ | > 839 | \ | > 840 v \ v > 841 +------+ +------+ +------+ > 842 +---+ASBR11| |ASBR12+--+ |ASBR13+---+ > 843 | +------+ +------+ | +------+ | > 844 v v v > 845 +-------+ +-------+ +-------+ > 846 |IGP-L11| |IGP-L12| |IGP-L13| > 847 +-------+ +-------+ +-------+ > > 849 Figure 5 : Forwarding Chain for hardware supporting 3 Levels > > [] The OutLabel-Lists at the bottom are not tagged as such. > > > [] Some down arrows ("v") are missing -- on the non-vertical lines. > > > > 851 Now suppose the hardware on iPE (the ingress PE) supports 2 levels > 852 of hierarchy only. In that case, the 3-levels forwarding chain in > 853 Figure 5 needs to be "flattened" into 2 levels only. > > 855 OutLabel-List OutLabel-List > 856 For VPN-IP1 For VPN-IP2 > 857 +------------+ +-------+ +-------+ +------------+ > 858 | VPN-L11 |<---|VPN-IP1| | VPN-IP2|--->| VPN-L22 | > 859 +------------+ +---+---+ +---+---+ +------------+ > 860 | VPN-L21 | | | | VPN-L32 | > 861 +------------+ | | +------------+ > 862 | | > 863 | | > 864 | | > 865 Flattened | | Flattened > 866 pathlist V V pathlist > 867 +===+===+ +===+===+===+ +==============+ > 868 +--------+ 0 | 1 | | 0 | 0 | 1 +---->|LASBR112(ePE2)| > 869 | +=|=+=\=+ +=/=+=/=+=\=+ +==============+ > 870 v | \ / / \ |LASBR122(ePE2)| > 871 +==============+ | \ +-----+ / \ +==============+ > 872 |LASBR111(ePE1)| | \/ / \ |LASBR13(ePE3) | > 873 +==============+ | /\ / \ +==============+ > 874 |LASBR121(ePE1)| | / \ / \ > 875 +==============+ | / \ / \ > 876 | / \ / \ > 877 | / + + \ > 878 | + | | \ > 879 | | | | \ > 880 v v v v \ > 881 +------+ +------+ +------+ > 882 +----|ASBR11| |ASBR12+---+ |ASBR13+---+ > 883 | +------+ +------+ | +------+ | > 884 v v v > 885 +-------+ +-------+ +-------+ > 886 |IGP-L11| |IGP-L12| |IGP-L13| > 887 +-------+ +-------+ +-------+ > > [] Same comments as above. > > > > 889 Figure 6 : Flattening 3 levels to 2 levels of Hierarchy on iPE > > 891 Figure 6 represents one way to "flatten" a 3 levels hierarchy into > 892 two levels. There are few important points: > > [nit] s/are few/are a few > > > > ... > 908 o Let's take a look at the flattened pathlist used by the prefix > 909 "VPN-IP2", The pathlist associated with the prefix "VPN-IP2" has > 910 three entries. > > [nit] s/"VPN-IP2", The/"VPN-IP2". The > > > > 912 o The first and second entry have index "0". This is because > 913 both entries correspond to ePE2. Thus when hashing performed > 914 by the forwarding engine results in using first or the second > 915 entry in the pathlist, the forwarding engine will pick the > 916 correct VPN label "VPN-L22", which is the label advertised by > 917 ePE2 for the prefix "VPN-IP2". > > [nit] s/using first/using the first > > > > ... > 952 o So the packet is forwarded towards the ASBR "ASBR12" and the IGP > 953 label at the top will be "L12". > > [minor] The figure calls the label "IGP-L12", not just "L12". There are > other instances of this below. > > > > ... > 982 6. Forwarding Chain Adjustment at a Failure > > 984 The hierarchical and shared structure of the forwarding chain > 985 explained in the previous section allows modifying a small number of > 986 forwarding chain objects to re-route traffic to a pre-calculated > 987 equal-cost or backup path without the need to modify the possibly > 988 very large number of BGP prefixes. In this section, we go over > 989 various core and edge failure scenarios to illustrate how FIB > 990 manager can utilize the forwarding chain structure to achieve BGP > 991 prefix independent convergence. > > [nit] s/how FIB manager/how FIB the manager > > > > 993 6.1. BGP-PIC core > > [minor] The meaning of "core" and "edge" should be explained somwehre. > > > > ... > 1001 When a remote link or node fails, IGP on the ingress PE receives > 1002 advertisement indicating a topology change so IGP re-converges to > 1003 either find a new next-hop and/or outgoing interface or remove the > 1004 path completely from the IGP prefix used to resolve BGP next-hops. > 1005 IGP and/or LDP download the modified IGP leaves with modified > 1006 outgoing labels for labeled core. > > [nit] s/IGP/the IGP/g > > > [nit] s/receives advertisement/receives an advertisement > > > [nit] s/for labeled core/for the labeled core > > > [] "download" > > This is the second time that the "download" concept is used without > explanation. See my comment in §3.1. > > > > 1008 When a local link fails, FIB manager detects the failure almost > 1009 immediately. The FIB manager marks the impacted path(s) as unusable > 1010 so that only useable paths are used to forward packets. Hence only > 1011 IGP pathlists with paths using the failed local link need to be > 1012 modified. All other pathlists are not impacted. Note that in this > 1013 particular case there is actually no need even to backwalk to IGP > 1014 leaves to adjust the OutLabel-Lists because FIB can rely on the > 1015 path-index stored in the useable paths in the pathlist to pick the > 1016 right label. > > [nit] s/is actually no need even to/is no need to > > > [nit] s/to IGP leaves/to the IGP leaves > > > [major] The term "backwalk" is introduced here, but not explained. > > > > 1018 It is noteworthy to mention that because FIB manager modifies the > 1019 forwarding chain starting from the IGP leaves only. BGP pathlists > 1020 and leaves are not modified. Hence traffic restoration occurs > within > 1021 the time frame of IGP convergence, and, for local link failure, > 1022 assuming a backup path has been precomputed, within the timeframe > of > 1023 local detection (e.g. 50ms). Examples of solutions that pre- > 1024 computing backup paths are IP FRR [16] remote LFA [17], Ti-LFA [15] > 1025 and MRT [18] or eBGP path having a backup path [11]. > > [major] The case on the previous paragraph deals with a local link failure > that results in the routes being "unusable", but this paragraph mentions > restoration "assuming a backup path has been precomputed" -- the > restoration timeframe also applies to the case where there is no backup > path (as explained above). > > Note that the next paragraph also talks about a backup route and seems to > ignore the "unusable" case above. > > > [nit] s/solutions that pre- computing/solutions that can pre-compute > > > [nit] s/Ti-LFA/TI-LFA > > > [minor] Expand FRR, LFA, TI-LFA, and MRT on first use. > > > [nit] s/IP FRR [16] remote LFA...[18] or eBGP/IP FRR [16], remote > LFA...[18], or eBGP > > > [minor] "eBGP path having a backup path [11]" > > I don't think that's the right reference. > > > > 1027 Let's apply the procedure mentioned in this subsection to the > 1028 forwarding chain depicted in Figure 2. Suppose a remote link > failure > 1029 occurs and impacts the first ECMP IGP path to the remote BGP next- > 1030 hop. Upon IGP convergence, the IGP pathlist used by the BGP > next-hop > 1031 is updated to reflect the new topology (one path instead of two). > As > 1032 soon as the IGP convergence is effective for the BGP next-hop > entry, > 1033 the new forwarding state is immediately available to all dependent > 1034 BGP prefixes. The same behavior would occur if the failure was > local > 1035 such as an interface going down. As soon as the IGP convergence is > 1036 complete for the BGP next-hop IGP route, all its BGP depending > 1037 routes benefit from the new path. In fact, upon local failure, if > 1038 LFA protection is enabled for the IGP route to the BGP next-hop and > 1039 a backup path was pre-computed and installed in the pathlist, upon > 1040 the local interface failure, the LFA backup path is immediately > 1041 activated (e.g. sub-50msec) and thus protection benefits all the > 1042 depending BGP traffic through the hierarchical forwarding > dependency > 1043 between the routes. > > [minor] "Upon IGP convergence... As soon as the IGP convergence is > effective for the BGP next-hop entry," > > There's some redundancy in the text. Also, by using different text, > you're implying that there is a difference between "IGP convergence" and > "IGP convergence...effective for the BGP next-hop entry". > > > [major] "The same behavior would occur if the failure was local such as an > interface going down. As soon as the IGP convergence..." > > According to the text above (2 or 3 paragraphs before), IGP convergence is > not necessary in the local failure case. > > > > ... > 1050 6.2.1. Adjusting forwarding Chain in egress node failure > > [minor] Most of the text talks about an "egress node". I assume that is > the same as an "edge node" -- is it? If so, please be consistent and/or > explain somewhere that you're referring to the same thing. > > > 1052 When an edge node fails, IGP on neighboring core nodes send route > 1053 updates indicating that the edge node is no longer reachable. IGP > 1054 running on the iPE instructs FIB to remove the IP and label leaves > 1055 corresponding to the failed edge node from FIB. So FIB manager > 1056 performs the following steps: > > [major] "IGP on neighboring core nodes send route updates indicating that > the edge node is no longer reachable" > > §6.1 says that "Node failures are treated as link failures." Is that the > same in this section? > > The neighbor nodes can't detect that the node is down -- they can only > detect the link being down. The rest of the explanation assumes that there > was a single link connecting the egress node to the core, or that all the > links were reported as down. Neither case is explained. > > If there is more than one link there shouldn't be a failure in reaching > the next-hop -- except for the change in the local router(s), similar to a > core failure. > > > > ... > 1073 6.2.2. Adjusting Forwarding Chain on PE-CE link Failure > ... > 1082 In the first case, the rest of iBGP peers will remain unaware of > the > 1083 link failure and will continue to forward traffic to the edge node > 1084 until the edge node attached to the failed link withdraws the BGP > 1085 prefixes. If the destination prefixes are multi-homed to another > 1086 iBGP peer, say ePE2, then FIB manager on the edge router detecting > 1087 the link failure applies the following steps (see Figure 3): > > [minor] Figure 3 is an example -- it does show a sample chain. Figure 4 > may be used as a reference for the type of backup described here. > > > > ... > 1100 o The label entry in OutLabel-List corresponding to the > 1101 internal path to backup egress PE has swap action to the > 1102 label advertised by backup egress PE. > > [nit] s/has swap action/has a swap action > > > [nit] s/by backup/by the backup > > > > 1104 o For an arriving label packet (e.g. VPN), the top label is > 1105 swapped with the label advertised by backup egress PE and > the > 1106 packet is sent towards that backup egress PE. > > [nit] s/label packet/labeled packet > > > [major] Even though you're expanding on an example, it is important for > the text to be precise. > > - The label mentioned is "e.g. VPN", but you're been using more specific > labels -- from figure 3, "VPN-L21"... > > - The OutLabel-Lists in Figure 3 show both "VPN-L21 (swap)" and "VPN-L21 > (push)". The same label is used, only the swap action is mentioned here. > What am I missing? > > - As I mentioned before, the diagram in Figure 3 is not clear. > > > > 1108 o For unlabeled traffic, packets are simply redirected towards > 1109 backup egress PE. > > [nit] s/backup/the backups > > > [major] "ackets are simply redirected towards backup egress PE" > > How? The backup ePE is presumably not connected (which is why a label was > needed)...?? > > > > 1111 In the second case where the edge router uses the IP address of the > 1112 failed link as the BGP next-hop, the edge router will still perform > 1113 the previous steps. But, unlike the case of next-hop self, IGP on > 1114 failed edge node informs the rest of the iBGP peers that IP address > 1115 of the failed link is no longer reachable. Hence the FIB manager on > 1116 iBGP peers will delete the IGP leaf corresponding to the IP prefix > 1117 of the failed link. The behavior of the iBGP peers will be > identical > 1118 to the case of edge node failure outlined in Section 6.2.1. > > [nit] s/IGP on failed edge node/the IGP on the failed edge node > > > [nit] s/that IP address/that the IP address > > > > 1120 It is noteworthy to mention that because the edge link failure is > 1121 local to the edge router, sub-50 msec convergence can be achieved > as > 1122 described in [11]. > > [] But only at the edge router -- not in other places of the network where > the IGP has to converge. > > Please keep the numbers in the Appendix. IOW, there's no need to claim > this performance here. > > > > 1124 Let's try to apply the case of next-hop self to the forwarding > chain > 1125 depicted in Figure 3. After failure of the link between ePE1 and > CE, > 1126 the forwarding engine will route traffic arriving from the core > 1127 towards VPN-NH2 with path-index=1. A packet arriving from the core > 1128 will contain the label VPN-L11 at top. The label VPN-L11 is swapped > 1129 with the label VPN-L21 and the packet is forwarded towards ePE2. > > [minor] This paragraph is out of place: you already tried to explain this > case a few paragraphs above. Please move the text and deduplicate. > > > > 1131 6.3. Handling Failures for Flattened Forwarding Chains > > 1133 As explained in the in Section 5 if the number of hierarchy levels > 1134 of a platform cannot support the native number of hierarchy levels > 1135 of a recursive forwarding chain, the instantiated forwarding chain > 1136 is constructed by flattening two or more levels. Hence a 3 levels > 1137 chain in Figure 5 is flattened into the 2 levels chain in Figure 6. > > [nit] s/a 3 levels/the 3-level > > > [nit] s/2 levels/2-level > > > > 1139 While reducing the benefits of BGP-PIC, flattening one hierarchy > 1140 into a shallower hierarchy does not always result in a complete > loss > 1141 of the benefits of the BGP-PIC. To illustrate this fact suppose > 1142 ASBR12 is no longer reachable in domain 1. If the platform supports > 1143 the full hierarchy depth, the forwarding chain is the one depicted > 1144 in Figure 5 and hence the FIB manager needs to backwalk one level > to > 1145 the pathlist shared by "ePE1" and "ePE2" and adjust it. If the > 1146 platform supports 2 levels of hierarchy, then a useable forwarding > 1147 chain is the one depicted in Figure 6. In that case, if ASBR12 is > no > 1148 longer reachable, the FIB manager has to backwalk to the two > 1149 flattened pathlists and updates both of them. > > 1151 The main observation is that the loss of convergence speed due to > 1152 the loss of hierarchy depth depends on the structure of the > 1153 forwarding chain itself. To illustrate this fact, let's take two > 1154 extremes. Suppose the forwarding objects in level i+1 depend on the > 1155 forwarding objects in level i. If every object on level i+1 depends > 1156 on a separate object in level i, then flattening level i into level > 1157 i+1 will not result in loss of convergence speed. Now let's take > the > 1158 other extreme. Suppose "n" objects in level i+1 depend on 1 object > 1159 in level i. Now suppose FIB flattens level i into level i+1. If a > 1160 topology change results in modifying the single object in level i, > 1161 then FIB has to backwalk and modify "n" objects in the flattened > 1162 level, thereby losing all the benefit of BGP-PIC. Experience shows > 1163 that flattening forwarding chains usually results in moderate loss > 1164 of BGP-PIC benefits. Further analysis is needed to corroborate and > 1165 quantify this statement. > > [major] "The main observation is that the loss of convergence speed due to > the loss of hierarchy depth... Further analysis is needed to corroborate > and quantify this statement." > > The explanation above sounds plausible. However, if you really don't > know, and there is a dependency on the structure, and it depends on the > failure, and there's no alternative (if the platform supports less > levels)... Why even include this information? It feels like the > equivalent of "hand waving". :-( > > > > 1167 7. Properties > > [minor] This section is really a summary. I wonder if it would serve a > better purpose at the start of the document as a "road map". > > > > 1169 7.1. Coverage > > 1171 All the possible failures, except CE node failure, are covered, > 1172 whether they impact a local or remote IGP path or a local or remote > 1173 BGP next-hop as described in Section 6. This section provides > 1174 details for each failure and how the hierarchical and shared FIB > 1175 structure proposed in this document allows recovery that does not > 1176 depend on number of BGP prefixes. > > [major] "except CE node failure" > > Isn't that covered in §6.2.2? > > > > 1178 7.1.1. A remote failure on the path to a BGP next-hop > > 1180 Upon IGP convergence, the IGP leaf for the BGP next-hop is updated > 1181 upon IGP convergence and all the BGP depending routes leverage the > 1182 new IGP forwarding state immediately. Details of this behavior can > 1183 be found in Section 6.1. > > [minor] Redundant text: "Upon IGP convergence, the IGP leaf for the BGP > next-hop is updated upon IGP convergence..." > > > > 1185 This BGP resiliency property only depends on IGP convergence and is > 1186 independent of the number of BGP prefixes impacted. > > [] "BGP resiliency property" > > The rest of the text (and the name of the document!) refers to > convergence. It isn't until this section that you change the terminology. > Please be consistent. > > > > 1188 7.1.2. A local failure on the path to a BGP next-hop > > 1190 Upon LFA protection, the IGP leaf for the BGP next-hop is updated > to > 1191 use the precomputed LFA backup path and all the BGP depending > routes > 1192 leverage this LFA protection. Details of this behavior can be found > 1193 in Section 6.1. > > [minor] Redundant text: LFA is mentioned 3 times in the first sentence. > > > [major] What about the case where an LFA doesn't exist? > > > > ... > 1238 7.3. Automated > > 1240 The BGP-PIC solution does not require any operator involvement. The > 1241 process is entirely automated as part of the FIB implementation. > > [] Ok -- it is not as much an automation as it is an implementation > choice. > > > > 1243 The salient points enabling this automation are: > > [minor] I"m not sure if you're saying that the list below is what enables > the "automation", or if the "automation" results in that. > > > > 1245 o Extension of the BGP Best Path to compute more than one primary > 1246 ([12]and [13]) or backup BGP next-hop ([7] and [14]). > > [major] This functionality is available regardless of the structure of the > FIB. > > > > 1248 o Sharing of BGP Path-list across BGP destinations with same > 1249 primary and backup BGP next-hop. > > [nit] s/same/the same > > > > 1251 o Hierarchical indirection and dependency between BGP pathlist and > 1252 IGP pathlist. > > [] The last two bullets are a characteristic of the implementation. > > > > 1254 7.4. Incremental Deployment > > 1256 As soon as one router supports BGP-PIC solution, it benefits from > 1257 all its benefits (most notably convergence that does not depend in > 1258 the number of prefixes) without any requirement for other routers > to > 1259 support BGP-PIC. > > [nit] Reduncant text: "it benefits from all its benefits" > > > [major] The assertion above is true. However, (in the extreme) a single > router supporting BGP-PIC doesn't necessarily translate in to better > overall convergence in the network (it depends on where the router is, the > failure, etc.). Also, having a mix of routers could result in transient > micro loops as the speed of convergence is not the same. > > > > 1261 8. Security Considerations > > 1263 The behavior described in this document is internal functionality > 1264 to a router that result in significant improvement to convergence > 1265 time as well as reduction in CPU and memory used by FIB while not > 1266 showing change in basic routing and forwarding functionality. As > 1267 such no additional security risk is introduced by using the > 1268 mechanisms proposed in this document. > > [] I can't think of anything else to say here, but you did not show (nor > do I need you to) "significant...reduction in CPU and memory used". It > sounds more like a marketing statement. :-( > > > > 1270 9. IANA Considerations > > 1272 No requirements for IANA > > [major] s/.../This document has no IANA actions. > > > > 1274 10. Conclusions > > [] This section is not needed, it contains redundant information. Please > remove it. > > > > ... > 1283 11. References > > [major] The RFC Editor requires that the OID be listed for all references > [1]. Please update the references -- here's an example of what they should > look like [2]: > > [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A Border > Gateway Protocol 4 (BGP-4)", RFC 4271, DOI 10.17487/RFC4271, > January 2006, <https://www.rfc-editor.org/info/rfc4271>. > > [1] https://www.rfc-editor.org/styleguide/part2/ > [2] https://www.rfc-editor.org/refs/ref4271.txt > > > > 1285 11.1. Normative References > ... > 1293 [3] E. Rosen, " Carrying Label Information in BGP-4", RFC 8277, > 1294 October 2017 > > [minor] This reference can be Informative. > > > > 1296 [4] Andersson, L., Minei, I., and B. Thomas, "LDP Specification", > 1297 RFC 5036, October 2007 > > [minor] This reference can be Informative. > > > > 1299 [5] A. Bashandy, C. Filsfils, S. Previdi, B. Decraene, S. > 1300 Litkowski, M. Horneffer, R. Shakir, "Segment Routing with > MPLS > 1301 data plane", RFC 8660, December 2019 > > [minor] This reference can be Informative. > > > > ... > 1377 Appendix A. Perspective > ... > 1392 o BGP Convergence per BGP destination ~ 200usec conservative, > > 1394 ~ 100usec optimistic > > [minor] "BGP Convergence" > > The explanation below talks about recalculating the best route -- which is > different than convergence (which includes sending BGP Updates, etc.). > > [EoR -18] > >
- AD review of draft-ietf-rtgwg-bgp-pic-18 Alvaro Retana
- Re: AD review of draft-ietf-rtgwg-bgp-pic-18 Alvaro Retana
- Re: AD review of draft-ietf-rtgwg-bgp-pic-18 Alvaro Retana
- RE: AD review of draft-ietf-rtgwg-bgp-pic-18 Susan Hares
- RE: AD review of draft-ietf-rtgwg-bgp-pic-18 Alvaro Retana
- Re: AD review of draft-ietf-rtgwg-bgp-pic-18 Yingzhen Qu
- Re: AD review of draft-ietf-rtgwg-bgp-pic-18 Alvaro Retana
- Re: AD review of draft-ietf-rtgwg-bgp-pic-18 Ahmed Bashandy
- Re: AD review of draft-ietf-rtgwg-bgp-pic-18 Alvaro Retana
- Re: AD review of draft-ietf-rtgwg-bgp-pic-18 Ahmed Bashandy
- Re: AD review of draft-ietf-rtgwg-bgp-pic-18 Alvaro Retana
- Re: AD review of draft-ietf-rtgwg-bgp-pic-18 Yingzhen Qu
- Re: AD review of draft-ietf-rtgwg-bgp-pic-18 Ahmed Bashandy