Re: Genart last call review of draft-ietf-rtgwg-bgp-pic-12

Ahmed Bashandy <abashandy.ietf@gmail.com> Sat, 06 February 2021 18:53 UTC

Return-Path: <abashandy.ietf@gmail.com>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8D73A3A046B; Sat, 6 Feb 2021 10:53:18 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Level:
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id N9Cwus-Gs1Xg; Sat, 6 Feb 2021 10:53:15 -0800 (PST)
Received: from mail-pg1-x52b.google.com (mail-pg1-x52b.google.com [IPv6:2607:f8b0:4864:20::52b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A17583A0652; Sat, 6 Feb 2021 10:53:12 -0800 (PST)
Received: by mail-pg1-x52b.google.com with SMTP id o7so6980277pgl.1; Sat, 06 Feb 2021 10:53:12 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=1zbSiquiy0M51qj+/LxfuRmO6fvkjmEkFwsJW3tTvWU=; b=QP6U81YmmiGalUeQTD5gOZ9byDYKrkSmN99hfZqSXYDGAr82/F6NsdQYajyp8D33Ca pLFyeFaIdGq9DPWHGQlsEuopgZKhO5MzwmQ4Vb0rFwgrblTEPLbEU0/Mahz3Hx3RwZI0 9vQLor7sQnUqoGniUKrbc7a70NS23pEI2mnZKuxa5HTttpjnjBXsH5pzBNabpyGhTEuX 3Oi/9dgyAbTuVDPsh8gAUkSVsT7mrEWDtXStaTi0h1DwovjqhFeGXq8WmMkD+V8uhtXL PfV2pgGEoKVXXHeXbyruAyfvXfy3/DAtlJXIBi/Bsv8fj3+0iU4t/58UwWzyjA67+KiS nFpg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=1zbSiquiy0M51qj+/LxfuRmO6fvkjmEkFwsJW3tTvWU=; b=HDoj+HECliPpGlx4Mz9+mrV2W+ib/EOey3xDUJzh7xzHNseeCN/n52gJgNnd4Wx1Oq IWUY1Lg2Xygg5iWzJdNxpRMIcoczHuOaPSKRZuN5reZIcmIF2Ud/SP025Y/Mj4GbyLUx Sm25JEESkldwas06xGSsBxPGhM4es/fjr5ZRWMtU3HxGnNHeKAI4+qGnP8n9Bl30Tm65 GbOMak6yYBrfYruN8W59ri80f5o2AdIHPdrrdlsDA8UUpCcdvIqtQH8DQWtIw5u1hEDt MOlQ4yErBhCAerfCAnXS2PCS2kT+eRz/krAXTtID6SZoudxORVQbtmSSQkR6J5YgQHiF AlCg==
X-Gm-Message-State: AOAM532Kwu7OboOsy2h6+KEUQvOvf5fQeR1pe/i8EYJdI+tItZQawdsy 4W9wf2xCHRu47XgISIU3JQZmkjKO7Ds=
X-Google-Smtp-Source: ABdhPJxwi77ESOy6EyGYZZGRAcmJsCeXakwyoER4Ybj/WZttMgqsEdfQLo22OUGpbcLz7cAEscs1ew==
X-Received: by 2002:a63:8f59:: with SMTP id r25mr5299286pgn.161.1612637591600; Sat, 06 Feb 2021 10:53:11 -0800 (PST)
Received: from [192.168.50.246] (c-73-189-164-225.hsd1.ca.comcast.net. [73.189.164.225]) by smtp.gmail.com with ESMTPSA id a8sm7741303pjs.40.2021.02.06.10.53.10 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 06 Feb 2021 10:53:11 -0800 (PST)
Subject: Re: Genart last call review of draft-ietf-rtgwg-bgp-pic-12
To: Theresa Enghardt <ietf@tenghardt.net>, gen-art@ietf.org
Cc: draft-ietf-rtgwg-bgp-pic.all@ietf.org, last-call@ietf.org, rtgwg@ietf.org
References: <161031531421.11518.2058149376831594099@ietfa.amsl.com>
From: Ahmed Bashandy <abashandy.ietf@gmail.com>
Message-ID: <194e8375-fe2b-f7fe-ba1d-07e65f46a92f@gmail.com>
Date: Sat, 06 Feb 2021 10:53:09 -0800
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.12.1
MIME-Version: 1.0
In-Reply-To: <161031531421.11518.2058149376831594099@ietfa.amsl.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtgwg/5iaF9BclTcGdF-TkBxzYYpdtxT4>
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtgwg/>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 06 Feb 2021 18:53:19 -0000

Thanks a lot for the detailed comments.

I will address them shortly


Ahmed


On 1/10/21 1:48 PM, Theresa Enghardt via Datatracker wrote:
> Reviewer: Theresa Enghardt
> Review result: Ready with Issues
>
> I am the assigned Gen-ART reviewer for this draft. The General Area
> Review Team (Gen-ART) reviews all IETF documents being processed
> by the IESG for the IETF Chair.  Please treat these comments just
> like any other review comments.
>
> For more information, please see the FAQ at
>
> <https://trac.ietf.org/trac/gen/wiki/GenArtfaq>.
>
> Document: draft-ietf-rtgwg-bgp-pic-12
> Reviewer: Theresa Enghardt
> Review Date: 2021-01-10
> IETF LC End Date: None
> IESG Telechat date: Not scheduled for a telechat
>
> Summary: The draft is basically ready for publication as an Informational RFC,
> but it has some context, clarity, and editorial issues that need to be fixed
> before publication.
>
> Major issues: None.
>
> Minor issues:
>
> Abstract:
>
> "In the network comprising thousands of iBGP peers exchanging millions
> of routes, many routes are reachable via more than one next-hop.
> Given the large scaling targets, it is desirable to restore traffic
> after failure in a time period that does not depend on the number of
> BGP prefixes."
> This part is missing a logical step in the argumentation between these two
> sentences. Is the first statement a prerequisite for restoring traffic, and
> then the question is how to make it scalable? Is the first statement the reason
> for things not being scalable? Please rephrase to make the relationship between
> these statements and the overall argumentation clear. Is "depending on the
> number of BGP prefixes" an inherent feature of BGP, or are you making any
> implicit assumptions? If so, please state them.
>
> "In this document we proposed an architecture […]"
> What does architecture mean in this context? Without any further qualification,
> in a networking context, as a reader I assume that "architecture" means
> "network architecture", i.e., something that involves multiple nodes such as
> multiple BGP speakers. But it appears that the document is only about the
> internals of each individual BGP speaker, i.e., how information is organized
> within the router. So maybe it's "router architecture" or "software
> architecture" or such? Please rephrase to make this clear in the abstract.
>
> Please clarify your scope. As the abstract specifically mentions iBGP, is this
> solution only about iBGP? Or is it about eBGP as well?
>
> Introduction:
>
> The introduction is missing a clear problem statement. Perhaps it's implicitly
> stated by saying that "convergence speed is limited by the time taken to
> serially propagate reachability information from the point of failure to the
> device that must re-converge.", but please be specific. Is this convergence
> speed that depends on information propagation time considered "too long", and
> therefore it needs to be reduced? Is it "too long" specifically in certain
> contexts, e.g., networks of a certain size? As the document actually appears to
> focus on speeding up changes within a singe node, it's not clear how this
> relates to propagation time. Does the node-internal speedup also speed up how
> fast propagated information converges? Why? As the statement about reachibility
> information being exchanged is the first sentence of the introduction, this
> makes it seems like it's fundamental to your document. If this is not the case,
> please consider starting the introduction with a clear problem statement that
> is actually fundamental to your document, such as "The way that information is
> currently organized within a BGP speaker [under … circumstances] is inefficient
> [for … reason] and leads to long convergence times."
>
> In the next sentence, "BGP speakers exchange reachability information about
> prefixes […]", the relationship to the problem statement is still not clear. Is
> this reachability information insufficient? Is there already is enough
> information to converge faster, and now your solution allows converging faster?
> Or something else?
>
> "[…] for labeled address families, namely AFI/SAFI 1/4, 2/4, 1/128, and 2/128
> […]" - Please expand these acronyms on first use and provide a reference.
>
> "[…] an edge router assigns local labels to prefixes and associates the
> local label with each advertised prefix […]"
> Does this apply to incoming advertisements, outgoing advertisements, or both?
> Please make the context clear here.
>
> "[…] such as L3VPN [7], 6PE
> [8], and Softwire [6] using BGP label unicast technique[3]."
> The "such as" is not entirely clear: If these are examples of the technique
> that the rest of the sentence describes, perhaps "using technologies such as"
> would be more clear. However, as the entire sentence is already very long,
> please consider splitting the sentence and make the relationship between the
> statements clear.
>
> Please expand NLRI on first use and perhaps provide a definition or reference.
>
> How does the proposal in this document relate to the techniques you mention,
> i.e., L3VPN, 6PE, and Softwire? Does it require them? Is their usage optional
> for your solution, but helps (and why)? Please make the relationship of your
> solution to these techniques explicit and state the prerequirements of your
> solution, if any.
>
> "This document proposes a hierarchical and shared forwarding chain
> organization […]"
> What is your solution an alternative to? How has information previously been
> organized? How does the concept of a forwarding chain relate to the details you
> already gave, which were about a BGP speaker exchanging reachability
> information and applying path selection - where does the forwarding chain come
> in? As this appears to be a fundamental concept to your solution, please
> introduce it in the first paragraph.
>
> "incrementally deployed and enabled with zero operator intervention"
> Well, deplying and enabling any solution does require operator intervention,
> e.g., a software update, correct? So perhaps that's Zero other operator
> intervention? Minimal operator intervention? Or not requiring a specific type
> of operator intervention that would otherwise be needed? Later in Section 3.1,
> the draft says "It is noteworthy to mention that the forwarding chain is
> constructed without any operator intervention at all.", so perhaps it's
> possible to further qualify what kind of operator intervention would otherwise
> be necessary, but is not necessary with your solution - e.g., no operator
> intervention is required to reconfigure routes when a link fails
>
> 1.1 Terminology
>
> Please expand on first usage and consider defining: AFI/SAFI, PE, CE, NLRI,
> forwarding plane, VPN RD's (probably VPN RDs), LSR, ASBRs, BGP-LU, FIB manager
> (is this a particular entity? A software component?) You don't have to define
> all BGP terms that you use, but please expand them once to make it easier to
> guess what they stand for or to look them up.
>
> For "Leaf", "IP leaf", "Label leaf": Why is it called leaf? In graph theory,
> isn't the leaf of a tree the node with no children and only one parent? In your
> figures, the "IP leaf" appears to have no parent and instead two children. So
> isn't it more of a root in the tree? Later, you mention the pathlist being "the
> parent" of the IP leaf, but in Figure 2, you have an arrow from the IP leaf
> pointing to the Pathlist, so to me that looks like the Pathlist is the child of
> the IP leaf. Is this a BGP convention? If so, perhaps a sentence stating that
> would help, and/or a reference.
>
> "OutLabel-List: Each labeled prefix is associated with an
>            OutLabel-List. The OutLabel-List is an array of one or more
>            outgoing labels and/or label actions where each label or label
>            action has 1-to-1 correspondence to a path in the pathlist.
>            Label actions are: push the label, pop the label, swap the
>            incoming label with the label in the Outlabel-Array entry, or
>            don't push anything at all in case of "unlabeled". The prefix
>            may be an IGP or BGP prefix"
> What are labels/label actions in this context? Are labels the same labels
> mentioned in the introduction, i.e., local labels that are assigned to
> prefixes? Are "outgoing labels" still local? Maybe here a brief explanation of
> how labels are defined and how they work would help.
>
> 2. Overview:
>
> "A forwarding plane that supports multiple levels of indirection:
> A forwarding that starts with a destination and ends with an
> outgoing interface is not a simple flat structure."
> What is "A forwarding"? Do you mean a forwarding entry? Is this the same thing
> as a route? Please consider adding a definition to the terminology. Is a
> forwarding plane the same as a forwarding chain (mentioned in the abstract)? If
> so, please unify your terminology. If not, please define the terms and explain
> what the differences are.
>
> 2.1.2. Availability of more than one BGP next-hops
>
> "The existence of a secondary next-hop is clear for the following
> reason: a service caring for network availability will require two
> disjoint network connections hence two BGP next-hops."
>
> By "the existence is clear" you mean "The existence is clearly required" or "It
> is clear whether a secondary next-hop exists" or something else?
>
> 2.2 BGP-PIC Illustration
>
> "We can see that the BGP
> pathlist consisting of BGP-NH1 and BGP-NH2 is shared by all NLRIs
> reachable via ePE1 and ePE2."
> How can we see that? ePE1 and ePE2 do not show up in Figure 2. I assume they
> map to something that is shown, but it's not clear what.
>
> 3.2. Example: Primary-Backup Path Scenario
>
> Comparing Figure 3 to Figure 2, there's a couple of differences in terminology:
> Figure 2 has an "IP Leaf" and Figure 3 has an "IP prefix leaf" called VPN-IP1.
> Are "IP Leaf" and "IP prefix leaf" the same concept? If so, please unify your
> terminology. Same question for VPN-L11 being "OutLabel-List" (Figure 2) and
> "Label-leaf" (Figure 3), VPN-L21 being part of an "OutLabel-List" (Figure 2)
> and "BGP OutLabel Array" (Figure 3), and BGP-NH1 being part of a "Pathlist"
> (Figure 2) and "BGP Pathlist". Figure 3 does not appear to show any Adjacency -
> why? Figure 2 does not appear to show any label actions - Why? Furthermore,
> making the figures more similar stylistically (e.g., having "IP prefix leaf"
> being always underlined or always in brackets) would help for comparing the two
> figures.
>
> 4. Forwarding Behavior
>
> "apply the label action of the label on the packet"
> What does this mean? Does "push" mean that the forwarding engine will add the
> label to the packet? How will this label be used? Will it be removed from the
> packet later? Will it be sent in a BGP advertisement? Please make this clearer
> here, and/or please explain what labels and label actions are earlier, and how
> they are used.
>
> "the forwarding engine applies a hashing algorithm to choose the path and
> the hashing at the BGP level yields path 0 while the hashing at the
> IGP level yields path 1"
> This sounds like ECMP, i.e., there's multiple paths and each packet is hashed
> and then sent through a path based on the hash. But the earlier sections
> sounded like your solution was more about primary paths and secondary failover
> paths. Are these two general approaches and your solution works for either?
> Please make this explicit, possibly early in the document.
>
> 5.1. Flattening the Forwarding Chain
>
> "Suppose the platform cannot support the number of hierarchy levels
> in the forwarding chain. FIB needs to reduce the number of hierarchy
> levels. […]"
> When in the process does this flattening happen? Only when a packet is
> forwarded, like in the above steps, or does it happen when the chain is first
> constructed? Does the flattening happen after a specific step in the above
> process, e.g., step 3, or is it independent? If it happens for each forwarded
> packet, this seems like a lot of steps. How is the overall efficiency still
> maintained?
>
> 6.1. BGP-PIC core
>
> "When a remote link or node fails, IGP on the ingress PE receives
> advertisement indicating a topology change so IGP re-converges to
> either find a new next-hop and/or outgoing interface or remove the
> path completely from the IGP prefix used to resolve BGP next-hops."
> Why IGP, when this document is about BGP?
> Is implied by the scope "when a core link or node fails but the BGP next-hop
> remains reachable"? If so, please make this explicit.
>
> "As soon as the IGP convergence is
> complete for the BGP next-hop IGP route, all its BGP depending
> routes benefit from the new path."
> What would happen in a scenario where BGP-PIC is not used? Would it take longer
> until the BGP routes can use the new path, and why?
>
> 6.2.2
> "the edge node attached to the failed
> link performs next-hop self" - What does "perform next-hop self" mean? Is there
> a word missing here, e.g., "lookup"?
>
> "The main observation is that the loss of convergence speed due to
> the loss of hierarchy depth"
> Does convergence depend of the exchange of BGP messages between BGP peers, or
> is the concept of convergence defined differently here? It seems like here
> convergence means something related to how information is stored/updated
> locally on the router, which is not what I would think about when I read "BGP
> convergence". (Related to the comment at the beginning of the introduction:
> What is your problem statement, i.e., what is the type of convergence you are
> talking about and that your solution speeds up?))
>
> 8. Security Considerations
>
> Are you sure that there are no security considerations?
> For example, if there is a bug in the implementation of this technique, could
> this make BGP prefix hijacking easier given a specific use of BGP labels?
>
> Nits/editorial comments:
>
> Abstract:
>
> "In the network comprising thousands of iBGP peers" -> "In a network comprising
> thousands of iBGP peers"
>
> Please expand BGP-PIC on first use.
>
> 1.1 Terminology
>
> "A prefix P/m (of any AFI/SAFI) that is learnt via
> an Interior Gateway Protocol, such as OSPF and ISIS, has a path
> for." - Is this sentence missing a subject for the "has a path for"? If this is
> "A prefix that an IGP has a path for", then the "is learnt via" does not fit in
> the sentence.
>
> "one or more prefix" -> "one or more prefixes"
>
> "a IP prefix" -> "an IP prefix"
>
> There's a stray ") in the "Pathlist" item.
>
> "may not necessarily has" -> "may not necessarily have"
>
> "the forwarding engine must visits" -> "the forwarding engine must visit"
>
> Please make all your terminology items consistent, i.e., sentences ending with
> a full stop or not.
>
> "A pathlist may contain a mix of primary and backup paths" - why is this its
> own item? Isn't it about the previous item, "Pathlist", and should be part of
> the same bullet point item?
>
> 2.2.1 Hierarchical Hardware FIB
>
> "the number of memory lookup's" -> "the number of memory lookups"
>
> 5.1. Flattening the Forwarding Chain
>
> Please unify how you write your terms, e.g., "OutLabel-list" vs.
> "outlabel-list" (Section 5.1)
>
> Please unify whether you capitalize all words in your headings or just some.
>
>