Re: RtgDir review: draft-ietf-rtgwg-bgp-pic-00

"Ahmed Bashandy (bashandy)" <bashandy@cisco.com> Thu, 21 April 2016 23:27 UTC

Return-Path: <bashandy@cisco.com>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 92A8212E737; Thu, 21 Apr 2016 16:27:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -15.516
X-Spam-Level:
X-Spam-Status: No, score=-15.516 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.996, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LyC-WWhDYLgc; Thu, 21 Apr 2016 16:27:52 -0700 (PDT)
Received: from rcdn-iport-8.cisco.com (rcdn-iport-8.cisco.com [173.37.86.79]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6D2A212E147; Thu, 21 Apr 2016 16:27:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=49284; q=dns/txt; s=iport; t=1461281272; x=1462490872; h=message-id:date:from:mime-version:to:cc:subject: references:in-reply-to; bh=Otp58elrjjdn5w3af1SGGyjhX/kwP4Tb0wIXdInstrI=; b=LxopeO7GtdIrGtwd4U/opXhL6fV6xV0T+Q/GQ73SSAXAqQr2D8fEM5Oy jkOO5MiYKN3VWzm6yDwgiQXr3U2dS4k31Ln5kCjxhDfLIq5opPxQ4mXgX EQRFNt/6dd3Z8e8YgAbnumcNqvgugCf67DCmHdqkmxMmnBI2vcMg88qwj A=;
X-IronPort-AV: E=Sophos; i="5.24,514,1454976000"; d="scan'208,217"; a="94538315"
Received: from rcdn-core-3.cisco.com ([173.37.93.154]) by rcdn-iport-8.cisco.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Apr 2016 23:27:51 +0000
Received: from [10.154.57.130] ([10.154.57.130]) by rcdn-core-3.cisco.com (8.14.5/8.14.5) with ESMTP id u3LNRpuP017698; Thu, 21 Apr 2016 23:27:51 GMT
Message-ID: <571961F6.9040608@cisco.com>
Date: Thu, 21 Apr 2016 16:27:50 -0700
From: "Ahmed Bashandy (bashandy)" <bashandy@cisco.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0
MIME-Version: 1.0
To: bruno.decraene@orange.com, "rtg-ads@ietf.org" <rtg-ads@ietf.org>
Subject: Re: RtgDir review: draft-ietf-rtgwg-bgp-pic-00
References: <18205_1461160419_571789E3_18205_2694_2_53C29892C857584299CBF5D05346208A0F8761E1@OPEXCLILM21.corporate.adroot.infra.ftgroup>
In-Reply-To: <18205_1461160419_571789E3_18205_2694_2_53C29892C857584299CBF5D05346208A0F8761E1@OPEXCLILM21.corporate.adroot.infra.ftgroup>
Content-Type: multipart/alternative; boundary="------------050700020907010209010406"
Archived-At: <http://mailarchive.ietf.org/arch/msg/rtgwg/GhKCUL1Iq4v6b2oaDJZ8K8MNX6E>
Cc: "rtg-dir@ietf.org" <rtg-dir@ietf.org>, "draft-ietf-rtgwg-bgp-pic.all@ietf.org" <draft-ietf-rtgwg-bgp-pic.all@ietf.org>, "rtgwg@ietf.org" <rtgwg@ietf.org>
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtgwg/>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 21 Apr 2016 23:27:56 -0000

Thanks a lot for the valuable comments.

I will try the address the comments in the next spin

Ahmed

On 4/20/2016 6:53 AM, bruno.decraene@orange.com wrote:
>
> Hello,
>
> I have been selected as the Routing Directorate reviewer for this 
> draft. The Routing Directorate seeks to review all routing or 
> routing-related drafts as they pass through IETF last call and IESG 
> review, and sometimes on special request. The purpose of the review is 
> to provide assistance to the Routing ADs. For more information about 
> the Routing Directorate, please see ​ 
> http://trac.tools.ietf.org/area/rtg/trac/wiki/RtgDir
>
> Although these comments are primarily for the use of the Routing ADs, 
> it would be helpful if you could consider them along with any other 
> IETF Last Call comments that you receive, and strive to resolve them 
> through discussion or by updating the draft.
>
> Document: draft-ietf-rtgwg-bgp-pic-00
> Reviewer: Bruno Decraene
> IETF LC End Date: “QA review” pre WG LC
> Intended Status: Informational
>
> *Summary:*
>
> I have some minor concerns about this document that I think should be 
> resolved before publication.
>
> *Comments:*
>
> - Document is interesting. Document is relatively clear but sometime 
> it feels like there is a little room for some reformulation/edition to 
> improve fluidity. In particular, the learning curve is a bit steep at 
> the beginning of the doc as most of the concepts are introduced in 3 
> pages (pages 4-6) in the form of a list of terminology and a pseudo 
> code. I would find useful to have an overview section just after the 
> introduction, with a high level view of the solution with a limited 
> number of new terms.
>
> - The text feels like authoritative, while probably many terms are 
> implementation specific. A priori, I would not expect all 
> implementation of BGP PIC to use the same terms, and possibly not the 
> same data structure. May be the text could be generalized to cover 
> multiple implementations; or modified to describe a generalized 
> concept (i.e. data-structure designed to share as much data as 
> possible between elements, at the cost of additional indirections); or 
> the document could state that it describes a specific implementation 
> with implementations specific terminology, data structure, and 
> specifics. Or a combination of both (e.g. adding a section being both 
> generalized and describing the concept, and then the existing sections 
> after stating that they are specific to one implementation).
>
> *Minor Issues:*
>
> - I find figure 2 very useful to understand the data-structure. I 
> would move it sooner in the doc, somewhere before §2.2. (with its 
> subsequent text below) e.g. a new §2.2 "FIB data-structure"
>
> It would need to be generalized i.e. example non-specific. I could 
> think of:
>
> IP Leaf:      Pathlist:       IP Leaf:                Pathlist:
>
> --------      --------- -------                 --------
>
> BGP NLRI ---> BGP NH1   ----> IGP IP1 (BGP NH1)  ---> IGP NH1, I1  
> ---> Adjacency1
>
> BGP NHi   --...                         IGP NHi, Ii  --..
>
> |                                    |
>
> |                                         |
>
> |                                         |
>
> v                                         v
>
>           OutLabel Array:                           OutLabel Array:
>
> --------------                            --------------
>
>           L (NLRI, NH1)                             L (IP1, NH1)
>
> L (NLRI, NHi)                             L (IP1, NHi)
>
> - Figure 1 could be enhanced with IGP-NH1, IGP-NH2, I1 and I2.
>
> - Example 3 does not use the same naming convention than examples 1 
> and 2, this make it harder to follow for a priori no reason. e.g. VPN 
> labels are named VPN-L11 in examples 1 and 2, but are named 
> VPN-PE21(P1) in exmaple 3; transport labels are named LDP-L12 in 
> exmaples 1 and 2, but LASBR11(PE22) and L11 in figure 3.
>
> - §2.3.3
>
> "The local labels of the next hops".
>
>  - All labels are locally assigned. So what do you mean by "local"
>
> - "next-hop" sometimes refers to IGP/connected next-hop (a priori the 
> case here) and sometimes to BGP next-hop. I find it hard to follow. I 
> rather use a different name (e.g; connected next-hop vs BGP next-hop)
>
> - §3
>
> "the hashing at the BGP level yields path 0 while the hashing at the 
> IGP level yields path 1. In that case, the packet will be sent out of 
> interface I1 with the label stack "LDP-L12,VPN-L21".
>
> Does not seem to match my understanding. For "LDP-L12,VPN-L21" I would 
> assume BGP used path index 1 and IGP used path index 0.
>
> IMHO:
>
> OLD: "Hence ASBR22 swaps "LASBR22(PE22)" with the LDP/SR label of 
> PE22, pushes the label of the next-hop towards PE22 in domain 2, and 
> sends the packet along the shortest path towards PE22."
>
> NEW: "Hence ASBR22 swaps "LASBR22(PE22)" with the LDP/SR label for 
> PE22 advertised by the next-hop towards PE22 in domain 2, and sends 
> the packet along the shortest path towards PE22."
>
> (in all cases "swaps" then "pushes" would increase the label stack by 
> 1, which is not the case.)
>
> §4.1
>
> "the useable paths in the loadinfo"
>
> loadinfo is a proprietary FIB datastructure which has not been 
> introduced/defined. You need to either remove that term (if possible) 
> or define it somewhere.
>
> "Hence traffic restoration occurs within the time frame of IGP 
> convergence,"
>
> agree.
>
> ..."and, for local link failure, within the timeframe of local 
> detection. Thus it is possible to achieve sub-50 msec convergence as 
> described in [10] for local link failure"
>
> IMO, this is restricted to specific cases. e.g. external (eBGP) link 
> failure, ECMP case, possibly IP FRR.  So possibly
>
> OLD: for local link failure, within the timeframe of local detection. 
> Thus it is possible to achieve sub-50 msec convergence as described in 
> [10] for local link failure
>
> NEW: for local link failure, assuming a backup path has been 
> precomputed, within the timeframe of local detection (e.g. 50ms). 
> Example of solutions precomputing a backup path are IP FRR [LFA], 
> [RLFA], [MRT], [TI-LFA] or eBGP path having a backup path [10].
>
> §4
>
> I would find useful to indicate, for each type of failure, the number 
> of data-structure that need to be updated.
>
> ---
>
> §4.2.2
>
> "To avoid loops, ePE2 MUST treat any core facing path as a backup
>
>       path, otherwise ePE2 may redirect traffic arriving from the core
>
>       back to ePE1 causing a loop."
>
> Looks a bit under-described to me. Could you please elaborate a bit? 
> In particular:
>
> - if 2 PE (PE1, PE2) are connected in U to 2 P (P1, P2)     
> (P1-PE1-PE2-P2), PE1 being nominal and PE2 only used in backup, in the 
> nominal situation, if the core network sends the trafic to PE1 via PE2 
> (used as a P/transit), how does PE2 know that it must send this 
> traffic to PE1? (rather than CE2)
>
> - this behavior looks like an additional specific feature. How doew 
> ePE1 knows that ePE2 have this feature?
>
> ---
>
> §4.3
>
> "  Hence if the platform supports the "unflattened" forwarding chain,
>
>    then a single pathlist needs to be updated while if the platform
>
>    supports a shallower forwarding chain, then two pathlists need to be
>
>    updated."
>
> IINM "single"  and "two" pathlist applies to the specific example. In 
> this last sentence/summary, I'd prefer a more general statement. A 
> priori, without digging too much in this most complex use case, it 
> seems like :s/single/o(1)  :s/two/o(PE) . The former looks close 
> (single vs o(1)) but IMHO there is a significant difference between 2 
> and o(PE) (i.e. 100s)
>
> ---
>
> §5.1
>
> Good paragraph. It's quite clear that the convergence time does not 
> depend on the number of BGP prefixes, which is good. For the benefit 
> of the reader, it would be even more interesting if, for each type of 
> failure, the text could indicate on what it depends. e.g.  o(1), 
> o(connected interfaces), o(PE), o(PEnominal*PEbackup)....
>
> --
>
> §7
>
> "No additional security risk is introduced by using the mechanisms 
> proposed in this document"
>
> In general, with such a sentence, it's difficult to evaluate whether 
> the authors have very quickly evaluated the risk or if this evaluation 
> has been performed in details. So in general, some more text detailing 
> which aspects have been evaluated is interesting for the reader (yet 
> painful for the authors).
>
> As the document describe an internal box behavior, this is difficult 
> to evaluate and discuss. But from a bad experience, I fear that there 
> may be an impact. Indeed, with such structure, the FIB 
> structure/memory is typically different between BGP prefixes and IGP 
> prefixes. In general, the implementation is designed to support the 
> "right" numbers of both. But assuming an accident or an attack, the 
> numbers may not be "right". e.g. one upon a time, someone has 
> redistributed the BGP table into the IGP. In this case, the total 
> number of IP prefixes in the FIB is exactly the same. But as the data 
> structure used in the FIB was different between BGP and IGP prefixes, 
> the FIB ran out of memory and the line card crashed (well actually 
> only the IP FIB, so IS-IS hello packet were still correctly sent and 
> forwarded. As a result, traffic was permanently black holed)
>
> ---
>
> § 9
>
> OLD: that allows achieving prefix independent convergence
>
> NEW: that allows achieving BGP prefixes independent convergence
>
> (it's still depend on the number of IGP prefixes and/or BGP pathlist)
>
> *Nits:*
>
> Abstract
>
> "via more than one path."
>
> In this 1rst sentence, it's not clear what path really means. (e.g cf 
> the terminology section where you have more than one). I guess that 
> you mean "BGP path". (as there are also typically multiple IGP path to 
> reach each BGP Next Hop)
>
> "The objective is achieved through organizing the forwarding chains"
>
> "chain" does not self self explicit to me. what about :s/chains/data 
> structure"
>
> "complete transparency"
>
> what do you mean? transparency to what / from who?
>
> §1
>
> OLD: to allow for more than one path for a given prefix
>
> NEW: to allow for BGP to advertise more than one path for a given prefix
>
> OLD: Another more common and widely deployed scenario is L3VPN with 
> multi-homed VPN sites
>
> NEW: Another more common and widely deployed scenario is L3VPN with 
> multi-homed VPN sites with unique Route Distinguisher.
>
> ---
>
> §1.2
>
> "Pathlist: It is an array of paths"
>
> "OutLabel-Array: The OutLabel-Array is a list of one or more outgoing 
> labels "
>
> So a list is defined as an array and the array is defined as a list :-).
>
> What about using the same term, e.g. a list?
>
> --
>
> The OutLabel-Array is a list of one or more
>
>       outgoing labels and/or label actions where each label or label
>
>       action has 1-to-1 correspondence to a path in the pathlist. It
>
>       is possible that the number of entries in the OutLabel-array is
>
>       different from the number of paths in the pathlist and the ith
>
>       Outlabel-Array entry is associated with the path whose path-
>
>       index is "i".
>
> - I don't see how one can have a 1-to-1 correspondance if the number 
> of elements is not the same.
>
> - Last sentence could be splitted in 2.
>
> -- 
>
> Since the term ingres PE is defined, you could also detine the term 
> egress PE. Possibly in the same sentence.
>
> OLD: "Ingress PE, "iPE": It is a BGP speaker that learns about a
>
>       prefix through another IBGP peer and chooses that IBGP peer as
>
>       the next-hop for the prefix
>
> NEW:      "Ingress PE, "iPE": It is a BGP speaker that learns about a
>
>       prefix through a IBGP peer and chooses an egress PE as the 
> next-hop for the prefix.
>
> As a side note, the previous definition assume that there were no 
> Route Relfector (the iBGP peer is the BGP Next Hop)
>
> -- 
>
> §2.3
>
> Figure 1 represents a VPN network with 3 PE and a CE. In this context, 
> "VPN-P1" sounds a bit like a P router. What about :s/VPN-P1/VPN-IP1 ? 
> Same comment for IGP-P1.
>
> --
>
> §2.3.2
>
> OLD: ePE2 constructs the forwarding chain depicted in Figure 1
>
> NEW: ePE2 constructs the forwarding chain depicted in Figure 3
>
> OLD: VPL-L11
>
> NEW: VPN-L11
>
> §2.3.3
>
> OLD: can reach ASBR1
>
> NEW: can reach ASBR11
>
> OLD: The label for advertised by ASBR11 to iPE
>
> NEW: The label advertised by ASBR11 to iPE
>
> OLD: The labels for advertised by ASBR12 to iPE
>
> NEW: The labels advertised by ASBR12 to iPE
>
> OLD: The labels for advertised to iPE by ASBR11 using BGP-LU
>
> NEW: The labels advertised  by ASBR11 to iPE using BGP-LU
>
> ---
>
> §3
>
> OLD: Let's applying the above forwarding steps to the example 
> described in Figure 1 Section 2.3.1.
>
> OLD: Let's applying the above forwarding steps to the example 
> described in Figure 2 Section 2.3.1.
>
> (somewhat guesssing. But in all cases, there is no figure 1 in section 
> 2.3.1)
>
> ---
>
> §4.1
>
> IMO
>
> OLD: As soon as the IGP convergence is effective for the BGP nhop 
> entry, the new forwarding state is immediately available to all 
> dependent BGP prefixes.
>
> NEW: As soon as the IGP convergence is effective for a BGP next-hop 
> entry, the new forwarding state is immediately available to all 
> dependent BGP prefixes.
>
> more generally
>
> :s/nhop/next-hop
>
> ---
>
> §4.3
>
> :s/PE222/PE22
>
> Best regards,
>
> Bruno
>
> **
>
> _________________________________________________________________________________________________________________________
>
> Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and delete this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
> Thank you.