Re: [Idr] I-D Action: draft-kaliraj-idr-multinexthop-attribute-01.txt

Gyan Mishra <> Sat, 12 June 2021 02:27 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 6BB7C3A18F2 for <>; Fri, 11 Jun 2021 19:27:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id NwKvssVTjbar for <>; Fri, 11 Jun 2021 19:27:42 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:4864:20::62f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 60A693A18F0 for <>; Fri, 11 Jun 2021 19:27:42 -0700 (PDT)
Received: by with SMTP id e1so3717528pld.13 for <>; Fri, 11 Jun 2021 19:27:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=m7lTf22h714swfnKsgiIIrIJHBRjfIFFuQBVhhs1Q88=; b=SVmNkFCqgLP7a2JyUjeps+Y0HBNP2iVuvPmfdK/DXlYOoIBWU8yksx4TF5bSadYiwI wjfQ37lMUeNZ6JYSFCmCj5p8U4XGVMn2DqOAL0p3etB5E9+lh4keYV/bfPM8h6VCLi4u f8K7XiaODGPfOw2hUiFb5xQP1UtNvXjQlQNNu9Nm0MEr1/Tnlxd26ohdEkLJWoVoznIF DGWAlVz+tTqelg9Lyi8AH/MXFNd3TYNM8arK0H7NwUjyj9xDkZ6BmgHGSlmQrJznYjGz Z/b8SbqggHboeLVhFw7JIEW3PM5g7a3VPmcgYlEt7OvWMzay6nsfBkbMJqCVVnZMzWTG rBcQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=m7lTf22h714swfnKsgiIIrIJHBRjfIFFuQBVhhs1Q88=; b=ogsE5is2OzGFf1EoHK4CvT28zvBo1Yx95WNTnZ5Cka9HiLUtlWiNAVZy/DC4HL6WgS xlQrRumjPDthVZwrin5GDNyUMP7f7m5DRPFfZ7qALHT6xArUVF7q54RGx6OZxhrGE+54 516DnQC388xiIq+W6qGTLytIEf/dwjw1kC76m+jw3NDpzF3kFJsiFNp3NOa+sDGbG/yc DHBVbEfh3zWjaPHrdy6B7tHK5niD906j4Xik8rfijniDz+qrMuujQOO2tiX9tNA4Fely rqTKM70XZrqn/1cxG27kiWHOYjzTrPFS/MR7+vJS01Yq5xvf555h2gxNeBH8Ashzv03P 20Og==
X-Gm-Message-State: AOAM530qlUO3KoFCGJq4dmQQPYEKv4yxeyMB2qsK4q86ppcGiEVwUXjc ZyH54zQIu1j1nyvMXoCqOEJtJIQVhaQVDuTDSTU=
X-Google-Smtp-Source: ABdhPJy8Adx7kN1puy7vIz/50VmhvoRVAjG5VtiLuD0dOztysW8f0A/W3Okryk91P+xprM05+0wOeGnb1VknMtcKtec=
X-Received: by 2002:a17:90a:a512:: with SMTP id a18mr12030594pjq.215.1623464859723; Fri, 11 Jun 2021 19:27:39 -0700 (PDT)
MIME-Version: 1.0
References: <> <> <>
In-Reply-To: <>
From: Gyan Mishra <>
Date: Fri, 11 Jun 2021 22:27:26 -0400
Message-ID: <>
To: Robert Raszuk <>
Cc: Kaliraj Vairavakkalai <>, "idr@ietf. org" <>,
Content-Type: multipart/alternative; boundary="00000000000044ed9305c4885e9d"
Archived-At: <>
Subject: Re: [Idr] I-D Action: draft-kaliraj-idr-multinexthop-attribute-01.txt
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 12 Jun 2021 02:27:48 -0000

Hi Kaliraj

I read through the draft and had a few questions.

This feature parallels add path but is different in the way it is deployed
as it a BGP optional transitive path attribute as opposed to a add path
capability code which is exchanged P2P between PE-RR peers in a controlled
fashion.  In that manner the add paths is generally used within an AS for
BGP PIC advertisement of backup paths as well as for BGP multipath load
balancing within an AS.

The gap this draft fills is clear to be able to provide a more intelligent
add path advertisement feature.  This also fills a very important gap of
unequal cost load balancing as in a topology may not have equal cost cost
metric for iBGP tie breaker, thus one path ends up being the best path even
though multiple paths exist with add paths even with unique RD, which makes
it difficult for iBGP load balancing to occur.

I support the development and progression of this draft as this is an
import solution to a real world problem of VPN overlay iBGP unequal cost
load balancing within the operator core as well as uniform predictable load

MultiNextHop= MNH - referring to as

Is the intent to use within an AS and if so making the attribute non
transitive I think  is a MUST as the next hop is changed for external
peering the propagation of the multinexthop attribute is irrelevant.   I
think you mentioned next-hop-self which is set on PE-RR peering at the
domain ingress node can be used to stop the MNH from being propagated.
However, the next-hop-self is done in ingress inbound upon entering the
domain and not leaving the domain so I think the MNH can still get
propagated outside the domain.

Section 3.1.1 described interaction with top level mandatory transitive
next hop attribute code 3 and the BGP Update MP_REACH next hop code 14.

Can you help explain section 3.1.1 interaction

   “When adding a MultiNexthop attribute to an advertised BGP route, the
   speaker MUST put the same next-hop address in the Advertising PNH
   field as it put in the Nexthop field inside NEXT_HOP attribute or
   MP_REACH_NLRI attribute.  Any speaker that recognizes this attribute
   and changes the PNH while re-advertising the route MUST remove the
   MultiNexthop-Attribute in the re-advertisement.  The speaker MAY
   however add a new MultiNexthop-Attribute to the re-advertisement;
   while doing so the speaker MUST record in the "Advertising-PNH" field
   the same next-hop address as used in NEXT_HOP field or MP_REACH_NLRI

What is PNH?  primary next hop ??

So the way I read this is the optional transitive MNH attribute is copied
into code 3 and 14.  So if there are 10 NH in the PNH they all get written
into top level code 3 and BGP update code 14.

Since this is for “BGP free core” the P routers where Label swap forwarding
semantics happens is not applicable as we don’t run BGP.  As the forwarding
semantics is not applicable after the VPN label “push” label imposition CE
to PE eBGP peering entering the core or “pop and forward” label disposition
PE to CE eBGP peering leaving the core.
All the next hops have a VPN label imposed which is the next-hop-self from
each PE, which I think would have only the “forwarding” semantics
applicable, so I think that is the only one that would apply for MNH iBGP
load balancing.

Kind Regards


On Fri, Jun 11, 2021 at 9:10 AM Robert Raszuk <> wrote:

> HI
> Two more questions ...
> 8. I am not clear what are trying to describe in Section 8.
>    Like any other optional transitive BGP attribute, it is possible that
>    this attribute gets propagated thru speakers that don't understand
>    this attribute and an error detected by a speaker multiple hops away.
>    This is mitigated by requiring the receiving speaker to remove this
>    attribute when doing nexthop-self.
> First indeed, the attribute may be propagated by BGP speakers which do not
> understand it, but that in itself is not an error. In those cases partial
> bit is set but attribute is still valid.
> This is also completely orthogonal to setting the next hop self on the
> path when propagating for example across EBGP.
> 9. You are providing a lot of analogy to Add-Paths. But in Add-Paths thx
> to capability negotiation it is mandatory for receiving speaker indicating
> support to act on it. Here you have chosen for some reason blind
> propagation as optional transitive which perhaps may be ok for networks
> which do end to end encapsulation, but I don't think it is going to work in
> pure IPv4/IPv6 hop by hop lookup - especially in the cases of mixed network
> elements some supporting the attribute and some not.
> Thx,
> R.
> On Fri, Jun 11, 2021 at 12:09 PM Robert Raszuk <> wrote:
>> Dear authors,
>> I have read yr draft with interest as some perhaps recall we have
>> discussed this topic in the past number of times.
>> Cosmetics:
>> 1. First nit - why do you say label must be 3107bis label ? (3.4.2.
>> Labeled IP nexthop) MPLS label is a label and I am not sure how one method
>> of label distribution matter here.
>> 2a. What is PNH ? It first occurs in section 3 as "PNH-Len" or
>> "PNH-address", but it is never explained in the draft. Is this Path Next
>> Hop ?
>> 2b. Would it be better to call this new attribute MNH MultiNextHop ?
>> Tech:
>> 3. In section 3.1.1 you describe how to assure that NH in MP_REACH is
>> also part of MultiNext Hop Attribute. But I do not see any discussion on
>> how to treat or ignore next hops in MP_REACH when a new attribute is
>> present and is valid.
>> 4. In section 3.1.2 you define behaviour of RR advertising paths from non
>> MultiNexthop paths and those which carry new attributes ... But you should
>> make it clear that this is only about step in best path selection
>> (or candidate selection). There can be other criteria before we even get to
>> that step.
>> 5. Now the most important question - how do you plan to handle atomic
>> withdraws ? I assume the plan is to readvertise the path with MNH - the
>> removed next hop. So by implicit withdraw this next hop will be removed.
>> The draft is silent on this. Now if the removed next hop was selected by
>> some receivers as best (due to and it was not arriving as part of
>> MP_REACH this proposal require significant implementation changes on how
>> BGP best path selection is triggered, how it runs, how it populates results
>> to the RIB/FIB. I think a new section is needed in detailed discussing the
>> withdraws.
>> 6. How do you envision max-prefix safety knobs to work here ? On the
>> surface it may seem orthogonal - but it is not. Today folks use this to
>> protect infrastructure from for example operator's mistakes. Here one
>> received path may fill the MNH attribute with 100s of next hops and as
>> being optional and transitive will be distributed to all routers all over
>> the world.
>> 7. Observe that metric to next hops is dynamic. So some implementations
>> capable of next hop tracking register with RIB all next hops and each time
>> metric changes they get a call back. Here we are effectively talking about
>> exploding this 10x or 100x or more ...
>> Many thx,
>> Robert
>> ---------- Forwarded message ---------
>> From: <>
>> Date: Fri, Jun 11, 2021 at 10:56 AM
>> Subject: I-D Action: draft-kaliraj-idr-multinexthop-attribute-01.txt
>> To: <>
>> A New Internet-Draft is available from the on-line Internet-Drafts
>> directories.
>>         Title           : BGP MultiNexthop attribute
>>         Authors         : Kaliraj Vairavakkalai
>>                           Minto Jeyananth
>>         Filename        : draft-kaliraj-idr-multinexthop-attribute-01.txt
>>         Pages           : 12
>>         Date            : 2021-06-11
>> Abstract:
>>    Today, a BGP speaker can advertise one nexthop for a set of NLRIs in
>>    an Update.  This nexthop can be encoded in either the BGP-Nexthop
>>    attribute (code 3), or inside the MP_REACH attribute (code 14).
>>    For cases where multiple nexthops need to be advertised, BGP-Addpath
>>    is used.  Though Addpath allows basic ability to advertise multiple-
>>    nexthops, it does not allow the sender to specify desired
>>    relationship between the multiple nexthops being advertised e.g.,
>>    relative-preference, type of load-balancing.  These are local
>>    decisions at the receiving speaker based on path-selection between
>>    the various additional-paths, which may tie-break on some arbitrary
>>    step like Router-Id.
>>    Some scenarios with a BGP-free core may benefit from having a
>>    mechanism, where egress-node can signal multiple-nexthops along with
>>    their relationship to ingress nodes.  This document defines a new BGP
>>    attribute "MultiNexthop" that can be used for this purpose.
>>    This attribute can be used for both labeled and unlabled BGP
>>    families.  For labeled-families, it is used for a different purpose
>>    in "downstream allocation" case than "upstream allocation" scenarios.
>> The IETF datatracker status page for this draft is:
>> There is also an htmlized version available at:
>> A diff from the previous version is available at:
>> Internet-Drafts are also available by anonymous FTP at:
>> _______________________________________________
>> I-D-Announce mailing list
>> Internet-Draft directories:
>> or
> _______________________________________________
> Idr mailing list


*Gyan Mishra*

*Network Solutions A**rchitect *

*Email <>*

*M 301 502-1347*