Re: [Lsr] Prefix Unreachable Announcement Use Cases

Gyan Mishra <hayabusagsm@gmail.com> Fri, 20 November 2020 08:43 UTC

Return-Path: <hayabusagsm@gmail.com>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EDAE73A1A68 for <lsr@ietfa.amsl.com>; Fri, 20 Nov 2020 00:43:10 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.087
X-Spam-Level:
X-Spam-Status: No, score=-2.087 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mLJNCzzMTY-b for <lsr@ietfa.amsl.com>; Fri, 20 Nov 2020 00:43:06 -0800 (PST)
Received: from mail-pg1-x529.google.com (mail-pg1-x529.google.com [IPv6:2607:f8b0:4864:20::529]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 73DAB3A1A9A for <lsr@ietf.org>; Fri, 20 Nov 2020 00:43:06 -0800 (PST)
Received: by mail-pg1-x529.google.com with SMTP id m9so6727927pgb.4 for <lsr@ietf.org>; Fri, 20 Nov 2020 00:43:06 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=6XUcRix3Q0FhaCp+yJoqMfWnWn88dFoVVAyK/wFqvsU=; b=jUf+rnvbVPcRgjQSzJAhdi0w4H2IybLukWMXMhcfPoaixvO2SIzQdP3WAsJFFp7WRs yYVU3FVhOMfBjo7+cpZuN+uBAtd1nBhJzrbjQwLgxL72dBVrWtpZj6IwXdCyMFCbYhl7 3vYYuPNo6+2USY5hPqMGizlZNASAUvpTH2wMtBwhYr8Jj6DA3bbwWHHT9A+zsQDkpF4i JWs2a38MRKkNDSqzeYy3AhBVNLMqTZ278tcAos7O15BRSyYCznRUiqN9Fqh6vJsqs76h jO5wNrKWH6Sh4gE25uPgwSoY/xn1WlPaQ08yDoaL/9A92PvL9OkmcriU9EjPXsfGOVGp C7bw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6XUcRix3Q0FhaCp+yJoqMfWnWn88dFoVVAyK/wFqvsU=; b=Tww2vNeUbrEd1gbbdohNjteU/1PLgLiwxhRBaw+OxOSvnLrN5b15LEZVStWgKYX8X7 ND9Bb6SovKz6HnpjlJjF304kC+Tlhed61X9l2ns74yJwjZIAKlnQO0A2tq5z0sGTgudZ q9qy+Dyz6Ksr5VNAf5cjh/cuwF3zTGRF9F5g/gh5fY4qRejKtItCPRPzsY4LwPzevBXC ngiCbPeiAcWiejaBiJ/E9ZrRmXyT1uztUUnI6iy0ysfJydiFJ+UMAoflzv68No1qBx0+ FfimL06o+klkkt3gjxH5jUYG+GXSCmUasjUT3HKoDfDcwL6WzoR/vX5/kf275BowhTd/ nZ7w==
X-Gm-Message-State: AOAM533kPXAHm1GiTAg8nJmyM69ZDClZDdB453Ts151BDujfbtoX/kkc 12gAjPPUr8VxXWX0dKEYD8InCEbCQzWGiZxWT68=
X-Google-Smtp-Source: ABdhPJwwmHRazbwdA31ZcWDYAaK0zYA9sY5v29KxJLbXh3HIUt8DkOrtO9zo9uBOSw6PMF4FrSheJH6BiacpNgKOHzc=
X-Received: by 2002:a65:4241:: with SMTP id d1mr15637498pgq.18.1605861785821; Fri, 20 Nov 2020 00:43:05 -0800 (PST)
MIME-Version: 1.0
References: <CAOj+MMH7zRaXNJTRC0ua7ohasUpo0MmeqgzcU9BdpcD7wD+Yrg@mail.gmail.com> <D477846E-1086-46A8-B2D6-E552623E2643@gmail.com> <016b01d6bca9$cf908c20$6eb1a460$@tsinghua.org.cn> <CAOj+MMEKbBU1mymU2RzWzwi6Se8ZwQ9OsCBn4NUiX3YAceLdoQ@mail.gmail.com> <CABNhwV1yS1KdPe0hYGOUhDBpqbNqZCaO=xNEr_LaRg35b=f55g@mail.gmail.com> <CAOj+MMGnRkYrTcC45QEy+F5HNCoFn75r=1gn-+OT89Q53D_pYA@mail.gmail.com> <CABNhwV1pK5JX5sDcPyRKuR67eAkAq-q3wRmYqbsfCwOj0wWjSw@mail.gmail.com> <32DFCE3A-D41C-48CA-928A-37011D158AEF@cisco.com> <c646fecb-2d45-4ece-adc1-eb0635a58c3c@Spark> <CAOj+MMGrZz3pJfmP1gh+4YO6XfKr_NWe+QOy8mfjyqUxqub5kw@mail.gmail.com> <019901d6bd81$9565b5b0$c0312110$@tsinghua.org.cn> <2C21FE6C-4AC0-4949-ADE0-357DD7E18A87@cisco.com> <00d401d6be1c$f2027f60$d6077e20$@tsinghua.org.cn> <CABNhwV0vU_mvnh-c_RURGbXBSMtqBTS+sLGepnGx1u=vbBNm-w@mail.gmail.com> <CAOj+MMHqRUk9otv8btoZ9RkYE08Z2n3TFrArUQsoGih5jHz_UA@mail.gmail.com>
In-Reply-To: <CAOj+MMHqRUk9otv8btoZ9RkYE08Z2n3TFrArUQsoGih5jHz_UA@mail.gmail.com>
From: Gyan Mishra <hayabusagsm@gmail.com>
Date: Fri, 20 Nov 2020 03:42:54 -0500
Message-ID: <CABNhwV3BO3By4Y=ArpgM0coGtDbNvvysdw5f8q_xab1M0LsgLw@mail.gmail.com>
To: Robert Raszuk <robert@raszuk.net>
Cc: "Acee Lindem (acee)" <acee@cisco.com>, Aijun Wang <wangaijun@tsinghua.org.cn>, lsr <lsr@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000004d64d305b485d565"
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/r9gYMImTBeNM3bTPM9FEFEeqfVs>
Subject: Re: [Lsr] Prefix Unreachable Announcement Use Cases
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 20 Nov 2020 08:43:11 -0000

Robert

Let me send it over graphical depiction as a picture is worth a thousand
words should be easy to see the concept.

Trying to keep as simple as possible.

Thanks

Gyan

On Fri, Nov 20, 2020 at 3:03 AM Robert Raszuk <robert@raszuk.net> wrote:

> Gyan,
>
> I like and support your use case #1
>
> For use case #2 I have doubts. Imagine indeed an area which get's
> partitioned such that ABR will loose connectivity to bunch of nodes. Does
> this mean that now such ABR will be blasting globally perhaps 1000s of PUAs
> ? How would the ABR be able to send PUA summary provided some disconnected
> POPs within area could be nicely summarized ? Isn't it better to design
> your area such that ABRs are interconnected with redundancy ?
>
> Thx,
> R.
>
>
>
> On Fri, Nov 20, 2020 at 3:28 AM Gyan Mishra <hayabusagsm@gmail.com> wrote:
>
>>
>> Aijun
>>
>> I am thinking per the feedback received let’s keep it really simple.  We
>> need a solid use case to move the ball forward were the PUA data plane
>> convergence capabilities can fills a gap that exists today.
>>
>> I have two simple solutions to start that I will update the presentation
>> with to start and present to the WG and we can go from there.
>>
>> 1.  BGP NH tracking via PUA of NH component prefix to converge the data
>> plane
>>
>> 2.  Area partitioned scenario where PUA is used for data plane
>> convergence to ABR that has reachability to component prefixes.
>>
>> I will send out tomorrow.
>>
>> Thanks
>>
>> Gyan
>>
>> On Wed, Nov 18, 2020 at 9:37 PM Aijun Wang <wangaijun@tsinghua.org.cn>
>> wrote:
>>
>>> Hi, Acee:
>>>
>>>
>>>
>>> OK, we will try to improve this document to meet this criteria.
>>>
>>> And, as this topic has been discussed intensely on the mail list, we are
>>> also eager to invite more interested experts to join us as co-authors to
>>> refine the solutions for more scenarios.
>>>
>>>
>>>
>>> Thanks in advance.
>>>
>>>
>>>
>>> Best Regards
>>>
>>>
>>>
>>> Aijun Wang
>>>
>>> China Telecom
>>>
>>>
>>>
>>>
>>>
>>> *From:* Acee Lindem (acee) <acee@cisco.com>
>>> *Sent:* Thursday, November 19, 2020 12:42 AM
>>> *To:* Aijun Wang <wangaijun@tsinghua.org.cn>; 'Robert Raszuk' <
>>> robert@raszuk.net>; 'Jeff Tantsura' <jefftant.ietf@gmail.com>
>>> *Cc:* 'Gyan Mishra' <hayabusagsm@gmail.com>; 'lsr' <lsr@ietf.org>;
>>> 'Acee Lindem (acee)' <acee=40cisco.com@dmarc.ietf.org>
>>> *Subject:* Re: [Lsr] Prefix Unreachable Announcement Use Cases
>>>
>>>
>>>
>>> Speaking as WG Co-Chair:
>>>
>>>
>>>
>>> *From: *Aijun Wang <wangaijun@tsinghua.org.cn>
>>> *Date: *Wednesday, November 18, 2020 at 3:05 AM
>>> *To: *Robert Raszuk <robert@raszuk.net>, Jeff Tantsura <
>>> jefftant.ietf@gmail.com>
>>> *Cc: *'Gyan Mishra' <hayabusagsm@gmail.com>, Acee Lindem <acee@cisco.com>,
>>> 'lsr' <lsr@ietf.org>, "'Acee Lindem (acee)'" <
>>> acee=40cisco.com@dmarc.ietf.org>
>>> *Subject: *RE: [Lsr] Prefix Unreachable Announcement Use Cases
>>>
>>>
>>>
>>> Hi, Robert:
>>>
>>>
>>>
>>> The trigger and propagation of PUA info can be standardized, the actions
>>> based on the PUA can be different in different situation.
>>>
>>> We can discuss and describe the actions based on different scenarios
>>> after its WG adoption?
>>>
>>>
>>>
>>> There will be no adoption call until there is a coherent draft with use
>>> case(s) and viable actions.
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Acee
>>>
>>>
>>>
>>>
>>>
>>> Best Regards
>>>
>>>
>>>
>>> Aijun Wang
>>>
>>> China Telecom
>>>
>>>
>>>
>>> *From:* Robert Raszuk <robert@raszuk.net>
>>> *Sent:* Wednesday, November 18, 2020 3:49 PM
>>> *To:* Jeff Tantsura <jefftant.ietf@gmail.com>
>>> *Cc:* Gyan Mishra <hayabusagsm@gmail.com>; Acee Lindem (acee) <
>>> acee@cisco.com>; lsr <lsr@ietf.org>; Aijun Wang <
>>> wangaijun@tsinghua.org.cn>; Acee Lindem (acee) <
>>> acee=40cisco.com@dmarc.ietf.org>
>>> *Subject:* Re: [Lsr] Prefix Unreachable Announcement Use Cases
>>>
>>>
>>>
>>> Jeff,
>>>
>>>
>>>
>>> Please notice that WAN is not an IX.
>>>
>>>
>>>
>>> While you can have full mesh of BFD sessions among all IXP participants
>>> each bombarding each over over TB fabric every 100 ms or so to map the same
>>> over global WAN is a different game. If nothing else RTT between IXP
>>> participants in healthy IX is around 1 ms while RTT between PEs distributed
>>> globally is often 100-200 ms.
>>>
>>>
>>>
>>> Just imagine 1000 PEs in 10 areas distributed all over the world. That
>>> means that in worst case scenario (say same mgmt VPN present on each PE)
>>> you will establish 1000*999 BFD sessions. Now for this to make sense timer
>>> needs to be 100 ms or so with 3x or 5x multiplier. Anything slower will
>>> defeat the purpose as BGP withdraw will be faster.
>>>
>>>
>>>
>>> Then we go into queuing issues. If BFD packets are queued at any
>>> interface meltdowns may occur which can be far worse in consequences then
>>> waiting for BGP service route removal. Such meltdowns often result in
>>> cascading effects to the applications itself.
>>>
>>>
>>>
>>> So this is not at all about autodiscovery with which address to
>>> setup the BFD session. It is much more about operational aspects of going
>>> that direction.
>>>
>>>
>>>
>>> With that I am supportive of this work even if we label it as
>>> experimental for some time. As each network is different what is optimal
>>> solution for one design and deployment may not be optimal for the other.
>>>
>>>
>>>
>>> Many thx,
>>>
>>> Robert
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Nov 18, 2020 at 4:34 AM Jeff Tantsura <jefftant.ietf@gmail.com>
>>> wrote:
>>>
>>> We have been discussing for quite some time and in different wg's (there
>>> ’s IX with RS use case) BFD verification based on next-hop extraction,
>>> Robert - you should know. (also built a well working prototype in previous
>>> life).
>>>
>>> Very simple logic:
>>>
>>> Upon route import (BGP update received and imported), extract next-hop,
>>> walk BFD session table, if no match (no existing session) - establish
>>> (S)BFD session (Discriminators distribution is a solved problem) to the
>>> next-hop, associate fate of all routes received from it, keep timers
>>> reasonable to prevent false positives.
>>>
>>> State is limited to PE’s importing each others routes (sharing a
>>> service) only
>>> High degree of automation
>>> No IGP pollution
>>>
>>>
>>>
>>> Cheers,
>>>
>>> Jeff
>>>
>>> On Nov 17, 2020, 6:43 AM -0800, Acee Lindem (acee) <acee@cisco.com>,
>>> wrote:
>>>
>>> Speaking as WG member:
>>>
>>>
>>>
>>> I think it would be good to hone in on the BGP PE failure convergence
>>> use case as suggested by Robert. It seems there is some interest here
>>> although I’m not convinced the IGP is the right place to solve this
>>> problem.
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Acee
>>>
>>>
>>>
>>> *From:* Lsr <lsr-bounces@ietf.org> on behalf of Gyan Mishra <
>>> hayabusagsm@gmail.com>
>>> *Date:* Tuesday, November 17, 2020 at 4:02 AM
>>> *To:* Robert Raszuk <robert@raszuk.net>
>>> *Cc:* lsr <lsr@ietf.org>, Jeff Tantsura <jefftant.ietf@gmail.com>,
>>> Aijun Wang <wangaijun@tsinghua.org.cn>, "Acee Lindem (acee)" <acee=
>>> 40cisco.com@dmarc.ietf.org>
>>> *Subject:* Re: [Lsr] Prefix Unreachable Announcement Use Cases
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Nov 17, 2020 at 3:36 AM Robert Raszuk <robert@raszuk.net> wrote:
>>>
>>>
>>>
>>>
>>>
>>>    Robert, I believe the original intention was related to having the
>>> data plane converge quickly when summarization is used and flip so traffic
>>> converges from the Active ABR to the Backup ABR.
>>>
>>>
>>>
>>> I do not buy this use case. Flooding within the area is fast such that
>>> both ABRs will get the same info. As mentioned before there is no practical
>>> use of PUA for making any routing or fwd decision on which ABR to use. If
>>> your ABRs are not connected with min redundancy this draft is a worst patch
>>> ever to work around such a design.
>>>
>>>
>>>
>>>    Gyan> Agreed.  The point of PUA in ABR use case is the ability to
>>> track the component prefixes and in case where component is down and
>>> traffic is still forwarded to the ABR and dropped.  The other more
>>> important use case is when links are down within the area and the area is
>>> partitioned and so one ABR has all component prefixes however other ABR is
>>> missing half the component prefixes.  So since the ABR will by default
>>> advertise the summary as long as their is one component UP the summary is
>>> still advertised.  So this use case is severely impacting as now you have
>>> an ECMP path to the other area for the summary via the two ABRs and you
>>> drop half your traffic.  So now with PUA the problem is fixed and the PUA
>>> is sent and now traffic is only sent to the ABR that has the component
>>> prefixes.
>>>
>>>
>>>
>>> Please present us a picture indicating before and after ABRs behaviour.
>>>
>>>
>>>
>>>      Gyan> will do
>>>
>>>
>>>
>>>    However PUA can be used in the absence of area segmentation within a
>>> single area when a link or node fails to converge the data plane quickly by
>>> sending PUA for the backup path so the active path.
>>>
>>>
>>>
>>> If there is no area segmentation then there is no summaries. So what are
>>> we missing in the first place ?
>>>
>>>
>>>
>>>     Gyan> Sorry I am stating that PUA feature can also be used intra
>>> area where if a link or node goes down to improve data plane convergence.
>>>
>>>
>>>
>>>
>>>
>>> With the IGP tuned with BFD fast detection on ISIS or OSPF links and LFA
>>> & RLFA for MPLS or TI-LFA for SR local protection - with those tweaks the
>>> convergence is well into sub second.  So for Intra area convergence with
>>> all the optimizations mentioned I am not sure how much faster the data
>>> plane will converge with PUA.
>>>
>>>
>>>
>>> Even without any of the above listed chain of acronymous things will
>>> generally work well intra-area without PUAs.
>>>
>>>
>>>
>>>     Gyan> Agreed which is why I mentioned the BGP next hop self use case
>>> if I could figure out how PUA could help there that would be a major
>>> benefit of PUA.
>>>
>>>
>>>
>>> Thx,
>>> R.
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> <> <http://www.verizon.com/>
>>>
>>> *Gyan Mishra*
>>>
>>> *Network Solutions Architect *
>>>
>>>
>>>
>>> *M 301 502-134713101 Columbia Pike
>>> <https://www.google.com/maps/search/13101+Columbia+Pike+Silver+Spring,+MD?entry=gmail&source=g>
>>> <https://www.google.com/maps/search/13101+Columbia+Pike+%C2%A0+Silver+Spring,+MD?entry=gmail&source=g>*Silver
>>> Spring, MD
>>> <https://www.google.com/maps/search/13101+Columbia+Pike+Silver+Spring,+MD?entry=gmail&source=g>
>>>
>>>
>>>
>>> --
>>
>> <http://www.verizon.com/>
>>
>> *Gyan Mishra*
>>
>> *Network Solutions A**rchitect *
>>
>>
>>
>> *M 301 502-134713101 Columbia Pike
>> <https://www.google.com/maps/search/13101+Columbia+Pike+%C2%A0+Silver+Spring,+MD?entry=gmail&source=g> *Silver
>> Spring, MD
>> <https://www.google.com/maps/search/13101+Columbia+Pike+%C2%A0+Silver+Spring,+MD?entry=gmail&source=g>
>>
>> --

<http://www.verizon.com/>

*Gyan Mishra*

*Network Solutions A**rchitect *



*M 301 502-134713101 Columbia Pike *Silver Spring, MD