Re: [Lsr] "Prefix Unreachable Announcement" and "IS-IS and OSPF Extension for Event Notification"

Robert Raszuk <robert@raszuk.net> Wed, 13 October 2021 17:37 UTC

Return-Path: <robert@raszuk.net>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 14BF53A0875 for <lsr@ietfa.amsl.com>; Wed, 13 Oct 2021 10:37:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=raszuk.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bdAfREjlOfYh for <lsr@ietfa.amsl.com>; Wed, 13 Oct 2021 10:37:23 -0700 (PDT)
Received: from mail-ua1-x92b.google.com (mail-ua1-x92b.google.com [IPv6:2607:f8b0:4864:20::92b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B84413A040A for <lsr@ietf.org>; Wed, 13 Oct 2021 10:37:23 -0700 (PDT)
Received: by mail-ua1-x92b.google.com with SMTP id 64so6051400uab.12 for <lsr@ietf.org>; Wed, 13 Oct 2021 10:37:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raszuk.net; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=S4y3Mien72p1i7i1f4gjsYiUQRwjTQ5A4f5rvEuThOU=; b=KlZfpSjeIXHbe5yzhgxu3rVeWrhdXwK9h51i5E6fFA7JQOwfVB2pOX5DF+xjenwj+T 8JLrOpSjEhgywbSt2gstA6SQIgrkLg2IH2zyf/0/8L3MypRnNQyb8ZyMVL8gXy+m7DGE EEmnWlqocHJMLR9sxIJvtVHXHt5Si3vhR263YKXRcEJOT7UyqE+aHjk5K2kagF9AVfgp etlxcW8VeZJtIh9icmrg5YsnGXeztBdEz/j/Ge4g2SgO//96FMD6W+gMGgfAWr5tf+Zt iR1yWUqbtyKpGNERJ4hvZM3rfzXvZfJA3Rhv4dcDPvX/lzysaXwoDmeDXbjiblqAMiLE DnvQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=S4y3Mien72p1i7i1f4gjsYiUQRwjTQ5A4f5rvEuThOU=; b=fFfKZAy7wOccPw2mKc6sTyWORCDMCFVt7hDolSWoEkzH0nSoyLO7AthX1zgaRE4tM3 j2OFJvOx10hT4+zU2MJ3aoXavebA2Lsh0M4g4d85mqRi4QTk51TZk7tUka92Yef/gpSq wjJZ/J+/35t4wTWEohACtGlSVcOw+j3y3rQoutqerAvg4V/7I4scrYFrSEGZ3cTTTDgR 1C9mdjwmpQZGaVhqJoxtDB9MqXUPVO+gWMEILIyOeT+Tr80bw8glJMgw111RaAFN4tkJ YZnIsSQWcNtch1Vgsru8RFq8fg7YFyMxqs0YMdSHvh/UtgvlE8qiEAMUYVb4OIOm6g/x 249Q==
X-Gm-Message-State: AOAM533hqBPoFnKRleMjb7m6NBYySNCuywaP1bh/JeDfbtQZN1LL1GeB NfwJEN6yolUThLlBAJRSGe38t5TWz7fKVEXYVK9E7g==
X-Google-Smtp-Source: ABdhPJyF57H/S1m84JtjP3UnKETofPMBmv8LD9JH31oVy+EAC9zLulcUepn0EOe3NyJgh0Ebeq69aA/eIoylDtMlso4=
X-Received: by 2002:ab0:136d:: with SMTP id h42mr614639uae.40.1634146642631; Wed, 13 Oct 2021 10:37:22 -0700 (PDT)
MIME-Version: 1.0
References: <CAOj+MMFedNRbWWyi3mWeqUjX=Q588qG8i7FsAy98dXLnr2ypfQ@mail.gmail.com> <6B75DDAB-B073-40F4-89DA-49805F381A1B@gmail.com>
In-Reply-To: <6B75DDAB-B073-40F4-89DA-49805F381A1B@gmail.com>
From: Robert Raszuk <robert@raszuk.net>
Date: Wed, 13 Oct 2021 19:37:25 +0200
Message-ID: <CAOj+MMGunvc1orx-=Ck1WxsYPGrkfiyWZWDCXbRnrc-MTeWw+g@mail.gmail.com>
To: Jeff Tantsura <jefftant.ietf@gmail.com>
Cc: "Acee Lindem (acee)" <acee=40cisco.com@dmarc.ietf.org>, lsr <lsr@ietf.org>, Peter Psenak <ppsenak=40cisco.com@dmarc.ietf.org>
Content-Type: multipart/alternative; boundary="000000000000252dd905ce3f6ab0"
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/NWhCvQ5376k0l9kae3fbBMAB69o>
Subject: Re: [Lsr] "Prefix Unreachable Announcement" and "IS-IS and OSPF Extension for Event Notification"
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 13 Oct 2021 17:37:30 -0000

Jeff,

The point is about the right architecture, not a brute force method.

Do you see it as elegant to bombard 1000s PE with a probe every second ? I
do not. Especially considering that PEs rarely go down.

And that should be compared with even driven notification.

BGP: - Local BFD detect the failure in say 100-200 ms
          - propagation via RR is max in ms (I recall in Cisco we measured
this to actually by much faster on dedicated control plane only RRs)
          - then interesting PE (RT filter) can be notified about the
failure in less then 300 ms - be conservative 500 ms.

IGP:  - Local BFD detect the failure in say 100-200 ms
         - Failure get's flooded to local ABR(s)
         - LSA/LSP get's generated and flooded hop by hop via say 20 hops
all over the network via say core and additional area.
          Maybe the entire process takes also 500 ms ... I do not have any
estimate here, but I am sure folks on LSR do.

In any case the event driven notification is a much better choice.

Kind regards,
R.


On Wed, Oct 13, 2021 at 7:28 PM Jeff Tantsura <jefftant.ietf@gmail.com>
wrote:

> Number of BGP peers isn’t representative here, classical deployments would
> have a number of RR’s to circumvent full mesh. What counts is the total
> number of PEs (next-hops) that originate the prefix that is locally
> imported (needs to be tracked). For further optimization, only multihomed
> prefixes are of interest (if a PE that has a CE that is single-homed to
> goes away, there’s no convergence).
> A possible solution (discussed a number of times here) is to extract
> next-hop from an UPDATE, compare to the list of next-hops already learnt
> and establish multi-hop BFD session according to the business logic.
> Modern mid-high end platforms can easily run a 1000 MH BFD session @1sec
>
> Cheers,
> Jeff
>
> On Oct 13, 2021, at 10:16, Robert Raszuk <robert@raszuk.net> wrote:
>
> 
>
> > How many other PEs does a BGP edge PE maximally peer with?
>
> Typically on IBGP side you will see 2-4 peers. Those are RRs.
>
> Due to no autodiscovery of BGP sessions no many people do iBGP full mesh
> between PEs.
>
> Best,
> R.
>
> On Wed, Oct 13, 2021 at 6:48 PM Acee Lindem (acee) <acee=
> 40cisco.com@dmarc.ietf.org> wrote:
>
>> Hi Peter,
>>
>> See inline.
>>
>> On 10/13/21, 4:42 AM, "Peter Psenak" <ppsenak=40cisco.com@dmarc.ietf.org>
>> wrote:
>>
>>     Hi Acee,
>>
>>     On 12/10/2021 21:05, Acee Lindem (acee) wrote:
>>     > Speaking as WG Chairs:
>>     >
>>     > The authors of “Prefix Unreachable Announcement” have requested an
>>     > adoption. The crux of the draft is to signal unreachability of a
>> prefix
>>     > across OSPF or IS-IS areas when area summarization is employed and
>>     > prefix is summarised. We also have “IS-IS and OSPF Extension for
>> Event
>>     > Notification” which can be used to address the same use case. The
>> drafts
>>     > take radically different approaches to the problem and the authors
>> of
>>     > both drafts do not wish to converge on the other draft’s method so
>> it is
>>     > understandable that merging the drafts really isn’t an option.
>>
>>     just for the record, I offered authors of "Prefix Unreachable
>>     Announcement" co-authorship on "Event notification" draft, arguing
>> the
>>     the event base solution addresses their use case in a more elegant
>> and
>>     scalable way. They decided to push their idea regardless.
>>
>> One solution to this problem would have definitely been better.
>>
>>     > Before an adoption call for either draft, I’d like to ask the WG:
>>     >
>>     >  1. Is this a problem that needs to be solved in the IGPs? The use
>> case
>>     >     offered in both drafts is signaling unreachability of a BGP
>> peer.
>>     >     Could this better solved with a different mechanism  (e.g., BFD)
>>     >     rather than flooding this negative reachability information
>> across
>>     >     the entire IGP domain?
>>
>>     we have looked at the various options. None of the existing ones
>> would
>>     fit the large scale deployment with summarization in place. Using BFD
>>     end to end to track reachability between PEs simply does not scale.
>>
>> It seems to me that scaling of BFD should be "roughly" proportional to
>> BGP session scaling but I seem to be in the minority. My opinion is based
>> on SDWAN tunnel scaling, where BFD is used implicitly in our solution. How
>> many other PEs does a BGP edge PE maximally peer with?
>> Thanks,
>> Acee
>>
>>
>>     Some people believe this should be solved by BGP, but it is important
>> to
>>     realize that while the problem statement at the moment is primarily
>>     targeted for egress PE reachability loss detection for BGP, the
>>     mechanism proposed is generic enough and can be used to track the
>> peer
>>     reachablity loss for other cases (e.g GRE endpoint, etc) that do not
>>     involve BGP.
>>
>>     We went even further and explored the option to use completely out of
>>     band mechanism that do not involve any existing protocols.
>>
>>     Simply, the advantage of using IGP is that it follows the existing
>> MPLS
>>     model, where the endpoint reachability is provided by IGPs. Operators
>>     are familiar with IGPs and know how to operate them.
>>
>>     On top of the above, IGP event notification can find other use cases
>> in
>>     the future, the mechanism defined in draft is generic enough.
>>
>>
>>     >  2. Assuming we do want to take on negative advertisement in the
>> IGP,
>>     >     what are the technical merits and/or detriments of the two
>> approaches?
>>
>>     we have listed some requirements at:
>>
>>
>> https://datatracker.ietf.org/doc/html/draft-ppsenak-lsr-igp-event-notification-00#section-3
>>
>>      From my perspective the solution should be optimal in terms of
>> amount
>>     of data and state that needs to be maintained, ideally separated from
>>     the traditional LS data. I also believe that having a generic
>> mechanism
>>     to distribute events has it own merits.
>>
>>     thanks,
>>     Peter
>>
>>     >
>>     > We’ll reserve any further discussion to “WG member” comments on the
>> two
>>     > approaches.
>>     >
>>     > Thanks,
>>     > Acee and Chris
>>     >
>>
>>
>> _______________________________________________
>> Lsr mailing list
>> Lsr@ietf.org
>> https://www.ietf.org/mailman/listinfo/lsr
>>
> _______________________________________________
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr
>
>