Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

Peter Psenak <ppsenak@cisco.com> Tue, 09 March 2021 11:12 UTC

Return-Path: <ppsenak@cisco.com>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F37BC3A1A25; Tue, 9 Mar 2021 03:12:59 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -11.902
X-Spam-Level:
X-Spam-Status: No, score=-11.902 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cMfrQ4L41YPt; Tue, 9 Mar 2021 03:12:58 -0800 (PST)
Received: from aer-iport-1.cisco.com (aer-iport-1.cisco.com [173.38.203.51]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 714973A1A1F; Tue, 9 Mar 2021 03:12:57 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=7187; q=dns/txt; s=iport; t=1615288377; x=1616497977; h=subject:to:cc:references:from:message-id:date: mime-version:in-reply-to:content-transfer-encoding; bh=Ni57OGL41XQuAeXzpUW2KDl3ayzTcKV4wnxc66PEGjc=; b=lCy6Gi6rE+Hpy33YnQ4YqSlayyNUguaEJmA5c2XPxkCvcNnwDF8s9Ls2 d6J1VcQJgW4UQM6w6ijGE9LXiA+eWKZ5BB7Z6wRIWAlpWGPhCfJFIGIJB Gn9N2Z8j0pW1FH/F9cXQj1+XKrQAferrBtAHxBxt+KltG7qhFetVRzEQF A=;
X-IronPort-AV: E=Sophos;i="5.81,234,1610409600"; d="scan'208";a="34037334"
Received: from aer-iport-nat.cisco.com (HELO aer-core-4.cisco.com) ([173.38.203.22]) by aer-iport-1.cisco.com with ESMTP/TLS/DHE-RSA-SEED-SHA; 09 Mar 2021 11:12:55 +0000
Received: from [10.60.140.52] (ams-ppsenak-nitro3.cisco.com [10.60.140.52]) by aer-core-4.cisco.com (8.15.2/8.15.2) with ESMTP id 129BCtfO014247; Tue, 9 Mar 2021 11:12:55 GMT
To: Robert Raszuk <robert@raszuk.net>
Cc: Tony Li <tony.li@tony.li>, Gyan Mishra <hayabusagsm@gmail.com>, Aijun Wang <wangaijun@tsinghua.org.cn>, Aijun Wang <wangaj3@chinatelecom.cn>, lsr <lsr@ietf.org>, "Acee Lindem (acee)" <acee@cisco.com>, draft-wang-lsr-prefix-unreachable-annoucement <draft-wang-lsr-prefix-unreachable-annoucement@ietf.org>
References: <22FDE3EA-B5D1-4E4D-B698-1D79173E8637@tony.li> <6E0281D2-7755-499A-B084-CA8472949683@chinatelecom.cn> <D6B0D95F-68AD-4A18-B98C-69835E8B149B@tony.li> <018801d71499$9890feb0$c9b2fc10$@tsinghua.org.cn> <CABNhwV2SpcDcm-s-WkWPpnVLpYB2nZGz2Yv0SfZah+-k=bGx4A@mail.gmail.com> <BFB3CE24-446A-4ADA-96ED-9CF876EA6A00@tony.li> <CAOj+MMGeR4bodbgpPqDCtLZD6XmX6fkjyxLWZAKa4LC2R1tBzg@mail.gmail.com> <ecf2e8b4-fdae-def6-1a29-ec1ae37f5811@cisco.com> <CAOj+MMFSEqVkM62TDAc6yn19Hup+v-9w=kiq_q6dVn39LcOkqQ@mail.gmail.com>
From: Peter Psenak <ppsenak@cisco.com>
Message-ID: <fdf0e62a-21fa-67e9-811d-5aa8749bb077@cisco.com>
Date: Tue, 9 Mar 2021 12:12:55 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.9.1
MIME-Version: 1.0
In-Reply-To: <CAOj+MMFSEqVkM62TDAc6yn19Hup+v-9w=kiq_q6dVn39LcOkqQ@mail.gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-Outbound-SMTP-Client: 10.60.140.52, ams-ppsenak-nitro3.cisco.com
X-Outbound-Node: aer-core-4.cisco.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/lxQuddsKjYL2lv1KQ0yfxNZ2rok>
Subject: Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Mar 2021 11:13:00 -0000

Hi Robert,

On 09/03/2021 12:02, Robert Raszuk wrote:
> Hey Peter,
> 
> Well ok so let's forget about LDP - cool !
> 
> So IGP sends summary around and that is all what is needed.
> 
> So the question why not propage information that PE went down in service 
> signalling - today mainly BGP.

because BGP signalling is prefix based and as a result slow.

> 
>  >   And forget BFD, does not scale with 10k PEs.
> 
> You missed the point. No one is proposing full mesh of BFD sessions 
> between all PEs. I hope so at least.
> 
> PE is connected to RRs so you need as many BFD sessions as RR to PE BGP 
> sessions. 

that can be still too many.
In addition you may have a hierarchical RR, which would still involve 
BGP signalling.

Once that session is brought down RR has all it needs to
> trigger a message (withdraw or implicit withdraw) to remove the 
> broken service routes in a scalable way.

that is the whole point, you need something that is prefix independent.

thanks,
Peter

> 
> Thx,
> R.
> 
> PS. Yes we still need to start support signalling of unreachability in 
> BGP itself when BGP is used for underlay but this is a bit different use 
> case and outside of scope of LSR
> 
> 
> On Tue, Mar 9, 2021 at 11:55 AM Peter Psenak <ppsenak@cisco.com 
> <mailto:ppsenak@cisco.com>> wrote:
> 
>     Robert,
> 
>     On 09/03/2021 11:47, Robert Raszuk wrote:
>      >  > You’re trying to fix a problem in the overlay by morphing the
>      > underlay.  How can that seem like a good idea?
>      >
>      > I think this really nails this discussion.
>      >
>      > We have discussed this before and while the concept of signalling
>      > unreachability does seem useful such signalling should be done
>     where it
>      > belongs.
>      >
>      > Here clearly we are talking about faster connectivity restoration
>     for
>      > overlay services so it naturally belongs in overlay.
>      >
>      > It could be a bit misleading as this is today underlay which
>     propagates
>      > reachability of PEs and overlay relies on it. And to scale,
>      > summarization is used hence in the underlay, failing remote PEs
>     remain
>      > reachable. That however in spite of many efforts in lots of
>     networks are
>      > really not the practical problem as those networks still relay on
>     exact
>      > match of IGP to LDP FEC when MPLS is used. So removal of /32 can and
>      > does happen.
> 
>     think SRv6, forget /32 or /128 removal. Think summarization.
> 
>     I'm not necessary advocating the solution proposed in this particular
>     draft, but the problem is valid. We need fast detection of the PE loss.
> 
>     And forget BFD, does not scale with 10k PEs.
> 
>     thanks,
>     Peter
> 
> 
> 
>      >
>      > In the same time BGP can pretty quickly (milliseconds)
>     remove affected
>      > service routes (or rather paths) hence connectivity can be
>     restored to
>      > redundantly connected endpoints in sub second. Such removal can
>     be in a
>      > form of atomic withdraw (or readvertisement), removal of recursive
>      > routes (next hop going down) or withdraw of few RD/64 prefixes.
>      >
>      > I am not convinced and I have not seen any evidence that if we
>     put this
>      > into IGP it will be any faster across areas or domains (case of
>      > redistribution over ASBRs to and from IGP to BGP). One thing for
>     sure -
>      > it will be much more complex to troubleshoot.
>      >
>      > Thx,
>      > R.
>      >
>      > On Tue, Mar 9, 2021 at 5:39 AM Tony Li <tony.li@tony.li
>     <mailto:tony.li@tony.li>
>      > <mailto:tony.li@tony.li <mailto:tony.li@tony.li>>> wrote:
>      >
>      >
>      >     Hi Gyan,
>      >
>      >      >     Gyan> In previous threads BFD multi hop has been
>     mentioned to
>      >     track IGP liveliness but that gets way overly complicated
>     especially
>      >     with large domains and not viable.
>      >
>      >
>      >     This is not tracking IGP liveness, this is to track BGP endpoint
>      >     liveness.
>      >
>      >     Here in 2021, we seem to have (finally) discovered that we can
>      >     automate our management plane. This ameliorates a great deal of
>      >     complexity.
>      >
>      >
>      >      >     Gyan> As we are trying to signal the IGP to trigger the
>      >     control plane convergence, the flooding machinery in the IGP
>     already
>      >     exists well as the prefix originator sub TLV from the link or
>     node
>      >     failure.  IGP seems to be the perfect mechanism for the control
>      >     plane signaling switchover.
>      >
>      >
>      >     You’re trying to fix a problem in the overlay by morphing the
>      >     underlay.  How can that seem like a good idea?
>      >
>      >
>      >      >       Gyan>As I mentioned advertising flooding of the longer
>      >     prefix defeats the purpose of summarization.
>      >
>      >
>      >     PUA also defeats summarization.  If you really insist on faster
>      >     convergence and not building a sufficiently redundant
>     topology, then
>      >     yes, your area will partition and you will have to pay the
>     price of
>      >     additional state for your longer prefixes.
>      >
>      >
>      >      > In order to do what you are stating you have to remove the
>      >     summarization and go back to domain wide flooding
>      >
>      >
>      >     No, I’m suggesting you maintain the summary and ALSO
>     advertise the
>      >     longer prefix that you feel is essential to reroute immediately.
>      >
>      >
>      >      > which completely defeats the goal of the draft which is to
>     make
>      >     host route summarization viable for operators.  We know the
>     prefix
>      >     that went down and that is why with the PUA negative
>     advertisement
>      >     we can easily flood a null0 to block the control plane from
>      >     installing the route.
>      >
>      >
>      >     So you can also advertise the more specific from the
>     connected ABR…
>      >
>      >
>      >      > We don’t have any prior knowledge of the alternate for the
>     egress
>      >     PE bgp next hop attribute for the customer VPN overlay.  So
>     the only
>      >     way to accomplish what you are asking is not do any summarization
>      >     and flood al host routes.  Of course  as I stated defeats the
>      >     purpose of the draft.
>      >
>      >
>      >     Please read again.
>      >
>      >     Tony
>      >
>      >     _______________________________________________
>      >     Lsr mailing list
>      > Lsr@ietf.org <mailto:Lsr@ietf.org> <mailto:Lsr@ietf.org
>     <mailto:Lsr@ietf.org>>
>      > https://www.ietf.org/mailman/listinfo/lsr
>      >
>