Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

Aijun Wang <wangaj3@chinatelecom.cn> Fri, 12 March 2021 04:20 UTC

Return-Path: <wangaj3@chinatelecom.cn>
X-Original-To: lsr@ietfa.amsl.com
Delivered-To: lsr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BC6123A0CB1; Thu, 11 Mar 2021 20:20:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[AC_DIV_BONANZA=0.001, BAYES_00=-1.9, DC_PNG_UNO_LARGO=0.001, HTML_MESSAGE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FcZmn7xd0iIH; Thu, 11 Mar 2021 20:20:40 -0800 (PST)
Received: from chinatelecom.cn (prt-mail.chinatelecom.cn [42.123.76.219]) by ietfa.amsl.com (Postfix) with ESMTP id 3297E3A0CA2; Thu, 11 Mar 2021 20:20:37 -0800 (PST)
HMM_SOURCE_IP: 172.18.0.92:7860.2069879497
HMM_ATTACHE_NUM: 0000
HMM_SOURCE_TYPE: SMTP
Received: from clientip-219.142.69.75?logid-8e2a8e1c938542bc8d9695d7df27e5e5 (unknown [172.18.0.92]) by chinatelecom.cn (HERMES) with SMTP id 46F742800B2; Fri, 12 Mar 2021 12:20:34 +0800 (CST)
X-189-SAVE-TO-SEND: 66040164@chinatelecom.cn
Received: from ([172.18.0.92]) by App0021 with ESMTP id 8e2a8e1c938542bc8d9695d7df27e5e5 for acee@cisco.com; Fri Mar 12 12:20:38 2021
X-Transaction-ID: 8e2a8e1c938542bc8d9695d7df27e5e5
X-filter-score: filter<0>
X-Real-From: wangaj3@chinatelecom.cn
X-Receive-IP: 172.18.0.92
X-MEDUSA-Status: 0
Sender: wangaj3@chinatelecom.cn
From: "Aijun Wang" <wangaj3@chinatelecom.cn>
To: <acee@cisco.com>, "'Gyan Mishra'" <hayabusagsm@gmail.com>
Cc: "'draft-wang-lsr-prefix-unreachable-annoucement'" <draft-wang-lsr-prefix-unreachable-annoucement@ietf.org>, "'lsr'" <lsr@ietf.org>
References: <CABNhwV2=HTRYG5PTgiGYanm8LeT1HYcrPKQ4R4TQHZ-GO9ecJQ@mail.gmail.com> <439DD1F9-9924-405F-9FBD-6704D85B05D6@cisco.com> <CABNhwV3ky4nbYxo7qLowu0GyHpSYRKWdTQ3y-uQpvU-p0bskww@mail.gmail.com> <0E39A8D4-547D-482C-BD01-4A8CDE48324A@cisco.com> <CABNhwV0FhouRWmNKn4xjf=Hz7D0gP=px41506_1+JKmHKKz_xA@mail.gmail.com> <8D6FA7ED-E055-4EA5-9E78-00DBB82D5FB6@cisco.com>
In-Reply-To: <8D6FA7ED-E055-4EA5-9E78-00DBB82D5FB6@cisco.com>
Date: Fri, 12 Mar 2021 12:20:29 +0800
Message-ID: <00fe01d716f7$0b5ebbd0$221c3370$@chinatelecom.cn>
MIME-Version: 1.0
Content-Type: multipart/related; boundary="----=_NextPart_000_00FF_01D7173A.19857E40"
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQMJCpf9iixZCnHW90uhzCTxfABulgIIe0Q/AJtxTKQA8zh6DQE+jvFAAip//2un487qMA==
Content-Language: zh-cn
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsr/AM6i6bO3Fx9wfRy7-mGYX5Dbcrw>
Subject: Re: [Lsr] https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05
X-BeenThere: lsr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Routing Working Group <lsr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsr>, <mailto:lsr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsr/>
List-Post: <mailto:lsr@ietf.org>
List-Help: <mailto:lsr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsr>, <mailto:lsr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Mar 2021 04:20:45 -0000

Hi, Acee:

Thanks for your considerations for the application of PUA mechanism. Some points should be clarified based on the discussion between you and Gyan.

Please to review whether they address your concerns or not:

1.     The PUA generation should be supported by all of the ABRs when deployment. Or else, the traffic will be sucked to the unsupported ABR, which may or may not forward the traffic.

When other non-ABR nodes support the PUA message, they can take some actions based on this message(service switchover etc.). 

If they don’t support, nothing will happen, only the problems described in this draft is not solved(service traffic is sent to blackhole or via one non-optimize path).

2.     The ABRs should only monitor the absence of prefixes within the summary range. Currently, we do not consider to explicitly configure the protected prefixes, but I think to configures them on the ABRs may reduce some unnecessary PUA messages. Will consider such approach in next update.

3.     For prefixes that not existing after the ABRs startup, ABR will not send the PUA messages for them. Because there will be no services can run on such non-exist addresses.

4.     Theoretically, the PUA messages should be advertised once or for a short configurable period, to aid the service that run on them convergence. After that, the ABR needs not advertise them again.

 

 

Best Regards

 

Aijun Wang

China Telecom

 

From: acee@cisco.com <acee@cisco.com> 
Sent: Friday, March 12, 2021 3:48 AM
To: Gyan Mishra <hayabusagsm@gmail.com>
Cc: Aijun Wang <wangaj3@chinatelecom.cn>cn>; draft-wang-lsr-prefix-unreachable-annoucement <draft-wang-lsr-prefix-unreachable-annoucement@ietf.org>rg>; lsr <lsr@ietf.org>
Subject: Re: https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

 

Hi Gyan, 

 

I think we are starting to communicate but there are still some problems. 

 

From: Gyan Mishra <hayabusagsm@gmail.com <mailto:hayabusagsm@gmail.com> >
Date: Thursday, March 11, 2021 at 1:27 PM
To: Acee Lindem <acee@cisco.com <mailto:acee@cisco.com> >
Cc: Aijun Wang <wangaj3@chinatelecom.cn <mailto:wangaj3@chinatelecom.cn> >, draft-wang-lsr-prefix-unreachable-annoucement <draft-wang-lsr-prefix-unreachable-annoucement@ietf.org <mailto:draft-wang-lsr-prefix-unreachable-annoucement@ietf.org> >, lsr <lsr@ietf.org <mailto:lsr@ietf.org> >
Subject: Re: https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

 

Hi Acee

 

Thank you for your comments.

 

Answers in-line.

 

Thank you

 

Gyan

 

On Thu, Mar 11, 2021 at 11:01 AM Acee Lindem (acee) <acee@cisco.com <mailto:acee@cisco.com> > wrote:

Hi Gyan, 

 

I guess you didn’t understand my first PUA question. See inline. 

 

From: Gyan Mishra <hayabusagsm@gmail.com <mailto:hayabusagsm@gmail.com> >
Date: Monday, March 8, 2021 at 8:11 PM
To: Acee Lindem <acee@cisco.com <mailto:acee@cisco.com> >
Cc: Aijun Wang <wangaj3@chinatelecom.cn <mailto:wangaj3@chinatelecom.cn> >, draft-wang-lsr-prefix-unreachable-annoucement <draft-wang-lsr-prefix-unreachable-annoucement@ietf.org <mailto:draft-wang-lsr-prefix-unreachable-annoucement@ietf.org> >, lsr <lsr@ietf.org <mailto:lsr@ietf.org> >
Subject: Re: https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

 

 

 

On Mon, Mar 8, 2021 at 7:37 PM Acee Lindem (acee) <acee@cisco.com <mailto:acee@cisco.com> > wrote:

Speaking as a WG member: 

 

Hi Gyan, 

 

The first question is how do you know which prefixes within the summary range to protect? Are these configured? Is this half-assed best-effort protection where you protect prefixes within the range that you’ve installed recently? Just how does this work? It is clearly not specified in the draft. 

 Gyan>  All prefixes within the summary range are protected see section 4.

 

   [RFC7794] and [I-D.ietf-lsr-ospf-prefix-originator <https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05#ref-I-D.ietf-lsr-ospf-prefix-originator> ] draft both define
   one sub-tlv to announce the originator information of the one prefix
   from a specified node.  This draft utilizes such TLV for both OSPF
   and ISIS to signal the negative prefix in the perspective PUA when a
   link or node goes down.
 
   ABR detects link or node down and floods PUA negative prefix
   advertisement along with the summary advertisement according to the
   prefix-originator specification.  The ABR or ISIS L1-L2 border node
   has the responsibility to add the prefix originator information when
   it receives the Router LSA from other routers in the same area or
   level.
 
Acee> So, the ABR will only know about missing prefixes that it has recently received? What if the prefix is already missing when the ABR establishes adjacencies on the path to the PE? What if the prefix is being permanently taken out of service – then this negative advertisement will persist permanently. What if there is an unintentional advertisement in the summary range and it is withdrawn? How do you decide whether or not to protect a prefix with in the range? 
 
 Gyan>  In section 6 of the draft under implementation considerations we have a MAA (Max Address Announcement) threshold value that is configurable.  
    1.  If the number of unreachable prefixes is less than MAA, the ABR
   should advertise the summary address and the PUA.
 
   2.  If the number of reachable address is less than MAA, the ABR
   should advertise the detail reachable address only.
 
   3.  If the number of reachable prefixes and unreachable prefixes
   exceed MAA, then advertise the summary address with MAX metric.

 

       If a prefix is already missing when the ABR establishes adjacency on the path the prefix is not in the lsdb and so a PUA would not be sent for a prefix that is unknown to the ABR.  Any traffic sent to that bgp next hop would still be impacted as this is an adjacency forming timing issue.  We will have investigated this issue with the authors and are thinking of maybe a delay timer after the adjacency is formed as to when the PUA mechanism becomes activated that we can add to the considerations section.  

 

So once an intra-area prefix within the range is reachable, you’ll maintain semi-permanent state that it is going to be protected by the PUA mechanism? Prior to that, it will not be protected? That is what you are implying. All these details need to be specified.

 

 

 

 
When the ABR or ISIS L1-L2 border node generates the summary
   advertisement based on component prefixes, the ABR will announce one
   new summary LSA or LSP which includes the information about this down
   prefix, with the prefix originator set to NULL.  The number of PUAs
   is equivalent to the number of links down or nodes down.  The LSA or
   LSP will be propagated with standard flooding procedures.
 
   If the nodes in the area receive the PUA flood from all of its ABR
   routers, they will start BGP convergence process if there exist BGP
   session on this PUA prefix.  The PUA creates a forced fail over
   action to initiate immediate control plane convergence switchover to
   alternate egress PE.  Without the PUA forced convergence the down
   prefix will yield black hole routing resulting in loss of
   connectivity.
 
   When only some of the ABRs can't reach the failure node/link, as that
   described in Section 3.2 <https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05#section-3.2> , the ABR th.at <http://th.at>  can reach the PUA prefix
   should advertise one specific route to this PUA prefix.  The internal
   routers within another area can then bypass the ABRs that can't reach
   the PUA prefix, to reach the PUA prefix.

 

The second comment is that using the prefix-originator TLV is a terrible choice of encoding. Note that if there is any router in the domain that doesn’t support the extension, you’ll actually attract traffic towards the ABR blackholing it. 

 Gyan> I will work with the authors to see if their is any alternative PUA process to signal and detect the failure in case prefix originator TLV is not supported.

Acee> Note that in the case of OSPFv3, the prefix originator TLV is a Sub-TLV of the Inter-Area Prefix TLV advertised in the E-Inter-Area-Prefix-LSA. If there are any OSPFv3 routers in the domain that don’t support this functionality and receive traffic for the protected prefix, they will actually route it towards the blackhole.  

        Gyan>> If the OSPFv3 router does not support the prefix originator TLV is a Sub-TLV of the Inter-Area Prefix TLV advertised in the E-Inter-Area-Prefix-LSA then it would use the summary address advertised by the ABR and black hole as the control plane would not converge to not use the summary on that non supporting OSPFv3 router.  Agreed.  That support dependency we will add to the considerations section 6.  

 

It is worse than that. For OSPFv3 routers that don’t support this extension, the E-Inter-Area-LSA will attract blackhole traffic since it will be rightly treated as a reachable more-specific advertisement. 

 

 

Further, I think your example is a bit contrived. I’d hope that an OSPF area with “thousands” of summarized PE addresses wouldn’t be portioned by a single failure as in figure 1 in the draft and your slides. I also that the option of a backbone tunnel between the ABRs was removed from the draft since it diminished the requirement for this functionality.

 Gyan> This is a real world Metro access edge example as the impact is customers that have LSP built to the down egress PE that has not failed over.  In this scenario their is a Primary and Backup PE per Metro edge which is typical for an operator.

 

The workaround used today is to flood all /32 next hop prefixes and not take advantage of summarization.  This draft makes RFC 5283 inter area FEC binding now viable for operators.

Acee> Or add a reliable intra-area link between your ABRs. Or, as a backup, a tunnel through the backbone area (as was previously in the draft).

        Gyan>  The issue is not redundancy.  The issue is when summarization is used  as the component prefixes BGP next hop recursive to the IGP are now hidden and with MPLS RFC 5283 inter area LSP use case, the failover is broken.  It’s not just faster convergence it’s any convergence as the traffic black hole dead ends on the ABR and cannot build the LSP to the egress PE.  Please see the diagram below in the slide deck it details this special use case. The LSP ends up dead ending  black hole on the ABR once the FEC LPM goes away for the next hop when the PE has a link or node failure.

 



 

I don’t why the ABR couldn’t stitch the LSPs through a different path as long as that path is part of the non-backbone area. I simply suggested providing more redundancy in your non-backbone area. Though not shown in your picture 😉 I see no reason why it wouldn’t work. 

 

Thanks,

Acee

 

Thanks,
Acee

 

 

Thanks,
Acee

 

From: Gyan Mishra <hayabusagsm@gmail.com <mailto:hayabusagsm@gmail.com> >
Date: Monday, March 8, 2021 at 6:57 PM
To: Acee Lindem <acee@cisco.com <mailto:acee@cisco.com> >, Aijun Wang <wangaj3@chinatelecom.cn <mailto:wangaj3@chinatelecom.cn> >, draft-wang-lsr-prefix-unreachable-annoucement <draft-wang-lsr-prefix-unreachable-annoucement@ietf.org <mailto:draft-wang-lsr-prefix-unreachable-annoucement@ietf.org> >, lsr <lsr@ietf.org <mailto:lsr@ietf.org> >
Subject: https://tools.ietf.org/html/draft-wang-lsr-prefix-unreachable-annoucement-05

 

 

Acee. 

 

Please ask the two questions you raised about the PUA draft so we can address your concerns.

 

If anyone else has any other outstanding questions or concerns we would like to address as well and resolve.

 

Once all questions and  concerns are satisfied we would like to ask for WG adoption.

 

Kind Regards 

 

Gyan

-- 

 <http://www.verizon.com/> Error! Filename not specified.

Gyan Mishra

Network Solutions Architect 

M 301 502-1347
13101 Columbia Pike <https://www.google.com/maps/search/13101+Columbia+Pike?entry=gmail&source=g>  
Silver Spring, MD

 

-- 

 <http://www.verizon.com/> Error! Filename not specified.

Gyan Mishra

Network Solutions Architect 

M 301 502-1347
13101 Columbia Pike 
Silver Spring, MD

 




 

-- 

 <http://www.verizon.com/> 

Gyan Mishra

Network Solutions Architect 

M 301 502-1347
13101 Columbia Pike 
Silver Spring, MD