Re: [Lsr] Prefix Unreachable Announcement Use Cases

Hi, Acee:
I think Robert have given the good explaination for the purpose of this draft.
The aim of this draft is to improve the service convergence time, which can be notified quickly the failure of underlying network link or node.
BFD is another possible solution, but it requires massive configuration and other costs as that pointed out by Robert.

With PUA, the failure informaiton of link or node will be advertised automatically and quickly. It keeps also the summary behaviour on ABRs to limit the amounts of reachable prefixes advertisment. 

More detail responses are inline below.

Thanks in advance.
Aijun Wang
China Telecom

发件人：Robert Raszuk <robert@raszuk.net>
发送日期：2020-11-15 06:30:20
收件人："Acee Lindem (acee)" <acee=40cisco.com@dmarc.ietf.org>
抄送人："lsr@ietf.org" <lsr@ietf.org>
主题：Re: [Lsr] Prefix Unreachable Announcement Use Cases
Hi Acee,

> 3.1 Inter-Area Node Failure Scenario – With respect to this use case, the node 
> in question is actually unreachable. In this case, the ABRs will normally install a 
> reject route for the advertised summary and will send an ICMP unreachable when 
> the packets are received for the unreachable prefix. 

And what will the network do with such ICMP unreachable ? Is there some draft I missed where encapsulating PE will choose a path with different tunnel endpoint upon reception of ICMP unreachable message ? 

See the entire idea behind this draft is to trigger faster switchover to other PEs in the case of a multihomed attached site. 

Option 1 - withdraw a service route in BGP. Use aggregate withdraw to speed this up (say withdraw just RDs) 

Option 2 - signal next hop unreachability (aka negative route) of more specific prefix then the aggregate itself. 

While I think just option 1 is ok for the vast majority of services your answer seems to be talking about ICMP unreachable which IMO would not help much with the issue. The proposal is not about failing ... the proposal is about faster connectivity restoration. 

> If faster detection is required, BFD or other mechanisms are available.  

Now running a full mesh of BFD multihop sessions from each PE to each other PE may be ok in theory but rather no-op in practice. Just think 1000 PEs with 100 ms BFD timers in a full mesh of BFD sessions. Then rethink the same with  BFD packets maxed to 1500 or 4K bytes packets as per some proposals floating around. 

If we want to move that way I would rather suggest we define a local BFD anchor explorers (one per area) which will probe all "interesting" next hops in a given area. Upon failure it would signal to those remote PEs which indicated interest in such tracking the event of failure. 

Again using BFD for this in any form or shape needs to be weighed for cost/benefit against option 1 and option 2 above. 

Thx,
Robert. 

PS . Now option 1 can easily be sub second if BFD is enabled on IBGP sessions between RRs and PEs. However I think there was some concerns expressed in the past by a vendor for this type of deployment of BFD between loopbacks. Maybe it would be beneficial for this discussion to better understand this concern. If valid I think the option 2 which IMO is the objective of this draft does present a valid problem to be solved. Today practically speaking networks flood in IGPs globally 1000s of /32 prefixes instead of summarizing them as this is the only way they can signal liveness of the remote PEs. 

On Sat, Nov 14, 2020 at 10:34 PM Acee Lindem (acee) <acee=40cisco.com@dmarc.ietf.org> wrote:

Speaking as WG member…

With respect to the use cases in section 3:

  3.1 Inter-Area Node Failure Scenario – With respect to this use case, the node in question is actually unreachable. In this case, the ABRs will normally install a reject route for the advertised summary and will send an ICMP unreachable when the packets are received for the unreachable prefix. This is the expected behavior and there really isn’t that much of advantage to move the point of unreachable detection a couple hops closer. If faster detection is required, BFD or other mechanisms are available.
[WAJ] Please see the explainations above from Robert.

  3.3 Intra-Area Node Failure Scenario – In the first place, multiple areas with overlapping summaries is just a bad network design. If the prefix is unreachable, the case digresses to getting the ICMP unreachable from the ABR with the invalid overlapping summary.
[WAJ] It is common, for example, ISIS level1-2 router will announce the default route to the level 1 area. And, also in the OSPF totally stubby area. 

3.2 Inter-Area Links Failure Scenario – This is the case where the prefix is reachable but only through a subset of the area ABRs. This is really the only valid use case. IMO, it is better to solve this case with intra-area tunnels through the backbone as described in section 6.1. I think this is preferable to the complexity proposed in this draft and especially section 6. It is “interesting” when non-implementors specify implementation details.
[WAJ] The tunnel soultions described in section 6 is the last resort of the path switch procedure. If we deploy the PUA mechanism, normally the routers within another area will switch automatically to other ABR when it receives the PUA from one ABR.  Only in the critical scenario that described in beginning of section 6, the solution described in section 6.1 or 6.2 will be used.

Thanks,
 Acee

_______________________________________________
 Lsr mailing list
 Lsr@ietf.org
 https://www.ietf.org/mailman/listinfo/lsr