Re: [Lsr] Thoughts about PUAs - are we not over-engineering?

Hi, Gunter:

Let me try to answer some of your concerns.

The reason that we prefer to the Summary+PUA/UPA solution is that the node failure(which is the main scenario that we focus now) is one rarely thing in the network. Then the unreachable event triggered mechanism is more efficient than advertising all of the node’s reachable address. This point has been discussed in the mail list in past.

In the https://datatracker.ietf.org/doc/html/draft-wang-lsr-prefix-unreachable-annoucement-09#section-8, we have illustrated how to control the advertisement of PUA message on the ABR. If this can’t settle your concerns, we can consider more policy on the ABR.

Regarding to the tracking and representation of PUA in RTM, we have proposed in the earlier version of this draft, that is to install one black hole route to the specified detailed prefix.

The reason that PUA requires routers within one area to be upgraded is that it want to avoid the situations when the router doesn’t recognize PUA message and misbehave. We are considering the convergence of PUA/UPA solutions, which may relax such requirements during deployment.

Aijun Wang
China Telecom

> On Jun 14, 2022, at 16:59, Van De Velde, Gunter (Nokia - BE/Antwerp) <gunter.van_de_velde@nokia.com> wrote:
> Hi All,
> 
> When reading both proposals about PUA's:
> * draft-ppsenak-lsr-igp-ureach-prefix-announce-00
> * draft-wang-lsr-prefix-unreachable-annoucement-09
> 
> The identified problem space seems a correct observation, and indeed summaries hide remote area network instabilities. It is one of the perceived benefits of using summaries. The place in the network where this hiding takes the most impact upon convergence is at service nodes (PE's for L3/L2/transport) where due to the summarization its difficult to detect that the transport tunnel end-point suddenly becomes unreachable. My concern however is if it really is a problem that is worthy for LSR WG to solve.
> 
> To me the "draft draft-wang-lsr-prefix-unreachable-annoucement-09" is not a preferred solution due to the expectation that all nodes in an area must be upgraded to support the IGP capability. From this operational perspective the draft "draft-ppsenak-lsr-igp-ureach-prefix-announce-00" is more elegant, as only the A(S)BR's and particular PEs must be upgraded to support PUA's. I do have concerns about the number of PUA advertisements in hierarchically summarized networks (/24 (site) -> /20 (region) -> /16 (core)). More specific, in the /16 backbone area, how many of these PUAs will be floating around creating LSP LSDB update churns? How to control the potentially exponential number of observed PUAs from floating everywhere? (will this lead to OSPF type NSSA areas where areas will be purged from these PUAs for scaling stability?)
> 
> Long story short, should we not take a step back and re-think this identified problem space? Is the proposed solution space not more evil as the problem space? We do summarization because it brings stability and reduce the number of link state updates within an area. And now with PUA we re-introduce additional link state updates (PUAs), we blow up the LSDB with information opaque to SPF best-path calculation. In addition there is suggestion of new state-machinery to track the igp reachability of 'protected' prefixes and there is maybe desire to contain or filter updates cross inter-area boundaries. And finally, how will we represent and track PUA in the RTM?
> 
> What is wrong with simply not doing summaries and forget about these PUAs to pinch holes in the summary prefixes? this worked very well during last two decennia. Are we not over-engineering with PUAs?
> 
> G/
> 
> _______________________________________________
> Lsr mailing list
> Lsr@ietf.org
> https://www.ietf.org/mailman/listinfo/lsr