RE: Draft minutes of November 18, 2008 rtgwg meeting

Hi all,

Please find below my corrections and new additions to clarify these questions:

The way we positioned rLFAP is an extension to the local LFAP (RFC 5286). Our claim is that by introducing X-hop neighborhood and fast failure notification mechanism, one can extend the coverage to 100% (depending on X and network topology). But what X  is suitable can be pre-calculated easily for a given network topology. And all LFAPs (local+remote) are pre-computed and do not affect convergence.

Comments:

Mike Shand: Are you saying that for unicast you won;t get micro loops ?

Ibrahim: yes. Micro-loops can occur in two phases. The first phase is when you install temporary LFAPs after a failure, you may still observe loops for certain destainations if your coverage is not 100%. rLFAP can completely prevent these loops for a right set of X parameter for a gicen network topology.

Mike      You may get remote loops outside the diameter of
          the repair area
Ibrahim: This phase (the second phase) is related to switching back to OSPF routes after a failure. rLFAP can minimize these loops in the repair area and can work fine with oFIB (as suggested by the IPFRR WG for local LFAP) to prevent the loops for the outside of the repair area. Again nothing is different than local LFAP here.

Alia: Inconsistencies in draft regarding how LFA's are computed
      some examples in draft would cause forwarding loops
Ibrahim: The draft provides a high level of the algorithm. If there are cases where you think rLFAP can cause forwarding loops, please let us know those cases. We implemented these algorithms and verify that there won't be any loop for realistic BRITE topologies (Please look at the table I provided in the presentation). The way I verified the consistency is to actually construct the combined path (OSPF unaffetced paths + LFAPs) and print those paths (by using the routing tables). Any inconsistency will cause an infinite print statement where I never observed. As I said during my presentation, I will update the high level algorithm in detail provided that the implementation exists. Any unclear part, please let us know so that we will make it clear.

Alia: Why is the notification not applicable to the IGP.
      I.E. Why not just tune the IGP instead of shorting it.

Ibrahim: everything is pre computed so you can just trigger the install. I answered this question by pointing why local LFAPs are used. The only difference in terms of convergence time between local LFAP and rLFAP is the time needed to send a failure notification (not to the entire network just to the neighborhood). The IPFRR framework document states that the time needed to send a failure notification over one hop is between 10 ms and 100ms. By making the failure notification packet a control packet, this time can be minimized. And the results show that you will achieve ~99% for X=1 by sending the failure notification to your one-hop neighbors.

Regarding your question again, you can not pre-compute all routes for the entire network if you would like to tune IGP timers for just a failure notification. This is why we defined the neighborhood. And the results show that you will achieve ~99% for X=1. So you need to do just SPT of your neighbors.

My question to Alia just for my understanding: how local LFAPs work with FIB? If you have a router with 5 links, are u installing local LFAPs protecting these five links in the FIB. I will appreciate your answer.

George Swallow: do you need to precompute all the failures for your
                neighbors links as well
Ibrahim: yes. In order to provide a full coverage you have to increase the complexity. rLFAP complexity in terms of the path calculation is minimal compared to the other scheme since you only need to do within a certain neighborhood instead of the entire network. As I pointed out during my presentation, the algorithm only calculates these LFAPs for the affected prefixes and the number of affected prefixes will be less for a remote failure. And the results show that you will achieve ~99% for X=1. So you need to do just SPT of your neighbors.

George: so that lots of state I have to maintain right.  I have to
have a
        strategy for all failures
Ibrahim: To provide a full coverage, you need to maintain extra state but this is minimized by defining the neighborhood.

Stewart in a realistic topology is this order k neighbors to the power
of
         x hops the number for the number of strategies we need to
precompute
         and store
Ibrahim: I have to mention my interesting finding at this point that some ISPs are designing their networks such that local LFAP will provide a full coverage. As I presented the coverage example, local LFAP coverage depends on the number of neighbors rather than the network size. If an ISP designs its network carefully such that each router with a high connectivity then they may have a full coverage. However, the number of links to be protected will increase and I belive the FIB size issue pointed as a disadvantage for rLFAP remains the same for this case. As I stated during my presentation, rLFAP needs less number of prefixes to be installed to FIB for a remote failure. I think more analysis needs to be carried out to prove which method will provide less FIB entries.

I will claim at this point that if an ISP desings its network according to rLFAP with X=1, then they need to add probably a few more links compared to local LFAP (X=0).

Stewart: what about competing solutions
         also should look at some of the work with frr tunnels
         what is wrong with an encapsulation based solution

Ibrahim: disadvantage is overhead of additional header and processing. For some topologies (e.g., it is clear in the ring topology as given an example always to rLFAP), the amount of traffic will be at least doubled on some links since all affected traffic will be forwarded to the same prefix (not-via) . It is a serious bottleneck not only for bandwidth-limited wireless networks but also in other networks. I looked at the draft regarding the multiple failures and not full coverage is guranteed. In terms of the path calculation, it will be explosive for multiple failures since needs to be done for the entire network.

jgs: perhaps we should take this discussion offline.

jgs:  2nd or 3rd time this draft has been presented what do you want
      to do with it

jgs: based on room poll we will pass on this work

Regards,
Ibrahim
_______________________________________________
rtgwg mailing list
rtgwg@ietf.org
https://www.ietf.org/mailman/listinfo/rtgwg