thoughts on draft-bryant-shand-ipfrr-notvia-addresses-00.txt
Alia Atlas <aatlas@avici.com> Fri, 25 March 2005 20:10 UTC
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA06376 for <rtgwg-web-archive@ietf.org>; Fri, 25 Mar 2005 15:10:20 -0500 (EST)
Received: from megatron.ietf.org ([132.151.6.71]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DEvEN-0003aX-3O for rtgwg-web-archive@ietf.org; Fri, 25 Mar 2005 15:16:31 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1DEv6N-00032g-KD; Fri, 25 Mar 2005 15:08:15 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1DEv6M-00032X-Cz for rtgwg@megatron.ietf.org; Fri, 25 Mar 2005 15:08:14 -0500
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA06056 for <rtgwg@ietf.org>; Fri, 25 Mar 2005 15:08:10 -0500 (EST)
Received: from gateway.avici.com ([208.246.215.5] helo=mailhost.avici.com) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1DEvCF-0003VZ-38 for rtgwg@ietf.org; Fri, 25 Mar 2005 15:14:21 -0500
Received: from aatlas-lt.avici.com (aatlas-lt.avici.com [10.2.20.92]) by mailhost.avici.com (8.12.8/8.12.8) with ESMTP id j2PK7q7l015889; Fri, 25 Mar 2005 15:07:52 -0500
Message-Id: <5.1.0.14.2.20050325145408.01fa1378@mailhost.avici.com>
X-Sender: aatlas@mailhost.avici.com
X-Mailer: QUALCOMM Windows Eudora Version 5.1
Date: Fri, 25 Mar 2005 15:07:51 -0500
To: rtgwg@ietf.org
From: Alia Atlas <aatlas@avici.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format="flowed"
X-Avici-MailScanner-Information: Please contact the ISP for more information
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 17bdfcaea25d1444baef0e24abc38874
Cc: thart@avici.com
Subject: thoughts on draft-bryant-shand-ipfrr-notvia-addresses-00.txt
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: rtgwg.ietf.org
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
Sender: rtgwg-bounces@ietf.org
Errors-To: rtgwg-bounces@ietf.org
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 2b2ad76aced9b1d558e34a970a85c027
I've been thinking about this approach and have a number of comments on it (sorry for the length) gathered from various discussions. In general, this is a straightforward approach that seems quite promising, but it still very preliminary. I am concerned about a number of issues that are either not sufficiently addressed in the draft or where I feel the complexity of the approach in the draft is a problem. If all such issues can be adequately resolved such that the IPFRR with Notvia Addresses can be relatively simple to configure, manage and understand while handling all the problem cases of interest, then this approach could be a likely candidate for an advanced method. As we've been discussing, the key question is what is the correct trade-off between mechanism complexity and network coverage. While I think that this approach has the possibility of being a reasonable trade-off, whether that is actually the case will depend on whether and how the various issues can be resolved. At a high level, conceptually the Notvia addresses approach gives very similar forwarding paths as TE FRR. The difference is that rather than the head-end doing the computation for each tunnel and signaling the ERO, the computation is distributed to all nodes in the network. First, I'm going to go through what I believe the benefits are. 1. The idea is conceptually simple; it is easy to understand the path the alternate will take. This is useful for network operations. 2. No more than a single level of encapsulation is ever required. Although it does suffer from requiring an explicit tunnel, at least the forwarding complexity overhead is constant and well understood. 3. For node and link failure, if the topology isn't disjoint after the failure, then an alternate will be found. There are definitely issues still to handle the broadcast link case, but I'll address those later. 4. Although computationally expensive, the necessary computation does appear to be feasible by using incremental SPFs and early termination. These bring the required computation for pt2pt-link and node failure into a reasonable range. My assumption is that this time can be further improved with some work. 5. There is no possibility of looping via the alternates in the event that a worse failure, which the alternate can't protect against, has occurred. This is both because the alternate in the Notvia addressed tunnel is not repaired and because the Notvia-addressed tunnel rejoins the SPT at a point that is always downstream of S. 6. There are no issues with multi-area OSPF because traffic is sent through tunnels to Notvia addresses that are always intra-area for at most one area. Of course, the multi-area OSPF applicability restrictions are not very restrictive. 7. It is clear that SRLG failures and broadcast link failures can be handled. The complexity required depends on the desired/required coverage. I'll address this later in the concerns section. 8. Because a tunnel is used, it is possible to use the same mechanism for multicast traffic, when we determine how to provide IPFRR for multicast. There is also the advantage that the router advertising the Notvia address can know the SPT in ref to that Notvia address; this allows an RPF check for traffic considering the alternate. Second is the list of downsides with the approach. The main concern is that the mechanism becomes too complex such that the trade-off between its complexity and the full coverage is not desirable. 1. This requires a large number of additional IP addresses in the IGP. The same number of additional FECs is required to support LDP. 2. Explicit tunnels are needed, which means that targeted LDP sessions are necessary to have this support LDP traffic. This is a particular concern for multi-homed prefixes; I'll describe my concerns on this later. 3. Substantial IGP changes are required to handle the additional Notvia addresses. 4. A more complex algorithm is required to make the computation feasible. 5. The management of the Notvia addresses & of the tunnels can create longer time periods where protection isn't available for a part of the network (the new link or node, etc.). Third, there are a number of issues that I feel need considerable discussion to try and resolve. I will try to go through each in turn and explain what I think the various aspects of each are. Each of these issues has the possibility to resolve in such a way that the Notvia Addresses approach becomes overly complex. 1. Notvia Addresses: The first issue is how the Notvia addresses are allocated, distributed and withdrawn. An initial idea of Stewart & Mike is that these addresses are not global addresses (i.e. are 10.x.x.x or such) and are configured in blocks on each router so that the router can manage the bindings itself. a. The routing extensions to the IGP will have to associate a network resource (node or link) that an address should be Notvia. This is probably straightforward. b. It is desirable to have some dampening on the withdrawal of Notvia addresses to minimize thrashing. c. If configured in blocks, it would be extremely desirable to have the same Notvia address mean the same thing through multiple reboots, etc. It'd be good to have some means of consistent association. This is for easy manageability. d. When a new link or neighbor comes up, there will be a longer period of time when an alternate isn't available because the Notvia address hasn't been advertised yet. These periods without protection need to be clearly understood and minimized. e. There may be scalability concerns based on the number of Notvia addresses and LDP FECs required. For instance, as described in the draft, it is basically the number of uni-directional links in the topology. This is ignoring the extras for broadcast links. To fully & certainly provide SRLG protection if at all feasible, would require that each router advertise a Notvia address for every uni-directional link into every neighbor of that router. This would result in K*L additional addresses, where K is the average number of neighbors & L is the number of uni-directional links in the topology. 2. Insufficiently diverse topology: It is possible that a network topology cannot provide an alternate that suffices for link, node and SRLG protection. It isn't clear to me how to compute a "best-available" alternate using this approach. For instance, if one can get link protection, but not node protection, how would that be determined, computed and assigned? This becomes much more of a concern for SRLG protection & for topologies where failures have already occurred and the network has converged for those & needs protection in the event of an additional failure. 3. Failure Diagnosis versus Pessimism: As written, the draft discusses the idea of doing failure diagnosis using BFD. As Stewart, Mike & I have discussed, this isn't possible for SRLG failures, although it is possible for broadcast links. a. I am concerned about adding the failure diagnosis. This is yet another level of complexity for implementation. It also has ramifications for the forwarding plane, because of the need to store multiple alternates to use & have multiple states to check to decide what to use. b. An example of a concern with the BFD diagnosis is that all interfaces on a node that has failed are not certain to fail exactly simultaneously or even within a sub-50ms bounded window. It is entirely possible that BFD sessions are terminated on different line-cards, that detect the router failure at slightly different times and stop forwarding traffic, therefore, at slightly different times. c. The other approach is to pessimistically eliminate all routers connected to the broadcast link as well as the broadcast link; this may not provide an alternate. It also needs to be thought through what issues might exist if the topologies used for the SPF vary slightly for each router that is on the broadcast link, since each will, as described, not prune itself out when doing the computation; of course, there could be an approach where the same topology can be used everywhere. It isn't clear to me what Notvia addresses would be needed to express "don't go through this pseudo-node or any nodes attached to it"; I don't think that it is simply the Notvia address for avoiding a particular node. 4. Multi-homed Prefixes: I am quite concerned about the mechanisms suggested in the draft. a. First, I really do not like the idea of having separate forwarding for "local" prefixes that come out of a tunnel. What is a local prefix? For instance, does this mean that an ABR has to forward traffic different depending on which area traffic from the tunnel has come from? I am concerned about how this would scale; maybe only 2 FIBs are needed (one for backbone & one for other), but it may be worse to handle AS external routes. I know that Stewart, Mike, Joel, Albert and I had discussed/agreed to put this idea out of scope at least for the moment. b. I am quite concerned about having tunnels to the advertisers of the prefixes. i. There needs to be a mechanism to determine whether the advertiser of a prefix will forward the packet in a loop-free fashion to avoid the failure point. The separate forwarding for "local" prefixes avoided the need for this determination, but at more substantial cost. ii. To support LDP, every tunnel requires a targeted LDP session. If multi-homed prefixes are common, then this becomes a full mesh for LDP. That isn't acceptable. Of course, multi-homed prefixes may be much more infrequent for LDP than for IP; for example, there is no reason to advertise a separate FEC for the subnet of a link. However, multi-homed prefixes are a concern for LDP for at least the inter-area, AS External, and BGP routes. iii. If traffic is encapsulated to a node's regular address, because that traffic is destined to a prefix advertised by the node, how does the receiving node know to remove the encapsulation and forward the packet inside all in the fast path? Is this a just a question of different handling based on the header type inside the outer encapsulation (for GRE)? iv. Perhaps these issues could be handled by determining a next-next-hop that avoids the failure to reach an appropriate advertiser. Of course, this is a different set/type of computation. 5. SRLGs and Broadcast Links: There seem to be a number of possible ways to handle SRLGs and broadcast links, each of which provides a different trade-off in terms of coverage, computation, and extra Notvia addresses. There are basically 4 approaches at this point. a. First, In order to compute a notvia alternate that avoids a link, the primary neighbor, and all SRLGs that the link is part of, it is necessary to have a separate topology and associated SPF computation for each link that is a member of an SRLG or a broadcast link. This requires also a substantially larger number of Notvia addresses and the corresponding mechanisms to determine how and when to allocate and de-allocate them. b. Second, one could use a topology that removed the primary neighbor and see whether SRLG protection can be obtained either along S's path or along any path of a neighbor of S that is also loop-free. c. Third, when a Notvia address indicates to avoid a node, one could remove not merely the node & the uni-directional links to and from that node, but also any other links that are in a common SRLG with any of the links to or from the removed node. This is pessimistic but allows some SRLG protection without increased computation or Notvia addresses. d. Fourth, one could simply track the SRLGs encountered along the Notvia path; this just reports whether the alternate provides SRLG protection without any effort to obtain it. 6. Implementability: Clearly, the draft describes the basic idea for Notvia addresses, but there are a fair number of implementation/protocol decisions that need to be made before this can become anything more than an interesting idea. 7. There is a definite need to describe the convergence case better. This is how the transition from using the alternate to the network being converged happens, such that the alternate remains functional. a. For instance, if the node E fails, then the Notvia address E_!S will no longer be advertised. If S was getting link protection (because that was all that was possible, for instance) by tunneling traffic to E_!S, it is important that this traffic be properly discarded when E's addresses go away. This implies that there needs to be a default blackhole for Notvia addresses. b. Another example is when node E fails, the next-next-hop B must continue to advertise the Notvia address B_!E until the network converges so that S can continue to tunnel traffic to B_!E as the alternate. c. It is possible to get a micro-forwarding loop affecting a Notvia address as a result of a less severe failure than anticipated. For instance, consider the following topology. [D] | 1 | [E]-----[F]-\ | | \ 10 1 |R 1 |R \ | 5 | \ [S]-----[H]----[I] 2 Link S->E and Link H->F are in SRLG R When node E fails, if I converges before H, there will be a loop affecting the Notvia address being used to reach F without going through any of Link S->E, E or SRLG R. d. How do exceptions work? Particularly in regards to an IP-in-IP encapsulation such as GRE, it doesn't seem like MTU exceeded cases can be handled cleanly either by use of DF or by doing IP fragmentation and then the reassembly at the end of the tunnel. This seems like a problem for all ICMP packets; how could a source understand the header inside for a TTL expired, for instance. e. For IP-in-IP tunnels, another concern is flow diversity. The IP source and destination addresses are used to determine a flow; this flow identification may then be used for a variety of purposes, including ECMP. By putting all the traffic to a variety of destinations inside the same header, the ability to take advantage of flow diversity appears to have disappeared. This could possibly be solved by putting the original source address into the encapsulating header? Are there other approaches? Hopefully, this will spark some discussion on the issues. Alia _______________________________________________ Rtgwg mailing list Rtgwg@ietf.org https://www1.ietf.org/mailman/listinfo/rtgwg
- thoughts on draft-bryant-shand-ipfrr-notvia-addre… Alia Atlas
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… mike shand
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Alia Atlas
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… stefano previdi
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Stewart Bryant
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Alia Atlas
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… stefano previdi
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Alia Atlas
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Alia Atlas
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… stefano previdi
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Alia Atlas
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… mike shand
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… stefano previdi
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Alia Atlas
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Alia Atlas
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Stewart Bryant
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Alia Atlas
- Re: the shen-mpls-nnhop Was:(Re: thoughts on draf… Naiming Shen
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Stewart Bryant
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Alia Atlas
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Loa Andersson
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Naiming Shen
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Loa Andersson
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Naiming Shen
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Alia Atlas
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Naiming Shen
- the shen-mpls-nnhop Was:(Re: thoughts on draft-br… Loa Andersson
- Re: the shen-mpls-nnhop Was:(Re: thoughts on draf… Stewart Bryant
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Alia Atlas
- Re: the shen-mpls-nnhop Was:(Re: thoughts on draf… Loa Andersson
- Re: the shen-mpls-nnhop Was:(Re: thoughts on draf… Stewart Bryant
- Re: the shen-mpls-nnhop Was:(Re: thoughts on draf… Alia Atlas
- Re: the shen-mpls-nnhop Was:(Re: thoughts on draf… Loa Andersson
- Re: the shen-mpls-nnhop Was:(Re: thoughts on draf… Alia Atlas
- Re: the shen-mpls-nnhop Was:(Re: thoughts on draf… Loa Andersson
- Re: the shen-mpls-nnhop Was:(Re: thoughts on draf… Alex Zinin
- Re: the shen-mpls-nnhop Was:(Re: thoughts on draf… George Swallow
- Re: thoughts on draft-bryant-shand-ipfrr-notvia-a… Naiming Shen