Re: Fwd: I-D ACTION:draft-atlas-ip-local-protect-00.txt

Alia Atlas <aatlas@avici.com> Fri, 13 February 2004 08:34 UTC

Received: from optimus.ietf.org (optimus.ietf.org [132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id DAA28362 for <routing-discussion-archive@lists.ietf.org>; Fri, 13 Feb 2004 03:34:52 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1ArYlu-0004vc-Hc; Fri, 13 Feb 2004 03:34:02 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1ArYlm-0004vI-89 for routing-discussion@optimus.ietf.org; Fri, 13 Feb 2004 03:33:54 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id DAA28325 for <routing-discussion@ietf.org>; Fri, 13 Feb 2004 03:33:48 -0500 (EST)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1ArYle-0007Sz-00 for routing-discussion@ietf.org; Fri, 13 Feb 2004 03:33:46 -0500
Received: from exim by ietf-mx with spam-scanned (Exim 4.12) id 1ArYkh-0007Ok-00 for routing-discussion@ietf.org; Fri, 13 Feb 2004 03:32:48 -0500
Received: from [208.246.215.201] (helo=mailhost.avici.com) by ietf-mx with esmtp (Exim 4.12) id 1ArYjw-0007G1-00 for routing-discussion@ietf.org; Fri, 13 Feb 2004 03:32:00 -0500
Received: from aatlas-lt2.avici.com (b2vpn2pc123.avici.com [10.2.104.123]) by mailhost.avici.com (8.12.8/8.12.8) with ESMTP id i1D8VPje029033; Fri, 13 Feb 2004 03:31:25 -0500
Message-Id: <5.1.0.14.2.20040213031722.01d6e118@10.2.0.68>
X-Sender: aatlas@10.2.0.68
X-Mailer: QUALCOMM Windows Eudora Version 5.1
Date: Fri, 13 Feb 2004 03:34:06 -0500
To: Raj Mani <rajmanibt@hotmail.com>
From: Alia Atlas <aatlas@avici.com>
Subject: Re: Fwd: I-D ACTION:draft-atlas-ip-local-protect-00.txt
Cc: routing-discussion@ietf.org
In-Reply-To: <BAY1-F149HFPOHLi7wv000324a8@hotmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format="flowed"
X-yoursite-MailScanner-Information: Please contact the ISP for more information
X-yoursite-MailScanner: Found to be clean
X-Spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on ietf-mx.ietf.org
X-Spam-Status: No, hits=0.0 required=5.0 tests=none autolearn=no version=2.60
Sender: routing-discussion-admin@ietf.org
Errors-To: routing-discussion-admin@ietf.org
X-BeenThere: routing-discussion@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/routing-discussion>, <mailto:routing-discussion-request@ietf.org?subject=unsubscribe>
List-Id: Routing Area General mailing list <routing-discussion.ietf.org>
List-Post: <mailto:routing-discussion@ietf.org>
List-Help: <mailto:routing-discussion-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/routing-discussion>, <mailto:routing-discussion-request@ietf.org?subject=subscribe>

At 01:43 AM 2/13/2004, Raj Mani wrote:

>Hi Atlas,
>
>          thanks, see some comments inline.
>>>
>>>      Good research. A few comments on this draft of local protection:
>>>
>>>      It's rather confusing to talk about characterizing the neighbors,
>>>      since each prefix can have it's own primary path, loop-free 
>>> node-protecting
>>>      alternate, loop-free link-protecting alternate, U-turn..., etc. If
>>>      one has lots of prefixes, and neighbors, the combination can be
>>>      rather large to keep track of.
>>
>>The characterization is more conceptual.  In general, one determines the 
>>appropriate alternate and primary next-hop for each node in the topology.
>>A prefix can be ECMP between two or more nodes, in which case it falls 
>>into the multiple primary neighbors case.
>
>          Yes, it's true. This should be per prefix/per next-hop based. 
> Which was
>           also the reason I'm puzzled by the paper most of the places 
> treating those
>           concept as per node basis. For example, the OSPF and ISIS signaling
>           for the U-turn capabilities, it just announces this link as 
> capable of performing
>           U-turn or not, but since those concept is really 
> prefix/destination related, you
>           can have some prefixes having alternate paths, but other 
> prefixes you
>           don't have. So I'm not sure how can OSPF/ISIS flood only the 
> capability
>           for the links.

First, the SPF computation is done on the topology.  Then the results are 
inherited to the prefixes, based on which routers those prefixes are 
attached to.

The breaking U-Turns is done on a link basis.  The decision about which 
traffic is in a loop and therefore needs the U-Turn broken depends upon the 
traffic's prefix and destination.  The ability to do so does not.

>>>      I didn't fully understand all the U-turn cases described, since
>>>      they are quite involved. Not sure you have covered the cases where two
>>>      nodes have multiple parallel links, even you receive from one link and
>>>      forward to another link, they can still be looping; also in
>>>      multi-node looping case due to asymmetrical routing,
>>>      S-->B-->A-->S which can not be detected by the inbound/outbound
>>>      interface lookups.
>>
>>Yes, this is why the check is whether the traffic was received from the 
>>same neighbor that the traffic would be sent out to and NOT whether the 
>>traffic was received on the same link on which it would be forwarded.
>
>          Since this is described as the "default" behavior; We are talking
>           about forwarding plane or fast path. What will be the performance
>           impact to detect which neighbor each packet is from? Even just
>           to match inbound interface with outbound interface, I'm pretty
>           sure that, most of the high-end routers can not even do that 
> (without
>           re-spining an Engine-12 for example); even for the routers are 
> able to
>           perform this, there can be a significant performance hit.

Well, I'm rather intimately familiar with a high-end router.  Avici makes 
core routers.  I would not be proposing this if we weren't capable of doing 
it without a performance issue.  Naturally, I can't speak to the 
capabilities of other routers to support this today.


>>If there were a routing loop of S->B->A->S, then B would not be a U-Turn 
>>neighbor of S and S would not sure B for a U-Turn alternate.  If you look 
>>at the definition of a U-Turn neighbor, a neighbor B is a U-Turn neighbor 
>>of S if and only if S is the primary neighbor for all optimal paths from 
>>B to the destination that go through S.
>>
>>
>>>      The computation of this can be intensive (depends on the number
>>>      of routes and number of neighbors), there are more chance nodes
>>>      converge at different times. But this paper depends on every nodes
>>>      are in stable condition, otherwise, some of the calculations may not
>>>      be reliable.
>>
>>Just like the primary SPF, the computation primarily depends on the 
>>number of nodes in the network, though the number of neighbors which can 
>>provide alternates is also important.
>>
>>The alternates are computed to protect against a single failure.   The 
>>calculations for the alternate next-hops are as reliable as the primary 
>>next-hops are; i.e. once the network has converged, there will be 
>>alternates that can be used to protect against a single failure.
>
>          Sometimes there can be link/route flappings, sometimes there are 
> fiber
>           cut which may bring down more than one IP link; a node failure
>           may also trigger multiple link failure detection by multiple nodes.

Absolutely.  As proposed, this can protect against a single link or node 
failure.  There are other cases it can be expanded to.


>>>      The paper mentioned LDP and node protection, I don't see how the
>>>      LDP traffic can be protected using node protection with this. Can
>>>      you give an example?
>>
>>The issue with providing LDP with node protection is knowing the new 
>>neighbor's label.  If the LSRs use downstream unsolicited and liberal 
>>label retention, then an LSR will have the label bindings for each FEC 
>>from each of its neighbors that support that FEC.
>>
>>As long as the label binding is known for the alternate neighbor, then 
>>the appropriate out-segment can be created beforehand based on the 
>>selected alternate next-hop.
>>
>>Does that make sense?  It does assume that there are LDP sessions to each 
>>neighbor, but that is generally the case.
>
>          yes, this makes sense.
>
>>
>>
>>>      This paper mentioned the capability need only be flooded to a node's
>>>      neighbor, I don't recall there is a special mechanism defined to
>>>      only flood to IGP neighbors but not further.
>>
>>For OSPF, there is the ability to have a link-scope opaque TLV (type 9, I 
>>believe).  The same mechanism isn't there for IS-IS.
>>
>>We actually decided not to use a link-scope TLV, because we wanted 
>>similar capabilities for OSPF and IS-IS, and because we wanted to define 
>>a link capabilities sub-TLV, where the other bits could be used to 
>>another purpose.
>>
>>Take a look at draft-atlas-ospf-local-protect-cap-00.txt and 
>>draft-martin-isis-local-protect-cap-00.txt.
>
>          Yes, I just did. Thus I have the above mentioned question. I would
>           think the capability is not just a node/link capability, but 
> also related
>           to the prefixes which may or may not use the same alternate paths
>           or even have alternate path.

The alternate capability is saying that a router can use that particular 
interface as an alternate.
The u-turn recipient capability is saying that the router can break U-Turns 
on that particular interface.
If the router can break U-Turns, then it can do so for all prefixes or FECs 
for which that interface connects to the primary neighbor.


>>>      If the paper claims the MPLS-TE FRR is complicated, This scheme
>>>      is not less complicated;) They can all be accomplished by router or
>>>      some server software, but at least MPLS-TE FRR is guaranteed to 
>>> deliver
>>>      the switch-over without depending on chance.
>>
>>The question is where the complexity burden is handled.  This scheme is 
>>complicated to understand and non-trivial to implement, but the 
>>complexity is primarily in the algorithm.  The management for it is more 
>>straightforward; the difference between managing a connection-less 
>>network and a connect-oriented network, on some level.
>
>          Yes, and we all know managing a connectless network can face
>          uncertainty while connection-oriented network gives certainty.
>          In some sense, it's also an implementation issue, the MPLS-TE
>          FRR if without considering the TE portion, can also be setup
>          automatically without much configuration.

That does of course depend upon the particular implementation of FRR.
If FRR is used with TE as a mesh overlay, true, but it suffers from the 
standard overlay scaling N^2 problems.  This means that it may cover a TE 
core, but not all of the network which it is desirable to protect.

If FRR is used to protect a particular node, one requires targeted LDP 
sessions to protect LDP traffic as well.

If FRR is used to protect a particular link, then all traffic on that link 
goes to the backup path on a failure.  With IP/LDP Local Protection, the 
traffic is spread to the various alternate next-hops, which may be more 
diverse.

Finally, operationalizing FRR and trouble-shooting it can be complicated.


>>This scheme doesn't depend on chance; it depends upon the topology being 
>>appropriately engineered.
>
>          When I mentioned "by chance", I meant in this paper, you may
>           or may not find alternate path, or find U-turn alternate; you may
>           face two network events close enough that you don't have
>           enough time to compute all the alternate paths before a local
>           link/node failure; while in MPLS-TE FRR, it's almost a certainty
>           to get switched-over correctly.

In MPLS-TE FRR, you still need to handle network changes and run a separate 
CSPF for every primary and every backup LSP.   I would not say that it is a 
certainty to get switched-over correctly if there are multiple network 
events close enough together.

As for potentially not finding an alternate path, that is part of what 
network engineering can help with.  In the analyses I've done of coverage, 
I do find that some source-destination pairs are not covered in most 
networks, but the coverage % goes from about 79% with just loop-free 
alternates to over 98% with U-Turn alternates as well.  Of course, that's 
based on source-destination coverage; when it is weighted by traffic 
matrices, the improvement can be more dramatic.

>>>      The alternate nexthops is not completely new, probably should
>>>      mention draft-kini-traf-restore-nsp-00.txt which I think was
>>>      expired, but searched on web found this link:
>>>
>>>http://yen.cse.kyutech.ac.jp/~ike/I-D/OLD/draft-kini-traf-restore-nsp-00.txt
>>>      that draft had some similar idea as computing primary and alternate
>>>      nexthops.
>>
>>Absolutely; it's hard to reference an expired draft.  I know that the 
>>concept of alternate next-hops has been discussed before, in reference to 
>>loop-free alternates.  The conclusion seems to have been that it didn't 
>>provide enough coverage.  That's what the U-Turn alternates are trying to 
>>solve.
>
>          this makes sense. thanks.

Thanks for the interest and questions.

Alia


>- r
>
>>
>>Thanks,
>>Alia
>>
>>
>>
>>_______________________________________________
>>routing-discussion mailing list
>>routing-discussion@ietf.org
>>https://www1.ietf.org/mailman/listinfo/routing-discussion
>
>_________________________________________________________________
>Create your own personal Web page with the info you use most, at My MSN. 
>http://click.atdmt.com/AVE/go/onm00200364ave/direct/01/



_______________________________________________
routing-discussion mailing list
routing-discussion@ietf.org
https://www1.ietf.org/mailman/listinfo/routing-discussion