Re: draft-aruns-ccamp-rsvp-restart-ext-00
Reshad Rahman <rrahman@cisco.com> Fri, 12 March 2004 14:16 UTC
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id JAA22658 for <ccamp-archive@ietf.org>; Fri, 12 Mar 2004 09:16:58 -0500 (EST)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1B1nT9-0002L2-00 for ccamp-archive@ietf.org; Fri, 12 Mar 2004 09:16:59 -0500
Received: from exim by ietf-mx with spam-scanned (Exim 4.12) id 1B1nS9-00025D-00 for ccamp-archive@ietf.org; Fri, 12 Mar 2004 09:15:58 -0500
Received: from psg.com ([147.28.0.62] ident=mailnull) by ietf-mx with esmtp (Exim 4.12) id 1B1nQe-0001cj-00 for ccamp-archive@ietf.org; Fri, 12 Mar 2004 09:14:24 -0500
Received: from lserv by psg.com with local (Exim 4.30; FreeBSD) id 1B1nBv-000Pdu-Is for ccamp-data@psg.com; Fri, 12 Mar 2004 13:59:11 +0000
Received: from [171.71.176.71] (helo=sj-iport-2.cisco.com) by psg.com with esmtp (Exim 4.30; FreeBSD) id 1B1nBk-000PaD-Q7 for ccamp@ops.ietf.org; Fri, 12 Mar 2004 13:59:00 +0000
Received: from sj-core-1.cisco.com (171.71.177.237) by sj-iport-2.cisco.com with ESMTP; 12 Mar 2004 06:01:43 +0000
Received: from mira-kan-a.cisco.com (IDENT:mirapoint@mira-kan-a.cisco.com [161.44.201.17]) by sj-core-1.cisco.com (8.12.10/8.12.6) with ESMTP id i2CDwtM7004438; Fri, 12 Mar 2004 05:58:56 -0800 (PST)
Received: from cisco.com (rrahman-u10.cisco.com [161.44.193.47]) by mira-kan-a.cisco.com (Mirapoint Messaging Server MOS 3.3.6-GR) with ESMTP id ABO84286; Fri, 12 Mar 2004 05:58:54 -0800 (PST)
Message-ID: <4051C21E.61753004@cisco.com>
Date: Fri, 12 Mar 2004 08:58:54 -0500
From: Reshad Rahman <rrahman@cisco.com>
Organization: Cisco Systems
X-Mailer: Mozilla 4.7 [en] (X11; U; SunOS 5.6 sun4u)
X-Accept-Language: en
MIME-Version: 1.0
To: Lou Berger <lberger@movaz.com>
CC: Nic Neate <Nic.Neate@dataconnection.com>, Adrian Farrel <adrian@olddog.co.uk>, "Satyanarayana, Arun" <aruns@movaz.com>, dimitri.papadimitriou@alcatel.be, ccamp@ops.ietf.org, Anca Zamfir <ancaz@cisco.com>, Junaid Israr <jisrar@cisco.com>, Zafar Ali <zali@cisco.com>
Subject: Re: draft-aruns-ccamp-rsvp-restart-ext-00
References: <53F74F5A7B94D511841C00B0D0AB16F8028708DE@baker.datcon.co.uk> <6.0.3.0.2.20040309105433.04e3fcb8@mo-ex1>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: owner-ccamp@ops.ietf.org
Precedence: bulk
X-Spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on ietf-mx.ietf.org
X-Spam-Status: No, hits=0.2 required=5.0 tests=AWL autolearn=no version=2.60
Content-Transfer-Encoding: 7bit
We'll have the part for simultaneous adjacent restarts in ~2 weeks. Regards, Reshad. Lou Berger wrote: > > Nic, > In one-on-one discussions at the IETF the authors agreed to do > just these two things! I know we're hoping to get the first part done late > this week/early next week. I can't speak for the other authors (of the > other half of the to-be-merged draft) on the second part. > > Lou > > At 07:41 AM 3/9/2004 -0500, Nic Neate wrote: > > >Hi Adrian (and draft-aruns authors), > > > >Responses below. In summary, I agree > > - with the suggestion of being able to request RecoveryPath messages > > - that it would be very helpful if the procedures for recovering from > >simultaneous adjacent restarts could be clarified. > > > >Thanks, > > > >Nic > > > > > -----Original Message----- > > > From: Adrian Farrel > > [<mailto:adrian@olddog.co.uk>mailto:adrian@olddog.co.uk] > > > Sent: Saturday, March 06, 2004 12:47 PM > > > To: Nic Neate; aruns@movaz.com; Movaz Networks - Louis Berger; > > > dimitri.papadimitriou@alcatel.be > > > Cc: ccamp@ops.ietf.org > > > Subject: Re: draft-aruns-ccamp-rsvp-restart-ext-00 > > > > > > > > > Hi Nic, > > > > > > > I've just read your draft-aruns-ccamp-rsvp-restart-ext-00 > > > and it looks good. > > > > In particular, we've been looking at using Restart for Fast > > > Reroute LSPs for > > > > some time and this draft provides everything that is needed > > > (like recovering > > > > the FAST_REROUTE, DETOUR, SENDER_TEMPLATE and ERO > > > > objects from the downstream node when they are not > > > available from upstream). > > > > > > Good. This concern was also raised in Seoul, and I am pleased > > > to hear that the draft > > > addresses these requirements. > > > > > > > However, I have a couple of concerns (not related to Fast Reroute). > > > > > > > > - Your draft doesn't tackle, and won't work for, > > > simultaneous restart of > > > > adjacent nodes. This is a problem that is tackled by > > > > draft-rahman-ccamp-rsvp-restart-extensions, so merging the > > > two drafts in > > > > some way may be the best way to resolve that. I realize > > > that the Aruns > > > > draft aims to make Restart possible for nodes which cannot > > > retrieve state > > > > from the data plane, and in that case recovering from > > > simultaneous restart > > > > of adjacent nodes isn't easy. I think including some > > > further extensions for > > > > nodes which can retrieve some state from the data plane would be > > > > appropriate. > > > > > > Retrieving state from the data plane only answers half of the > > > problem. However, it is > > > certainly important to audit the recovered control plane > > > information against the known > > > data plane state. > > > > > > >Indeed. My point was that if you can't retrieve even the outgoing signaling > >interface from your data plane following a "nodal fault", you haven't got > >much hope of reconstructing protocol state in between two nodes which > >restarted at the same time (without some serious protocol enhancement > >anyway). Hence the suggestion of additional extensions to recover from > >adjacent restarts for nodes which can retrieve the outgoing signaling > >interface. > > > > > With regard to adjacent node failures and restarts, I believe > > > there are actually > > > sufficient capabilities here. Perhaps the authors would like > > > to include text to clarify > > > the procedures. > > > > > > >If this is the case, then no problem. I agree that some text clarifying > >that in the draft would be very helpful. > > > > > > - The back compatibility with RFC 3473 restart looks > > > risky. Draft Aruns > > > > mandates that restarted nodes don't send Path Refreshes > > > until either the > > > > recovery period expires or a RecoveryPath is received from > > > downstream. In > > > > the case that the downstream node only supports RFC 3473 > > > restart (and so > > > > doesn't send RecoveryPaths), it may well timeout Path state > > > at the same time > > > > as or very soon after the recovery period expires. Hence a > > > dangerous timing > > > > window is created. > > > > > > You have something here. > > > However, section 9.5.3 of RFC3473 does not say that the > > > neighbor MUST discard state that > > > is not restored in the recovery time interval. Presumably it > > > would simply recommence > > > waiting for state refresh and so would time out after a 3.5 > > > refresh intervals from the end > > > of the recovery interval. > > > > > > >That would be sensible behavior, yes. My concern (as I'm sure you realize) > >is that it won't happen like that in all cases in the real world. > > > > > Some compromise may be introduced here by noting that 3473 > > > says that Path state SHOULD be > > > restored within 1/2 of the recovery time. So we could follow > > > this logic and use the first > > > half of the time interval for the RecoveryPath message and > > > the second half for backwards > > > compatible recovery. > > > > > > On the other hand, I would prefer that this new capability > > > (support for RecoveryPath > > > message) was signaled in the Restart_Capabilities object so > > > that the restarting node can > > > know whether to expect to receive a RecoveryPath or not. > > > > > > > As a potential solution to both problems I'd suggest that a > > > restarting node > > > > receiving a Path message with a recovery label should > > > always forward it > > > > immediately as well as it can, and include both a recovery > > > label and (for > > > > back compatibility) a suggested label. Similarly, it should forward > > > > RecoveryPath messages immediately as well as it can. I'd > > > be happy to > > > > discuss any of this further. > > > > > > This sounds very dangerous. > > > "As well as it can" may include path computation which may > > > pick a path other than the one > > > previously in use. Hence the new Path message will be sent to > > > a new neighbor. This > > > disaster is no better than the problem we are trying to solve. > > > > > > >Fine. I had in mind that a node should only forward a Path message before > >receiving a RecoveryPath if it was sure that it could send it (as per > >RFC3473) to the right place and without a dangerous ERO. In any case, I > >prefer the idea of being able to request RecoveryPath messages and it sounds > >like that will make recovery possible in more situations. > > > > > Cheers, > > > Adrian > > >
- draft-aruns-ccamp-rsvp-restart-ext-00 Nic Neate
- Re: draft-aruns-ccamp-rsvp-restart-ext-00 Adrian Farrel
- Re: draft-aruns-ccamp-rsvp-restart-ext-00 Reshad Rahman
- RE: draft-aruns-ccamp-rsvp-restart-ext-00 Nic Neate
- RE: draft-aruns-ccamp-rsvp-restart-ext-00 Lou Berger
- Re: draft-aruns-ccamp-rsvp-restart-ext-00 Reshad Rahman