Re: Restart signaling, DR/BDR, and RouterDeadInterval

Liem Nguyen <lhnguyen@CISCO.COM> Sat, 11 May 2002 21:14 UTC

Received: from PEAR.EASE.LSOFT.COM (pear.ease.lsoft.com [209.119.1.37]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA12592 for <ospf-archive@LISTS.IETF.ORG>; Sat, 11 May 2002 17:14:34 -0400 (EDT)
Received: from walnut (209.119.1.45) by PEAR.EASE.LSOFT.COM (LSMTP for OpenVMS v1.1b) with SMTP id <8.E6B67D8A@PEAR.EASE.LSOFT.COM>; Sat, 11 May 2002 17:14:10 -0400
Received: from DISCUSS.MICROSOFT.COM by DISCUSS.MICROSOFT.COM (LISTSERV-TCP/IP release 1.8d) with spool id 979179 for OSPF@DISCUSS.MICROSOFT.COM; Sat, 11 May 2002 17:14:42 -0400
Received: from 161.44.11.97 by WALNUT.EASE.LSOFT.COM (SMTPL release 1.0d) with TCP; Sat, 11 May 2002 17:14:42 -0400
Received: from rtp-iosxdm1.cisco.com (localhost [127.0.0.1]) by rtp-msg-core-1.cisco.com (8.12.2/8.12.2) with ESMTP id g4BLF7V4002794; Sat, 11 May 2002 17:15:08 -0400 (EDT)
Received: (lhnguyen@localhost) by rtp-iosxdm1.cisco.com (8.8.8-Cisco List Logging/CISCO.WS.1.2) id RAA19585; Sat, 11 May 2002 17:14:41 -0400 (EDT)
References: <E7E13AAF2F3ED41197C100508BD6A328291DB7@india_exch.hyderabad.mindspeed.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
Message-ID: <20020511171441.A10597@rtp-iosxdm1.cisco.com>
Date: Sat, 11 May 2002 17:14:41 -0400
Reply-To: Mailing List <OSPF@DISCUSS.MICROSOFT.COM>
Sender: Mailing List <OSPF@DISCUSS.MICROSOFT.COM>
From: Liem Nguyen <lhnguyen@CISCO.COM>
Subject: Re: Restart signaling, DR/BDR, and RouterDeadInterval
To: OSPF@DISCUSS.MICROSOFT.COM
In-Reply-To: <E7E13AAF2F3ED41197C100508BD6A328291DB7@india_exch.hyderabad.mindspeed.com>; from VishwasM@NETPLANE.COM on Sat, May 11, 2002 at 07:47:12AM -0400

On Sat, May 11, 2002 at 07:47:12AM -0400, Manral, Vishwas wrote:
> Folks,
>
> I had further thoughts, on this("restart signaling") and "hitless restart"
> in general.
>
> 1. I guess in case of "restart signalling" it could help to actually
> maintain DR state. This could be easily done by checking DR value from
> neighbors hellos on broadcast/nbma interfaces and using that value in the
> hellos, before sending own hellos.

If the neighbors are aware that you are restarting (via restart signaling),
then DR/BDR can certainly be retained as you suggested.  In fact, we do
that in our implementation.


> 2. I have been wondering about this for some time.
>
> If a router could make out that the other end is hitless restart capable and
> is going an unplanned restart, we could wait for some more
> time(HitlessRestartInterval > RouterDeadInterval) that could be signalled
> whenever the adjacencies were brought up at init time. The way to do this
> could be to check if pings are working, to the neighbors interface, but the
> OSPF process itself is not working.
>
> What I mean to say if the helper could be informed that the adjacent router
> is going an unplanned outage(by any method) it could actually wait for a
> longer period than the RouterDeadInterval in case of unplanned outage too.
> So couldn't we keep a provision for this.

IMO, if hitless restart procedure isn't initiated (not necessarily finished)
at RouterDeadInterval expiry, then hitless restart should be aborted.

Liem

>
> 3. Regarding the avoiding exit of hitless restart in case we know that
> routing loops would not occur. The conclusion I got to was that
>    a) We could either have a helper H, exit hitless restart whenever a route
> to D in the helper H changed (as agreed before earlier on the list). (i.e.
> the route on atleast one of the neighbors would change for a route on the
> restarting router R to change)
>    b) Or we could have if a changed PATH on helper H to the destination D,
> does not go thru the restarting router R, we need not exit hitless restart.
>
> However we need would require all routers H to have the same behaviour(a or
> b). My observation to the above is that, it would be easier to do a).
>
> Thanks,
> Vishwas
>
> -----Original Message-----
> From: Manral, Vishwas [mailto:VishwasM@NETPLANE.COM]
> Sent: Friday, May 10, 2002 11:13 PM
> To: OSPF@DISCUSS.MICROSOFT.COM
> Subject: Re: Restart signaling, DR/BDR, and RouterDeadInterval
>
>
> Mitchell,
>
> I guess the "restart signaling" draft is now located at
> http://www.ietf.org/internet-drafts/draft-nguyen-ospf-restart-00.txt
>
> Though Alex, or some other draft author would give the exact details, I
> think this draft does not bother about keeping DR state, so if a DR router
> goes down, and comes up it may no longer be the DR after the DR election.
> Things I guess work as per the base RFC2328.
>
> Acee,
>
> Thanks for the information. (However I don't remember asking a question
> related to this. ;-))
>
> Thanks,
> Vishwas
>
> -----Original Message-----
> From: Acee Lindem [mailto:acee@REDBACK.COM]
> Sent: Friday, May 10, 2002 9:07 PM
> To: OSPF@DISCUSS.MICROSOFT.COM
> Subject: Re: Restart signaling, DR/BDR, and RouterDeadInterval
>
>
> Michelle, Manral,
>
> There are two hitless restart drafts. Although I was not at
> the specific meeting, the one below is the one that was accepted
> by the OSPF work group.
>
> http://www.ietf.org/internet-drafts/draft-ietf-ospf-hitless-restart-02.txt
>
> IMHO, it does address your questions although there is
> currently no provision for a restart that doesn't proceed within
> the router dead interval. We are considering this for our
> implementation.
>
>
> Erblichs wrote:
> > OSPF Restart Signaling.. Feb01 to Sept01
> >
> > Hi Group,
> >
> >         I didn't see if this became a standard, so
> >         this clarification / question could be moot..
> >
> >         Just thinking of the proper way to code
> >         this..
> >
> >         I assume that if a hello is recieved from
> >         a restart router within the RouterDeadInterval
> >         timeframe, then there is significantly less
> >         work... There was no intervening teardown..
> >
> >         However, what is not mentioned, and I want to
> >         clarify this..
> >
> >         Assume both routers support Restart..
> >         If the restarting router was the DR or the BDR,
> >         there is no assumption that if the restart was
> >         done WITHIN the RouterDeadInterval timeframe,
> >         the adjacency is still up, that a restarting
> >         router would stay the BDR or DR...
> >
> >         Assume both routers support restart..
> >         Else, NOT WITHIN the RouterDeadInterval timeframe.
> >         We have to assume that there was a intervening
> >         election. Would we do another election?
> >
> >         There is no mention of election effects / side-effects
> >         with respect to the restarting router, when all or
> >         some routers support restart? The non-restart supported
> >         routers would force a re-election.
> >
> >         Thanks,
> >                 Mitchell Erblich
> >
>
>
> --
> Acee

--

Liem Nguyen