Re: Hitless restart

"Manral, Vishwas" <VishwasM@NETPLANE.COM> Sun, 27 October 2002 16:28 UTC

Received: from cherry.ease.lsoft.com (cherry.ease.lsoft.com [209.119.0.109]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id LAA06158 for <ospf-archive@LISTS.IETF.ORG>; Sun, 27 Oct 2002 11:28:41 -0500 (EST)
Received: from walnut (209.119.0.61) by cherry.ease.lsoft.com (LSMTP for Digital Unix v1.1b) with SMTP id <8.00794D48@cherry.ease.lsoft.com>; 27 Oct 2002 11:31:01 -0500
Received: from DISCUSS.MICROSOFT.COM by DISCUSS.MICROSOFT.COM (LISTSERV-TCP/IP release 1.8e) with spool id 346783 for OSPF@DISCUSS.MICROSOFT.COM; Sun, 27 Oct 2002 11:31:01 -0500
Received: from 12.27.183.253 by WALNUT.EASE.LSOFT.COM (SMTPL release 1.0f) with TCP; Sun, 27 Oct 2002 11:31:01 -0400
Received: by XOVER.dedham.mindspeed.com with Internet Mail Service (5.5.2653.19) id <4B0HPJMW>; Sun, 27 Oct 2002 11:31:00 -0500
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain; charset="iso-8859-1"
Message-ID: <E7E13AAF2F3ED41197C100508BD6A3287918EA@india_exch.hyderabad.mindspeed.com>
Date: Sun, 27 Oct 2002 11:32:50 -0500
Reply-To: Mailing List <OSPF@DISCUSS.MICROSOFT.COM>
Sender: Mailing List <OSPF@DISCUSS.MICROSOFT.COM>
From: "Manral, Vishwas" <VishwasM@NETPLANE.COM>
Subject: Re: Hitless restart
To: OSPF@DISCUSS.MICROSOFT.COM
Precedence: list

Hi Acee,

> I agree - LSRefreshTime seems like a good upper bound. An
> implementation could refresh its pre-start LSAs in process
> of graceful restart but I don't think this makes sense. 30
> minutes should be more than enough time for a graceful
> restart.

Rite. I dont think so either, however if you are documenting it, 30 minutes
would be a more appropriate period.

> I'd like to remain true to John's original specification which
> minimizes the likelihood of a graceful restart resulting in a routing
> loop or blackholed traffic. We could possibly document alternatives
> in a separate section.
I think it is ok to put in a seperate section.

> If the mechanism you describe acheives the same goals with respect
> to minimizing routing loops/blackholing using less a conservative
> criteria for terminating graceful restart then maybe it warrents a
> separate draft.

Acee, I had given similar comments earlier, regarding minimizing the case
where we exit hitless restart(without completing it) and that was put in
the version-02 of the draft. This was just a follow up comment. If/when
the working group feels its an important to document it, I will do that.

Thanks,
Vishwas

> -----Original Message-----
> From: Acee Lindem [mailto:acee@REDBACK.COM]
> Sent: Thursday, October 24, 2002 9:31 AM
> To: OSPF@DISCUSS.MICROSOFT.COM
> Subject: Re: Hitless restart
>
>
> Hi Don,
>
> Don Goodspeed wrote:
>
>
>>Acee,
>>
>>On the issue of updating the grace period, I would only
>>update in case the reason code changed, and changed to
>>a value of unknown (and/or we could add a new "unplanned
>>restart" reason code).
>>
>
>
> I'm not sure there is any difference between unknown
> and unplanned.
>
>
>>Also, I remember a while back a discussion on the list
>>between yourself, Padma, and John regarding the IP address
>>field in the Grace LSA.  I cannot remember the specific issue,
>>but I assume there's been a resolution (or at least an
>>understanding)?
>>
>
>
> This was the comment that Padma made that the IP interface
> address TLV is not necessary to identify the restarting
> neighbor on multi-access (broadcast and NBMA) links. I pointed out
> that it wasn't really necessary since the grace LSA was already
> scoped to an OSPF interface and there could only be one
> neighbor adjacency corresponding to the router ID (even if
> one was running OSPF on more than one subnet). John pointed
> out that OSPFv2 always uses the IP interface address to identify
> neighboring routers on multi-access networks so it should
> included to be consistent with the RFC 2328 (numerous sections).
> I tend to agree with this and would vote to leave it in.
>
> However, it is not needed for OSPFv3 since here neighboring
> routers are always identified by router ID.
>
>
>
>>Finally, there was also a previous discussion I was involved
>>in regarding contraining the grace period to a max value of
>>MaxAge (3600).  Is this being included in the new draft?
>>
>
>
> That could be documented.
>
>
>
>>Thanks,
>>Don
>>
>> --- On Tue 10/22, Acee Lindem  wrote:
>>From: Acee Lindem [mailto: acee@REDBACK.COM]
>>To: OSPF@DISCUSS.MICROSOFT.COM
>>Date: Tue, 22 Oct 2002 14:49:37 -0400
>>Subject: Re: Hitless restart
>>
>>
>>
>>>Manral, Vishwas wrote:
>>>
>>>
>>>
>>>>Hi Acee,
>>>>
>>>>I agree to Padma's statement that we are using link-local LSA to
>>>>
>>>>
>>>signal
>>>
>>>
>>>>hitless-restart, because we would not want the LSA to be flooded by
>>>>
>>>>
>>>the
>>>
>>>
>>>>neighbors into the area. We are signalling router parameters in the
>>>>
>>>>
>>>link
>>>
>>>
>>>>local LSA.
>>>>
>>>>Regarding 2.
>>>>- I think we should update the grace-period/reason only when we have
>>>>
>>>>
>>>a
>>>
>>>
>>>>change in content of the Grace LSA from the neighbor to the
>>>>
>>>>
>>>restarting
>>>
>>>
>>>>router on any interface. So if we get a different value of
>>>>grace-period/restart reason we update and restart the grace period on
>>>>
>>>>
>>>the
>>>
>>>
>>>>helper else we do not. This would be in keeping with Section 3 of the
>>>>
>>>>
>>>draft.
>>>
>>>
>>>Vishwas,
>>>
>>>I agree that if we do agree to change the existing specification than we
>>>should, in fact, honor changes to the grace-period/restart-reason. One
>>>of my concerns with this approach is the added complexity (consider
>>>determining which of multiple link local LSAs received on different
>>>links is more recent).
>>>
>>>Thanks,
>>>
>>>Acee
>>>
>>>
>>>
>>>
>>>>- I think the first grace LSA withdrawn would cause the router to
>>>>
>>>>
>>>exit
>>>
>>>
>>>>hitless restart mode for that neighbor.
>>>>
>>>>Thanks,
>>>>Vishwas
>>>>
>>>>-----Original Message-----
>>>>From: Acee Lindem [mailto:acee@REDBACK.COM]
>>>>Sent: Tuesday, October 22, 2002 8:22 PM
>>>>To: OSPF@DISCUSS.MICROSOFT.COM
>>>>Subject: Re: Hitless restart
>>>>
>>>>
>>>>All,
>>>>
>>>>I think we have a "rough concensus" that the restarting
>>>>
>>>>
>>>router
>>>
>>>
>>>>should NOT try to optimize flooding of grace LSAs. I've
>>>>only had one vote for this and I believe the vote was more
>>>>for flooding optimizations in general than this particular
>>>>scenario.
>>>>
>>>>
>>>>Padma,
>>>>
>>>>I don't think it simplifies the helper code (or specification)
>>>>to apply the grace LSA to all neighbor instances of restarting
>>>>router. Here are my reasons:
>>>>
>>>>  1. Conceptually, I don't think it is right to apply the link
>>>>     local grace LSA to neighbors that are not on the link.
>>>>
>>>>  2. There are simply more corner cases -
>>>>
>>>>      - When you receive a second grace LSA from the restarting
>>>>        router, do you update the reason and grace period for
>>>>        all neighbor instances or only ones on that segment?
>>>>
>>>>      - When do exit helper mode? Is it when the first grace
>>>>        LSA is withdrawn or all the grace LSAs? I certainly don't
>>>>        think there should be disconnect between entering and
>>>>        exiting helper mode.
>>>>
>>>>I do agree that it is somewhat more robust for the case where
>>>>you are adjacent with the same router on more than one link.
>>>>However, I don't see this as a real big gain. Has anyone
>>>>else implemented this draft?
>>>>
>>>>Thanks,
>>>>Acee
>>>>
>>>>
>>>>
>>>>Padma Pillay-Esnault wrote:
>>>>
>>>>
>>>>
>>>>
>>>>>Rajesh
>>>>>
>>>>>
>>>>>Rajesh Varadarajan wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>Acee, Padma,
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>>>I do think it simplifies things if a
>>>>>>>>>>
>>>>>>>>>>
>>>restarting router originates the
>>>
>>>
>>>>grace LSAs
>>>>
>>>>
>>>>
>>>>>>>>>>on all it's interfaces (or at least all with
>>>>>>>>>>
>>>>>>>>>>
>>>full neighbor
>>>
>>>
>>>>>^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>>>>>adjacencies if it a planned restart).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>The above statement seems to imply (my reading) that a
>>>>>>
>>>>>>
>>>restarting router
>>>
>>>
>>>>>>can only send Grace LSA's out of some (not all) of its
>>>>>>
>>>>>>
>>>interfaces. The
>>>
>>>
>>>>>It doesn't imply that - clearly we send to all our neighbors.
>>>>>
>>>>>Padma
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>spec however requires the router to send the grace-lsa out of
>>>>>>
>>>>>>
>>>all
>>>
>>>
>>>>>>interfaces on which it has full neigbhors. Failure to do so
>>>>>>
>>>>>>
>>>would always
>>>
>>>
>>>>>>result in its helper mode being terminated by others.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>>I also agree with you .. that's what I do - the
>>>>>>>>>
>>>>>>>>>
>>>restarting router sends
>>>
>>>
>>>>>>>>>grace lsa over all interfaces. But from a Helping
>>>>>>>>>
>>>>>>>>>
>>>router perspective
>>>
>>>
>>>>>>>>>we should do as in 00-txt. It simplifies the
>>>>>>>>>
>>>>>>>>>
>>>helper code.
>>>
>>>
>>>>>>>>>It also prevents corner cases where we might not
>>>>>>>>>
>>>>>>>>>
>>>help all adjacencies
>>>
>>>
>>>>>>>>>(when we should).
>>>>>>>>>
>>>>>>>>>Padma
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>I haven't heard anyone object to why the 00-txt.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>I will - I specifically agree with
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>      Note that Router Y only needs to receive a single
>>>>>>>
>>>>>>>
>>>grace-LSA from
>>>
>>>
>>>>>>>      X, even if X and Y attach to multiple common
>>>>>>>
>>>>>>>
>>>segments.
>>>
>>>
>>>>>>>I don't like using an LSA with link local scope to
>>>>>>>
>>>>>>>
>>>enter/terminate
>>>
>>>
>>>>>>>helper mode for neighbors on different interfaces. It is
>>>>>>>
>>>>>>>
>>>better if the
>>>
>>>
>>>>>>>link local grace LSA only applies to that link.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>Given the behavior above, I think the each neighbor should be
>>>>>>
>>>>>>
>>>considered
>>>
>>>
>>>>>>independently on a segment by segment basis.
>>>>>>
>>>>>>
>>>>>>thanks,
>>>>>>rajesh
>>>>>>
>>>>--
>>>>
>


--
Acee