RE: a draft about how BFD notifying the state change to applications

richard.spencer@bt.com Wed, 24 August 2005 11:22 UTC

From: richard.spencer@bt.com
content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Date: Wed, 24 Aug 2005 12:22:06 +0100
Message-ID: <B5E87B043D4C514389141E2661D255EC0A835B85@i2km41-ukdy.domain1.systemhost.net>
Thread-Topic: a draft about how BFD notifying the state change to applications
Thread-Index: AcWoUz9xPgAaSgrbQtKJNF24yQsInQAOt/BQ
To: yangpingan@huawei.com, internet-drafts@ietf.org
Content-Transfer-Encoding: quoted-printable
Cc: rtg-bfd@ietf.org, jhaas@nexthop.com, dward@cisco.com, d.katz@juniper.com
Subject: RE: a draft about how BFD notifying the state change to applications
Precedence: list
Sender: rtg-bfd-bounces@ietf.org
Errors-To: rtg-bfd-bounces@ietf.org

Pingan,

If I understand correctly, the problem you are trying to solve is the case where we get a flapping BFD session (i.e. the session is going up and down repeatedly), e.g. due to an intermittent link failure. Your solution is to use hold timers to delay any actions to be taken by applications as a result of a session going up or down.

IMO your solution does not solve the problem. If a session is flapping then the flapping needs to be detected and the appropriate action taken. Your solution does not reliably detect flapping and I assume is based on ignoring flaps that occur within the configured hold times.

The current text in section 4 states: "When BFD session goes up, if HOLD timer is running, then stop HOLD timer and does not notify application the state change of the session. Otherwise start WTR timer." The effect of this proposal is that when you get a BFD session going up and down, although it will be detected by BFD, the application will never take any action if the flap interval is less than the hold timer. For example, lets say the flap interval is 5secs and the hold timer is set to 10secs. When the session goes down the hold timer will start but after 5 secs the session will come back up and the hold timer will be stopped and no action taken. This cycle will continue indefinitely, i.e. traffic will continue to be dropped intermittently and the flapping will not be detected.

Even if you don't stop the hold timer when the session comes back up, you still won't reliably detect flapping. Lets say a session goes down and you start the hold timer (set to say 10 secs). The session might go up and down 5 times during the 10 sec hold time, but by chance when the hold timer expires the session may be up. In which case, the application won't take any action despite traffic being lost intermittently over the 10 sec period.

Also, the main purpose of using BFD is to provide fast fault detection. If hold timers are introduced to delay any actions to be taken following fault detection (e.g. failover to a backup route), then what's the point of having fast detection in the first place?

An alternative solution would be to use a BFD session up/down state transition alarm threshold. If the number of times a session goes up/down within a configured time period exceeds the threshold, then the session can i) be disabled or administratively shut down (under control of OSS or the BFD system itself), ii) an alarm raised to instigate fault diagnosis/correction activities, or iii) both. This solution does not require any changes to the BFD protocol and could be implemented today.

Regards,
Richard

> -----Original Message-----
> From: rtg-bfd-bounces@ietf.org [mailto:rtg-bfd-bounces@ietf.org]On
> Behalf Of yangpingan 30338
> Sent: 24 August 2005 03:26
> To: internet-drafts@ietf.org
> Cc: rtg-bfd@ietf.org; jhaas@nexthop.com; dward@cisco.com;
> d.katz@juniper.com
> Subject: a draft about how BFD notifying the state change to
> applications
> 
> 
> 
> Dear all,
> 
> I have written a draft about how BFD notifying the state 
> change to applications, the purpose is to reduce the effect to system
> when BFD session goes up and down frequently, and reduce the 
> packet loss when defect is recovered. It's 4 pages long, 
> Comments are very welcome.
> 
> Please help to post to the IETF website this draft and thank 
> you for that.
> 
> 
> Thank you and Best regards.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> **************************************************************
> ****************************
>  This email and its attachments contain confidential 
> information from HUAWEI, which is intended only for the 
> person or entity whose address is listed above. Any use of 
> the information contained herein in any way (including, but 
> not limited to, total or partial disclosure, reproduction, or 
> dissemination) by persons other than the intended 
> recipient(s) is prohibited. If you receive this e-mail in 
> error, please notify the sender by phone or em
> ail immediately and delete it!
>  
> **************************************************************
> **************************
>

a draft about how BFD notifying the state change … yangpingan 30338
RE: a draft about how BFD notifying the state cha… richard.spencer
Re: a draft about how BFD notifying the state cha… Yang Pingan
Re: a draft about how BFD notifying the state cha… Thomas D. Nadeau
RE: a draft about how BFD notifying the state cha… richard.spencer
Re: a draft about how BFD notifying the state cha… Jeffrey Haas