Re: a draft about how BFD notifying the state change to applications

"Thomas D. Nadeau" <> Wed, 24 August 2005 16:46 UTC

Received: from localhost.localdomain ([] by with esmtp (Exim 4.32) id 1E7yOG-0001Bg-5Z; Wed, 24 Aug 2005 12:46:16 -0400
Received: from ([] by with esmtp (Exim 4.32) id 1E7yO5-00017c-As; Wed, 24 Aug 2005 12:46:06 -0400
Received: from (ietf-mx []) by (8.9.1a/8.9.1a) with ESMTP id MAA03583; Wed, 24 Aug 2005 12:46:01 -0400 (EDT)
Received: from ([]) by with esmtp (Exim 4.43) id 1E7yOP-00078c-Sv; Wed, 24 Aug 2005 12:46:26 -0400
Received: from ( by with ESMTP; 24 Aug 2005 12:45:55 -0400
X-IronPort-AV: i="3.96,138,1122868800"; d="scan'208"; a="67705032:sNHT34652364"
Received: from [] ( []) by (8.12.10/8.12.6) with SMTP id j7OGjpT7018746; Wed, 24 Aug 2005 12:45:52 -0400 (EDT)
In-Reply-To: <>
References: <>
Mime-Version: 1.0 (Apple Message framework v734)
Content-Type: text/plain; charset="US-ASCII"; delsp="yes"; format="flowed"
Message-Id: <>
Content-Transfer-Encoding: 7bit
From: "Thomas D. Nadeau" <>
Date: Wed, 24 Aug 2005 12:45:44 -0400
X-Mailer: Apple Mail (2.734)
X-Spam-Score: 0.0 (/)
X-Scan-Signature: b1c41982e167b872076d0018e4e1dc3c
Content-Transfer-Encoding: 7bit
Subject: Re: a draft about how BFD notifying the state change to applications
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <>
List-Unsubscribe: <>, <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>

     Good analysis Richard. One comment below.

> Pingan,
> If I understand correctly, the problem you are trying to solve is  
> the case where we get a flapping BFD session (i.e. the session is  
> going up and down repeatedly), e.g. due to an intermittent link  
> failure. Your solution is to use hold timers to delay any actions  
> to be taken by applications as a result of a session going up or down.
> IMO your solution does not solve the problem. If a session is  
> flapping then the flapping needs to be detected and the appropriate  
> action taken. Your solution does not reliably detect flapping and I  
> assume is based on ignoring flaps that occur within the configured  
> hold times.
> The current text in section 4 states: "When BFD session goes up, if  
> HOLD timer is running, then stop HOLD timer and does not notify  
> application the state change of the session. Otherwise start WTR  
> timer." The effect of this proposal is that when you get a BFD  
> session going up and down, although it will be detected by BFD, the  
> application will never take any action if the flap interval is less  
> than the hold timer. For example, lets say the flap interval is  
> 5secs and the hold timer is set to 10secs. When the session goes  
> down the hold timer will start but after 5 secs the session will  
> come back up and the hold timer will be stopped and no action  
> taken. This cycle will continue indefinitely, i.e. traffic will  
> continue to be dropped intermittently and the flapping will not be  
> detected.
> Even if you don't stop the hold timer when the session comes back  
> up, you still won't reliably detect flapping. Lets say a session  
> goes down and you start the hold timer (set to say 10 secs). The  
> session might go up and down 5 times during the 10 sec hold time,  
> but by chance when the hold timer expires the session may be up. In  
> which case, the application won't take any action despite traffic  
> being lost intermittently over the 10 sec period.
> Also, the main purpose of using BFD is to provide fast fault  
> detection. If hold timers are introduced to delay any actions to be  
> taken following fault detection (e.g. failover to a backup route),  
> then what's the point of having fast detection in the first place?
> An alternative solution would be to use a BFD session up/down state  
> transition alarm threshold. If the number of times a session goes  
> up/down within a configured time period exceeds the threshold, then  
> the session can i) be disabled or administratively shut down (under  
> control of OSS or the BFD system itself), ii) an alarm raised to  
> instigate fault diagnosis/correction activities, or iii) both. This  
> solution does not require any changes to the BFD protocol and could  
> be implemented today.

     Based on the premise that BFD should allow for fast detection,
I think you are right here in that all of the raw alarms should be
forwarded to the applications that are interested in those alarms.
This is indeed how the various implementations that I am familiar
with do this. It is a local decision as to how to process
(i.e.: squelch/forward/aggregate) those alarms. It doesn't seem
under the purview of the BFD protocol however, to stipulate
how the alarms are processed given this point. Also
another important point to include is that different applications
may need different processing algorithms, which also makes
this something too application specific to be stipulated
as part of the BFD protocol.


> Regards,
> Richard
>> -----Original Message-----
>> From: []On
>> Behalf Of yangpingan 30338
>> Sent: 24 August 2005 03:26
>> To:
>> Cc:;;;
>> Subject: a draft about how BFD notifying the state change to
>> applications
>> Dear all,
>> I have written a draft about how BFD notifying the state
>> change to applications, the purpose is to reduce the effect to system
>> when BFD session goes up and down frequently, and reduce the
>> packet loss when defect is recovered. It's 4 pages long,
>> Comments are very welcome.
>> Please help to post to the IETF website this draft and thank
>> you for that.
>> Thank you and Best regards.
>> **************************************************************
>> ****************************
>>  This email and its attachments contain confidential
>> information from HUAWEI, which is intended only for the
>> person or entity whose address is listed above. Any use of
>> the information contained herein in any way (including, but
>> not limited to, total or partial disclosure, reproduction, or
>> dissemination) by persons other than the intended
>> recipient(s) is prohibited. If you receive this e-mail in
>> error, please notify the sender by phone or em
>> ail immediately and delete it!
>> **************************************************************
>> **************************