Re: a draft about how BFD notifying the state change to applications

Yang Pingan <yangpingan@huawei.com> Wed, 24 August 2005 13:52 UTC

Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1E7vfd-0000ch-6C; Wed, 24 Aug 2005 09:52:01 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1E7vfb-0000cV-Od; Wed, 24 Aug 2005 09:51:59 -0400
Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id JAA21376; Wed, 24 Aug 2005 09:51:56 -0400 (EDT)
Received: from szxga02-in.huawei.com ([61.144.161.54] helo=huawei.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1E7vfh-0000ow-BL; Wed, 24 Aug 2005 09:52:18 -0400
Received: from huawei.com (szxga02-in [172.24.2.6]) by szxga02-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 1.25 (built Mar 3 2004)) with ESMTP id <0ILQ000TTC5U7T@szxga02-in.huawei.com>; Wed, 24 Aug 2005 21:58:42 +0800 (CST)
Received: from szxml01-in ([172.24.1.3]) by szxga02-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 1.25 (built Mar 3 2004)) with ESMTP id <0ILQ00LNUC5U2Y@szxga02-in.huawei.com>; Wed, 24 Aug 2005 21:58:42 +0800 (CST)
Received: from qingyuan64202 ([221.216.175.92]) by szxml01-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 1.25 (built Mar 3 2004)) with ESMTPA id <0ILQ00BWNCA9L0@szxml01-in.huawei.com>; Wed, 24 Aug 2005 22:01:47 +0800 (CST)
Date: Wed, 24 Aug 2005 21:57:24 +0800
From: Yang Pingan <yangpingan@huawei.com>
To: richard.spencer@bt.com, internet-drafts@ietf.org
Message-id: <001901c5a8b3$e7c1d760$6607fea9@qingyuan64202>
MIME-version: 1.0
X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2800.1506
X-Mailer: Microsoft Outlook Express 6.00.2800.1506
Content-type: text/plain; charset="iso-8859-1"
Content-transfer-encoding: 7bit
X-Priority: 3
X-MSMail-priority: Normal
References: <B5E87B043D4C514389141E2661D255EC0A835B85@i2km41-ukdy.domain1.systemhost.net>
X-Spam-Score: 0.1 (/)
X-Scan-Signature: 386e0819b1192672467565a524848168
Content-Transfer-Encoding: 7bit
Cc: rtg-bfd@ietf.org, jhaas@nexthop.com, dward@cisco.com, d.katz@juniper.com
Subject: Re: a draft about how BFD notifying the state change to applications
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
Sender: rtg-bfd-bounces@ietf.org
Errors-To: rtg-bfd-bounces@ietf.org

Richard,

What you said is just one possible situation of  WTR and HOLD process. Normally the hold time is less than WTR time(The recommended
value of HOLD timer is several seconds or less than one second. and WTR timer several minute). So this cannot  be achieved by alarm threshold.
And another purpose of WTR  is that: before BFD notifying forwarding engine, it gives the chance for preparing forwarding table(especially for 1hop BFD),
so can avoid  traffic lost when defect is recoverd.




----- Original Message ----- 
From: <richard.spencer@bt.com>
To: <yangpingan@huawei.com>; <internet-drafts@ietf.org>
Cc: <rtg-bfd@ietf.org>; <jhaas@nexthop.com>; <dward@cisco.com>; <d.katz@juniper.com>
Sent: Wednesday, August 24, 2005 7:22 PM
Subject: RE: a draft about how BFD notifying the state change to applications


Pingan,

If I understand correctly, the problem you are trying to solve is the case where we get a flapping BFD session (i.e. the session is going up and down repeatedly), e.g. due to an intermittent link failure. Your solution is to use hold timers to delay any actions to be taken by applications as a result of a session going up or down.

IMO your solution does not solve the problem. If a session is flapping then the flapping needs to be detected and the appropriate action taken. Your solution does not reliably detect flapping and I assume is based on ignoring flaps that occur within the configured hold times.

The current text in section 4 states: "When BFD session goes up, if HOLD timer is running, then stop HOLD timer and does not notify application the state change of the session. Otherwise start WTR timer." The effect of this proposal is that when you get a BFD session going up and down, although it will be detected by BFD, the application will never take any action if the flap interval is less than the hold timer. For example, lets say the flap interval is 5secs and the hold timer is set to 10secs. When the session goes down the hold timer will start but after 5 secs the session will come back up and the hold timer will be stopped and no action taken. This cycle will continue indefinitely, i.e. traffic will continue to be dropped intermittently and the flapping will not be detected.

Even if you don't stop the hold timer when the session comes back up, you still won't reliably detect flapping. Lets say a session goes down and you start the hold timer (set to say 10 secs). The session might go up and down 5 times during the 10 sec hold time, but by chance when the hold timer expires the session may be up. In which case, the application won't take any action despite traffic being lost intermittently over the 10 sec period.

Also, the main purpose of using BFD is to provide fast fault detection. If hold timers are introduced to delay any actions to be taken following fault detection (e.g. failover to a backup route), then what's the point of having fast detection in the first place?

An alternative solution would be to use a BFD session up/down state transition alarm threshold. If the number of times a session goes up/down within a configured time period exceeds the threshold, then the session can i) be disabled or administratively shut down (under control of OSS or the BFD system itself), ii) an alarm raised to instigate fault diagnosis/correction activities, or iii) both. This solution does not require any changes to the BFD protocol and could be implemented today.

Regards,
Richard

> -----Original Message-----
> From: rtg-bfd-bounces@ietf.org [mailto:rtg-bfd-bounces@ietf.org]On
> Behalf Of yangpingan 30338
> Sent: 24 August 2005 03:26
> To: internet-drafts@ietf.org
> Cc: rtg-bfd@ietf.org; jhaas@nexthop.com; dward@cisco.com;
> d.katz@juniper.com
> Subject: a draft about how BFD notifying the state change to
> applications
> 
> 
> 
> Dear all,
> 
> I have written a draft about how BFD notifying the state 
> change to applications, the purpose is to reduce the effect to system
> when BFD session goes up and down frequently, and reduce the 
> packet loss when defect is recovered. It's 4 pages long, 
> Comments are very welcome.
> 
> Please help to post to the IETF website this draft and thank 
> you for that.
> 
> 
> Thank you and Best regards.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> **************************************************************
> ****************************
>  This email and its attachments contain confidential 
> information from HUAWEI, which is intended only for the 
> person or entity whose address is listed above. Any use of 
> the information contained herein in any way (including, but 
> not limited to, total or partial disclosure, reproduction, or 
> dissemination) by persons other than the intended 
> recipient(s) is prohibited. If you receive this e-mail in 
> error, please notify the sender by phone or em
> ail immediately and delete it!
>  
> **************************************************************
> **************************
>