Re: BFD stability follow-up from IETF-91

Marc Binderberger <marc@sniff.de> Mon, 08 December 2014 04:03 UTC

Return-Path: <marc@sniff.de>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3361F1A1BF6 for <rtg-bfd@ietfa.amsl.com>; Sun, 7 Dec 2014 20:03:38 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.56
X-Spam-Level:
X-Spam-Status: No, score=-1.56 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_EQ_DE=0.35, T_RP_MATCHES_RCVD=-0.01] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uJ4ktlwFa6EP for <rtg-bfd@ietfa.amsl.com>; Sun, 7 Dec 2014 20:03:36 -0800 (PST)
Received: from door.sniff.de (door.sniff.de [IPv6:2001:6f8:94f:1::1]) by ietfa.amsl.com (Postfix) with ESMTP id 4C6B61A1BFA for <rtg-bfd@ietf.org>; Sun, 7 Dec 2014 20:03:35 -0800 (PST)
Received: from [IPv6:::1] (localhost.sniff.de [127.0.0.1]) by door.sniff.de (Postfix) with ESMTP id A76552AA0F; Mon, 8 Dec 2014 04:03:32 +0000 (GMT)
Date: Sun, 07 Dec 2014 20:07:16 -0800
From: Marc Binderberger <marc@sniff.de>
To: Manav Bhatia <manavbhatia@gmail.com>
Message-ID: <20141207200716673593.1de52229@sniff.de>
In-Reply-To: <CAG1kdojvQbtHuB7dDUPzmvT-mVPhgsGTX+MOC7AB_j6t3rEuXg@mail.gmail.com>
References: <007701d00af9$28719050$7954b0f0$@chinamobile.com> <D09E5FAC.27C51%mmudigon@cisco.com> <007e01d00b07$9c02cc10$d4086430$@chinamobile.com> <7347100B5761DC41A166AC17F22DF1121B8998E7@eusaamb103.ericsson.se> <00a001d00d64$7735ce50$65a16af0$@chinamobile.com> <7347100B5761DC41A166AC17F22DF1121B8A87E6@eusaamb103.ericsson.se> <730769BB-D021-4E22-878A-2C289822A156@gmail.com> <7347100B5761DC41A166AC17F22DF1121B8AA754@eusaamb103.ericsson.se> <09CD6B2F-4DCC-429F-848B-223C72A0F171@gmail.com> <7347100B5761DC41A166AC17F22DF1121B8AAA24@eusaamb103.ericsson.se> <CO2PR0501MB8231A4913DEB31323847CA8B3780@CO2PR0501MB823.namprd05.prod.outlook.com> <7347100B5761DC41A166AC17F22DF1121B8AAC0D@eusaamb103.ericsson.se> <CAG1kdoiquWYaAz5ti14VrmiqXmph-SpjgYs=m8AuQGdKGo2xXQ@mail.gmail.com> <7347100B5761DC41A166AC17F22DF1121B8AACDB@eusaamb103.ericsson.se> <CAG1kdojvQbtHuB7dDUPzmvT-mVPhgsGTX+MOC7AB_j6t3rEuXg@mail.gmail.com>
Subject: Re: BFD stability follow-up from IETF-91
MIME-Version: 1.0
Content-Type: text/plain; charset="big5"
Content-Transfer-Encoding: base64
X-Mailer: GyazMail version 1.5.15
Archived-At: http://mailarchive.ietf.org/arch/msg/rtg-bfd/NpBTsmXls96Ly8Cax6mKac4FxJ4
Cc: "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Dec 2014 04:03:38 -0000

Hello Manav,

> I am sure the WG would appreciate such a lecture since that would obviate 
> the need for such an ID. Are you suggesting that i turn on logging and 
> packet tracing for *each* incoming BFD packet for all the sessions that i 
> have? Trying doing that for 25 BFD sessions where few are running at 50ms 
> and 100ms TX intervals. Now trying combing through the logs when 1 BFD 
> session flaps to understand where the issue was.

the problem I have here is we move deep into the implementation area. 
"Problem" for 2 reasons:

* it's telling my competitors what tricks my company has developed for better 
debugging. Wait ... I want to sell this advantage! :-)

* if we try to "standardize" this too far we choke implementor's phantasy. At 
least I fear so.


Manav, if you think about some kind of light, integrated log/tracing, then 
yes, you could do this for thousands of sessions with even faster timers. I'm 
of course speaking purely hypothetical here *cough* :-)

This is why I'm a fan of some minimal agreement plus room for local (i.e. 
per-vendor) data to be collected. For debugging you want both sides/vendors 
being involved anyway as data needs to be interpreted.


Regards, Marc



On Fri, 5 Dec 2014 08:31:33 +0530, Manav Bhatia wrote:
> Hi Greg,
> 
> I am sure the WG would appreciate such a lecture since that would obviate 
> the need for such an ID. Are you suggesting that i turn on logging and 
> packet tracing for *each* incoming BFD packet for all the sessions that i 
> have? Trying doing that for 25 BFD sessions where few are running at 50ms 
> and 100ms TX intervals. Now trying combing through the logs when 1 BFD 
> session flaps to understand where the issue was.
> 
> Cheers, Manav
> 
> On Thu, Dec 4, 2014 at 10:03 PM, Gregory Mirsky 
> <gregory.mirsky@ericsson.com> wrote:
>> Hi Manav,
>> I hope you don’t expect me to give a lecture on how to design and 
>> implement debugable implementation using logging and packet tracing.
>>  
>>                 Regards,
>>                                 Greg
>>  
>> From: Manav Bhatia [mailto:manavbhatia@gmail.com] 
>> Sent: Thursday, December 04, 2014 8:16 AM
>> To: Gregory Mirsky
>> Cc: Santosh P K; Mahesh Jethanandani; rtg-bfd@ietf.org
>> 
>> Subject: Re: BFD stability follow-up from IETF-91
>> 
>>  
>> I am not sure what the confusion is Greg.
>>  
>> Assume i have a BFD session thats up. At some point in time it flaps. Now 
>> i need to know whether the issue was at the TX or the RX.
>>  
>> Please tell me how TWAMP can help me here. Also tell me how what tool i 
>> can use if its a uBFD session that flapped.
>>  
>> Cheers, Manav
>>  
>> On Thu, Dec 4, 2014 at 9:09 PM, Gregory Mirsky 
>> <gregory.mirsky@ericsson.com> wrote:
>> Hi Santosh,
>> but that is what can be called “feature creep”. BFD is continuity check 
>> mechanism and for active performance measurement, even occasional, there 
>> are TWAMP in IP and RFC 6374/6375 in MPLS/MPLS-TP. It may be tempting to 
>> expand scope of BFD but, I believe, it is successful exactly because it 
>> was simple, light-weight and designed to do exactly one thing – 
>> continuity check.
>>  
>>                 Regards,
>>                                 Greg
>>  
>> From: Santosh P K [mailto:santoshpk@juniper.net] 
>> Sent: Thursday, December 04, 2014 7:02 AM
>> To: Gregory Mirsky; Mahesh Jethanandani
>> 
>> Cc: rtg-bfd@ietf.org
>> Subject: RE: BFD stability follow-up from IETF-91
>>  
>> Hello Greg,
>>   Debugging BFD is one of the use case. I also want to bring up one of the 
>> use case that Jeff suggested in his earlier  mail. Operator might NOT want 
>> to run OAM which does loss and delay measurement all the time due to its 
>> overhead. With the extension to BFD (sequence number) we can detect if 
>> there is any loss but BFD still stays up. This loss detection can be used 
>> as a trigger for loss and delay measurement. Echo can be used only in case 
>> of singlehop and in one direction only.  
>>  
>> Thanks
>> Santosh P K  
>>  
>> From: Rtg-bfd [mailto:rtg-bfd-bounces@ietf.org] On Behalf Of Gregory Mirsky
>> Sent: Thursday, December 04, 2014 12:12 PM
>> To: Mahesh Jethanandani
>> Cc: rtg-bfd@ietf.org
>> Subject: RE: BFD stability follow-up from IETF-91
>>  
>> Hi Mahesh,
>> indeed, LSP Ping is part of MPLS OAM tool set as BFD itself that intended 
>> to monitor operational state of the network, path continuity between two 
>> nodes. And LSP Ping, as primarily on-demand troubleshooting tool, helps 
>> localize and, to certain degree, diagnose the problem. But the ultimate 
>> debugging is proprietary. This proposal, in my view, helps not monitor 
>> behavior of the network but BFD itself, quality of BFD implementation. I’
>> m not saying that it is not useful for implementers and operators, one can 
>> find that too many BFD sessions or at too short intervals being  ran. I don
>> ’t agree to loading this as extension of the widely used standard. 
>> Perhaps we can look into using BFD Echo as self-debugging instrument.
>>  
>>                 Regards,
>>                                 Greg
>>  
>> From: Mahesh Jethanandani [mailto:mjethanandani@gmail.com] 
>> Sent: Wednesday, December 03, 2014 10:23 PM
>> To: Gregory Mirsky
>> Cc: Fan, Peng; MALLIK MUDIGONDA (mmudigon); rtg-bfd@ietf.org
>> Subject: Re: BFD stability follow-up from IETF-91
>>  
>> Greg,
>>  
>> I believe we have a disagreement here. I do not believe that issue of 
>> debug ability are outside the scope of a standardized protocol.
>>  
>> Look at MPLS ping and traceroute (RFC 4379) . They are ultimately debug 
>> tools used to establish viability of a path and they are very much part of 
>> the standardized protocol.
>>  
>>> On Dec 3, 2014, at 3:25 PM, Gregory Mirsky <gregory.mirsky@ericsson.com> 
>>> wrote:
>>>  
>>> Hi Mahesh,
>>> I consider issues of debugability, not of just BFD but any other 
>>> standardized protocol, to be outside of Standard track, at most to be 
>>> suitable for Informational or Experimental track. If we agree on that, 
>>> then we can discuss scenarios that present problem and investigate 
>>> whether anything in the protocol requires clarification to help vendors 
>>> in building well-performing, scalable and interoperable implementations 
>>> and provide operational guidelines for operators.
>>>  
>>>                 Regards,
>>>                                 Greg
>>>  
>>> From: Mahesh Jethanandani [mailto:mjethanandani@gmail.com] 
>>> Sent: Tuesday, December 02, 2014 8:46 PM
>>> To: Gregory Mirsky
>>> Cc: Fan, Peng; MALLIK MUDIGONDA (mmudigon); rtg-bfd@ietf.org
>>> Subject: Re: BFD stability follow-up from IETF-91
>>>  
>>> Greg,
>>>  
>>> What is Peng referring to is a way to figure out why a particular BFD 
>>> session flapped, particularly if the packet(s) for that session arrive 
>>> late. I do not see how that can be performance measurement. It is basic 
>>> BFD debug ability. Running a separate DM does tell you why a particular 
>>> BFD session flapped.
>>>  
>>> Now we can debate what methods can be employed to measure that delay and 
>>> I am open to ways to doing it, including local loopback to measure 
>>> transmit delays or time stamping of packets in hardware. But in cases, 
>>> where there is no support for either of the capabilities, one of the 
>>> suggested solutions is to use the time stamps carried in the BFD payload. 
>>>  
>>> Cheers.
>>>  
>>>> On Dec 1, 2014, at 9:38 AM, Gregory Mirsky <gregory.mirsky@ericsson.com> 
>>>> wrote:
>>>>  
>>>> Hi Peng,
>>>> and still, you’re looking for a tool to measure BFD performance. Then 
>>>> you’ll be looking for a tool to verify the BFD performance measurement, 
>>>> and on, and on. Operators do need complete set of FCAPS tools, including 
>>>> performance measurement. Note that passive performance measurement 
>>>> through marking method that Mach Chen referred to can monitor BFD 
>>>> flow(s) and be used to do Loss and/or Delay Measurement. And active 
>>>> Synthetic Loss Measurement may simulate flow of small packets as well as 
>>>> relatively large packets. And the same goes for active measurement 
>>>> method of Delay Measurement. I like Swiss Army knives but let us not 
>>>> turn BFD into one.
>>>>  
>>>>                 Regards,
>>>>                                 Greg
>>>>  
>>>> From: Fan, Peng [mailto:fanpeng@chinamobile.com] 
>>>> Sent: Monday, December 01, 2014 4:44 AM
>>>> To: Gregory Mirsky; 'MALLIK MUDIGONDA (mmudigon)'; rtg-bfd@ietf.org
>>>> Subject: RE: BFD stability follow-up from IETF-91
>>>>  
>>>> Hi Gregory,
>>>>  
>>>> I was just giving an example :) Application traffic usually cannot stand 
>>>> small packet loss, not to say 30% loss.
>>>>  
>>>> I am actually asking for a debug function that could give us some useful 
>>>> hints of poor connection with small protocol change, besides the basic 
>>>> connectivity information. If it measures something, it measures packets 
>>>> of BFD itself. So I don’t expect it to be considered as a performance 
>>>> measurement tool.
>>>>  
>>>> Best regards,
>>>> Peng
>>>>  
>>>> From: Gregory Mirsky [mailto:gregory.mirsky@ericsson.com] 
>>>> Sent: Saturday, November 29, 2014 3:37 AM
>>>> To: Fan, Peng; 'MALLIK MUDIGONDA (mmudigon)'; rtg-bfd@ietf.org
>>>> Subject: RE: BFD stability follow-up from IETF-91
>>>>  
>>>> Hi Peng,
>>>> this is very interesting scenario. I think that if BFD experiences ~30% 
>>>> packet loss, then highly likely so are affected other applications. Then 
>>>> it is not just BFD issue but condition that should be detected  by 
>>>> performance measurement method, whether active or passive packet loss 
>>>> measurement.
>>>> I’m convinced that overloading BFD with performance measurement 
>>>> provisions is counter-productive and is inappropriate.
>>>>  
>>>>                 Regards,
>>>>                                 Greg
>>>>  
>>>> From: Rtg-bfd [mailto:rtg-bfd-bounces@ietf.org] On Behalf Of Fan, Peng
>>>> Sent: Friday, November 28, 2014 4:34 AM
>>>> To: 'MALLIK MUDIGONDA (mmudigon)'; rtg-bfd@ietf.org
>>>> Subject: RE: BFD stability follow-up from IETF-91
>>>>  
>>>> Hi Mallik,
>>>>  
>>>> Exactly. Packets may be experiencing slight loss, but the link can 
>>>> hardly be regarded as connected. More importantly, the experience of 
>>>> upper-level applications can be degraded severely (e.g. TCP traffic is 
>>>> not able to go fast in face of even small continuous loss). But what if 
>>>> one BFD frame is lost every three frames? Then the loss rate is 30% on 
>>>> average, which is already a very severe value.
>>>>  
>>>> Best regards,
>>>> Peng
>>>>  
>>>> From: MALLIK MUDIGONDA (mmudigon) [mailto:mmudigon@cisco.com] 
>>>> Sent: Friday, November 28, 2014 7:53 PM
>>>> To: Fan, Peng; rtg-bfd@ietf.org
>>>> Subject: Re: BFD stability follow-up from IETF-91
>>>>  
>>>> Hi Peng,
>>>>  
>>>> If the BFD packets are lost, doesn’t the BFD session go DOWN? Are you 
>>>> saying that packet loss is not big enough to make BFD session go DOWN?
>>>>  
>>>> Thanks
>>>>  
>>>> Regards
>>>> Mallik
>>>>  
>>>> From: <Fan>, Peng <fanpeng@chinamobile.com>
>>>> Date: Friday, 28 November 2014 4:20 pm
>>>> To: "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>
>>>> Subject: RE: BFD stability follow-up from IETF-91
>>>>  
>>>> Hi Jeff, all,
>>>>  
>>>> I have been following this stability extension from the beginning, and 
>>>> as an
>>>> operator I would like to express that this draft enables the "advanced
>>>> feature" we desire for BFD to provide additional useful information that
>>>> helps operators understand network issues. A relevant use case is 
>>>> detecting
>>>> lossy or "quasi-disconnected" links or member LAG links. An example of 
>>>> such
>>>> situation we experienced was a loosely connected fiber link resulting in
>>>> continuous, small amount of packet loss. BFD could get the information of
>>>> lost BFD frames on such unstable link, and probably report when a target
>>>> level is reached, say a certain number of frames are lost over a period 
>>>> or
>>>> among a total number of frames.
>>>>  
>>>> Best regards,
>>>> Peng
>>> 
>>>  
>>> Mahesh Jethanandani
>>> Co-chair, NETCONF WG
>>> mjethanandani@gmail.com
>> 
>>  
>> Mahesh Jethanandani
>> Co-chair, NETCONF WG
>> mjethanandani@gmail.com
>>  
>>  
>>  
>>  
>