RE: BFD stability follow-up from IETF-91

Marc Binderberger <marc@sniff.de> Wed, 26 November 2014 17:39 UTC

Return-Path: <marc@sniff.de>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 369691A1A94 for <rtg-bfd@ietfa.amsl.com>; Wed, 26 Nov 2014 09:39:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.56
X-Spam-Level:
X-Spam-Status: No, score=-1.56 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_EQ_DE=0.35, T_RP_MATCHES_RCVD=-0.01] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UHGwiqBupy23 for <rtg-bfd@ietfa.amsl.com>; Wed, 26 Nov 2014 09:39:48 -0800 (PST)
Received: from door.sniff.de (door.sniff.de [IPv6:2001:6f8:94f:1::1]) by ietfa.amsl.com (Postfix) with ESMTP id 463601A014D for <rtg-bfd@ietf.org>; Wed, 26 Nov 2014 09:39:43 -0800 (PST)
Received: from [IPv6:::1] (localhost.sniff.de [127.0.0.1]) by door.sniff.de (Postfix) with ESMTP id E84892AA0F; Wed, 26 Nov 2014 17:39:40 +0000 (GMT)
Date: Wed, 26 Nov 2014 09:42:42 -0800
From: Marc Binderberger <marc@sniff.de>
To: Mach Chen <mach.chen@huawei.com>
Message-ID: <20141126094242449051.c8abfe39@sniff.de>
In-Reply-To: <F73A3CB31E8BE34FA1BBE3C8F0CB2AE28B2D9A97@SZXEMA510-MBX.china.huawei.com>
References: <20141126001931.GJ20330@pfrc> <CAG1kdoghcA=xSaXmkr68qduH2t8oC=-ZazoQztj8JK12SazKsw@mail.gmail.com> <20141126005023981392.0c488535@sniff.de> <F73A3CB31E8BE34FA1BBE3C8F0CB2AE28B2D9A97@SZXEMA510-MBX.china.huawei.com>
Subject: RE: BFD stability follow-up from IETF-91
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: GyazMail version 1.5.15
Archived-At: http://mailarchive.ietf.org/arch/msg/rtg-bfd/KW9SrnXjCxmlbQducELJQ_qyE78
Cc: "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Nov 2014 17:39:50 -0000

Hello Mach,

> This triggers me think out there should be another solution for getting the 
> Tx and Rx timestamps without encoding the timestamps in the BFD packets. 
> For example, the Tx and Rx systems could just save timestamps locally or 
> send them to a centralized entity and then use the sequence numbers to 
> correlate them for further analyzing.

I remember some discussion on NVO3 about how many bits it takes ;-) - could 
you send the links/draft names you are working on to this list? May be useful 
for further discussions.


Thanks & Regards,
Marc



On Wed, 26 Nov 2014 09:17:32 +0000, Mach Chen wrote:
> Hi Marc and Manav,
> 
>> -----Original Message-----
>> From: Rtg-bfd [mailto:rtg-bfd-bounces@ietf.org] On Behalf Of Marc
>> Binderberger
>> Sent: Wednesday, November 26, 2014 4:50 PM
>> To: Manav Bhatia
>> Cc: rtg-bfd@ietf.org
>> Subject: Re: BFD stability follow-up from IETF-91
>> 
>> Hello Manav,
>> 
>>> I believe the work is important and addresses something thats really
>>> required (spent too much time debugging why BFD flapped!).
>> 
>> agree :-) we should keep the discussion alive.
>> 
>> 
>>> side Time stamping would have helped in debugging whether the BFD
>>> packet was sent late, or whether the packet was sent on time and also
>>> arrived on time but was delayed when passing it up the BFD
>>> stack/processor (lay in the RX buffer for tad too long)
>> 
>> well, I can see a point in having the Tx timestamps in the packet mainly 
>> for the
>> purpose of knowing "this" packet was okay/not okay on the Tx side and to
>> correlate it with your local Rx measurement.
> 
> Yes, this is one solution if people think BFD delay is needed. If allow to 
> have Tx timestamps to be carried in the packets, seems it should be no 
> problem to leave a seat for the Rx timestamps as well :-). After all, with 
> both Tx and Rx timestamp, it may simplify the implementation. 
> 
>> 
>> And even this point is less relevant with sequence numbers as this number
>> allows the identification of packets and thus the correlation of 
>> information from
>> the Tx and Rx system.
> 
> Indeed, the sequence number helps a lot for the correlation between the Tx 
> and Rx system. 
> 
> This triggers me think out there should be another solution for getting the 
> Tx and Rx timestamps without encoding the timestamps in the BFD packets. 
> For example, the Tx and Rx systems could just save timestamps locally or 
> send them to a centralized entity and then use the sequence numbers to 
> correlate them for further analyzing.
> 
> Best regards,
> Mach
> 
>> 
>> 
>> Regards, Marc
>> 
>> 
>> 
>> 
>> 
>> 
>> On Wed, 26 Nov 2014 12:26:41 +0530, Manav Bhatia wrote:
>>> Hi Jeff,
>>> 
>>> I vividly remember the original intent of the stability draft was to
>>> help debug BFD failures -- to isolate the issue at the RX or the TX
>>> side Time stamping would have helped in debugging whether the BFD
>>> packet was sent late, or whether the packet was sent on time and also
>>> arrived on time but was delayed when passing it up the BFD
>>> stack/processor (lay in the RX buffer for tad too long), etc. But then
>>> time stamping came with its own set of issues, and was hence dropped
>>> from the original draft.
>>> 
>>> Can the authors send a summary on the list on why time stamping was
>>> dropped so that we're all clear on that one.
>>> 
>>> The current proposal does help but is not complete.
>>> 
>>> Assume that the RX end loses a BFD session and learns later that it
>>> did eventually receive the missing BFD packets (based on the seq #).
>>> How would it know which end was misbehaving? Was it a delay at the TX
>>> side, or was it the RX that delayed passing the packets to the BFD
>>> process(or). This is usually what we want to debug and i want to
>>> understand how this draft with sequence numbers can unequivocally tell
>>> me that.
>>> 
>>> I believe the work is important and addresses something thats really
>>> required (spent too much time debugging why BFD flapped!). Clearly
>>> what would help is putting a small section that describes how we can
>>> use the sequence numbers to debug what and where things went wrong.
>>> 
>>> Cheers, Manav
>>> 
>>> 
>>> On Wed, Nov 26, 2014 at 5:49 AM, Jeffrey Haas <jhaas@pfrc.org> wrote:
>>>> draft-ashesh-bfd-stability-01 was presented again during IETF-91 in
>>>> Honolulu.  The slides can be viewed here:
>>>> 
>>>> http://www.ietf.org/proceedings/91/slides/slides-91-bfd-4.pptx
>>>> 
>>>> To attempt to simplify the presentation, the contentious portion of
>>>> the timers were removed from the proposal, leaving only the sequence
>>>> numbering for detecting loss of BFD async packets.
>>>> 
>>>> When the room was polled to see whether the draft should be adopted
>>>> as a WG item, the sense of the room was very quiet.  As promised,
>>>> this is to inquire for support for this draft on the WG mailing list
>>>> to make sure the whole group has a voice.
>>>> 
>>>> It should be noted that post-meeting discussion on the fate of this
>>>> draft noted that BFD authentication code points are plentiful and are
>>>> available with expert review.  Should the draft authors wish to
>>>> continue this work as Experimental, that is an option.
>>>> 
>>>> -- Jeff
>>>> 
>>> 
>