Re: BFD stability follow-up from IETF-91

Manav Bhatia <manavbhatia@gmail.com> Wed, 26 November 2014 23:42 UTC

Return-Path: <manavbhatia@gmail.com>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DAC9B1A8829 for <rtg-bfd@ietfa.amsl.com>; Wed, 26 Nov 2014 15:42:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id a4fKVR9tPd6A for <rtg-bfd@ietfa.amsl.com>; Wed, 26 Nov 2014 15:42:36 -0800 (PST)
Received: from mail-ob0-x22b.google.com (mail-ob0-x22b.google.com [IPv6:2607:f8b0:4003:c01::22b]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EA7021A87EC for <rtg-bfd@ietf.org>; Wed, 26 Nov 2014 15:42:35 -0800 (PST)
Received: by mail-ob0-f171.google.com with SMTP id uz6so3004003obc.30 for <rtg-bfd@ietf.org>; Wed, 26 Nov 2014 15:42:35 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=lwOF/uXV9ntuanqECle2qW4a2HB2JBv5Rg5/AjxUqTI=; b=SeR8wtM2raIgpvaovd7/8P7MlcsOwrYDtA1Gocskhcg9A7iodiboKt3jubZYEXF7lP al244spVPfzILEx2AQ3DcpI400lAEedp1VoZTozRYxX65XLxMHQ9oEt8nj0MZS9L4Lr3 vZAHTEH3yswURh0Zkp23yv1z+j93QBcbUzHArJNSAEu3wd44s7XJVgKDJeqrDcn/5tAW QVdlsnbLI6jQD3+IA/D3mIKJXIgPJK6UHq//U32ogPdwEFg6I188o0KipN4/4jBJa3mk LvPbj9NdP5TXHTC1x9Qd503IOEayLunXWp0PYqqYiMy6IpM2NYnEmmsRYNBpya3BjQ1W SQvA==
MIME-Version: 1.0
X-Received: by 10.60.67.165 with SMTP id o5mr22345242oet.24.1417045355155; Wed, 26 Nov 2014 15:42:35 -0800 (PST)
Received: by 10.76.178.199 with HTTP; Wed, 26 Nov 2014 15:42:35 -0800 (PST)
In-Reply-To: <20141126005023981392.0c488535@sniff.de>
References: <20141126001931.GJ20330@pfrc> <CAG1kdoghcA=xSaXmkr68qduH2t8oC=-ZazoQztj8JK12SazKsw@mail.gmail.com> <20141126005023981392.0c488535@sniff.de>
Date: Thu, 27 Nov 2014 05:12:35 +0530
Message-ID: <CAG1kdojJfGqDUW2_CshM58v7+sF2H-vaCN1j-9EMYH0yRG5=UQ@mail.gmail.com>
Subject: Re: BFD stability follow-up from IETF-91
From: Manav Bhatia <manavbhatia@gmail.com>
To: Marc Binderberger <marc@sniff.de>
Content-Type: text/plain; charset="UTF-8"
Archived-At: http://mailarchive.ietf.org/arch/msg/rtg-bfd/5R6B2bcjXjUzZgNCZExg7bzzKqQ
Cc: "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Nov 2014 23:42:38 -0000

Hi Marc,

(and we meet again! :-))

My claim is that just using sequence numbers may NOT help.

A BFD session with an interval of 33ms on router A will flap if it
does not see any BFD packet for 100ms.

Assume that the last seq number that A sees from the remote end is,
say 100. A will bring down the BFD session if it now does not see 101,
102 and 103 in the next 100ms.

Further assume that these packets were not seen by A and the session
flaps. However, we get these 3 BFD packets immediately after this flap
-- at 100ms + some_delta.

Now given just the sequence numbers its almost impossible for A to
know whether the issue was at the RX or the TX side.

Am i missing something?

Cheers, Manav


On Wed, Nov 26, 2014 at 2:20 PM, Marc Binderberger <marc@sniff.de> wrote:
> Hello Manav,
>
>> I believe the work is important and addresses something thats really
>> required (spent too much time debugging why BFD flapped!).
>
> agree :-) we should keep the discussion alive.
>
>
>> side Time stamping would have helped in debugging whether the BFD
>> packet was sent late, or whether the packet was sent on time and also
>> arrived on time but was delayed when passing it up the BFD
>> stack/processor (lay in the RX buffer for tad too long)
>
> well, I can see a point in having the Tx timestamps in the packet mainly for
> the purpose of knowing "this" packet was okay/not okay on the Tx side and to
> correlate it with your local Rx measurement.
>
> And even this point is less relevant with sequence numbers as this number
> allows the identification of packets and thus the correlation of information
> from the Tx and Rx system.
>
>
> Regards, Marc
>
>
>
>
>
>
> On Wed, 26 Nov 2014 12:26:41 +0530, Manav Bhatia wrote:
>> Hi Jeff,
>>
>> I vividly remember the original intent of the stability draft was to
>> help debug BFD failures -- to isolate the issue at the RX or the TX
>> side Time stamping would have helped in debugging whether the BFD
>> packet was sent late, or whether the packet was sent on time and also
>> arrived on time but was delayed when passing it up the BFD
>> stack/processor (lay in the RX buffer for tad too long), etc. But then
>> time stamping came with its own set of issues, and was hence dropped
>> from the original draft.
>>
>> Can the authors send a summary on the list on why time stamping was
>> dropped so that we're all clear on that one.
>>
>> The current proposal does help but is not complete.
>>
>> Assume that the RX end loses a BFD session and learns later that it
>> did eventually receive the missing BFD packets (based on the seq #).
>> How would it know which end was misbehaving? Was it a delay at the TX
>> side, or was it the RX that delayed passing the packets to the BFD
>> process(or). This is usually what we want to debug and i want to
>> understand how this draft with sequence numbers can unequivocally tell
>> me that.
>>
>> I believe the work is important and addresses something thats really
>> required (spent too much time debugging why BFD flapped!). Clearly
>> what would help is putting a small section that describes how we can
>> use the sequence numbers to debug what and where things went wrong.
>>
>> Cheers, Manav
>>
>>
>> On Wed, Nov 26, 2014 at 5:49 AM, Jeffrey Haas <jhaas@pfrc.org> wrote:
>>> draft-ashesh-bfd-stability-01 was presented again during IETF-91 in
>>> Honolulu.  The slides can be viewed here:
>>>
>>> http://www.ietf.org/proceedings/91/slides/slides-91-bfd-4.pptx
>>>
>>> To attempt to simplify the presentation, the contentious portion of the
>>> timers were removed from the proposal, leaving only the sequence numbering
>>> for detecting loss of BFD async packets.
>>>
>>> When the room was polled to see whether the draft should be adopted as a WG
>>> item, the sense of the room was very quiet.  As promised, this is to
>>> inquire
>>> for support for this draft on the WG mailing list to make sure the whole
>>> group has a voice.
>>>
>>> It should be noted that post-meeting discussion on the fate of this draft
>>> noted that BFD authentication code points are plentiful and are available
>>> with expert review.  Should the draft authors wish to continue this work as
>>> Experimental, that is an option.
>>>
>>> -- Jeff
>>>
>>