RE: BFD stability follow-up from IETF-91

Gregory Mirsky <gregory.mirsky@ericsson.com> Wed, 03 December 2014 20:52 UTC

Return-Path: <gregory.mirsky@ericsson.com>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 67E851A7008 for <rtg-bfd@ietfa.amsl.com>; Wed, 3 Dec 2014 12:52:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -104.201
X-Spam-Level:
X-Spam-Status: No, score=-104.201 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id f1HsTuBPmnBd for <rtg-bfd@ietfa.amsl.com>; Wed, 3 Dec 2014 12:52:30 -0800 (PST)
Received: from usevmg21.ericsson.net (usevmg21.ericsson.net [198.24.6.65]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 98FD41A89EB for <rtg-bfd@ietf.org>; Wed, 3 Dec 2014 12:52:30 -0800 (PST)
X-AuditID: c6180641-f79916d00000623a-d1-547f1bd2824b
Received: from EUSAAHC007.ericsson.se (Unknown_Domain [147.117.188.93]) by usevmg21.ericsson.net (Symantec Mail Security) with SMTP id 0B.67.25146.5DB1F745; Wed, 3 Dec 2014 15:19:01 +0100 (CET)
Received: from EUSAAMB103.ericsson.se ([147.117.188.120]) by EUSAAHC007.ericsson.se ([147.117.188.93]) with mapi id 14.03.0195.001; Wed, 3 Dec 2014 15:52:25 -0500
From: Gregory Mirsky <gregory.mirsky@ericsson.com>
To: Santosh P K <santoshpk@juniper.net>, Marc Binderberger <marc@sniff.de>, Manav Bhatia <manavbhatia@gmail.com>
Subject: RE: BFD stability follow-up from IETF-91
Thread-Topic: BFD stability follow-up from IETF-91
Thread-Index: AQHQDUqXQ6amXjPuFEmbPIOgLl4p2px+NluAgAAjMzA=
Date: Wed, 03 Dec 2014 20:52:25 +0000
Message-ID: <7347100B5761DC41A166AC17F22DF1121B8AA51B@eusaamb103.ericsson.se>
References: <20141126001931.GJ20330@pfrc> <CAG1kdoghcA=xSaXmkr68qduH2t8oC=-ZazoQztj8JK12SazKsw@mail.gmail.com> <20141126005023981392.0c488535@sniff.de> <F73A3CB31E8BE34FA1BBE3C8F0CB2AE28B2D9A97@SZXEMA510-MBX.china.huawei.com> <20141126094242449051.c8abfe39@sniff.de> <F73A3CB31E8BE34FA1BBE3C8F0CB2AE28B2DB0BD@SZXEMA510-MBX.china.huawei.com> <315041E4211CB84E86EF7C25A2AB583D3476B1C0@xmb-rcd-x15.cisco.com> <CAG1kdojcmMj38t3wj24zy=6vn4Pa04khuJT4tN5tJF56g0kDPA@mail.gmail.com> <05bc7896aad04c0797eb2759c857f949@CO2PR0501MB823.namprd05.prod.outlook.com> <CAG1kdoi6skeQTmn0zW9ML7hfseXgVRh3=6ifF2kD+R8UK8BS8A@mail.gmail.com> <20141201013841551442.5a9df5b9@sniff.de> <CO2PR0501MB8238FA187D0B7BEA2E18BDEB37B0@CO2PR0501MB823.namprd05.prod.outlook.com>
In-Reply-To: <CO2PR0501MB8238FA187D0B7BEA2E18BDEB37B0@CO2PR0501MB823.namprd05.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [147.117.188.9]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrGLMWRmVeSWpSXmKPExsUyuXRPrO5V6foQg7XztS0uT2pjt5h95T+z xec/2xgtrt3dyuzA4rFz1l12jyVLfjJ5XG+6yu7RurqbJYAlissmJTUnsyy1SN8ugStjx9YP 7AWTEis2b1zL3sA427eLkZNDQsBE4uf/mywQtpjEhXvr2boYuTiEBI4wSnzavZ0RwlnGKNHV O4cVpIpNwEjixcYedhBbRKBIYtbsh2BxZgFNiaYTn8HiwgKGEqu6F0PVGEkcmzEXyraS2Njb AWazCKhITJjczAZi8wr4Sny+eAZq819WiZbNM5lAEpwC8RKXdy0DsxmBzvt+ag0TxDJxiVtP 5jNBnC0gsWTPeWYIW1Ti5eN/rBC2osS+/unsEPU6Egt2f2KDsLUlli18zQyxWFDi5MwnLBMY xWYhGTsLScssJC2zkLQsYGRZxchRWpxalptuZLiJERhRxyTYHHcwLvhkeYhRgINRiYfXgKcu RIg1say4MvcQozQHi5I4r2b1vGAhgfTEktTs1NSC1KL4otKc1OJDjEwcnFINjKrObP+n3Y0t ucJ7zfT2TPWznDvKTzF92H5bqa25gFuYhV8za4+c9rxEc6s9O//MaNqtxbToodNzV67QXJle hq+vnz4ItfttrHS4ts1WL0flaiivSxOzbXrjP4XrUgFKB5smFeo1aBn8daio6lmQ0KmuM6P2 UleVx/mCCvPz19fu+ut28mmpEktxRqKhFnNRcSIAiTsBxYkCAAA=
Archived-At: http://mailarchive.ietf.org/arch/msg/rtg-bfd/wT6yGNL03biI2KdGESkLvJ0Ki00
Cc: "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 03 Dec 2014 20:52:36 -0000

Dear All,
had authors of the proposal or we already dismissed use of BFD Echo? I've scanned the thread and couldn't find trace of us discussing BFD Echo mode. I think that it is more suitable for experimentation and unorthodox use.

	Regards,
		Greg

-----Original Message-----
From: Rtg-bfd [mailto:rtg-bfd-bounces@ietf.org] On Behalf Of Santosh P K
Sent: Wednesday, December 03, 2014 5:39 AM
To: Marc Binderberger; Manav Bhatia
Cc: rtg-bfd@ietf.org
Subject: RE: BFD stability follow-up from IETF-91

Hello Manav and Marc,
       

> > One way to solve this problem is by attaching a debug trailer that 
> > only carries the seq numbers at the *end* of the BFD packet. This 
> > would not be covered in the Length field carried in the BFD header 
> > but would be accounted for in the length carried in the IP header.
> 
> BFD itself is not related to IP, i.e. there is not always an IP header.
> Sure, the encapsulating "frame" may provide a length but actually, why 
> not covering the debug trailer with the BFD length?
> 
> If this is solely for debug purpose than this may work. For simple 
> copying-out into e.g. a packet trace buffer it would be even simpler 
> to have the BFD length covering the trailer.
> If hardware is supposed to process the trailer information (other than 
> copying out) then it's getting ugly - having fixed position, fixed 
> length extension headers would be preferable for simple access.

Fixed length would be easy to process in hardware. Problem is when we have many have extensions in future. If we want to use only one extension that is at the last then I will be forced to pad all the other extension ahead of it? This might not be a problem if we have fewer extensions but might become problem when there are too many extensions. 


> 
> Another idea is to use the 0x80 bit of the auth type to distinguish 
> between a "normal" authentication header and a "sequence + authentication".
> 

I think this is good. In the BFD extension TLV we still have many reserved bits that can be used as well?

Thanks
Santosh P K 



> 
> On Thu, 27 Nov 2014 21:12:00 +0530, Manav Bhatia wrote:
> > Hi Santosh,
> >
> > You could use the crypto sequence numbers carried in the meticulous 
> > cryptographic auth for detecting packet losses. However, this breaks 
> > when you use non-meticulous crypto authentication since the sequence 
> > number is only incremented occasionally there. This i believe is a 
> > deal breaker since i really envision non meticulous mode to be the 
> > one being widely deployed. In fact we were supposed to write a draft 
> > on that and i guess it just fell through the cracks (lemme ping my 
> > co-author on that !)
> >
> > One way to solve this problem is by attaching a debug trailer that 
> > only carries the seq numbers at the *end* of the BFD packet. This 
> > would not be covered in the Length field carried in the BFD header 
> > but would be accounted for in the length carried in the IP header. 
> > The concept of attaching a trailer is documented well and is used in 
> > the IGPs. RFC 6506 describes one such trailer for OSPFv3. The catch 
> > however is that this debug trailer will NOT be covered by the BFD 
> > authentication. Is this acceptable to the WG?
> >
> > I think the problem of diagnosing a BFD flap becomes all the more 
> > important with BFD authentication turned on since then we have more 
> > points where a delay can be inserted.
> >
> > Cheers, Manav
> >
> >
> > On Thu, Nov 27, 2014 at 8:32 PM, Santosh P K <santoshpk@juniper.net>
> wrote:
> >> Manav,
> >>     This is good question.
> >>
> >>> Can the authors add some text on how this debugging mechanism 
> >>> would work if somebody employs BFD authentication?
> >>
> >> Right now we have considered without authentication (we are setting 
> >> A bit). We should add some text on how we can use both Auth and de 
> >> bug
> TLV.
> >> Is there any suggestion you have? I will get back to you on this.
> >>
> >>
> >> Thanks
> >> Santosh P K
> >>
> >>>> -----Original Message-----
> >>>> From: Rtg-bfd [mailto:rtg-bfd-bounces@ietf.org] On Behalf Of Mach
> Chen
> >>>> Sent: Thursday, November 27, 2014 2:13 PM
> >>>> To: Marc Binderberger
> >>>> Cc: rtg-bfd@ietf.org
> >>>> Subject: RE: BFD stability follow-up from IETF-91
> >>>>
> >>>> Hi Marc,
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Marc Binderberger [mailto:marc@sniff.de]
> >>>>> Sent: Thursday, November 27, 2014 1:43 AM
> >>>>> To: Mach Chen
> >>>>> Cc: Manav Bhatia; rtg-bfd@ietf.org
> >>>>> Subject: RE: BFD stability follow-up from IETF-91
> >>>>>
> >>>>> Hello Mach,
> >>>>>
> >>>>>> This triggers me think out there should be another solution for 
> >>>>>> getting the Tx and Rx timestamps without encoding the 
> >>>>>> timestamps
> in
> >>>>>> the BFD
> >>>>> packets.
> >>>>>> For example, the Tx and Rx systems could just save timestamps 
> >>>>>> locally or send them to a centralized entity and then use the 
> >>>>>> sequence numbers to correlate them for further analyzing.
> >>>>>
> >>>>> I remember some discussion on NVO3 about how many bits it takes 
> >>>>> ;-
> ) -
> >>>>> could you send the links/draft names you are working on to this list?
> >>>>> May be useful for further discussions.
> >>>>
> >>>> Sure, here is the
> >>>> link(http://tools.ietf.org/html/draft-chen-ippm-coloring-
> >>> based-ipfpm-framework-02) for the reference.
> >>>>
> >>>> But here I want to say is that since we have sequence number, we 
> >>>> may
> not
> >>> need the marking based solution. Suppose that someone want to
> monitor
> >>> the delay of a BFD packet , just record and save the timestamp at 
> >>> the Tx side, which indexed by the sequence number. Similarly, do 
> >>> the same at the Rx side. Then based on the timestamps from both Tx 
> >>> and Rx, and using the sequence number to correlate the timestamps, 
> >>> it can also provide a way
> to
> >>> monitor the delay of the BFD packet.
> >>>>
> >>>> That means, only if there is sequence number, even if without 
> >>>> carrying the
> >>> timestamp in the BFD packet, BFD packet delay can be measured.
> >>>>
> >>>> Best regards,
> >>>> Mach
> >>>>
> >>>>>
> >>>>>
> >>>>> Thanks & Regards,
> >>>>> Marc
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Wed, 26 Nov 2014 09:17:32 +0000, Mach Chen wrote:
> >>>>>> Hi Marc and Manav,
> >>>>>>
> >>>>>>> -----Original Message-----
> >>>>>>> From: Rtg-bfd [mailto:rtg-bfd-bounces@ietf.org] On Behalf Of
> Marc
> >>>>>>> Binderberger
> >>>>>>> Sent: Wednesday, November 26, 2014 4:50 PM
> >>>>>>> To: Manav Bhatia
> >>>>>>> Cc: rtg-bfd@ietf.org
> >>>>>>> Subject: Re: BFD stability follow-up from IETF-91
> >>>>>>>
> >>>>>>> Hello Manav,
> >>>>>>>
> >>>>>>>> I believe the work is important and addresses something thats 
> >>>>>>>> really required (spent too much time debugging why BFD
> flapped!).
> >>>>>>>
> >>>>>>> agree :-) we should keep the discussion alive.
> >>>>>>>
> >>>>>>>
> >>>>>>>> side Time stamping would have helped in debugging whether the
> >>> BFD
> >>>>>>>> packet was sent late, or whether the packet was sent on time 
> >>>>>>>> and also arrived on time but was delayed when passing it up 
> >>>>>>>> the BFD stack/processor (lay in the RX buffer for tad too 
> >>>>>>>> long)
> >>>>>>>
> >>>>>>> well, I can see a point in having the Tx timestamps in the 
> >>>>>>> packet mainly for the purpose of knowing "this" packet was 
> >>>>>>> okay/not okay on the Tx side and to correlate it with your local Rx measurement.
> >>>>>>
> >>>>>> Yes, this is one solution if people think BFD delay is needed. 
> >>>>>> If allow to have Tx timestamps to be carried in the packets, 
> >>>>>> seems it should be no problem to leave a seat for the Rx 
> >>>>>> timestamps as well :-). After all, with both Tx and Rx 
> >>>>>> timestamp, it may simplify the
> >>>>> implementation.
> >>>>>>
> >>>>>>>
> >>>>>>> And even this point is less relevant with sequence numbers as 
> >>>>>>> this number allows the identification of packets and thus the 
> >>>>>>> correlation of information from the Tx and Rx system.
> >>>>>>
> >>>>>> Indeed, the sequence number helps a lot for the correlation
> between
> >>>>>> the Tx and Rx system.
> >>>>>>
> >>>>>> This triggers me think out there should be another solution for 
> >>>>>> getting the Tx and Rx timestamps without encoding the 
> >>>>>> timestamps
> in
> >>>>>> the BFD
> >>>>> packets.
> >>>>>> For example, the Tx and Rx systems could just save timestamps 
> >>>>>> locally or send them to a centralized entity and then use the 
> >>>>>> sequence numbers to correlate them for further analyzing.
> >>>>>>
> >>>>>> Best regards,
> >>>>>> Mach
> >>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Regards, Marc
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, 26 Nov 2014 12:26:41 +0530, Manav Bhatia wrote:
> >>>>>>>> Hi Jeff,
> >>>>>>>>
> >>>>>>>> I vividly remember the original intent of the stability draft 
> >>>>>>>> was to help debug BFD failures -- to isolate the issue at the 
> >>>>>>>> RX or the TX side Time stamping would have helped in 
> >>>>>>>> debugging
> whether
> >>>>>>>> the BFD packet was sent late, or whether the packet was sent 
> >>>>>>>> on time and also arrived on time but was delayed when passing 
> >>>>>>>> it up the BFD stack/processor (lay in the RX buffer for tad 
> >>>>>>>> too long), etc. But then time stamping came with its own set 
> >>>>>>>> of issues, and was hence dropped from the original draft.
> >>>>>>>>
> >>>>>>>> Can the authors send a summary on the list on why time 
> >>>>>>>> stamping was dropped so that we're all clear on that one.
> >>>>>>>>
> >>>>>>>> The current proposal does help but is not complete.
> >>>>>>>>
> >>>>>>>> Assume that the RX end loses a BFD session and learns later 
> >>>>>>>> that it did eventually receive the missing BFD packets (based 
> >>>>>>>> on the
> seq
> >>> #).
> >>>>>>>> How would it know which end was misbehaving? Was it a delay 
> >>>>>>>> at the TX side, or was it the RX that delayed passing the 
> >>>>>>>> packets to the BFD process(or). This is usually what we want 
> >>>>>>>> to debug and i want to understand how this draft with 
> >>>>>>>> sequence numbers can unequivocally tell me that.
> >>>>>>>>
> >>>>>>>> I believe the work is important and addresses something thats 
> >>>>>>>> really required (spent too much time debugging why BFD
> flapped!).
> >>>>>>>> Clearly what would help is putting a small section that 
> >>>>>>>> describes how we can use the sequence numbers to debug what 
> >>>>>>>> and
> where
> >>> things went wrong.
> >>>>>>>>
> >>>>>>>> Cheers, Manav
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, Nov 26, 2014 at 5:49 AM, Jeffrey Haas 
> >>>>>>>> <jhaas@pfrc.org>
> >>> wrote:
> >>>>>>>>> draft-ashesh-bfd-stability-01 was presented again during 
> >>>>>>>>> IETF-91 in Honolulu.  The slides can be viewed here:
> >>>>>>>>>
> >>>>>>>>> http://www.ietf.org/proceedings/91/slides/slides-91-bfd-4.pp
> >>>>>>>>> tx
> >>>>>>>>>
> >>>>>>>>> To attempt to simplify the presentation, the contentious 
> >>>>>>>>> portion of the timers were removed from the proposal, 
> >>>>>>>>> leaving only the sequence numbering for detecting loss of BFD async packets.
> >>>>>>>>>
> >>>>>>>>> When the room was polled to see whether the draft should be 
> >>>>>>>>> adopted as a WG item, the sense of the room was very quiet.  
> >>>>>>>>> As promised, this is to inquire for support for this draft 
> >>>>>>>>> on the WG mailing list to make sure the whole group has a voice.
> >>>>>>>>>
> >>>>>>>>> It should be noted that post-meeting discussion on the fate 
> >>>>>>>>> of this draft noted that BFD authentication code points are 
> >>>>>>>>> plentiful and are available with expert review.  Should the 
> >>>>>>>>> draft authors wish to continue this work as Experimental, 
> >>>>>>>>> that is
> an
> >>> option.
> >>>>>>>>>
> >>>>>>>>> -- Jeff
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>
> >