Re: TWAMP analysis for assisting BFD debuggin (was Re: BFD stability follow-up from IETF-91)

Jeffrey Haas <jhaas@pfrc.org> Mon, 22 December 2014 15:27 UTC

Return-Path: <jhaas@slice.pfrc.org>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C0B2F1A9100 for <rtg-bfd@ietfa.amsl.com>; Mon, 22 Dec 2014 07:27:56 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.578
X-Spam-Level:
X-Spam-Status: No, score=-1.578 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, IP_NOT_FRIENDLY=0.334, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QU2RjaLDjW2S for <rtg-bfd@ietfa.amsl.com>; Mon, 22 Dec 2014 07:27:55 -0800 (PST)
Received: from slice.pfrc.org (slice.pfrc.org [67.207.130.108]) by ietfa.amsl.com (Postfix) with ESMTP id 8BAE21A9104 for <rtg-bfd@ietf.org>; Mon, 22 Dec 2014 07:27:55 -0800 (PST)
Received: by slice.pfrc.org (Postfix, from userid 1001) id 819B9C26E; Mon, 22 Dec 2014 10:27:52 -0500 (EST)
Date: Mon, 22 Dec 2014 10:27:52 -0500
From: Jeffrey Haas <jhaas@pfrc.org>
To: Gregory Mirsky <gregory.mirsky@ericsson.com>
Subject: Re: TWAMP analysis for assisting BFD debuggin (was Re: BFD stability follow-up from IETF-91)
Message-ID: <20141222152752.GQ16279@pfrc>
References: <730769BB-D021-4E22-878A-2C289822A156@gmail.com> <7347100B5761DC41A166AC17F22DF1121B8AA754@eusaamb103.ericsson.se> <09CD6B2F-4DCC-429F-848B-223C72A0F171@gmail.com> <7347100B5761DC41A166AC17F22DF1121B8AAA24@eusaamb103.ericsson.se> <CO2PR0501MB8231A4913DEB31323847CA8B3780@CO2PR0501MB823.namprd05.prod.outlook.com> <7347100B5761DC41A166AC17F22DF1121B8AAC0D@eusaamb103.ericsson.se> <CAG1kdoiquWYaAz5ti14VrmiqXmph-SpjgYs=m8AuQGdKGo2xXQ@mail.gmail.com> <7347100B5761DC41A166AC17F22DF1121B8AACDB@eusaamb103.ericsson.se> <20141219210222.GJ16279@pfrc> <7347100B5761DC41A166AC17F22DF1121B8C5EA0@eusaamb103.ericsson.se>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <7347100B5761DC41A166AC17F22DF1121B8C5EA0@eusaamb103.ericsson.se>
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: http://mailarchive.ietf.org/arch/msg/rtg-bfd/Zvj9pGaoQXb2Tv13PFC5tEw2nWI
Cc: "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Dec 2014 15:27:57 -0000

On Fri, Dec 19, 2014 at 10:52:10PM +0000, Gregory Mirsky wrote:
> thank you for your interest in this discussion. If you agree, we can add IPPM WG to it.

I think that would be excessively overloading the conversation at this time.
My primary point was that a performance measuring tech wasn't a fully
appropriate tool for this situation.  At some point if IPPM would like to
have BFD devs visit their session and provide feedback about where
implementation layering issues bias measurement, I suspect we can find
volunteers.

> I agree with your analysis of TWAMP and I would not suggest that
> TWAMP-Test is well suited to debug BFD specific issues. But TWAMP-Test, as
> other active performance measurement mechanisms, i.e. Y.1731 and MPLS PM
> based on RFC 6374, may be used to measure latency and jitter of a network
> or its segment.

But such measurements will depend on where in the implementation the
injectiosn and measurement are done.  The closer you get to IEEE work, the
more the measurements are likely to be implemented in HW as part of L2.  The
closer to IETF, the more likely it will be a layer just above HW L2.  Each
kind of measurement is valuable, and the results are ideally congruent.

>  And even more than active, passive measurement methods may
> provide helpful information to troubleshoot network or BFD. That may be
> done using IPFIX on certain nodes in the network with the data analysis by
> a data collector.

IPFIX typically will give you biased statistics either due to binning
effects or the fact that it's statistically sampled.
(Admittedly, I don't follow current work, only work I've done in the last 5
years with flow.)

To some extent, that observation highlights a difference in what is being
observed.  When you are concerned about flow-level impacts in your network,
the aggregates are the item of concern.  For BFD, and to a lesser extent
things like the other control plane protocols, a small number of lost
packets at inopportune times has significantly negative impacts.  The
granularities become much tighter.

> I agree that debugging and troubleshooting is often an art and operators
> and vendors  need to have and use all tools that are available. We
> probably can do better job of instrumenting BFD debug tools but, I
> believe, not by changing, not by overloading its main functions of
> monitoring continuity and fast detection of the Loss of Continuity defect.
> Whether Down state is indication of the real defect or false negative,
> that is the question to be answered through analysis of available
> information and follow-up with debugging and troubleshooting the BFD
> itself rather than the network if there are concerns that it was not a
> real defect.

That is the fundamental point: What tools do we need in BFD to troubleshoot
BFD especially given the layering issues that are somewhat distinct for its
implementations?  Sequencing information, minimally.  I am becoming more
convinced that the timing information is helpful - if tricky to implement in
a cross-vendor fashion.  I don't believe that BFD with such extensions
should replace mechanisms such as CFM.

-- Jeff