Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt

Dave Katz <dkatz@juniper.net> Fri, 30 March 2007 21:45 UTC

Return-path: <rtg-bfd-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1HXOu4-0000KQ-1l; Fri, 30 Mar 2007 17:45:00 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1HXOu3-0000KF-Da for rtg-bfd@ietf.org; Fri, 30 Mar 2007 17:44:59 -0400
Received: from kremlin.juniper.net ([207.17.137.120]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HXOu2-0001Mt-3S for rtg-bfd@ietf.org; Fri, 30 Mar 2007 17:44:59 -0400
Received: from unknown (HELO merlot.juniper.net) ([172.17.27.10]) by kremlin.juniper.net with ESMTP/TLS/DES-CBC3-SHA; 30 Mar 2007 14:44:57 -0700
X-IronPort-AV: i="4.14,355,1170662400"; d="scan'208"; a="679061718:sNHT42033920"
Received: from [172.16.12.13] (nimbus-sc.juniper.net [172.16.12.13]) by merlot.juniper.net (8.11.3/8.11.3) with ESMTP id l2ULiuJ26424; Fri, 30 Mar 2007 14:44:56 -0700 (PDT) (envelope-from dkatz@juniper.net)
In-Reply-To: <3D93D8A8-F2CD-4F5D-BA37-5A2489E2C3DA@cisco.com>
References: <E1HWe9i-0004Zp-AR@stiedprstage1.ietf.org> <3D93D8A8-F2CD-4F5D-BA37-5A2489E2C3DA@cisco.com>
Mime-Version: 1.0 (Apple Message framework v752.2)
Content-Type: text/plain; charset="WINDOWS-1252"; delsp="yes"; format="flowed"
Message-Id: <29C50044-05B4-412E-B0D8-4B1B6F38672F@juniper.net>
Content-Transfer-Encoding: quoted-printable
From: Dave Katz <dkatz@juniper.net>
Date: Fri, 30 Mar 2007 14:44:56 -0700
To: Naiming Shen <naiming@cisco.com>
X-Mailer: Apple Mail (2.752.2)
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 0fa76816851382eb71b0a882ccdc29ac
Cc: rtg-bfd@ietf.org
Subject: Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
Errors-To: rtg-bfd-bounces@ietf.org

Hi Naiming, thanks for submitting this.

I've pointed out privately to Naiming that I think this is a rather  
ugly architectural hack, albeit a pragmatic one.  The Right Way™ to  
do this would be to run BFD directly over the datalink layer and take  
down all data protocols (but not the datalink) in sympathy with the  
BFD session state.  However, there are (at least) two problems with  
this.  Firstly, it means specifying how to run BFD over a plethora of  
datalink layers.  Secondly, the IETF cannot standardize how to run  
BFD over datalink layers it does not control (so we could standardize  
something directly over PPP, for example, but not IEEE or cisco HDLC.)

The architectural ugliness, aside from being a layer violation, rears  
its head in a couple of ways related to fate-sharing.  Firstly, it  
means that everything shares fate with the IP over which BFD is being  
run, so if, say, IPv4 forwarding fails, non-v4 applications will see  
the failure.  This isn't *too* terrible;  it means that there may be  
false negatives, causing the application to temporarily flap.

But secondly, there is a self-referential problem.  What you'd  
*really* like to do on a BFD failure is to take down the failing  
protocol and keep it down until the BFD session resumes.  But you  
can't do this if BFD is running on top of that failing protocol  
(which it will be) which thus requires the "special flag" hack and  
only temporarily taking down the protocol.  The fate-sharing  
implications of this are more serious, because it is possible to get  
false positives in some cases when the data protocol has failed.   
Imagine, for example, running ISIS with this method.  If IPv4 fails,  
the BFD session goes down, the interface flaps for five seconds, and  
then is reenabled.  At that point ISIS will happily come back up and  
report IP reachability even though it doesn't exist.  You can bet  
that folks who have not yet implemented BFD will be tempted to do this.

My other objection to the "special flag" is that it is arguably  
overspecified.  As far as I can tell, this is isomorphic with simply  
having static routes interact with BFD "directly" and is already  
covered by the generic spec (which carefully says only what the  
effect of the interaction should be, not the implementation of it.)   
It is an explicit signaling mechanism from BFD, albeit a primitive one.

In light of this, my preference would be for all of the verbiage  
about static routes and dynamic protocols and special fiags to be  
removed.  In place of this, add text that is very specific about the  
fate-sharing implications of this mechanism as outlined above, and  
point out that any application of BFD that does not automatically  
share fate with the data protocol over which BFD is running (such as  
ISIS or static routes) MUST have some form of explicit interaction  
with BFD in order to avoid false positives, and leave it at that.   
The "special bit" hack is orthogonal to this mechanism;  it could  
just as well have been specified in the generic spec (and would have  
been just as inappropriate there.)

It would be nice to point out that the only function of flapping the  
interface is to provide an Up->Down edge for protocols to see, and  
that the only requirement for duration is for it to be long enough so  
that it isn't absorbed by any hysteresis mechanism that might be  
sitting on top of it, and short enough so that the protocol isn't  
held down for "too long" (another ugly interaction.)

--Dave

On Mar 30, 2007, at 10:21 AM, Naiming Shen wrote:

>
> hi bfd-wg,
>
> comments are welcome.
>
> thanks.
> - Naiming