Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt

Dave Katz <dkatz@juniper.net> Sat, 31 March 2007 00:53 UTC

Return-path: <rtg-bfd-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1HXRq4-0005Fz-P2; Fri, 30 Mar 2007 20:53:04 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1HXRq3-0005Fp-Gs for rtg-bfd@ietf.org; Fri, 30 Mar 2007 20:53:03 -0400
Received: from borg.juniper.net ([207.17.137.119]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HXRq1-000585-NR for rtg-bfd@ietf.org; Fri, 30 Mar 2007 20:53:03 -0400
Received: from unknown (HELO merlot.juniper.net) ([172.17.27.10]) by borg.juniper.net with ESMTP/TLS/DES-CBC3-SHA; 30 Mar 2007 17:53:02 -0700
X-IronPort-AV: i="4.14,355,1170662400"; d="scan'208"; a="699791604:sNHT36801428"
Received: from [172.16.12.13] (nimbus-sc.juniper.net [172.16.12.13]) by merlot.juniper.net (8.11.3/8.11.3) with ESMTP id l2V0qvJ61946; Fri, 30 Mar 2007 17:53:01 -0700 (PDT) (envelope-from dkatz@juniper.net)
In-Reply-To: <E69131CB-20D2-4B5C-8485-831D6F038AC9@cisco.com>
References: <E1HWe9i-0004Zp-AR@stiedprstage1.ietf.org> <3D93D8A8-F2CD-4F5D-BA37-5A2489E2C3DA@cisco.com> <29C50044-05B4-412E-B0D8-4B1B6F38672F@juniper.net> <E69131CB-20D2-4B5C-8485-831D6F038AC9@cisco.com>
Mime-Version: 1.0 (Apple Message framework v752.2)
Content-Type: text/plain; charset="US-ASCII"; delsp="yes"; format="flowed"
Message-Id: <448453AC-AAC4-4924-8BF2-87AC85907252@juniper.net>
Content-Transfer-Encoding: 7bit
From: Dave Katz <dkatz@juniper.net>
Date: Fri, 30 Mar 2007 17:52:56 -0700
To: Naiming Shen <naiming@cisco.com>
X-Mailer: Apple Mail (2.752.2)
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 14582b0692e7f70ce7111d04db3781c8
Cc: rtg-bfd@ietf.org
Subject: Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
Errors-To: rtg-bfd-bounces@ietf.org

On Mar 30, 2007, at 4:01 PM, Naiming Shen wrote:

>>
>> But secondly, there is a self-referential problem.  What you'd  
>> *really* like to do on a BFD failure is to take down the failing  
>> protocol and keep it down until the BFD session resumes.  But you  
>> can't do this if BFD is running on top of that failing protocol  
>> (which it will be) which thus requires the "special flag" hack and  
>> only temporarily taking down the protocol.  The fate-sharing  
>> implications of this are more serious, because it is possible to  
>> get false positives in some cases when the data protocol has  
>> failed.  Imagine, for example, running ISIS with this method.  If  
>> IPv4 fails, the BFD session goes down, the interface flaps for  
>> five seconds, and then is reenabled.  At that point ISIS will  
>> happily come back up and report IP reachability even though it  
>> doesn't exist.  You can bet that folks who have not yet  
>> implemented BFD will be tempted to do this.
>
>
> If you are talking about sharing fate between ipv4 bfd and ISIS is  
> a problem, then I agree,
> this scheme is only meant for sharing fate among protocols, it  
> should not be used
> if this assumption is not true. But using ISIS to carry IP  
> information itself is violating this
> principle;-) don't know who to blame, although it's not a BFD issue.

I guess my point is that you need to be very explicit about what  
works and what doesn't.  It's certainly the case that, without the  
presence of BFD, ISIS will do Bad Things if IP forwarding breaks,  
since it will continue to advertise IP connectivity.  But the whole  
point of BFD is to sense data protocol connectivity and provide that  
(presumably useful) information to other parts of a system.  If BFD  
is providing information directly to ISIS, it can withdraw IP  
connectivity (or tear down the adjacency if need be) and keep it that  
way until connectivity is restored.  If ISIS relied on this scheme,  
and IP connectivity failed (but datalink connectivity remained), the  
ISIS adjacency would flap, and then ISIS would proceed to black hole  
traffic.

I think you need to specifically disallow this mechanism for cases  
like this, namely, applications that will continue to run even with  
the failure of a data protocol, but whose correct operation requires  
that data protocol.  (Note that this sentence describes both ISIS and  
static routes.)

OSPFv3 is another interesting example if you're running BFD in this  
configuration only over IPv4.  If there is a v4 failure, OSPFv3 will  
flap unnecessarily.  This gets back to the IPv4 == everything fate  
sharing that is at the heart of the way you've specified it, and  
which I think is an unnecessary restriction.  A number of systems  
(including the one I'm most familiar with these days) has an  
interface hierarchy that includes the data protocol.  Such systems  
are likely better served by having, say, separate v4 and v6 BFD  
sessions and flapping the appropriate data protocol up/down status in  
the face of a BFD session failure.  This would allow the OSPFv3 case  
to run unmolested when v4 died.  I would suggest to at least offer  
this up as a MAY when you discuss the fate sharing implications of  
this mechanism, since it should be essentially no more work to  
implement if the system is already built this way.

>>
>> In light of this, my preference would be for all of the verbiage  
>> about static routes and dynamic protocols and special fiags to be  
>> removed.  In place of this, add text that is very specific about  
>> the fate-sharing implications of this mechanism as outlined above,  
>> and point out that any application of BFD that does not  
>> automatically share fate with the data protocol over which BFD is  
>> running (such as ISIS or static routes) MUST have some form of  
>> explicit interaction with BFD in order to avoid false positives,  
>> and leave it at that.  The "special bit" hack is orthogonal to  
>> this mechanism;  it could just as well have been specified in the  
>> generic spec (and would have been just as inappropriate there.)
>
> I think the dynamic and static difference still needs to be  
> mentioned, although should not
> be directly linked with a 'special flag' for the static routing.

But I think the "difference" here is fundamental--as soon as you have  
any special case communication between BFD and a part of the system,  
you've basically discarded the point of the draft (if I understand  
it) which is to be able to leverage BFD without changing your  
"clients" to specifically use it.  What you're specifying here is  
functionally *exactly* the same as what the generic spec talks about  
for static routes and other non-protocol applications, and only  
muddies the spec, IMHO.

Why not just say that this mechanism only provides a way of more  
rapidly taking down applications that would otherwise go down  
eventually and which will stay down on their own until the path is  
healed (namely, control protocols), and leave statics out of it  
altogether?


--Dave