Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt

Naiming Shen <naiming@cisco.com> Mon, 02 April 2007 17:14 UTC

Return-path: <rtg-bfd-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1HYQ71-00081O-FY; Mon, 02 Apr 2007 13:14:35 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1HYQ70-00080x-T1 for rtg-bfd@ietf.org; Mon, 02 Apr 2007 13:14:34 -0400
Received: from sj-iport-3-in.cisco.com ([171.71.176.72] helo=sj-iport-3.cisco.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HYQ6z-00021V-6H for rtg-bfd@ietf.org; Mon, 02 Apr 2007 13:14:34 -0400
Received: from sj-dkim-3.cisco.com ([171.71.179.195]) by sj-iport-3.cisco.com with ESMTP; 02 Apr 2007 10:14:33 -0700
Received: from sj-core-2.cisco.com (sj-core-2.cisco.com [171.71.177.254]) by sj-dkim-3.cisco.com (8.12.11/8.12.11) with ESMTP id l32HEWDF011498; Mon, 2 Apr 2007 10:14:32 -0700
Received: from xbh-sjc-221.amer.cisco.com (xbh-sjc-221.cisco.com [128.107.191.63]) by sj-core-2.cisco.com (8.12.10/8.12.6) with ESMTP id l32HENZh002350; Mon, 2 Apr 2007 17:14:32 GMT
Received: from xfe-sjc-211.amer.cisco.com ([171.70.151.174]) by xbh-sjc-221.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 2 Apr 2007 10:14:30 -0700
Received: from [127.0.0.1] ([171.68.225.134]) by xfe-sjc-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 2 Apr 2007 10:14:29 -0700
In-Reply-To: <5B807F72-EEB9-4D11-91E3-4798187CAABB@cisco.com>
References: <E1HWe9i-0004Zp-AR@stiedprstage1.ietf.org> <3D93D8A8-F2CD-4F5D-BA37-5A2489E2C3DA@cisco.com> <29C50044-05B4-412E-B0D8-4B1B6F38672F@juniper.net> <E69131CB-20D2-4B5C-8485-831D6F038AC9@cisco.com> <448453AC-AAC4-4924-8BF2-87AC85907252@juniper.net> <F9A4058F-65FD-46DB-A3B7-681AB089A3EB@cisco.com> <5B807F72-EEB9-4D11-91E3-4798187CAABB@cisco.com>
Mime-Version: 1.0 (Apple Message framework v752.3)
Content-Type: text/plain; charset="US-ASCII"; delsp="yes"; format="flowed"
Message-Id: <078305F4-C44D-4853-8C96-23FF3E2338E2@cisco.com>
Content-Transfer-Encoding: 7bit
From: Naiming Shen <naiming@cisco.com>
Date: Mon, 02 Apr 2007 10:14:27 -0700
To: "Thomas D. Nadeau" <tnadeau@cisco.com>
X-Mailer: Apple Mail (2.752.3)
X-OriginalArrivalTime: 02 Apr 2007 17:14:29.0778 (UTC) FILETIME=[5FBFBF20:01C7754A]
DKIM-Signature: v=0.5; a=rsa-sha256; q=dns/txt; l=8932; t=1175534072; x=1176398072; c=relaxed/simple; s=sjdkim3002; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; d=cisco.com; i=naiming@cisco.com; z=From:=20Naiming=20Shen=20<naiming@cisco.com> |Subject:=20Re=3A=20I-D=20ACTION=3Adraft-shen-bfd-intf-p2p-nbr-00.txt=20 |Sender:=20; bh=Z88pGm6X33zucw0Kp0J7TL2azGybRFVnJCpez/4VqaE=; b=EeeGZgRcLezNAuGDqDxKGwAy520AxKRewStJXRLAGRzkLU/Wa5awiJot1my0jPdDjrszaRBc 1l1YzN9kWMPQSP2YgRAx2ybtFh8KHbqO7FrXclq9O53p8kaRf+tqFaLP;
Authentication-Results: sj-dkim-3; header.From=naiming@cisco.com; dkim=pass ( sig from cisco.com/sjdkim3002 verified; );
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 311e798ce51dbeacf5cdfcc8e9fda21b
Cc: rtg-bfd@ietf.org, Dave Katz <dkatz@juniper.net>
Subject: Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
Errors-To: rtg-bfd-bounces@ietf.org

Thomas,

On Apr 2, 2007, at 7:01 AM, Thomas D. Nadeau wrote:

>
> On Mar 30, 2007:9:29 PM, at 9:29 PM, Naiming Shen wrote:
>
>>
>> On Mar 30, 2007, at 5:52 PM, Dave Katz wrote:
>>
>>>
>>> On Mar 30, 2007, at 4:01 PM, Naiming Shen wrote:
>>>
>>>>>
>>>>> But secondly, there is a self-referential problem.  What you'd  
>>>>> *really* like to do on a BFD failure is to take down the  
>>>>> failing protocol and keep it down until the BFD session  
>>>>> resumes.  But you can't do this if BFD is running on top of  
>>>>> that failing protocol (which it will be) which thus requires  
>>>>> the "special flag" hack and only temporarily taking down the  
>>>>> protocol.  The fate-sharing implications of this are more  
>>>>> serious, because it is possible to get false positives in some  
>>>>> cases when the data protocol has failed.  Imagine, for example,  
>>>>> running ISIS with this method.  If IPv4 fails, the BFD session  
>>>>> goes down, the interface flaps for five seconds, and then is  
>>>>> reenabled.  At that point ISIS will happily come back up and  
>>>>> report IP reachability even though it doesn't exist.  You can  
>>>>> bet that folks who have not yet implemented BFD will be tempted  
>>>>> to do this.
>>>>
>>>>
>>>> If you are talking about sharing fate between ipv4 bfd and ISIS  
>>>> is a problem, then I agree,
>>>> this scheme is only meant for sharing fate among protocols, it  
>>>> should not be used
>>>> if this assumption is not true. But using ISIS to carry IP  
>>>> information itself is violating this
>>>> principle;-) don't know who to blame, although it's not a BFD  
>>>> issue.
>>>
>>> I guess my point is that you need to be very explicit about what  
>>> works and what doesn't.  It's certainly the case that, without  
>>> the presence of BFD, ISIS will do Bad Things if IP forwarding  
>>> breaks, since it will continue to advertise IP connectivity.  But  
>>> the whole point of BFD is to sense data protocol connectivity and  
>>> provide that (presumably useful) information to other parts of a  
>>> system.  If BFD is providing information directly to ISIS, it can  
>>> withdraw IP connectivity (or tear down the adjacency if need be)  
>>> and keep it that way until connectivity is restored.  If ISIS  
>>> relied on this scheme, and IP connectivity failed (but datalink  
>>> connectivity remained), the ISIS adjacency would flap, and then  
>>> ISIS would proceed to black hole traffic.
>>
>> Even in the normal BFD interacting with ISIS(for ipv4), I would  
>> think it can also do this
>> black holing. Since ipv4 bfd session is flapping, and datalink  
>> layer is fine, and ISIS
>> packets is going through ok, hellos are happy. ISIS will bring up  
>> the adjacency, and
>> then register with bfd, and bfd later failed again, which will  
>> bring down the ISIS session.
>> I fail to see the difference between the two schemes.
>
> 	A simple example of how this can be broken is that if you
> consider a distributed router, where you run ISIS on the routing
> processor, and BFD on specialized hardware down on a line card.
> While the line card's hardware could continue to send/receive
> BFD packets, the ISIS process on the RP could crash or get wedged,
> and cause it to ignore routing updates. If the routing protocol
> timers are sufficiently high (usually on the order of minutes or
> even hours), then no one will know until this timer goes off.
>
>>
>>> I think you need to specifically disallow this mechanism for  
>>> cases like this, namely, applications that will continue to run  
>>> even with the failure of a data protocol, but whose correct  
>>> operation requires that data protocol.  (Note that this sentence  
>>> describes both ISIS and static routes.)
>>>
>>> OSPFv3 is another interesting example if you're running BFD in  
>>> this configuration only over IPv4.  If there is a v4 failure,  
>>> OSPFv3 will flap unnecessarily.  This gets back to the IPv4 ==  
>>> everything fate sharing that is at the heart of the way you've  
>>> specified it, and which I think is an unnecessary restriction.  A  
>>> number of systems (including the one I'm most familiar with these  
>>> days) has an interface hierarchy that includes the data  
>>> protocol.  Such systems are likely better served by having, say,  
>>> separate v4 and v6 BFD sessions and flapping the appropriate data  
>>> protocol up/down status in the face of a BFD session failure.   
>>> This would allow the OSPFv3 case to run unmolested when v4 died.   
>>> I would suggest to at least offer this up as a MAY when you  
>>> discuss the fate sharing implications of this mechanism, since it  
>>> should be essentially no more work to implement if the system is  
>>> already built this way.
>>
>> Sure. There can be an configuration option for bring down the  
>> whole thing or bring
>> down the data protocol part if the platform supports that.
>>
>> Even though from architecture wise, the separation of bfds is  
>> clean, ipv4 controls the
>> ipv4 protocols and ipv6 controls the ipv6 protocols. there are  
>> still much to be desired
>> from implementation point of angle. On many routers BFD packets  
>> going out not really
>> through the exact data packets forwarding path or the packets are  
>> sent out from the
>> same software process, be it v6 or v6. So the argument of data  
>> separation is rather
>> mood. And I'm yet to see a case BFD session down is actually  
>> caused by the layer 3
>> lookup engine which is only responsible for ipv4;-) I would rather  
>> do the re-route
>> altogether though if we know one of the data plane is already in  
>> trouble.
>
> 	Right. Just look at the example I gave above. Even if you run the  
> ISIS
> process on the line card's CPU, if BFD is run in special LC  
> hardware outside
> of the CPU, that constitutes two disjoint forwarding IP stacks; BFD  
> will only
> be testing its own.
>
>>
>>>
>>>>>
>>>>> In light of this, my preference would be for all of the  
>>>>> verbiage about static routes and dynamic protocols and special  
>>>>> fiags to be removed.  In place of this, add text that is very  
>>>>> specific about the fate-sharing implications of this mechanism  
>>>>> as outlined above, and point out that any application of BFD  
>>>>> that does not automatically share fate with the data protocol  
>>>>> over which BFD is running (such as ISIS or static routes) MUST  
>>>>> have some form of explicit interaction with BFD in order to  
>>>>> avoid false positives, and leave it at that.  The "special bit"  
>>>>> hack is orthogonal to this mechanism;  it could just as well  
>>>>> have been specified in the generic spec (and would have been  
>>>>> just as inappropriate there.)
>>>>
>>>> I think the dynamic and static difference still needs to be  
>>>> mentioned, although should not
>>>> be directly linked with a 'special flag' for the static routing.
>>>
>>> But I think the "difference" here is fundamental--as soon as you  
>>> have any special case communication between BFD and a part of the  
>>> system, you've basically discarded the point of the draft (if I  
>>> understand it) which is to be able to leverage BFD without  
>>> changing your "clients" to specifically use it.  What you're  
>>> specifying here is functionally *exactly* the same as what the  
>>> generic spec talks about for static routes and other non-protocol  
>>> applications, and only muddies the spec, IMHO.
>>
>> only the UP->Down portion is different between the two schemes.  
>> the rest is the same.
>> but the bring down itself is different from the point of dynamic  
>> or static.
>>
>>>
>>> Why not just say that this mechanism only provides a way of more  
>>> rapidly taking down applications that would otherwise go down  
>>> eventually and which will stay down on their own until the path  
>>> is healed (namely, control protocols), and leave statics out of  
>>> it altogether?
>>
>> Maybe you have a point. I'll think about that.
>
> 	I agree with Katz here. The point of this draft should be to  
> suggest that
> this approach AUGMENT the normal routing hellos, as well as  
> existing ifUp/Down
> mechanisms inside your box.

I agree the intf UP->Down signal is the key of this draft; but as I  
pointed out
in the previous email, there is also an important difference in the  
bring-up stage
in terms of non-protocol services, that the bfd session state is  
condensed into
an intf-based indication, which is very easy for implementation.  
Since those
services looking at the intf state anyway, to augment with tins intf- 
p2p bfd,
is just simply a one-line diff to those services.

thanks.
- Naiming

>
> 	--Tom
>
>
>
>>
>> thanks.
>> - Naiming
>>
>>>
>>>
>>> --Dave
>