Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt
Naiming Shen <naiming@cisco.com> Mon, 02 April 2007 17:14 UTC
Return-path: <rtg-bfd-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1HYQ71-00081O-FY; Mon, 02 Apr 2007 13:14:35 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1HYQ70-00080x-T1 for rtg-bfd@ietf.org; Mon, 02 Apr 2007 13:14:34 -0400
Received: from sj-iport-3-in.cisco.com ([171.71.176.72] helo=sj-iport-3.cisco.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HYQ6z-00021V-6H for rtg-bfd@ietf.org; Mon, 02 Apr 2007 13:14:34 -0400
Received: from sj-dkim-3.cisco.com ([171.71.179.195]) by sj-iport-3.cisco.com with ESMTP; 02 Apr 2007 10:14:33 -0700
Received: from sj-core-2.cisco.com (sj-core-2.cisco.com [171.71.177.254]) by sj-dkim-3.cisco.com (8.12.11/8.12.11) with ESMTP id l32HEWDF011498; Mon, 2 Apr 2007 10:14:32 -0700
Received: from xbh-sjc-221.amer.cisco.com (xbh-sjc-221.cisco.com [128.107.191.63]) by sj-core-2.cisco.com (8.12.10/8.12.6) with ESMTP id l32HENZh002350; Mon, 2 Apr 2007 17:14:32 GMT
Received: from xfe-sjc-211.amer.cisco.com ([171.70.151.174]) by xbh-sjc-221.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 2 Apr 2007 10:14:30 -0700
Received: from [127.0.0.1] ([171.68.225.134]) by xfe-sjc-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 2 Apr 2007 10:14:29 -0700
In-Reply-To: <5B807F72-EEB9-4D11-91E3-4798187CAABB@cisco.com>
References: <E1HWe9i-0004Zp-AR@stiedprstage1.ietf.org> <3D93D8A8-F2CD-4F5D-BA37-5A2489E2C3DA@cisco.com> <29C50044-05B4-412E-B0D8-4B1B6F38672F@juniper.net> <E69131CB-20D2-4B5C-8485-831D6F038AC9@cisco.com> <448453AC-AAC4-4924-8BF2-87AC85907252@juniper.net> <F9A4058F-65FD-46DB-A3B7-681AB089A3EB@cisco.com> <5B807F72-EEB9-4D11-91E3-4798187CAABB@cisco.com>
Mime-Version: 1.0 (Apple Message framework v752.3)
Content-Type: text/plain; charset="US-ASCII"; delsp="yes"; format="flowed"
Message-Id: <078305F4-C44D-4853-8C96-23FF3E2338E2@cisco.com>
Content-Transfer-Encoding: 7bit
From: Naiming Shen <naiming@cisco.com>
Date: Mon, 02 Apr 2007 10:14:27 -0700
To: "Thomas D. Nadeau" <tnadeau@cisco.com>
X-Mailer: Apple Mail (2.752.3)
X-OriginalArrivalTime: 02 Apr 2007 17:14:29.0778 (UTC) FILETIME=[5FBFBF20:01C7754A]
DKIM-Signature: v=0.5; a=rsa-sha256; q=dns/txt; l=8932; t=1175534072; x=1176398072; c=relaxed/simple; s=sjdkim3002; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; d=cisco.com; i=naiming@cisco.com; z=From:=20Naiming=20Shen=20<naiming@cisco.com> |Subject:=20Re=3A=20I-D=20ACTION=3Adraft-shen-bfd-intf-p2p-nbr-00.txt=20 |Sender:=20; bh=Z88pGm6X33zucw0Kp0J7TL2azGybRFVnJCpez/4VqaE=; b=EeeGZgRcLezNAuGDqDxKGwAy520AxKRewStJXRLAGRzkLU/Wa5awiJot1my0jPdDjrszaRBc 1l1YzN9kWMPQSP2YgRAx2ybtFh8KHbqO7FrXclq9O53p8kaRf+tqFaLP;
Authentication-Results: sj-dkim-3; header.From=naiming@cisco.com; dkim=pass ( sig from cisco.com/sjdkim3002 verified; );
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 311e798ce51dbeacf5cdfcc8e9fda21b
Cc: rtg-bfd@ietf.org, Dave Katz <dkatz@juniper.net>
Subject: Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
Errors-To: rtg-bfd-bounces@ietf.org
Thomas, On Apr 2, 2007, at 7:01 AM, Thomas D. Nadeau wrote: > > On Mar 30, 2007:9:29 PM, at 9:29 PM, Naiming Shen wrote: > >> >> On Mar 30, 2007, at 5:52 PM, Dave Katz wrote: >> >>> >>> On Mar 30, 2007, at 4:01 PM, Naiming Shen wrote: >>> >>>>> >>>>> But secondly, there is a self-referential problem. What you'd >>>>> *really* like to do on a BFD failure is to take down the >>>>> failing protocol and keep it down until the BFD session >>>>> resumes. But you can't do this if BFD is running on top of >>>>> that failing protocol (which it will be) which thus requires >>>>> the "special flag" hack and only temporarily taking down the >>>>> protocol. The fate-sharing implications of this are more >>>>> serious, because it is possible to get false positives in some >>>>> cases when the data protocol has failed. Imagine, for example, >>>>> running ISIS with this method. If IPv4 fails, the BFD session >>>>> goes down, the interface flaps for five seconds, and then is >>>>> reenabled. At that point ISIS will happily come back up and >>>>> report IP reachability even though it doesn't exist. You can >>>>> bet that folks who have not yet implemented BFD will be tempted >>>>> to do this. >>>> >>>> >>>> If you are talking about sharing fate between ipv4 bfd and ISIS >>>> is a problem, then I agree, >>>> this scheme is only meant for sharing fate among protocols, it >>>> should not be used >>>> if this assumption is not true. But using ISIS to carry IP >>>> information itself is violating this >>>> principle;-) don't know who to blame, although it's not a BFD >>>> issue. >>> >>> I guess my point is that you need to be very explicit about what >>> works and what doesn't. It's certainly the case that, without >>> the presence of BFD, ISIS will do Bad Things if IP forwarding >>> breaks, since it will continue to advertise IP connectivity. But >>> the whole point of BFD is to sense data protocol connectivity and >>> provide that (presumably useful) information to other parts of a >>> system. If BFD is providing information directly to ISIS, it can >>> withdraw IP connectivity (or tear down the adjacency if need be) >>> and keep it that way until connectivity is restored. If ISIS >>> relied on this scheme, and IP connectivity failed (but datalink >>> connectivity remained), the ISIS adjacency would flap, and then >>> ISIS would proceed to black hole traffic. >> >> Even in the normal BFD interacting with ISIS(for ipv4), I would >> think it can also do this >> black holing. Since ipv4 bfd session is flapping, and datalink >> layer is fine, and ISIS >> packets is going through ok, hellos are happy. ISIS will bring up >> the adjacency, and >> then register with bfd, and bfd later failed again, which will >> bring down the ISIS session. >> I fail to see the difference between the two schemes. > > A simple example of how this can be broken is that if you > consider a distributed router, where you run ISIS on the routing > processor, and BFD on specialized hardware down on a line card. > While the line card's hardware could continue to send/receive > BFD packets, the ISIS process on the RP could crash or get wedged, > and cause it to ignore routing updates. If the routing protocol > timers are sufficiently high (usually on the order of minutes or > even hours), then no one will know until this timer goes off. > >> >>> I think you need to specifically disallow this mechanism for >>> cases like this, namely, applications that will continue to run >>> even with the failure of a data protocol, but whose correct >>> operation requires that data protocol. (Note that this sentence >>> describes both ISIS and static routes.) >>> >>> OSPFv3 is another interesting example if you're running BFD in >>> this configuration only over IPv4. If there is a v4 failure, >>> OSPFv3 will flap unnecessarily. This gets back to the IPv4 == >>> everything fate sharing that is at the heart of the way you've >>> specified it, and which I think is an unnecessary restriction. A >>> number of systems (including the one I'm most familiar with these >>> days) has an interface hierarchy that includes the data >>> protocol. Such systems are likely better served by having, say, >>> separate v4 and v6 BFD sessions and flapping the appropriate data >>> protocol up/down status in the face of a BFD session failure. >>> This would allow the OSPFv3 case to run unmolested when v4 died. >>> I would suggest to at least offer this up as a MAY when you >>> discuss the fate sharing implications of this mechanism, since it >>> should be essentially no more work to implement if the system is >>> already built this way. >> >> Sure. There can be an configuration option for bring down the >> whole thing or bring >> down the data protocol part if the platform supports that. >> >> Even though from architecture wise, the separation of bfds is >> clean, ipv4 controls the >> ipv4 protocols and ipv6 controls the ipv6 protocols. there are >> still much to be desired >> from implementation point of angle. On many routers BFD packets >> going out not really >> through the exact data packets forwarding path or the packets are >> sent out from the >> same software process, be it v6 or v6. So the argument of data >> separation is rather >> mood. And I'm yet to see a case BFD session down is actually >> caused by the layer 3 >> lookup engine which is only responsible for ipv4;-) I would rather >> do the re-route >> altogether though if we know one of the data plane is already in >> trouble. > > Right. Just look at the example I gave above. Even if you run the > ISIS > process on the line card's CPU, if BFD is run in special LC > hardware outside > of the CPU, that constitutes two disjoint forwarding IP stacks; BFD > will only > be testing its own. > >> >>> >>>>> >>>>> In light of this, my preference would be for all of the >>>>> verbiage about static routes and dynamic protocols and special >>>>> fiags to be removed. In place of this, add text that is very >>>>> specific about the fate-sharing implications of this mechanism >>>>> as outlined above, and point out that any application of BFD >>>>> that does not automatically share fate with the data protocol >>>>> over which BFD is running (such as ISIS or static routes) MUST >>>>> have some form of explicit interaction with BFD in order to >>>>> avoid false positives, and leave it at that. The "special bit" >>>>> hack is orthogonal to this mechanism; it could just as well >>>>> have been specified in the generic spec (and would have been >>>>> just as inappropriate there.) >>>> >>>> I think the dynamic and static difference still needs to be >>>> mentioned, although should not >>>> be directly linked with a 'special flag' for the static routing. >>> >>> But I think the "difference" here is fundamental--as soon as you >>> have any special case communication between BFD and a part of the >>> system, you've basically discarded the point of the draft (if I >>> understand it) which is to be able to leverage BFD without >>> changing your "clients" to specifically use it. What you're >>> specifying here is functionally *exactly* the same as what the >>> generic spec talks about for static routes and other non-protocol >>> applications, and only muddies the spec, IMHO. >> >> only the UP->Down portion is different between the two schemes. >> the rest is the same. >> but the bring down itself is different from the point of dynamic >> or static. >> >>> >>> Why not just say that this mechanism only provides a way of more >>> rapidly taking down applications that would otherwise go down >>> eventually and which will stay down on their own until the path >>> is healed (namely, control protocols), and leave statics out of >>> it altogether? >> >> Maybe you have a point. I'll think about that. > > I agree with Katz here. The point of this draft should be to > suggest that > this approach AUGMENT the normal routing hellos, as well as > existing ifUp/Down > mechanisms inside your box. I agree the intf UP->Down signal is the key of this draft; but as I pointed out in the previous email, there is also an important difference in the bring-up stage in terms of non-protocol services, that the bfd session state is condensed into an intf-based indication, which is very easy for implementation. Since those services looking at the intf state anyway, to augment with tins intf- p2p bfd, is just simply a one-line diff to those services. thanks. - Naiming > > --Tom > > > >> >> thanks. >> - Naiming >> >>> >>> >>> --Dave >
- Fwd: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Naiming Shen
- Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Dave Katz
- Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Naiming Shen
- RE: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Nitin Bahadur
- Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Naiming Shen
- Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Dave Katz
- Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Naiming Shen
- Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Dave Katz
- Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Naiming Shen
- Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Thomas D. Nadeau
- Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Naiming Shen
- Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Thomas D. Nadeau
- Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Miya Kohno (mkohno)
- RE: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Nitin Bahadur
- Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Naiming Shen
- Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Naiming Shen
- RE: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Miya Kohno (mkohno)
- Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Naiming Shen
- RE: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Nitin Bahadur
- Re: I-D ACTION:draft-shen-bfd-intf-p2p-nbr-00.txt Naiming Shen