BFD State Machine
Dave Katz <dkatz@juniper.net> Wed, 02 March 2005 20:43 UTC
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA24636; Wed, 2 Mar 2005 15:43:23 -0500 (EST)
Received: from megatron.ietf.org ([132.151.6.71]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1D6ai1-0002CB-GP; Wed, 02 Mar 2005 15:44:41 -0500
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1D6agY-0007J8-3e; Wed, 02 Mar 2005 15:43:10 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1D6agW-0007J3-5F for rtg-bfd@megatron.ietf.org; Wed, 02 Mar 2005 15:43:08 -0500
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA24611 for <rtg-bfd@ietf.org>; Wed, 2 Mar 2005 15:43:06 -0500 (EST)
Received: from colo-dns-ext1.juniper.net ([207.17.137.57]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1D6ahk-0002BZ-8Z for rtg-bfd@ietf.org; Wed, 02 Mar 2005 15:44:24 -0500
Received: from merlot.juniper.net (merlot.juniper.net [172.17.27.10]) by colo-dns-ext1.juniper.net (8.11.3/8.9.3) with ESMTP id j22Kgx960556 for <rtg-bfd@ietf.org>; Wed, 2 Mar 2005 12:42:59 -0800 (PST) (envelope-from dkatz@juniper.net)
Received: from [172.16.12.139] (nimbus-sf.juniper.net [172.16.12.139]) by merlot.juniper.net (8.11.3/8.11.3) with ESMTP id j22Kgre42691 for <rtg-bfd@ietf.org>; Wed, 2 Mar 2005 12:42:53 -0800 (PST) (envelope-from dkatz@juniper.net)
Mime-Version: 1.0 (Apple Message framework v619.2)
Content-Transfer-Encoding: 7bit
Message-Id: <0036571b21c6f5db0c588591b0426735@juniper.net>
Content-Type: text/plain; charset="US-ASCII"; format="flowed"
To: "'rtg-bfd@ietf.org'" <rtg-bfd@ietf.org>
From: Dave Katz <dkatz@juniper.net>
Date: Wed, 02 Mar 2005 13:42:53 -0700
X-Mailer: Apple Mail (2.619.2)
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 73734d43604d52d23b3eba644a169745
Content-Transfer-Encoding: 7bit
Subject: BFD State Machine
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
Sender: rtg-bfd-bounces@ietf.org
Errors-To: rtg-bfd-bounces@ietf.org
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 10d3e4e3c32e363f129e380e644649be
Content-Transfer-Encoding: 7bit
Richard Spencer's comments made me put on my thinking cap for awhile, and I've come to the conclusion that the BFD state machine as specified is flawed. It has (at least) the following issues: Sessions could come up faster Unidirectional failures require a timeout of INIT state to restore service Service restoration is probabilistic Dissimilar timer values during session establishment can cause permanent deadlock The first is a little unhappy, but not too bad. The second and third are ugly, but not fatal; the session will come up eventually. The fourth is fatal as currently spec'ed--if one side has a transmit interval of one second and the other has a transmit interval of five seconds, the session will enter the deadlock loop that Richard pointed out. This is fixable by requiring that the same value be sent for Desired TX and Required RX while not in Up state, but that's ugly too. The heart of the problem is this--there are two symmetric wait states in the state machine, but there is insufficient state passed in the protocol (only one bit) to communicate what's happening. The IHU bit tells you only that the other system is in one of two states. The symmetry in the protocol, combined with the state ambiguity, causes the deadlock opportunity. After some hemming and hawing, Dave2 and I have decided that we should bite the bullet and change the protocol. This will require an incompatible change and a bump of the version number. The change is fairly simple; rather than communicate a single bit of state information (IHU), instead communicate the current state. As we move an extra bit of state into the protocol, this allows us to remove a state from the state machine. The new scheme fixes all of the above problems; as a matter of fact the unidrectional failure case ends up taking one less packet to come up than the bidirectional failure case. Since the state is carried in the protocol, every state change causes a change in packet contents, which in turn triggers an "extra" packet to be transmitted, so the sessions can come up rapidly, but safely, and in as deterministic a manner as is possible in networking. It also takes one less packet in each direction to come up. Given that this document hasn't even made proposed standard yet, and given that, as far as I know, I'm the only one with deployed code on which customers rely, I am not going to attempt to document any kind of interoperability or version negotiation scheme. Version 0 should simply be discarded and replaced with Version 1. I'm going to spin the document, which will appear after the IETF meeting (as the I-D sluice is blocked until after the meeting.) Dave2 will make a presentation at the meeting describing the changes. Below is the new state machine. It is pretty straightforward; the notations on the arcs are the neighbor's state as signalled in received BFD packets (plus the detection timer expiration.) I think this is much more transparent than the last one in terms of being able to intuit the mechanism and its side effects. This should be sufficient for the motivated to analyze and pick apart. Comments and brickbats to the list, please. All this goes to show that protocol design is a subtle art... --Dave +--+ | | UP | V DOWN +------+ INIT +------------| |------------+ | | DOWN | | | +-------->| |<--------+ | | | +------+ | | | | | | | | | | | | DOWN,| | | |TIMER TIMER| | V | | V +------+ +------+ +----| | | |----+ DOWN| | INIT |--------------------->| UP | |INIT, UP +--->| | INIT, UP | |<---+ +------+ +------+
- BFD State Machine Dave Katz
- Re: BFD State Machine Chris Nogradi
- Re: BFD State Machine Dave Katz
- Re: BFD State Machine Chris Nogradi
- Re: BFD State Machine Reshad Rahman
- Re: BFD State Machine Dave Katz
- Re: BFD State Machine Dave Katz
- Re: BFD State Machine Dave Katz