RE: Resetting the sequence number in an authenticated BFD session

Alexander Vainshtein <Alexander.Vainshtein@ecitele.com> Sun, 13 January 2008 05:44 UTC

Return-path: <rtg-bfd-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1JDvdk-0008Jq-Ef; Sun, 13 Jan 2008 00:44:12 -0500
Received: from rtg-bfd by megatron.ietf.org with local (Exim 4.43) id 1JDvdi-0008Jk-8e for rtg-bfd-confirm+ok@megatron.ietf.org; Sun, 13 Jan 2008 00:44:10 -0500
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1JDvdh-0008Jc-Rw for rtg-bfd@ietf.org; Sun, 13 Jan 2008 00:44:09 -0500
Received: from eci-iron1.ecitele.com ([147.234.242.117]) by chiedprmail1.ietf.org with esmtp (Exim 4.43) id 1JDvdg-0002Sc-1N for rtg-bfd@ietf.org; Sun, 13 Jan 2008 00:44:09 -0500
Received: from unknown (HELO ILPTAM01.ecitele.com) ([147.234.244.44]) by eci-iron1.ecitele.com with ESMTP; 13 Jan 2008 08:02:17 +0200
Received: from ilptexch01.ecitele.com ([172.31.244.40]) by ILPTAM01.ecitele.com with Microsoft SMTPSVC(6.0.3790.3959); Sun, 13 Jan 2008 07:44:06 +0200
Received: from ILPTMAIL01.ecitele.com (147.234.245.211) by ilptexch01.ecitele.com (172.31.244.40) with Microsoft SMTP Server id 8.1.240.5; Sun, 13 Jan 2008 07:44:05 +0200
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-Class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C855A7.4F147E30"
Date: Sun, 13 Jan 2008 07:44:05 +0200
Message-ID: <64122293A6365B4A9794DC5636F9ACFD0252D70E@ILPTEX02.ecitele.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: Resetting the sequence number in an authenticated BFD session
Thread-Index: AchUhcqY3J/aQWI7S+6ZfNwqFVkAswBHs2T1
References: <64122293A6365B4A9794DC5636F9ACFD0252D70A@ILPTEX02.ecitele.com> <1A38C490-BC35-4ACA-A138-A93A03A99BE6@juniper.net>
From: Alexander Vainshtein <Alexander.Vainshtein@ecitele.com>
To: Dave Katz <dkatz@juniper.net>
X-OriginalArrivalTime: 13 Jan 2008 05:44:06.0589 (UTC) FILETIME=[4FBF36D0:01C855A7]
X-Spam-Score: 1.8 (+)
X-Scan-Signature: 5fb88b8381f3896aeacc5a021513237b
Cc: Ronen Sommer <Ronen.Sommer@ecitele.com>, BFD WG <rtg-bfd@ietf.org>, Igor Danilovich <Igor.Danilovich@ecitele.com>, David Ward <dward@cisco.com>
Subject: RE: Resetting the sequence number in an authenticated BFD session
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
Errors-To: rtg-bfd-bounces@ietf.org

Dave,
Lots of thanks for a detailed response. Please see inline below (blue italics).
 
Regards,
                   Sasha

________________________________

From: Dave Katz [mailto:dkatz@juniper.net]
Sent: Fri 1/11/2008 9:09 PM
To: Alexander Vainshtein
Cc: David Ward; BFD WG; Igor Danilovich; Ronen Sommer
Subject: Re: Resetting the sequence number in an authenticated BFD session


I am not a security expert, nor do I play one on TV, but the whole point of the sequence number scheme is to protect against replay attacks, and any scheme that allows for the arbitrary resetting of the sequence number space opens up a giant hole.
[Sasha] I am not a security expert either, but I understand that allowing arbitrary resetting of the sequence number would be a serious security issue. This was the main reason behind the statement (in the original message) that "I do not see a simple solution for the problem".

If the authentication section were to carry an additional field with "next sequence number expected" then the sender who had lost track of the sequence space could recover without the receiver being vulnerable to a replay attack (the details of making this work properly with multiple packets in flight seems possible with sufficient signaling but is beyond my ability to extemporize in this email.)  Note that I believe it is impossible to avoid session flapping in the case where the round-trip time between systems is greater than the detection time of the session, so it's not clear that any such solution is possible in the general case. [Sasha] I agree that this is at least very complicated and probably impossible.

If people feel strongly enough about this issue and cannot solve it any other way, I would suggest an extension to the base spec using a new authentication type field, as this is going to take some time and careful thought, and could be done without affecting the base spec. [Sasha] This would be nice. Not sure, though, how strong the people feel about it. One piece of info that could help is the understanding of the actual usage of the BFD authentication in the real-life deployments. 


It's worth noting, however, that this is mostly just a particular instance of the more general problem of recovering from lost BFD state.  Another interesting example is trying to handle various graceful-restart-like scenarios, including processor failover. 

The generic solution to these problems is to add a layer between the BFD state machine and the applications that does some intelligent hysteresis around BFD state changes and hides the flap from the applications.  This can easily be done without impacting the detection time of the session for cases other than the sequence number issue.  The long-overdue reissue of the generic spec will talk about this more fully, Real Soon Now.

It's a little bit touchier to pull off with the sequence number stuff because it's hard to reestablish session state in less than a detection time.  One straightforward approach would be to simply wait for the old session to time out (since you'll be receiving packets that don't authenticate.)  This complicates the heuristics of the flap suppression a bit, but not terribly, and it also means that signaling session failure to applications when the far end key stops working will take longer than a detection time.  This doesn't sound like a bad tradeoff to me, since it's a deep-end case and wouldn't impact the detection time for generic failures.  The security implications are exactly what they are today for session establishment (or slightly better, since any bad-guy third party would have to block the legitimate session as well as replaying the establishment of a new one.)

Another scheme could involve establishing a new session and abandoning the old one, which could be done in less than a detection time, but this opens up a giant denial-of-service hole.

--Dave


On Jan 10, 2008, at 1:42 PM, Alexander Vainshtein wrote:


	Hi all,
	I have a question related to the expected behavior of sequence numbers in an aythenticated (MD5 or SHA1) BFD session.
	 
	The corresdponding sections of draft-ietf-bfd-base-06 state that, once the packet has been authenticated by the receiver, its sequence number MUST be checked; if its value is out of range defined by the last received sequence number and the Detect Multiplexor, the packet MUST be discarded.
	 
	This may result in the a BFD session going down in the situation when the transceiver "loses" the information about its last transmitted sequence number. A suitable use case is a multilink interface (LAG, ML-PPP, etc.) with the links residing in different line cards, and e BFD implemented in one of these cards: if this card fails, the BFD would could be re-started in one of the remaining cards. Such a restart would not affect the local session because the BFD machine would be restarted with bfd.AuthSeqKnown = 0, but keeping bfd.XmitAuthSeq consistent between different line cards seems problematic. (Implemeting BFD in some common card would resolve the situation with the multilink interfaces but would raise similar issues when the common card fails).
	 
	Note that this problem would not occur for a non-authenticated BFD session.
	 
	IMHO this problem is real, and I do not see a simple solution for it. 
	I would highly appreciate any feedback from the draft authors and/or from the WG.
	 
	Regards,
	                  Sasha