Re: draft-palanivelan-bfd-v2-gr-02

David Ward <dward@cisco.com> Fri, 17 July 2009 00:41 UTC

Return-Path: <dward@cisco.com>
X-Original-To: rtg-bfd@core3.amsl.com
Delivered-To: rtg-bfd@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id DC68128C0E1 for <rtg-bfd@core3.amsl.com>; Thu, 16 Jul 2009 17:41:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id E2t6mxxWylxw for <rtg-bfd@core3.amsl.com>; Thu, 16 Jul 2009 17:41:30 -0700 (PDT)
Received: from sj-iport-1.cisco.com (sj-iport-1.cisco.com [171.71.176.70]) by core3.amsl.com (Postfix) with ESMTP id B9ECD3A6A85 for <rtg-bfd@ietf.org>; Thu, 16 Jul 2009 17:41:30 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.42,414,1243814400"; d="scan'208";a="215255217"
Received: from sj-dkim-4.cisco.com ([171.71.179.196]) by sj-iport-1.cisco.com with ESMTP; 17 Jul 2009 00:42:04 +0000
Received: from sj-core-1.cisco.com (sj-core-1.cisco.com [171.71.177.237]) by sj-dkim-4.cisco.com (8.12.11/8.12.11) with ESMTP id n6H0g46v016399; Thu, 16 Jul 2009 17:42:04 -0700
Received: from xbh-rtp-211.amer.cisco.com (xbh-rtp-211.cisco.com [64.102.31.102]) by sj-core-1.cisco.com (8.13.8/8.14.3) with ESMTP id n6H0g3Ec006126; Fri, 17 Jul 2009 00:42:04 GMT
Received: from xmb-rtp-202.amer.cisco.com ([64.102.31.52]) by xbh-rtp-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 16 Jul 2009 20:42:03 -0400
Received: from [127.0.0.1] ([64.102.8.172]) by xmb-rtp-202.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 16 Jul 2009 20:42:03 -0400
Message-Id: <00BFE893-FDB3-44E6-B10F-28BC5A795AC4@cisco.com>
From: David Ward <dward@cisco.com>
To: Vishwas Manral <vishwas.ietf@gmail.com>
In-Reply-To: <77ead0ec0907160940k5300d401i688dead2c22342f4@mail.gmail.com>
Content-Type: text/plain; charset="US-ASCII"; format="flowed"; delsp="yes"
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v935.3)
Subject: Re: draft-palanivelan-bfd-v2-gr-02
Date: Thu, 16 Jul 2009 19:42:01 -0500
References: <A96E7D1872FEC94BB90A3A92CEEC7F3A663D6B@XMB-BGL-41E.cisco.com> <C6849A09.57521%nitinb@juniper.net> <77ead0ec0907160940k5300d401i688dead2c22342f4@mail.gmail.com>
X-Mailer: Apple Mail (2.935.3)
X-OriginalArrivalTime: 17 Jul 2009 00:42:03.0306 (UTC) FILETIME=[6709F0A0:01CA0677]
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; l=3295; t=1247791324; x=1248655324; c=relaxed/simple; s=sjdkim4002; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; d=cisco.com; i=dward@cisco.com; z=From:=20David=20Ward=20<dward@cisco.com> |Subject:=20Re=3A=20draft-palanivelan-bfd-v2-gr-02 |Sender:=20; bh=KtxOvH/Dsc9UQS4xmbNzLPBLmCr0ovLRcareU8LD9+4=; b=OvahIQqfkMbGTkj5k6fVpCBGUc/qTvlNKCuRLjt2d8NzpKrYFVMnx0f623 8UvuGRIG9vDdtMzLgkE0PD2FY7lxq6Gx58+mLozDRZSO1qtbcrlGfopNXNPy uFZgoRKyaP;
Authentication-Results: sj-dkim-4; header.From=dward@cisco.com; dkim=pass ( sig from cisco.com/sjdkim4002 verified; );
Cc: "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>, "Palanivelan A (apvelan)" <apvelan@cisco.com>, David Ward <dward@cisco.com>
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtg-bfd>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Jul 2009 00:41:31 -0000

One change to the BFD base spec as requested by the IESG is a  
congestion control mechanism. We've come to conclusion with the  
transport, internet and routing ADs on the algorithm and you will see  
it in the last rev of the base spec. In a very short summary, if one  
can detect that there is persistent congestion then use the adaptive  
timers to send fewer packets and see if the congestion clears. Then  
ramp them back up slowly until self-stabilized and no observable,  
persistent congestion is detected.

In the case described below, there doesn't need to be protocol  
extensions to cover this issue.  There are many ways system design  
choices can impact your implementation.

-DWard

On Jul 16, 2009, at 11:40 AM, Vishwas Manral wrote:

> Hi A. Palanivelan,
>
> I agree with Nitin here. If BFD packets are not prioritized over
> others, we need to fix that instead of any other changes.
>
> Thanks,
> Vishwas
>
> On Thu, Jul 16, 2009 at 8:49 AM, Nitin Bahadur<nitinb@juniper.net>  
> wrote:
>> Hi,
>>
>>      I would think that if there are multiple things happening at  
>> the same time...then the system needs to prioritize
>> BFD over other things in some way or offload bfd to some place that  
>> would prevent bfd from getting affected.
>>
>> Thanks
>> Nitin
>>
>> On 7/16/09 2:11 AM, "Palanivelan A (apvelan)" <apvelan@cisco.com>  
>> wrote:
>>
>> Hi Nitin,
>> It is very true that BFD works well in planned restarts and we  
>> don't need GR extensions there.
>> But, we have seen issues when bfd is working along with other time  
>> intensive, high priority events.
>>
>> In such a scenario, for GR, we are likely to hit bfd timer expiry  
>> which will be treated as a failure, thus bringing down the  
>> adjacencies (ISIS/OSPF).
>> One such example is when there are scaled number of PPPoE sessions  
>> on a SP router that also has bfd enabled (with strict timers).
>>
>> This would mean looking for a workaround at the architecture level  
>> of your router to make this work.
>>
>> In fact, this sort of experience let me into writing this draft.
>> Do revert back for more discussion.
>> PS: sorry for that late reply. I had a recent road accident that  
>> put me off for a longer time.
>> Regards,
>> A.Palanivelan
>> To: <apvelan at cisco.com <mailto:apvelan@DOMAIN.HIDDEN> >, <rtg- 
>> bfd at ietf.org <mailto:rtg-bfd@DOMAIN.HIDDEN> >
>>
>> ________________________________
>>
>> Hi,
>>
>>  You really do not need GR extensions to BFD to help with
>> planned restart. There is enough text in draft-ietf-bfd-generic
>> (Section 4.3.2.2) to help you accomplish what you want.
>>
>>  At the top of my head, I can think of 2 ways:
>>
>> 1) Restarting router brings down BFD session with diag code of
>>   ADMIN_DOWN. ADMIN_DOWN should not bring down the adjacency
>>   on the peer. So the peer BFD session will go down but ISIS/OSPF
>>   will not treat it as an adjacency down event.
>>
>> 2) Restarting router increases the BFD session timer to XXX
>>   seconds (XXX > restart time). It can save some state locally
>>   to note this. After restart, the restarting router reads the local
>>   state and continues the BFD session from the UP state.
>>
>> Thanks
>> Nitin
>>