RE: [Lsr] [Idr] draft-merciaz-idr-bgp-bfd-strict-mode

"Albert Fu (BLOOMBERG/ 120 PARK)" <afu14@bloomberg.net> Mon, 29 July 2019 23:57 UTC

Return-Path: <afu14@bloomberg.net>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0B0E4120025; Mon, 29 Jul 2019 16:57:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level:
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id i5zgmCNZBcLO; Mon, 29 Jul 2019 16:57:38 -0700 (PDT)
Received: from mgnj12.bloomberg.net (mgnj12.bloomberg.net [69.191.244.38]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0BC9A12002E; Mon, 29 Jul 2019 16:57:37 -0700 (PDT)
X-BB-Reception-Complete: 29 Jul 2019 19:57:36 -0400
X-IP-Listener: Outgoing Mail
X-IP-MID: 412579129
Received: from msllnjpmsgsv06.bloomberg.com (HELO msllnjpmsgsv06) ([10.126.134.166]) by mgnj12.bloomberg.net with SMTP; 29 Jul 2019 19:57:36 -0400
X-BLP-INETSVC: version=BLP_APP_S_INETSVC_1.0.1; host=mgnj12:25; conid=110
Date: Mon, 29 Jul 2019 23:57:36 -0000
From: "Albert Fu (BLOOMBERG/ 120 PARK)" <afu14@bloomberg.net>
Reply-To: Albert Fu <afu14@bloomberg.net>
To: gregimirsky@gmail.com, acee@cisco.com, ginsberg@cisco.com
Cc: idr@ietf.org, ketant@cisco.com, lsr@ietf.org, rtg-bfd@ietf.org, albert.f168@gmail.com, shares@ndzh.com
MIME-Version: 1.0
Message-ID: <5D3F87F001E105B200390031_0_1874@msllnjpmsgsv06>
X-BLP-GUID: 5D3F87F001E105B2003900310000
Subject: RE: [Lsr] [Idr] draft-merciaz-idr-bgp-bfd-strict-mode
Content-Type: multipart/alternative; boundary="BOUNDARY_5D3F87F001E105B200390031_0_2049_msllnjpmsgsv06"
Content-ID: <ID_5D3F87F001E105B200390031_0_1874@msllnjpmsgsv06>
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtg-bfd/KISo-2TpcPpEUJfXqHsoPr7jftU>
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Jul 2019 23:57:42 -0000

I will leave the implementation aspect to the experts.

I do agree with Les, as per our current network deployment, that most BFD use cases for BGP are for single hop eBGP sessions. For multihop iBGP sessions, we do not run BFD on the iBGP sessions, and will instead run BFD on the underlying IGP.

I would also like to add that it will be good for BFD dampening implementation to be "smart" in that the "Up" signal from BFD to the client (ISIS/OSPF/BGP) should occur only if the BFD session remains up over the dampening/hold-up period. This will avoid the client (ISIS/OSPF/BGP) from being prematurely being established until the network status is proven stable. For example, if the hardware fault causes the break-in-middle failure to occur every 5s continuously, and the dampening/hold-up period if 10s, we would prefer not to bring the client protocol up. This will help to reduce network churn and improve stability.

Thanks

Albert


From: ginsberg@cisco.com At: 07/27/19 20:23:21To:  gregimirsky@gmail.com,  acee@cisco.com
Cc:  Albert Fu (BLOOMBERG/ 120 PARK ) ,  idr@ietf.org,  ketant@cisco.com,  lsr@ietf.org,  rtg-bfd@ietf.org,  albert.f168@gmail.com,  shares@ndzh.com
Subject: RE: [Lsr] [Idr] draft-merciaz-idr-bgp-bfd-strict-mode

     

Greg – 
  
I have a very different opinion. 
  
Dampening should always be done at the lowest layer possible. 
In most cases this argues for interface layer, but there are cases (switches in the path to the directly connected neighbor) where interface dampening doesn’t always tell you what you need to know.  So I acknowledge the usefulness of dampening  at the BFD layer. 
But doing it at the BFD client layer is wasteful. It forces the same logic to be implemented in multiple places and introduces race conditions where what BFD thinks and what the BFD client thinks about the same state differ. 
I would argue against such an approach. 
  
I have a related question: 
  
In the case where the BGP neighbor is multiple hops away, what benefit does BFD dampening provide? 
(Note that I am assuming that there likely would be single hop BFD sessions used by the IGPs (for example) along the path to the BGP neighbor and expecting that BFD dampening would be use for the single hop sessions when appropriate.) 
  
   Les 
  

From: Lsr <lsr-bounces@ietf.org> On Behalf Of  Greg Mirsky
Sent: Thursday, July 25, 2019 3:41 PM
To: Acee Lindem (acee) <acee@cisco.com>
Cc: idr@ietf.org; Albert Bloomberg <afu14@bloomberg.net>; Ketan Talaulikar (ketant) <ketant@cisco.com>; lsr@ietf.org; rtg-bfd@ietf.org; Albert F <albert.f168@gmail.com>; Susan Hares <shares@ndzh.com>
Subject: Re: [Lsr] [Idr] draft-merciaz-idr-bgp-bfd-strict-mode 
  

Hi Acee, 

I imagine that there could be multiple clients of the same BFD session with different requirements in regard to dampening behavior. For example, the delay each client desires to use may be different for each client of the BFD session. If  that is a plausible use case, I think that placing dampening to a client may be a better choice. 

  

Regards, 

Greg 
  

On Thu, Jul 25, 2019 at 6:23 PM Acee Lindem (acee) <acee@cisco.com> wrote: 

Hi Albert, Ketan,  
The authors will document dampening in the operational considerations. I’m also of the mind that the dampening should be done in BFD rather than the BFD clients (e.g., BGP).  
Thanks,
Acee 
  

From: Lsr <lsr-bounces@ietf.org> on behalf of Albert F <albert.f168@gmail.com>
Date: Thursday, July 25, 2019 at 5:14 PM
To: "Ketan Talaulikar (ketant)" <ketant@cisco.com>
Cc: IDR List <idr@ietf.org>, "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>, Albert Bloomberg <afu14@bloomberg.net>,  Susan Hares <shares@ndzh.com>, "lsr@ietf.org" <lsr@ietf.org>
Subject: Re: [Lsr] [Idr] draft-merciaz-idr-bgp-bfd-strict-mode 

  

Hi Ketan,  

  

I think it will be good to mention this in the doc, as I expect most large networks concerned with network stability impacted by link flaps to enable the BFD hold-up feature. 

  

For example, if one side has BFD hold-up enabled (> BGP hold time) and the other side does not, the BGP keepalive message from one side may be delayed even if BFD is up. This may have implication on the BGP session transitiining to established phase. 

  

Thanks 

Albert 

  

  
  

On Thu, Jul 25, 2019, 4:27 PM Ketan Talaulikar (ketant) <ketant@cisco.com> wrote: 

Hi Albert, 
  
Thanks for your feedback from an operator perspective – it is valuable. This “BFD hold up” behaviour that you desire is best handled by BFD since I would expect that similar behaviour would be desired across routing protocols (OSPF, ISIS,  BGP) and perhaps other clients. 
  
IMHO this is not something that we should be tackling within the scope of this BGP draft. Would you agree? 
  
That said, this seems like a local implementation aspect to me. We should however discuss within the BFD WG if there is value in documenting this. 
  
Thanks, 
Ketan 
  

From: Idr <idr-bounces@ietf.org> On Behalf Of Susan Hares
Sent: 25 July 2019 16:21
To: 'Albert Fu' <afu14@bloomberg.net>; idr@ietf.org
Subject: Re: [Idr] draft-merciaz-idr-bgp-bfd-strict-mode 
  
Albert:  
  
To clarify, do you support WG adoption with the draft as is.   
  
As a WG chair, I have to trust that all  drafts are improved during the WG process.  Can this small change be made after adoption or should it be made before the draft is considered for adoption.  
  
Sue Hares 
  

From: Idr [mailto:idr-bounces@ietf.org] On Behalf Of Albert Fu (BLOOMBERG/ 120 PARK)
Sent: Thursday, July 25, 2019 4:19 PM
To: idr@ietf.org
Subject: [Idr] draft-merciaz-idr-bgp-bfd-strict-mode 
  

I am in support of this draft, and would like to request a small change to make this draft more operationally useful.

We have encountered several traffic blackhole problems in our production network without this feature. As such, we have deployed BGP with strict BFD mode on a proprietary vendor implementation for a while.
 
Since a lot of MetroE circuit failures occur with interfaces still up, ie. break in the middle issues, the traditional knobs like interface hold-time/debounce timer can not be used to dampen interface flaps.  


We have observed that interface issues tend to occur in bursts and would like to request that an option be added in "Section 4 Operation:" to delay BGP from coming up until BFD is proven stable continuously for a period of time (i.e. BFD hold up feature).  


This is a feature that we are currently using in the proprietary vendor deployment. In our case, since we have multiple redundant paths, we have some links where we delay BGP from coming up until BFD has been stable continuously for 60 seconds.

Thanks
Albert Fu
Bloomberg 


 
 
_______________________________________________
Idr mailing list
Idr@ietf.org
https://www.ietf.org/mailman/listinfo/idr