RE: BFD WG adoption for draft-haas-bfd-large-packets

"Albert Fu (BLOOMBERG/ 120 PARK)" <> Mon, 22 October 2018 21:08 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 974C612D4EA for <>; Mon, 22 Oct 2018 14:08:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id xSk36LvAsdI6 for <>; Mon, 22 Oct 2018 14:08:28 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 41D77128CB7 for <>; Mon, 22 Oct 2018 14:08:28 -0700 (PDT)
X-BB-Reception-Complete: 22 Oct 2018 17:08:27 -0400
X-IP-Listener: Outgoing Mail
X-IP-MID: 203538638
Received: from (HELO msllnjpmsgsv06) ([]) by with SMTP; 22 Oct 2018 17:08:27 -0400
X-BLP-INETSVC: version=BLP_APP_S_INETSVC_1.0.1; host=mgnj12:25; conid=428
Date: Mon, 22 Oct 2018 21:08:27 -0000
From: "Albert Fu (BLOOMBERG/ 120 PARK)" <>
Reply-To: Albert Fu <>
MIME-Version: 1.0
Message-ID: <5BCE3C4B028807F6003909C3_0_92814@msllnjpmsgsv06>
X-BLP-GUID: 5BCE3C4B028807F6003909C30000
Subject: RE: BFD WG adoption for draft-haas-bfd-large-packets
Content-Type: multipart/alternative; boundary="BOUNDARY_5BCE3C4B028807F6003909C3_0_128282_msllnjpmsgsv06"
Content-ID: <ID_5BCE3C4B028807F6003909C3_0_90710@msllnjpmsgsv06>
Archived-At: <>
X-Mailman-Approved-At: Tue, 23 Oct 2018 05:23:19 -0700
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 22 Oct 2018 21:08:32 -0000

Sorry for the late response, and thanks for the comments so far.

We have observed the "MTU" issue on Telco WAN circuits (typically, the p2p WAN links are deployed using MPLS L2VPN service). So the cause is outside of our control. But  when the MTU issue happens, there are no network events/alarms, since most protocol keepalive packets are small. It is quite time consuming to troubleshoot the issue, especially in networks with many ECMP paths.

I would agree that this issue is less likely to occur within customer infrastructure that use back-back links, assuming there are good provisioning tools in place.  

This draft proposes adding padding at the IP layer, without any change to the BFD/UDP packet. Depending on the implementation, the padding bytes will be stripped off at the IP layer, and may not be visible to the BFD process, hence potentially little impact on performance (the increase in link util is small relative to today's BW).

The advantage of having padding implemented in BFD is that it will enable the issue to be detected very quickly as it happens, and traffic can be diverted to working paths with minimal network downtime. Of all the routing protocols, ISIS is the only one I am aware of that supports hello padding, but since this is a control plane process, we have to use conservative detection timer, which means there will be noticeable network downtime using this method.

The BFD padding will be an option on a per neighbor basis and user configurable. In our case, we will look at setting the total padded BFD packet size to 1512 bytes, being 1500 IP payload + up to 3 MPLS headers. If scaling is an issue, Network Designer can choose to enable the padding feature for WAN circuits, but not on circuits within customer infrastructure where this is unlikely to be an issue.

From: At: 10/22/18 16:43:05To:
Subject: Rtg-bfd Digest, Vol 152, Issue 20

Send Rtg-bfd mailing list submissions to

To subscribe or unsubscribe via the World Wide Web, visit
or, via email, send a message with subject or body 'help' to

You can reach the person managing the list at

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Rtg-bfd digest..."

Today's Topics:

   1. RE: BFD WG adoption for draft-haas-bfd-large-packets
      (Les Ginsberg (ginsberg))


Message: 1
Date: Mon, 22 Oct 2018 20:40:49 +0000
From: "Les Ginsberg (ginsberg)" <>
To: "Reshad Rahman (rrahman)" <>, "Naiming Shen
       (naiming)" <>
Cc: "" <>
Subject: RE: BFD WG adoption for draft-haas-bfd-large-packets
Message-ID: <>
Content-Type: text/plain; charset="utf-8"

Reshad ?


From: Reshad Rahman (rrahman)
Sent: Monday, October 22, 2018 1:02 PM
To: Les Ginsberg (ginsberg) <>; Naiming Shen (naiming) <>
Subject: Re: BFD WG adoption for draft-haas-bfd-large-packets


1)      <chair hat>From Albert?s presentation @ IETF102, the motivation for doing this is to have BFD fail if the expected MTU can not be met, and therefore for traffic to be rerouted as a result. So I think this is the use case we should stick with but I?d like to know if others think otherwise.</chair hat>
2)      Les, I don?t understand the question regarding client MTU requirement. Are you saying if OSPF and BGP are clients of the same BFD session they might have different MTU requirements? I don?t think this is the problem the solution is trying to solve: the solution is trying to figure out if packets of up to size X can make it to the other end, this is independent of BFD client.
[Les:] I don?t think the example of OSPF/BGP is a good one. Think tunneled traffic vs non-tunneled.
I raised the question because different BFD clients can have different MTU limitations and wanted to know how this would be addressed when they shared the same BFD session.
Does BFD use the minimum or the maximum of all the client MTUs?

3)      Les, I agree we need to balance the cost of a new protocol v/s overloading BFD at the risk of ?polluting? it. I believe BFD is a good fit for this but we have to be careful, so it?s good to have these discussions.

[Les:] My current feeling is that this isn?t a good fit for BFD. But I agree with you ? discussing this at this early stage is a good thing to do.



From: "Les Ginsberg (ginsberg)" <<>>
Date: Sunday, October 21, 2018 at 8:36 PM
To: "Naiming Shen (naiming)" <<>>
Cc: "Reshad Rahman (rrahman)" <<>>, "<>" <<>>
Subject: RE: BFD WG adoption for draft-haas-bfd-large-packets

Naiming -

Thanx for the good discussion. Responses inline.

From: Naiming Shen (naiming)
Sent: Sunday, October 21, 2018 3:36 PM
To: Les Ginsberg (ginsberg) <<>>
Cc: Reshad Rahman (rrahman) <<>>;<>
Subject: Re: BFD WG adoption for draft-haas-bfd-large-packets


On Oct 21, 2018, at 3:26 PM, Les Ginsberg (ginsberg) <<>> wrote:

Naiming ?


From: Naiming Shen (naiming)
Sent: Sunday, October 21, 2018 3:12 PM
To: Les Ginsberg (ginsberg) <<>>
Cc: Reshad Rahman (rrahman) <<>>;<>
Subject: Re: BFD WG adoption for draft-haas-bfd-large-packets

It probably should say ?the payload size MAY be increased to this value and it is
not recommended for a BFD session to always use the large size packet for padding.
How frequent the large size packet being used is application specific?.

[Les:] This does not address the question as to why we want to use a mechanism specifically designed for sub-second detection for this case.
??? (Note that it does not come for free. ? )

Since we already have a session between two end-points, a BFD session, why not utilize that
instead of having to explicitly configurae another ?MTU discovery protocol? session with burden
of configuration and management.

[Les:] Because it comes with costs and risks and problems of its own.

We do not know how many of the existing BFD implementations will be able to handle this w/o changes. Some may not be able to handle this at all.
The draft already acknowledges that this may affect BFD scaling. It is not much of a leap to think that in order to handle BFD at scale current implementations have made certain assumptions ? one of which could be what is the maximum expected size of a BFD packet.
And the user will ? of course ? have to configure this as a BFD option (I believe this was acknowledged in the Montreal presentation) ? so it is not as if no additional config is required.

I am sure we can come up with other risks/costs with a bit more thought.

Since most of the application traffic does not fill the full size of the pipe to reach the MTU, so
this detection does not need to be sub-seconds, unlike normal BFD down we have to switch
the traffic immediately. MTU change can be detected by variing the BFD size say once
every minute (just like the BFD authentication proposal, once a while is sort of ok). Not knowing
the path MTU has changed for days is bad, but got notified in 2 minutes is good:-)

for the variety of encaps, the internal application probably can deduced from a
BFD one into their own as long as we have a number for path MTU.

[Les:] If ?your? MTU requirements are smaller than ?mine? ? would you want the BFD session to go down even though you could continue to use the link successfully?

No, I think this document can also specify that, the BFD should not go ?DOWN? if MTU has reduced, it should
only to be used as a ?discovery? mechanism ontop of the BFD itself. Say I?m sending large packets every 5 minutes
for 10 packets, this can be on top of the existing BFD schedule. It smaller packets still comes back to keep the
session alive. The big packets will give us the ?indication? of extra data we have learned,

[Les:] So, this has some implications:

We have both a transmit interval and a multiplier for a BFD session because we allow that some BFD packets may be dropped for reasons which do not represent a true loss of connectivity. Therefore up to N-1 consecutive packets may be dropped w/o triggering a session state change. If large BFD packets are ?occasionally? inserted this means there are intervals during which N-2 packets drops (not counting the BFD large packet) would be sufficient to trigger a BFD session state change. Further, if the processing of the large BFD packets makes it more likely that subsequent BFD packets will be dropped (e.g., because the processing of the large BFD packet simply takes longer) then it is possible that a BFD session state change might be triggered simply as a side effect of the insertion of the large packet into the data stream.

You also are now defining a ?child session? which is embedded within the parent session. BFD large packets are then not meant to influence the parent BFD session state and therefore have to be processed separately. This ? in many ways ? is equivalent to defining ?another protocol?. I?ll grant it might be a bit simpler as it can inherit some things from the parent session ? but it certainly is no longer simply a transparent part of existing BFD session operation.

And you still have not fully addressed the differing client MTU requirement ? unless you are proposing that BFD report to its clients what set of MTUs are being ?tested? and which ones have failed and which have not.

It is conceivable that all of this could be addressed in some way, but it gives me pause as to whether this is the best solution.


- Naiming


- Naiming

On Oct 20, 2018, at 5:14 PM, Les Ginsberg (ginsberg) <<>> wrote:

I have some concerns.

It has been stated that there is a need for sub-second detection of this condition ? but I really question that requirement.
What I would expect is that MTU changes only occur as a result of some maintenance operation (configuration change, link addition/bringup, insertion of a new box in the physical path etc.). The idea of using a mechanism which is specifically tailored for sub-second detection to monitor something that is only going to change occasionally seems inappropriate. It makes me think that other mechanisms (some form of OAM, enhancements to routing protocols to do what IS-IS already does ?) could be more appropriate and would still meet the operational requirements.

I have listened to the Montreal recording ? and I know there was discussion related to these issues (not sending padded packets all the time, use of BFD echo, etc.) ? but I would be interested in more discussion of the need for sub-second detection.

Also, given that a path might be used with a variety of encapsulations, how do you see such a mechanism being used when multiple BFD clients share the same BFD session and their MTU constraints are different?



From: Rtg-bfd <<>> On Behalf Of Reshad Rahman (rrahman)
Sent: Wednesday, October 17, 2018 6:06 PM
Subject: BFD WG adoption for draft-haas-bfd-large-packets

Hello BFD WG,

We have received an adoption request for ?BFD encapsulated in large packets?.

The adoption call will end on Friday Nov 9th.

Please send email to the list indicating ?yes/support?  or ?no/do not support?. If you do not support adoption, please state your reasons.

Reshad & Jeff.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>


Subject: Digest Footer

Rtg-bfd mailing list


End of Rtg-bfd Digest, Vol 152, Issue 20