Re: BFD WG adoption for draft-haas-bfd-large-packets

"Naiming Shen (naiming)" <> Mon, 22 October 2018 19:38 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 5915112F18C for <>; Mon, 22 Oct 2018 12:38:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -14.97
X-Spam-Status: No, score=-14.97 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.47, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id DbPZkIGfdgye for <>; Mon, 22 Oct 2018 12:38:24 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id F0C3D128CB7 for <>; Mon, 22 Oct 2018 12:38:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;;; l=68596; q=dns/txt; s=iport; t=1540237103; x=1541446703; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=iB6iamgJN8skduNDkrmL7za0PE/MVF5YYXc5RjKsb/M=; b=Gor+n1gygQbhoNjLM+OhDc9qGKL/msyV6pFHv9JZc2uMu/fkL/3nbtbx XswTIIFca9KSptdn7QEfo557zV5sZJZubNCHoG4JB4nm0jabc1VZRfE8d ds5sh3y0YZ/X02Kvedf9vL65ZVRSW1SkpNH5M/4Q3b1UYn8738TvRavrP Y=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-AV: E=Sophos;i="5.54,413,1534809600"; d="scan'208,217";a="459759587"
Received: from ([]) by with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Oct 2018 19:38:21 +0000
Received: from ( []) by (8.15.2/8.15.2) with ESMTPS id w9MJcJLV008929 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=FAIL); Mon, 22 Oct 2018 19:38:19 GMT
Received: from ( by ( with Microsoft SMTP Server (TLS) id 15.0.1395.4; Mon, 22 Oct 2018 14:38:18 -0500
Received: from ([]) by ([]) with mapi id 15.00.1395.000; Mon, 22 Oct 2018 14:38:18 -0500
From: "Naiming Shen (naiming)" <>
To: Mahesh Jethanandani <>
CC: "Les Ginsberg (ginsberg)" <>, "Reshad Rahman (rrahman)" <>, "" <>
Subject: Re: BFD WG adoption for draft-haas-bfd-large-packets
Thread-Topic: BFD WG adoption for draft-haas-bfd-large-packets
Thread-Index: AQHUZn7F2obCS5tWFUCJRL2ls+mZdKUo1C+ggAHIywD//67wkIAAV7mA///Hi8CAAYATAIAAGSKA
Date: Mon, 22 Oct 2018 19:38:18 +0000
Message-ID: <>
References: <> <> <> <> <> <> <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: []
Content-Type: multipart/alternative; boundary="_000_2868B780B49648A2A45A8C8BFB74CBF5ciscocom_"
MIME-Version: 1.0
Archived-At: <>
X-Mailman-Approved-At: Mon, 22 Oct 2018 12:45:37 -0700
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 22 Oct 2018 19:38:29 -0000

Probably just bandwidth increase, and if you need encryption/decryption on the packets,
then large packets will cost more in CPU.

- Naiming

On Oct 22, 2018, at 11:08 AM, Mahesh Jethanandani <<>> wrote:

I think it is important to understand the intent of why the large packets are being sent. For example, if the idea is to be able to transport a large size packet without dropping it in the path, then there might be little or no processing of the payload of the packet; just the fact that we received it. I can understand that if we start to process the payload, that that might cause delays in processing the next packet. Do we believe that receiving a large size packet, with little or no processing performed on it, will cause us to drop/not receive another packet?

On Oct 21, 2018, at 5:36 PM, Les Ginsberg (ginsberg) <<>> wrote:

Naiming -

Thanx for the good discussion. Responses inline.

From: Naiming Shen (naiming)
Sent: Sunday, October 21, 2018 3:36 PM
To: Les Ginsberg (ginsberg) <<>>
Cc: Reshad Rahman (rrahman) <<>>;<>
Subject: Re: BFD WG adoption for draft-haas-bfd-large-packets


On Oct 21, 2018, at 3:26 PM, Les Ginsberg (ginsberg) <<>> wrote:

Naiming –


From: Naiming Shen (naiming)
Sent: Sunday, October 21, 2018 3:12 PM
To: Les Ginsberg (ginsberg) <<>>
Cc: Reshad Rahman (rrahman) <<>>;<>
Subject: Re: BFD WG adoption for draft-haas-bfd-large-packets

It probably should say “the payload size MAY be increased to this value and it is
not recommended for a BFD session to always use the large size packet for padding.
How frequent the large size packet being used is application specific”.

[Les:] This does not address the question as to why we want to use a mechanism specifically designed for sub-second detection for this case.
??? (Note that it does not come for free. ☺ )

Since we already have a session between two end-points, a BFD session, why not utilize that
instead of having to explicitly configurae another ‘MTU discovery protocol’ session with burden
of configuration and management.

[Les:] Because it comes with costs and risks and problems of its own.

We do not know how many of the existing BFD implementations will be able to handle this w/o changes. Some may not be able to handle this at all.
The draft already acknowledges that this may affect BFD scaling. It is not much of a leap to think that in order to handle BFD at scale current implementations have made certain assumptions – one of which could be what is the maximum expected size of a BFD packet.
And the user will – of course – have to configure this as a BFD option (I believe this was acknowledged in the Montreal presentation) – so it is not as if no additional config is required.

I am sure we can come up with other risks/costs with a bit more thought.

Since most of the application traffic does not fill the full size of the pipe to reach the MTU, so
this detection does not need to be sub-seconds, unlike normal BFD down we have to switch
the traffic immediately. MTU change can be detected by variing the BFD size say once
every minute (just like the BFD authentication proposal, once a while is sort of ok). Not knowing
the path MTU has changed for days is bad, but got notified in 2 minutes is good:-)

for the variety of encaps, the internal application probably can deduced from a
BFD one into their own as long as we have a number for path MTU.

[Les:] If “your” MTU requirements are smaller than “mine” – would you want the BFD session to go down even though you could continue to use the link successfully?

No, I think this document can also specify that, the BFD should not go “DOWN” if MTU has reduced, it should
only to be used as a ‘discovery’ mechanism ontop of the BFD itself. Say I’m sending large packets every 5 minutes
for 10 packets, this can be on top of the existing BFD schedule. It smaller packets still comes back to keep the
session alive. The big packets will give us the “indication” of extra data we have learned,

[Les:] So, this has some implications:

We have both a transmit interval and a multiplier for a BFD session because we allow that some BFD packets may be dropped for reasons which do not represent a true loss of connectivity. Therefore up to N-1 consecutive packets may be dropped w/o triggering a session state change. If large BFD packets are “occasionally” inserted this means there are intervals during which N-2 packets drops (not counting the BFD large packet) would be sufficient to trigger a BFD session state change. Further, if the processing of the large BFD packets makes it more likely that subsequent BFD packets will be dropped (e.g., because the processing of the large BFD packet simply takes longer) then it is possible that a BFD session state change might be triggered simply as a side effect of the insertion of the large packet into the data stream.

You also are now defining a “child session” which is embedded within the parent session. BFD large packets are then not meant to influence the parent BFD session state and therefore have to be processed separately. This – in many ways – is equivalent to defining “another protocol”. I’ll grant it might be a bit simpler as it can inherit some things from the parent session – but it certainly is no longer simply a transparent part of existing BFD session operation.

And you still have not fully addressed the differing client MTU requirement – unless you are proposing that BFD report to its clients what set of MTUs are being “tested” and which ones have failed and which have not.

It is conceivable that all of this could be addressed in some way, but it gives me pause as to whether this is the best solution.


- Naiming


- Naiming

On Oct 20, 2018, at 5:14 PM, Les Ginsberg (ginsberg) <<>> wrote:

I have some concerns.

It has been stated that there is a need for sub-second detection of this condition – but I really question that requirement.
What I would expect is that MTU changes only occur as a result of some maintenance operation (configuration change, link addition/bringup, insertion of a new box in the physical path etc.). The idea of using a mechanism which is specifically tailored for sub-second detection to monitor something that is only going to change occasionally seems inappropriate. It makes me think that other mechanisms (some form of OAM, enhancements to routing protocols to do what IS-IS already does ☺) could be more appropriate and would still meet the operational requirements.

I have listened to the Montreal recording – and I know there was discussion related to these issues (not sending padded packets all the time, use of BFD echo, etc.) – but I would be interested in more discussion of the need for sub-second detection.

Also, given that a path might be used with a variety of encapsulations, how do you see such a mechanism being used when multiple BFD clients share the same BFD session and their MTU constraints are different?



From: Rtg-bfd <<>> On Behalf Of Reshad Rahman (rrahman)
Sent: Wednesday, October 17, 2018 6:06 PM
Subject: BFD WG adoption for draft-haas-bfd-large-packets

Hello BFD WG,

We have received an adoption request for “BFD encapsulated in large packets”.

The adoption call will end on Friday Nov 9th.

Please send email to the list indicating “yes/support”  or “no/do not support”. If you do not support adoption, please state your reasons.

Reshad & Jeff.

Mahesh Jethanandani<>