Re: Zaheduzzaman Sarker's Discuss on draft-ietf-bfd-unsolicited-11: (with DISCUSS and COMMENT)

Jeffrey Haas <jhaas@pfrc.org> Tue, 18 April 2023 15:24 UTC

Return-Path: <jhaas@pfrc.org>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8D7FCC151711; Tue, 18 Apr 2023 08:24:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xK5-vVuad71J; Tue, 18 Apr 2023 08:24:33 -0700 (PDT)
Received: from slice.pfrc.org (slice.pfrc.org [67.207.130.108]) by ietfa.amsl.com (Postfix) with ESMTP id B20D6C151553; Tue, 18 Apr 2023 08:24:27 -0700 (PDT)
Received: from smtpclient.apple (104-10-90-238.lightspeed.livnmi.sbcglobal.net [104.10.90.238]) by slice.pfrc.org (Postfix) with ESMTPSA id E2E3A1E037; Tue, 18 Apr 2023 11:24:25 -0400 (EDT)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\))
Subject: Re: Zaheduzzaman Sarker's Discuss on draft-ietf-bfd-unsolicited-11: (with DISCUSS and COMMENT)
From: Jeffrey Haas <jhaas@pfrc.org>
In-Reply-To: <AM6PR07MB39920946F22521797E66B2B29F9C9@AM6PR07MB3992.eurprd07.prod.outlook.com>
Date: Tue, 18 Apr 2023 11:24:25 -0400
Cc: Reshad Rahman <reshad@yahoo.com>, The IESG <iesg@ietf.org>, "draft-ietf-bfd-unsolicited@ietf.org" <draft-ietf-bfd-unsolicited@ietf.org>, "bfd-chairs@ietf.org" <bfd-chairs@ietf.org>, "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <52F8005C-DD2D-480B-B0A2-0EFF1A88CD9C@pfrc.org>
References: <167104636614.47387.14544637650303450586@ietfa.amsl.com> <20221215223922.GD23286@pfrc.org> <437097223.585815.1679885856359@mail.yahoo.com> <AM6PR07MB39920946F22521797E66B2B29F9C9@AM6PR07MB3992.eurprd07.prod.outlook.com>
To: Zaheduzzaman Sarker <zaheduzzaman.sarker@ericsson.com>
X-Mailer: Apple Mail (2.3696.120.41.1.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtg-bfd/BuLwZc9qx8flEPRxUp4HRc-DLXc>
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Apr 2023 15:24:37 -0000

Zahed,


> On Apr 18, 2023, at 8:44 AM, Zaheduzzaman Sarker <zaheduzzaman.sarker@ericsson.com> wrote:
> 
> Hi,
> 
> Thanks for the update. I am afraid I would like to discuss it a bit more so that we understand better the things we are agreeing to.
> 
> Please see inline responses.

> On Thursday, December 15, 2022, 05:39:32 PM EST, Jeffrey Haas <jhaas@pfrc.org> wrote:
>> 
>> It's also not uncommon for implementations to dyanmically adjust their
>> timers based on load within some constraints.  When that's not possible,
>> BFD traffic that becomes unsustainable causes the BFD sessions to start
>> losing packets, which in many cases will cause the session to transition to
>> the Down state - and thus back to slow PDU transmission.
>> 
> Ok I see. Thanks for the pointer. So, there is process of scaling down the number of sessions based of the system load and the way to get there is observed packet loss.

Exactly.  See in particular RFC 5880, §6.8.3.  Short form: A poll sequence is the protocol machinery to latch into the new negotiated timers.

Implementations are not required to do such dynamic negotiation, but many implementations may do this. See, for example, the use of the term "adaptation" in Juniper's manual:
https://www.juniper.net/documentation/us/en/software/junos/high-availability/topics/topic-map/bfd.html


> Here the packet loss is not that harmful if I understand correctly.

Packet loss may result in the session going down.  The session going down will take the dependent client resource using BFD down as well.

I suspect you'd find the critical detail here is that in such circumstances BFD isn't an effective denial-of-service.  If it's over-aggressive, it kills itself.  Many implementations also provide for features that keep unstable sessions down to avoid repeated flapping.  This is an implementation choice.

> If again my understanding is not correct, then we need mechanism so that we don’t reach to a situation where the system takes the hit due to packet loss. That was not that clear to me.

Hopefully this clarifies the situation.

> The caveat in this draft is related to an unexpected number of BFD sessions

In general, BFD scales by the number of sessions along with the associated packet rates.  Operators already must make judgment calls about balancing the ability to quickly detect failures with aggressive timers vs. the number of sessions the device can support.  Thus, this is not new territory operationally.  And similarly, since BFD can negotiate to the least aggressive side of a session (RFC 5880, §6.8.7), a well behaving implementation can simply decide that unsolicited BFD sessions may be limited by: number of sessions, list of acceptable prefixes the sessions come from, and the aggressiveness of the timers based on their desire for slow or fast sessions.

-- Jeff