Re: [RTG-DIR] Rtgdir last call review of draft-ietf-mpls-p2mp-bfd-06

Joel Halpern <jmh@joelhalpern.com> Sun, 25 February 2024 01:56 UTC

Return-Path: <jmh@joelhalpern.com>
X-Original-To: rtg-dir@ietfa.amsl.com
Delivered-To: rtg-dir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D8A12C14F5E9; Sat, 24 Feb 2024 17:56:38 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.103
X-Spam-Level:
X-Spam-Status: No, score=-2.103 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=joelhalpern.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mk5_5JtVsqv4; Sat, 24 Feb 2024 17:56:34 -0800 (PST)
Received: from mailb2.tigertech.net (mailb2.tigertech.net [208.80.4.154]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F0F48C14F5E8; Sat, 24 Feb 2024 17:56:28 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by mailb2.tigertech.net (Postfix) with ESMTP id 4Tj6L44sSlz1nv24; Sat, 24 Feb 2024 17:56:28 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelhalpern.com; s=2.tigertech; t=1708826188; bh=Cfkj+ZQ41dJ6Rs+GY4IeovkssEnv6B4XxFfjhUIR5aA=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=ESDDOGWx3eHWYH+ctbQs/bISGrubU4BVW+W48DcYyWQEvFs7BoKM0TU3BFuym9Tkt HqZ5rtx4+OgEtK8gSL2y6vU0vbsopaHJg3v60xTsD6rmp2hCXVxeRrIy9r64o/ZA38 Y95fStZrj/EroGtsZs2eAKtp0hJLVYAxMdzL7hEQ=
X-Quarantine-ID: <rfUlOYboWjlp>
X-Virus-Scanned: Debian amavisd-new at b2.tigertech.net
Received: from [192.168.20.146] (unknown [50.233.136.230]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mailb2.tigertech.net (Postfix) with ESMTPSA id 4Tj6L36ZKTz1nsM8; Sat, 24 Feb 2024 17:56:27 -0800 (PST)
Content-Type: multipart/alternative; boundary="------------bLvsxTzNmhhlzAGLms670kZu"
Message-ID: <52902652-167a-414f-8ca6-c13c80504829@joelhalpern.com>
Date: Sat, 24 Feb 2024 20:56:24 -0500
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: en-US
To: Greg Mirsky <gregimirsky@gmail.com>
Cc: rtg-dir@ietf.org, draft-ietf-mpls-p2mp-bfd.all@ietf.org, last-call@ietf.org, mpls@ietf.org
References: <170864700898.14065.4946299905740369098@ietfa.amsl.com> <CA+RyBmXitJr-57P3y_=pYEqwoHeMo4HKqPKOud-ZZ2dQQb_gGQ@mail.gmail.com> <176e1397-5b01-487f-8ae0-078bfe2f8ee7@joelhalpern.com> <CA+RyBmUMit0oc1MZTnQ0apTM8Wj_ra7Tna5JCwwMbtbKOfgyCQ@mail.gmail.com> <ca4d0846-9ac9-4846-8bf6-f2e68787c9c8@joelhalpern.com> <CA+RyBmWUgge9E28Y_CCF1_EQB1YzchWXzDK9P4qYxozmR7KFyw@mail.gmail.com>
From: Joel Halpern <jmh@joelhalpern.com>
In-Reply-To: <CA+RyBmWUgge9E28Y_CCF1_EQB1YzchWXzDK9P4qYxozmR7KFyw@mail.gmail.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtg-dir/eBy2uCtVksvdpDylCvJlioKIOXQ>
Subject: Re: [RTG-DIR] Rtgdir last call review of draft-ietf-mpls-p2mp-bfd-06
X-BeenThere: rtg-dir@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Routing Area Directorate <rtg-dir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-dir>, <mailto:rtg-dir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtg-dir/>
List-Post: <mailto:rtg-dir@ietf.org>
List-Help: <mailto:rtg-dir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-dir>, <mailto:rtg-dir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 25 Feb 2024 01:56:38 -0000

Mostly.  THere is one other aspect.  You may consider it irrelevant, in 
which case we can simply say so.  Can the inbound notifications coming 
from a large number of leaves at the same time cause data plane congestion?

Yours,

Joel

On 2/24/2024 8:44 PM, Greg Mirsky wrote:
> Hi Joel,
> thank you for your quick response. I consider two risks that may 
> stress the root's control plane:
>
>   * notifications transmitted by the leaves reporting the failure of
>     the p2mp LSP
>   * notifications transmitted by the root to every leave closing the
>     Poll sequence
>
> As I understand it, you refer to the former as inbound congestion. The 
> latter - outbound. Is that correct? I agree that even the inbound 
> stream of notifications may overload the root's control plane. And the 
> outbound process further increases the probability of the congestion 
> in the control plane. My proposal is to apply a rate limiter to 
> control inbound flow of BFD Control messages punted to the control plane.
> What would you suggest in addition to the proposed text?
>
> Best regards,
> Greg
>
> On Sat, Feb 24, 2024 at 3:28 PM Joel Halpern 
> <jmh.direct@joelhalpern.com> wrote:
>
>     What you say makes sense.  I think we need to acknowledge the
>     inbound congestion risk, even if we choose not to try to
>     ameliorate it.  Your approaches seems to address the outbound
>     congestion risk from the root.
>
>     YOurs,
>
>     Joel
>
>     On 2/24/2024 6:25 PM, Greg Mirsky wrote:
>>     Hi Joel,
>>     thank you for the clarification. My idea is to use a rate limiter
>>     at the root of the p2mp LSP that may receive notifications from
>>     the leaves affected by the failure. I imagine that the threshold
>>     of the rate limiter might be exceeded and the notifications will
>>     be discarded. As a result, some notifications will be processed
>>     by the headend of the p2mp BFD session later, as the tails
>>     transmit notifications periodically until the receive the BFD
>>     Control message with the Final flag set.  Thus, we cannot avoid
>>     the congestion but mitigate the negative effect it might cause by
>>     extending the convergence. Does that make sense?
>>
>>     Regards,
>>     Greg
>>
>>     On Sat, Feb 24, 2024 at 2:39 PM Joel Halpern
>>     <jmh@joelhalpern.com> wrote:
>>
>>         That covers part of my concern.  But....  A failure near the
>>         root means that a lot of leaves will see failure, and they
>>         will all send notifications converging on the root.  Those
>>         notifications themselves, not just the final messages, seem
>>         able to cause congestion.  I am not sure what can be done
>>         about it, but we aren't allowed to ignore it.
>>
>>         Yours,
>>
>>         Joel
>>
>>         On 2/24/2024 3:34 PM, Greg Mirsky wrote:
>>>         Hi Joel,
>>>         thank you for your support of this work and the suggestion.
>>>         Would the following update of the last paragraph of Section
>>>         5 help:
>>>         OLD TEXT:
>>>            An ingress LSR that has received the BFD Control packet,
>>>         as described
>>>            above, sends the unicast IP/UDP encapsulated BFD Control
>>>         packet with
>>>            the Final (F) bit set to the egress LSR.
>>>         NEW TEXT:
>>>            As described above, an ingress LSR that has received the
>>>         BFD Control
>>>            packet sends the unicast IP/UDP encapsulated BFD Control
>>>         packet with
>>>            the Final (F) bit set to the egress LSR.  In some
>>>         scenarios, e.g.,
>>>            when a p2mp LSP is broken close to its root, and the
>>>         number of egress
>>>            LSRs is significantly large, the control plane of the
>>>         ingress LSR
>>>            might be congested by the BFD Control packets transmitted
>>>         by egress
>>>            LSRs and the process of generating unicast BFD Control
>>>         packets, as
>>>            noted above.  To mitigate that, a BFD implementation that
>>>         supports
>>>            this specification is RECOMMENDED to use a rate limiter
>>>         of received
>>>            BFD Control packets passed to processing in the control
>>>         plane of the
>>>            ingress LSR.
>>>
>>>         Regards,
>>>         Greg
>>>
>>>         On Thu, Feb 22, 2024 at 4:10 PM Joel Halpern via Datatracker
>>>         <noreply@ietf.org> wrote:
>>>
>>>             Reviewer: Joel Halpern
>>>             Review result: Ready
>>>
>>>             Hello,
>>>
>>>             I have been selected as the Routing Directorate reviewer
>>>             for this draft. The
>>>             Routing Directorate seeks to review all routing or
>>>             routing-related drafts as
>>>             they pass through IETF last call and IESG review, and
>>>             sometimes on special
>>>             request. The purpose of the review is to provide
>>>             assistance to the Routing ADs.
>>>             For more information about the Routing Directorate,
>>>             please see
>>>             https://wiki.ietf.org/en/group/rtg/RtgDir
>>>
>>>             Although these comments are primarily for the use of the
>>>             Routing ADs, it would
>>>             be helpful if you could consider them along with any
>>>             other IETF Last Call
>>>             comments that you receive, and strive to resolve them
>>>             through discussion or by
>>>             updating the draft.
>>>
>>>             Document: draft-name-version
>>>             Reviewer: your-name
>>>             Review Date: date
>>>             IETF LC End Date: date-if-known
>>>             Intended Status: copy-from-I-D
>>>
>>>             Summary:  This document is ready for publication as a
>>>             Proposed Standard.
>>>                 I do have one question that I would appreciate being
>>>             considered.
>>>
>>>             Comments:
>>>                 The document is clear and readable, with careful
>>>             references for those
>>>                 needing additional details.
>>>
>>>             Major Issues: None
>>>
>>>             Minor Issues:
>>>                 I note that the security considerations (section 6)
>>>             does refer to
>>>                 congestion issues caused by excessive transmission
>>>             of BFD requests.   I
>>>                 wonder if section 5 ("Operation of Multipoint BFD
>>>             with Active Tail over
>>>                 P2MP MPLS LSP") should include a discussion of the
>>>             congestion implications
>>>                 of multiple tails sending notifications at the rate
>>>             of 1 per second to the
>>>                 head end, particularly if the failure is near the
>>>             head end.  While I
>>>                 suspect that the 1 / second rate is low enough for
>>>             this to be safe,
>>>                 discussion in the document would be helpful.
>>>
>>>