Re: [mpls] Rtgdir early review of draft-ietf-mpls-bfd-directed-07

Greg Mirsky <gregimirsky@gmail.com> Tue, 01 September 2020 21:25 UTC

Return-Path: <gregimirsky@gmail.com>
X-Original-To: mpls@ietfa.amsl.com
Delivered-To: mpls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 760E83A10EC; Tue, 1 Sep 2020 14:25:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.096
X-Spam-Level:
X-Spam-Status: No, score=-2.096 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SkvM_Sdm7GC1; Tue, 1 Sep 2020 14:25:00 -0700 (PDT)
Received: from mail-lj1-x22b.google.com (mail-lj1-x22b.google.com [IPv6:2a00:1450:4864:20::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6E60A3A0417; Tue, 1 Sep 2020 14:24:59 -0700 (PDT)
Received: by mail-lj1-x22b.google.com with SMTP id s205so3312817lja.7; Tue, 01 Sep 2020 14:24:59 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=4iDZopkvUYFyDHkD/7RgDB300B3J+JKtIjIaIBsVCUc=; b=RaivAXzvrfsL4x86fCoBJyFGIVXhl0gatjfYIl29hBPtZy4vQVikDZrXglD7vSo96e LBB9wUlZxcUzdDajesyZ00kAQ57NZ/48F4Gh2yZI0LY+Jb0ZoM3Z26/7UYFw1PHg2m9X iO2SVP6aZ3O7VRnEXyvvu7daShpycMyrWkOO9jnroZOcIkUkYRplxzancBhfMmsfux+n rtKyTEU0+ELy+5pUfy+xrJwM64B109QaLJmfXYH3BbJU/27U2t7qr1dyaXnjfFT7YtFJ RJsFPA2UeHcPU9fwY9erFz4t7vLpSfgj0azM626OOPx8pNPM3g+LKhkNYznYuXvkelgX lCsw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=4iDZopkvUYFyDHkD/7RgDB300B3J+JKtIjIaIBsVCUc=; b=gIkq6ITVZHTY+DITiSrKLMOi5/LmatV2I9FjbJhcKsE8KvBSK4srUoB49fUl6SHzSw GjKHKjCVeE7wxbEl/ghWd4w2gSe/sco2H4TxmlsXxoHbfVTM1OLYU7vYhrNpdhbgliNS d3UjOKRSPh704USiqSlNlT92e1drqbNXagqYXGlriPBNgSe3TwumbvGpFHPVzl3i3A0D Ah94a+Wbv+mMPPr3mcP2vsuAx/XGKvjDStdNHBBu5/O3+VC/AJGaTCx/PnwzS+S5LzUR Qbm9ai6L5GGLHWfcRzA3/QN7m/rcaXtk9MZ4sz2ge9hEUB7SePJWlT8WCx4A5zKNOigq /3ug==
X-Gm-Message-State: AOAM533wU/gazQMMG4xOcU+MPblSN3tZjQhxXXq+65145PmkF0syb4Ma g1WHYgGt2ha7OG5puiHBBT9A5g/CGCRVd+ODdT4RVTzG
X-Google-Smtp-Source: ABdhPJxRLw2gpn5lnICrCS7Q2t5e6HqNkcy6YYe7YPrOcVtS/cTws7/KpYaoCAFQI888dLm/7dyta0q4vl0zthJhoVA=
X-Received: by 2002:a2e:87c4:: with SMTP id v4mr1610022ljj.8.1598995496823; Tue, 01 Sep 2020 14:24:56 -0700 (PDT)
MIME-Version: 1.0
References: <149978159930.12344.18347332855391607627@ietfa.amsl.com> <CA+RyBmXR-Lpv3g+q-85Or4t+ccG4mqutQczzyKFg2YCyKjzqBg@mail.gmail.com> <FA423279-E6A7-4020-BF08-5D8DD3DED346@cisco.com>
In-Reply-To: <FA423279-E6A7-4020-BF08-5D8DD3DED346@cisco.com>
From: Greg Mirsky <gregimirsky@gmail.com>
Date: Tue, 1 Sep 2020 14:24:44 -0700
Message-ID: <CA+RyBmX6HnnLBC16ppwQuM9KphfOreOeHxW_uKpgH=BCvrsJQg@mail.gmail.com>
To: mpls <mpls@ietf.org>, MPLS Working Group <mpls-chairs@ietf.org>
Cc: Routing Directorate <rtg-dir@ietf.org>, "draft-ietf-mpls-bfd-directed@ietf.org" <draft-ietf-mpls-bfd-directed@ietf.org>, "Carlos Pignataro (cpignata)" <cpignata@cisco.com>
Content-Type: multipart/alternative; boundary="00000000000095faba05ae4726d9"
Archived-At: <https://mailarchive.ietf.org/arch/msg/mpls/wmRYNmEkhShGZQc5RIKYkhvHsvw>
Subject: Re: [mpls] Rtgdir early review of draft-ietf-mpls-bfd-directed-07
X-BeenThere: mpls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Multi-Protocol Label Switching WG <mpls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mpls>, <mailto:mpls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mpls/>
List-Post: <mailto:mpls@ietf.org>
List-Help: <mailto:mpls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mpls>, <mailto:mpls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 01 Sep 2020 21:25:06 -0000

Dear MPLS WG Chairs,
as noted at the WG meeting during IETF-108, all technical comments from the
early RtgDir review have been addressed. The draft was updated and includes
the Operational Considerations section explaining the use of the controlled
reverse path of a BFD session in the combination with periodic MPLS LSP
ping. Also, the Implementation Status section hs been added to report one
known implementation of the mechanism defined in the draft. The authors
discussed the status of the draft and ask your consideration for the WG LC.

Regards,
Greg (on behalf of the authors)

On Tue, Aug 4, 2020 at 7:39 PM Carlos Pignataro (cpignata) <
cpignata@cisco.com> wrote:

> Hi,
>
> Since I originated the “early review”, and my email is included in the
> “To” line, I feel I should respond and comment.
>
> The review mentioned below is over 3 years old, over 1,100 days old.
>
> Before and after that ‘early review’ there have been several others.
>
> My perspective continues to be that the method in mpls-bfd-directed is not
> sensible — what works well for MPLS LSP Ping as a command/response paradigm
> to choose a return path, does not extend to long-lived sessions such as
> BFD. Scanning through the current revision of the document, it continues to
> be underspecified to the point of being harmful.
>
> Best,
>
> Carlos.
>
>
> 2020/08/04 午後6:39、Greg Mirsky <gregimirsky@gmail.com>のメールt;のメール:
>
>
> Dear All,
> as we've discussed at the MPLS WG meeting at IETF-108, I've been given an
> action point to match updates to the draft-ietf-mpls-bfd-directed with the
> RtgDir early review comments. Please find notes from the authors in-line
> under GIM>> and Mach>> tags.
> Welcome your questions, comments.
>
> Regards,
> Greg (on behalf of the authors)
>
> ---------- Forwarded message ---------
> From: Carlos Pignataro <cpignata@cisco.com>
> Date: Tue, Jul 11, 2017 at 7:00 AM
> Subject: Rtgdir early review of draft-ietf-mpls-bfd-directed-07
> To: <rtg-dir@ietf.org>
> Cc: <mpls@ietf.org>rg>, <draft-ietf-mpls-bfd-directed.all@ietf.org>rg>, <
> bfd-chairs@ietf.org>
>
>
> Reviewer: Carlos Pignataro
> Review result: Has Issues
>
> Hello
>
> I have been selected to do a routing directorate “early” review of this
> draft.
> https://datatracker.ietf.org/doc/draft-ietf-mpls-bfd-directed/
>
> The routing directorate will, on request from the working group chair,
> perform
> an “early” review of a draft before it is submitted for publication to the
> IESG. The early review can be performed at any time during the draft’s
> lifetime
> as a working group document. The purpose of the early review depends on the
> stage that the document has reached.
>
> The MPLS chairs have requested an early review from the directorate with
> the
> objective of improving document quality.  This document has had three
> unsuccessful WG LCs.
>
> For more information about the Routing Directorate, please see
> http://trac.tools.ietf.org/area/rtg/trac/wiki/RtgDir
>
> Document: draft-ietf-mpls-bfd-directed-07.txt
> Reviewer: Carlos Pignataro
> Review Date: Early July, 2017
> Intended Status: Standards Track
>
> Summary:
> I have significant concerns about this document.
> I also recommend a BFD WG Chair or appointee to review this document.
>
> Comments:
>
> First, I have a general concern about the architectural approach in this
> document.
>
> This document is modeled after RFC 7110. RFC 7110 describes the
> specification
> of a return path for MPLS LSP Ping. MPLS LSP Ping uses a request/reply
> command/response paradigm, in which receipt of an Echo Request elicits the
> generation of an Echo Reply.
>
> BFD for MPLS, however, uses a different approach and paradigm (as per RFC
> 5884). An MPLS LSP Ping packet is used as a bootstrap, signaling
> discriminator
> value for a persistent BFD session. After the MPLS LSP Ping signals the
> Discriminator (via MPLS LSP Ping TLV) to use, then BFD control messages are
> sent back and forth.
>
> However, while the BFD session is UP and  BFD control messages and being
> sent
> back and forth, and while no MPLS LSP Ping packets are sent after
> bootstrapping
> -- what happens if the return path changes (e.g., the return LSP goes down,
> gets unconfigured, etc.)?
> GIM>> Need to note that after bootstrapping a BFD session over MPLS LSP
> with the MPLS LSP Ping that includes the BFD Discriminator TLV,
> MPLS LSP Ping is periodically sent according to Section 3.2 RFC 5884:
>    Hence, BFD is used in conjunction with LSP Ping for MPLS LSP fault
>    detection:
> ...
>     iii) LSP Ping is used to periodically verify the control plane
>          against the data plane by ensuring that the LSP is mapped to
>          the same FEC, at the egress, as the ingress.
> [Mach] If the return LSP goes down, then the BFD session should go down as
> well, this's one of the goals that this solution is trying to achieve. For
> example, in some case, it want the forward and reverse paths share the same
> route and fate.  The return path down normally means the forward path is
> down as well.
>
> In that case, not only this mechanism can actually make things worst,
> because
> it results in a false negative, but also the document does not specify how
> the
> system should recover.
> [Mach] As above, this result is desired in the case that requires the
> forward and reverse path share route and fate. It should be kept down until
> the specified return path recovered.
>
> The I-D seems to assume complete topological
> invariability for it to work long-term, since it does not specify any
> mechanism
> to update or to deal with such a failure or change scenario.
> GIM>> We've added the Operational Considerations section that describes
> how the local BFD system using periodic MPLS LSP Ping can monitor the
> viability of the Reverse path. Particularly, the text that describes this:
>    If any of defined in
>    [RFC7110] sub-TLVs used in BFD Reverse Path TLV, then the periodic
>    verification of the control plane against the data plane, as
>    recommended in Section 4 [RFC5884], MUST use the Return Path TLV, as
>    per [RFC7110], with that sub-TLV.  By using the LSP Ping with Return
>    Path TLV, an operator monitors whether at the egress BFD node the
>    reverse LSP is mapped to the same FEC as the BFD session.  Selection
>    and control of the rate of LSP Ping with Return Path TLV follows the
>    recommendation of [RFC5884]: "The rate of generation of these LSP
>    Ping Echo request messages SHOULD be significantly less than the rate
>    of generation of the BFD Control packets.  An implementation MAY
>    provide configuration options to control the rate of generation of
>    the periodic LSP Ping Echo request messages."
>
> On the other hand, there is already a BFD mechanism without the
> bootstrapping
> setup and with a command/response like behavior, that is S-BFD, RFC 7881.
> That
> one is notably missing from this draft.
> GIM>> The goal of the proposed mechanism is not to avoid bootstrapping a
> BFD over MPLS LSP session or simulate Echo Response/Reply behavior but to
> control the path the remote BFD system uses to transmit BFD control packets
> in BFD Asynchronous mode.
> [Mach] The purpose of this draft is to specify a particular return path,
> S-BFD cannot specify the return path, thus cannot satisfy the requirement.
>
> Further, there seem to be a number of potentially erroneous assumptions
> made,
> see below.
>
> Additional Comments:
>
>      Bidirectional Forwarding Detection (BFD) Directed Return Path
>
> The title should include that this is *only* for MPLS BFD.
> GIM>> Thank you for the helpful editorial suggestion.
>
>    When a BFD session monitors an explicitly routed unidirectional path
>    there may be a need to direct egress BFD peer to use a specific path
>    for the reverse direction of the BFD session.
>
> Scope: is this solution targeted only for "explicitly routed unidirectional
> path", and the solution to have the reply come back the exact reverse
> direction? That does not seem to be the case and the solution.
> GIM>> The path from the remote BFD system may traverse the same nodes and
> links traversed by the BFD control packets transmitted from the ingress
> LER, i.e., be co-routed. But the proposed mechanism is more generic and
> allows an operator to control the path traversed by a BFD control packet
> transmitted by the egress LER.
>
>    [RFC5880], [RFC5881], and [RFC5883] established the BFD protocol for
>    IP networks.  [RFC5884] and [RFC7726] set rules of using BFD
>    asynchronous mode over IP/MPLS LSPs.  These standards implicitly
>    assume that the egress BFD peer will use the shortest path route
>    regardless of route being used to send BFD control packets towards
>    it.
>
> Is "These standards" referring to the three former or the four latter?
> GIM>> Two latter two. We've clarified that by the following update:
> OLD TEXT:
>    [RFC5884] and [RFC7726] set rules of using BFD
>    asynchronous mode over IP/MPLS LSPs.  These standards implicitly
>    assume that the egress BFD peer will use the shortest path route
>    regardless of route being used to send BFD control packets towards
>    it.
> NEW TEXT:
> [RFC5884] and [RFC7726] set rules for using BFD
>    asynchronous mode over IP/MPLS LSPs, while not defining means to
>    control the path an egress BFD system uses to send BFD control
>    packets towards the ingress BFD system.
>
>    For the case where a LSP is explicitly routed it is likely that the
>    shortest return path to the ingress BFD peer would not follow the
>    same path as the LSP in the forward direction.  The fact that BFD
>    control packets are not guaranteed to follow the same links and nodes
>    in both forward and reverse directions is a significant factor in
>    producing false positive defect notifications, i.e. false alarms, if
>    used by the ingress BFD peer to deduce the state of the forward
>    direction.
>
> There may be an implicit mis-assumption in this text and overall approach:
> the
> fact that traffic flows on one direction does not imply that the reverse
> direction using the same interfaces and nodes would actually be
> consequently
> properly programmed and working.
> GIM>> If the objective of controlling the path from the egress to the
> ingress system is to monitor such path, then detecting a defect in the
> forwarding path fulfills the objective.
>
>    This document defines the BFD Reverse Path TLV as an extension to LSP
>    Ping [RFC8029] and proposes that it is to be used to instruct the
>    egress BFD peer to use an explicit path for its BFD control packets
>    associated with a particular BFD session.
>
> This text assumes that the BFD return path is MPLS. However, my
> understanding
> from RFC 5884 is that this is not necessarily the case, and the return can
> be
> IP.
> GIM>> The document defines an optional mechanism to control the BFD
> Reverse path. The default behavior, if the BFD Reverse Path TLV is not
> included in the MPLS LSP Ping with the BFD Discriminator TLV, remains, as
> you've noted - over IP.
>
>    When BFD is used to monitor unidirectional explicitly routed path,
>    e.g.  MPLS-TE LSP, BFD control packets in forward direction would be
>    in-band using the mechanism defined in [RFC5884] and [RFC5586].
>
> Which BFD uses RFC 5586? RFC5586 says that is not needed:
> GIM>> We've removed references to RFC 5586.
>
>    "Some of these functions can be supported using existing
>    tools such as Virtual Circuit Connectivity Verification (VCCV)
>    [RFC5085], Bidirectional Forwarding Detection for MPLS LSPs (BFD-
>    MPLS) [BFD-MPLS], LSP-Ping [RFC4379], or BFD-VCCV [BFD-VCCV]."
>
> And then:
>
>    o  a failure detection by ingress node on the reverse path cannot be
>       interpreted as bi-directional failure unambiguously and thus
>       trigger, for example, protection switchover of the forward
>       direction without possibility of being a false positive.
>
>    To address this scenario the egress BFD peer would be instructed to
>    use a specific path for BFD control packets.
>
> But using a specific path for return cannot either imply "interpreted as
> bi-directional failure unambiguously", so the scenario is not *addressed*.
>
>    The BFD Reverse Path TLV carries information about the path onto
>    which the egress BFD peer of the BFD session referenced by the BFD
>    Discriminator TLV MUST transmit BFD control packets.  The format of
>    the BFD Reverse Path TLV is as presented in Figure 1.
>
> What does the remote endpoint do with that "MUST" if the return FEC goes
> away?
> GIM>> The Operational Considerations section includes the following text:
>    Suppose an operator planned network maintenance activity that
>    possibly affects FEC used in the BFD Reverse Path TLV.  In that case,
>    the operator MUST avoid the unnecessary disruption using the LSP Ping
>    with a new FEC in the BFD Reverse Path TLV.  But in some scenarios,
>    proactive measures cannot be taken.  Because the frequency of LSP
>    Ping messages will be lower than the defect detection time provided
>    by the BFD session.  As a result, a change in the reverse-path FEC
>    will first be detected as the BFD session's failure.  In such a case,
>    the ingress BFD node SHOULD immediately transmit the LSP Ping Echo
>    request with Return Path TLV to verify whether the FEC is still
>    valid.  If the failure was caused by the change in the FEC used for
>    the reverse direction of the BFD session, the ingress BFD node SHOULD
>    bootstrap a new BFD session using another FEC in BFD Reverse Path
>    TLV.
>
> There also seem to be some self-contradiction. This document says:
>
>    LSP ping, defined in [RFC8029], uses BFD Discriminator TLV [RFC5884]
>    to bootstrap a BFD session over an MPLS LSP.  This document defines a
>    new TLV, BFD Reverse Path TLV, that MUST contain a single sub-TLV
>    that can be used to carry information about the reverse path for the
>    BFD session that is specified by value in BFD Discriminator TLV.
>
> And then says:
>
>    Reverse Path field contains a sub-TLV.
>
> But then says:
>
>    None, one or more sub-TLVs MAY be included in the BFD Reverse
>    Path TLV.  If none sub-TLVs found in the BFD Reverse Path TLV, the
>    egress BFD peer MUST revert to using the default, i.e., over IP
>    network, reverse path.
>
> So is it only one, or none/one/multiple?
> GIM>> Thank you for pointing this. Below is the update in the working
> version of the draft:
> OLD TEXT:
>    This
>    document defines a new TLV, BFD Reverse Path TLV, that MUST contain a
>    single sub-TLV that can be used to carry information about the
>    reverse path for the BFD session that is specified by the value in
>    BFD Discriminator TLV.
> NEW TEXT:
>    This
>    document defines a new TLV, BFD Reverse Path TLV, that MAY contain
>    none, one or more sub-TLVs that can be used to carry information
>    about the reverse path for the BFD session that is specified by the
>    value in BFD Discriminator TLV.
>
> and one more:
> OLD TEXT:
> Reverse Path field contains a sub-TLV
> NEW TEXT:
> Reverse Path field contains none, one or more sub-TLVs.
>
> I believe it needs to be multiple since then a Tunnel can be specified.
> But the
> document as-is seems self-contradicting.
>
> Further, where has that "default" been defined as "over IP network"?
> GIM>> The text was re-worked to avoid using "default":
>    If no sub-TLVs are found in the BFD
>    Reverse Path TLV, the egress BFD peer MUST revert to using the local
>    policy based decision as described in Section 7 [RFC5884], i.e.,
>    routed over IP network.
>
> There's another contradiction here:
>
>    If the egress LSR cannot find the path specified in the Reverse Path
>    TLV it MUST send Echo Reply with the received Reverse Path TLV and
>    set the Return Code to "Failed to establish the BFD session.  The
>    specified reverse path was not found" Section 3.3.  The egress BFD
>    peer MAY establish the BFD session over IP network as defined in
>    [RFC5884].
>
> So the response is "Failed to establish the BFD session." But then it MAY
> establish the session?
> GIM>> Echo Reply with the error code is to indicate that the reverse path
> is not available at the egress LSR. The local policy at the egress BFD
> system may command the transmission of a BFD control packet from egress to
> the ingress if the Reverse Path is not available.
> And, again, what if the path is found at bootstrap but
> lost afterwards?
> GIM>> Section 5 includes text that provides information on handling such a
> scenario.
>
> 4.  Use Case Scenario
>
> The fact that A-B-C-D-G-H works does not mean that the reverse,
> H-G-D-C-B-A,
> will work.
> GIM>> If the objective is to verify that the reverse path works, then
> detecting a defect fulfills the task.
> 6.  Security Considerations
>
>    Security considerations discussed in [RFC5880], [RFC5884], [RFC7726],
>    and [RFC8029], apply to this document.
>
> There seem to be additional security considerations with returns taking
> explicit paths, and should be expanded in here.
> GIM>> That scenario is discussed in the Security Considerations section in
> RFC 7110. A reference to RFC 7110 was added:
>
>    Security considerations discussed in [RFC5880], [RFC5884], [RFC7726],
>    [RFC8029], and [RFC7110] apply to this document.
>
> Net-net, I do have concerns about this document. I believe it is not ready
> to
> advance, and could use more whiteboard time as well as a review by BFD
> experts.
>
> Best,
>
> Carlos Pignataro.
>
>
>