Re: [bess] Alvaro Retana's Discuss on draft-ietf-bess-mvpn-fast-failover-13: (with DISCUSS and COMMENT)

Greg Mirsky <gregimirsky@gmail.com> Wed, 23 December 2020 03:59 UTC

Return-Path: <gregimirsky@gmail.com>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3C5563A0972; Tue, 22 Dec 2020 19:59:30 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.696
X-Spam-Level:
X-Spam-Status: No, score=-0.696 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_COMMENT_SAVED_URL=1.391, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_HTML_ATTACH=0.01, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jZBFRLUnjU1p; Tue, 22 Dec 2020 19:59:22 -0800 (PST)
Received: from mail-lf1-x132.google.com (mail-lf1-x132.google.com [IPv6:2a00:1450:4864:20::132]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AFEBE3A0965; Tue, 22 Dec 2020 19:59:20 -0800 (PST)
Received: by mail-lf1-x132.google.com with SMTP id o13so37010314lfr.3; Tue, 22 Dec 2020 19:59:20 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=K+gzJR4whknvoMrjEPQQxK9BCT22t7LY3Rb7aWYcopk=; b=X0LyLDYMgZvbVrbSGcZUA4vPPvhCeajx4VhjLCj6L2KwLhwy3mL6g47QmpOkZdGCOR LcPHMlX8s1Plee8k3mm9o3a5GDeY/sZhc1o9jZ+Cq5tkPfBNN78dYbl+RXe54SIuf7oL n1s+xANF4EAvXHOnwrSb1lzjLlc8EQ7tkQ+vJwG5RuA4jWZVqYyPbMc9rQc6jgPEWcbv v3rapY8VndSn96wAz0MPPNufEBZsmAtg3ohQCBM2cJNw4QGutbBx+2MKGTBJRWzXYJ/C 808lKf1G3FGNqWeqYWNSrWYCgtoK+4KfBDsQRrWOXdmC6tXXW8FSPEDBjMjYGif3+etc 2SDQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=K+gzJR4whknvoMrjEPQQxK9BCT22t7LY3Rb7aWYcopk=; b=XW8OWOib6lvKInCJtE/XS7CTdRsX3BXRWBGLZ0NYAtHYjHnEl+q5mbtuVnsRnZjojo wk7DlJxnbMqqpKuKjR4CXDEAcNCNW/7fSknfZwfAR2H1CYM742F8V+khHp80TCsrXqEA qD9BLT7pu9G8x7a41H6hEYUScJvASuaJwv8TxuYrmM0mMlr0CCk8zal9FaffKjqEBEee +CUD3rc9MQ1Fy2YbiXvMHzkJM4YHJ8wT3/xn2mVvxifJhERKo3xOqiVintcjz8z3RG2/ 9n+p2/YslbgMZ2duNJInSuo+XQqsfPyl/ZlAi7D3DBj5WLj5/GZ5XFG+Le4G9qL1vwMA 7tPQ==
X-Gm-Message-State: AOAM532iP6z0IIY+lLeoH9lG0zCQU/fVSxWA6kJA5qEkNmj3R/PMDp7b dALbFqaVpmLONrNHtE6y737lRnUUQM8PjcxTuiE=
X-Google-Smtp-Source: ABdhPJywMLtlN/GsepCjIZ6x15NvygmRGcQ4iXm+yeW3teyw6lNFYgugMfrRh+eb1YF35pJTDe7xWL/n/IBJfDCs7kY=
X-Received: by 2002:a2e:a494:: with SMTP id h20mr11254882lji.145.1608695958332; Tue, 22 Dec 2020 19:59:18 -0800 (PST)
MIME-Version: 1.0
References: <1336556383.1214634.1608220368883.ref@mail.yahoo.com> <1336556383.1214634.1608220368883@mail.yahoo.com> <CAMMESsxqkuSMkKRt-q=PagiF8dRGda-MBAvpKGRsEXWqgbaR7w@mail.gmail.com> <CA+RyBmVRS3L51cqJgbsgYM8JOaBhmR+F=SabgP_54xOSGnZi3Q@mail.gmail.com> <CAMMESsxV36nhiXjy5bEYFuHx-CmTLHLPDDA757vuzEPpbW809A@mail.gmail.com>
In-Reply-To: <CAMMESsxV36nhiXjy5bEYFuHx-CmTLHLPDDA757vuzEPpbW809A@mail.gmail.com>
From: Greg Mirsky <gregimirsky@gmail.com>
Date: Tue, 22 Dec 2020 19:59:06 -0800
Message-ID: <CA+RyBmVyOrsavvTYTF81VmM9k9Ckp+u4BgXz4Ba3+Ocm_FpXLQ@mail.gmail.com>
To: Alvaro Retana <aretana.ietf@gmail.com>
Cc: "bfd-chairs@ietf.org" <bfd-chairs@ietf.org>, Stephane Litkowski <slitkows.ietf@gmail.com>, "draft-ietf-bess-mvpn-fast-failover@ietf.org" <draft-ietf-bess-mvpn-fast-failover@ietf.org>, "bess-chairs@ietf.org" <bess-chairs@ietf.org>, The IESG <iesg@ietf.org>, "bess@ietf.org" <bess@ietf.org>
Content-Type: multipart/mixed; boundary="0000000000002605cf05b719b765"
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/K437T1F9DXI3YaOM5hBh4_9YDGg>
Subject: Re: [bess] Alvaro Retana's Discuss on draft-ietf-bess-mvpn-fast-failover-13: (with DISCUSS and COMMENT)
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 23 Dec 2020 03:59:31 -0000

Hi Alvaro,
thank you for giving me more details to work with. Please find my notes and
the new proposed updates in-line below under the GIM2>> tag. Attached is
the diff that highlights changes related to your DISCUSS and COMMENT.

Regards,
Greg

On Mon, Dec 21, 2020 at 12:01 PM Alvaro Retana <aretana.ietf@gmail.com>
wrote:

> On December 20, 2020 at 7:24:34 PM, Greg Mirsky wrote:
>
> Greg:
>
> Hi!
>
> I'm leaving in only the parts where we don't agree or where I have
> commments.
>
> Thanks!
>
> Alvaro.
>
>
> ...
> > > ----------------------------------------------------------------------
> > > DISCUSS:
> > > ----------------------------------------------------------------------
> > >
> > > (1) This document describes several methods to determine the status of
> a
> > > tunnel (in §3), none of which "provide a "fast failover" solution when
> > > used alone, but can be used together with the mechanism described in
> > > Section 4" (§1). §3 also says this:
> > >
> > > An implementation may support any combination of the methods
> > > described in this section and provide a network operator with control
> > > to choose which one to use in the particular deployment.
> > >
> > > While §3.1 is clear in the fact that it is not a requirement for all
> > > downstream PEs to use the same mechanism, there are no guidelines to
> aid
> > > the operator to chose which mechanism to use. Some cases may be obvious
> > > (e.g. §3.1.3 applies to tunnels of a specific type), but others are
> not.
> > > I would like to see deployment considerations related to the
> advantages/
> > > disadvantages that each method may have in specific situations
> (including
> > > their possible combination).
> >
> > GIM>> I think it might be not that simple to compare deployment
> challenges
> > and benefits resulting from the deployment of different P-tunnel
> monitoring
> > methods because we have to somehow abstract from an impact of choices
> made
> > by respective implementations. Also, there probably should be a reference
> > environment, a use case we agree with. I agree that such a comparative
> > analysis of P-tunnel monitoring methods is useful it doesn't seem to
> benefit
> > this document as the choice of a developer which ones to implement and an
> > operator which one to use don't affect the functionality discussed in the
> > draft. Would you agree?
>
> No, I don't.  That's the reason for this DISCUSS point. :-)
>
> Why is this document specifying multiple mechanisms?  Why not specify
> just one?  At a high level, I assume the answer is that one size
> doesn't fit all, and that specific deployments might benefit from
> specific mechanisms while others may not be as useful.   I don't think
> that to provide guidance (I'm not asking for in-depth analysis) there
> needs to be a priori agreement on use cases, etc...beyond what the WG
> has already been considering for the development of this document.
>
> Your initial statement that "it might be not that simple to compare"
> is precisely the reason I think that operational guidance is needed.
>
GIM2>> I propose a new sub-section to summarize the benefits and challenges
of methods described in Section 3.1:
3.1.8.  Operational Considerations for Monitoring P-Tunnel's Status

   Several methods to monitor the status of a P-tunnel are described in
   Section 3.1.  Though there might be no perfect method, a comparison
   of benefits and challenges of each technique could be helpful to both
   implementors and network operators.

   Tracking the root of an MVPN (Section 3.1.1) concludes about the
   status of a P-tunnel based on the control plane information.
   Because, in general, the MPLS data plane is not fate-sharing with the
   control plane, this method might produce false positive or false
   negative alarms.  On the other hand, because BGP next-hop tracking is
   broadly supported and deployed, this method might be the easiest to
   deploy.

   Method described in Section 3.1.2 monitors the state of the data
   plane but only for an egress P-PE link of a P-tunnel.  As a result,
   network failures that affect upstream links might not be detected
   using this method and the MVPN convergence would be determined by the
   convergence of the BGP control plane.

   Using the state change of a P2MP RSVP-TE LSP as the trigger to re-
   evaluate the status of the P-tunnel (Section 3.1.3) relies on the
   mechanism used to monitor the state of the P2MP LSP.

   The method described in Section 3.1.4 is simple and prone to false
   alarms but it is applicable to a sub-set of MVPNs, those that use the
   leaf-triggered x-PMSI tunnels.

   Though some MVPN might be used to provide a multicast service with
   predictable interpacket interval (Section 3.1.5), the number of such
   cases seem limited.

   Monitoring the status of a P-tunnel using p2mp BFD session
   (Section 3.1.6) may produce the most accurate and expedient failure
   notification of all monitoring methods discussed.  On the other hand,
   it requires careful consideration of the additional load of BFD onto
   network and PE nodes.

>
>
>
> > > (2) The BFD Discriminator Attribute has a very narrow application in
> this
> > > document when compared to the potential other uses given the
> extensibility
> > > possibilities related to bootstrapping BFD. I have serious concerns
> about
> > > the attribute being defined in this document, amongst a series of other
> > > mechanisms.
> > >
> > > (2a) The tunnel can be monitored without the new BGP Attribute
> (assuming
> > > proper configuration of course). Why is that option is not even
> mentioned
> > > in the document?
> >
> > GIM>> You are right, there could be other methods to bootstrap a p2mp BFD
> > session. But since RFC 8562 has no discussion of bootstrapping a session,
> > having it in this draft seemed somewhat out of place. We can add an
> > informational reference to Section 4 of draft-mirsky-mpls-p2mp-bfd
> > (https://datatracker.ietf.org/doc/draft-mirsky-mpls-p2mp-bfd/)where
> several
> > bootstrapping options discussed. Would that be acceptable?
>
> No, but thanks for pointing at that other draft which supports the
> fact that the attribute is not the only way to bootstrap the session.
>
> My point here is that you don't need the attribute for BFD to be an
> useful tool.  However, the description of the tool (monitoring) is
> tied to the attribute (including the point below about deleting the
> BFD session).  It doesn't have to be: the use of BFD should be
> independent of the bootstrapping mechanism.
>
> Do you consider other possible ways of bootstrapping the session
> valid?  Would it be ok to use them to enable the use of BFD?  Can a
> BFD session that is setup using a different mechanism used to monitor
> the tunnel?
>

GIM2>> Thank you for clarifying the question. Yes, a p2mp BFD session can
be instantiated using some other than the BFD Discriminator attribute
method. draft-mirsky-mpls-p2mp-bfd gives several options and describes how
MPLS LSP Ping can be used. But that method, in my opinion, has some
operational issues and is less preferable compared to using an extension of
a control plane protocol. Other examples of extending a protocol to
bootstrap a p2mp BFD session can be found in the draft.
I've added a note in the first paragraph of Section 3.1.6 to clarify that
the use of the BFD Discriminator attribute is optional:
NEW TEXT:
   The P-tunnel status may be derived from the status of a multipoint
   BFD session [RFC8562] whose discriminator is advertised along with an
   x-PMSI A-D Route.  A P2MP BFD session can be instantiated using a
   mechanism other than the BFD Discriminator attribute, e.g., MPLS LSP
   Ping ([I-D.mirsky-mpls-p2mp-bfd]).  Description of these methods is
   outside the scope of this document.

>
>
>
> ...
> > > (2b) The fact that BFD monitoring can be achieved without the new
> > > attribute makes me think that the bootstrapping of BFD using BGP would
> be
> > > better served in a document produced by the BFD WG. One of the editors
> has
> > > expressed the same opinion [1] [2]. Has a discussion taken place in the
> > > BFD WG (or at least with the Chairs) about this work? Why was it not
> taken
> > > up there?
> > >
> > GIM>> Work on draft-mirsky-mpls-p2mp-bfd is progressing at MPLS WG.
> AFAIK,
> > it is in the state Candidate for WG Adoption. Moving the definition of
> the
> > BFD Discriminator attribute to that draft, as I understand, would require
> > using it as the normative reference. Comparing the current states of both
> > documents, would severely delay the publication of MVPN Fast Failover
> > specification and likely affect implementations of P-tunnel monitoring
> > mechanisms. I hope you can agree with the current organization of the
> > document.
>
> That doesn't address my question about whether this work should have
> been done in the BFD WG -- as you had suggested (links above).
>
> I'm sure you saw the message I sent to Reshad/Jeff on this topic.
> I'll rely on their and Martin's opinion.  BTW, I agree with Jeff in
> that bfd/idr should be given the opportunity to review this document.
>
GIM2>> I'm leaving this decision to the AD and Chairs of BESS and BFD WGs.

>
>
>
>
>
> > > ----------------------------------------------------------------------
> > > COMMENT:
> > > ----------------------------------------------------------------------
> ...
> > > (2) s/is an OPTIONAL procedure/is an optional procedure
> > > This is not a normative statement to require capitalization.
> >
> > GIM>> I think that the use of the normative form is reasonable. The
> sentence
> > can be re-worded using MAY. For example:
> > OLD TEXT:
> > The procedure described here is an OPTIONAL procedure that is based
> > on a downstream PE taking into account the status of P-tunnels rooted
> > at each possible Upstream PE, for including or not including each
> > given PE in the list of candidate UMHs for a given (C-S, C-G) state.
> > NEW TEXT:
> > The procedure described here MAY be used in a BGP/MPLS MVPN [RFC6513].
> It is
> > based on a downstream PE taking into account the status of P-tunnels
> rooted
> > at each possible Upstream PE, for including or not including each given
> PE
> > in the list of candidate UMHs for a given (C-S, C-G) state.
> >
> > We wanted to stress that without this mechanism BGP/MPLS MVPN, as
> defined in
> > RFCs 6513 and 6514, is fully functional and architecturally complete.
> This
> > draft discusses mechanisms that support the protection and faster
> > convergence in MVPN's control plane. And since these mechanisms only an
> > improvement compared to the BGP mechanisms, we wanted to emphasize that
> by
> > using the normative form. Would you agree?
>
> No, I don't.  Normative language was not meant for emphasis.  In any
> case, this is not a blocking comment.
>
GIM>> Switched to the lower case version of "optional"

>
>
>
>
> ...
> > > (4) §3.1.1: "similar to BGP next-hop tracking" Is this specified
> > > somewhere? I don't remember seeing a specification for next-hop
> tracking,
> > > but do know that implementations do it -- in an implementation-specific
> > > way. Please add a little more text about what is meant/expected.
> >
> > GIM>> Indeed, BGP next-hop tracking is internal to a system behavior that
> > has not been, to the best of my knowledge, been documented at IETF or any
> > SDO. Would the following text provide reasonable context information:
> >
> > That is similar to BGP next-hop tracking for VPN routes, except that
> > the address considered is not the BGP next-hop address but the root
> > address in the x-PMSI Tunnel attribute. BGP next-hop tracking is a
> > feature that reduces the BGP convergence time comparing to the
> > "regular" BGP by monitoring BGP next-hop address changes in the
> > routing table. It's event-based because it detects changes in the
> > routing table. When it detects a change, it performs a next-hop scan
> > to find if any of the next hops in the BGP table is affected and updates
> > it accordingly.
>
> Suggestion:
>
> OLD>
>    BGP next-hop tracking is a feature that reduces the BGP convergence time
>    comparing to the "regular" BGP by monitoring BGP next-hop address
> changes
>    in the routing table. It's event-based because it detects changes in the
>    routing table. When it detects a change, it performs a next-hop scan to
>    find if any of the next hops in the BGP table is affected and updates it
>    accordingly.
>
> NEW>
>    BGP next-hop tracking monitors BGP next-hop address changes in the
> routing
>    table.  In general, when a change is detected, it performs a next-hop
> scan
>    to find if any of the next hops in the BGP table is affected and
> updates it
>    accordingly.>
>
GIM2>> Thank you for the proposed text. Accepted.

>
>
>
> ...
> > > (6) The "reachability condition" is mentioned in §3.1.1/§3.1.3/§3.1.4.
> > > Does this mean that that root tracking (§3.1.1) should be used with the
> > > other mechanisms? The specific text says that "the downstream PE can
> > > immediately update its UMH when the reachability condition changes",
> > > giving the impression that the combination is possible but not
> required.
> > >
> > > Note that §4.3 is titled "Reachability Determination", which I hoped
> would
> > > shed more light, but all it does is point back to §3.1.
> >
> > GIM>> I don't see any benefit of concurrently using more than one of the
> > described in the document mechanisms that can monitor the state of a
> > P-tunnel. That is certainly possible but, in my opinion, would add
> > unnecessary complexity. The last paragraph in Section 3.1 is intended as
> the
> > introduction to sub-sections that discuss different monitoring
> mechanisms:
> >
> > An implementation may support any combination of the methods
> > described in this section and provide a network operator with control
> > to choose which one to use in the particular deployment.
> > Would you suggest an update to this paragraph to clarify the statement?
>
> See the first DISCUSS point above.
>
> You didn't explicitly answer the question:  is the text in
> §3.1.1/§3.1.3/§3.1.4 an implication that root tracking should be used
> with the other mechanisms?  If not, then that does it mean?
>
GIM2>> I hope that a new sub-section added in the response to the first
DISCUSS clarifies this question.

>
>
>
> ...
> > > (8) §3.1.2 mentions that "careful consideration and coordination" is
> > > needed when using other mechanisms such as rfc4090 "because
> uncorrelated
> > > timers might cause unnecessary switchovers and destabilize the
> network."
> > > What are the associated timers related to the mechanisms in this
> section?
> >
> > GIM>> This is in reference to the defect detection timers. When using
> > multi-layer protection particular consideration must be given to the
> > interaction of defect detections at different layers of a network. It s
> > recommended to use longer detection intervals at the higher layers. Some
> > recommendations suggest using a multiplier of 3 or larger, e.g., 10 msec
> > detection for FRR and at least 100 msec for e2e detection.
>
> Can you add something like that to the document?
>
GIM2>> I've used that text to update the last paragraph of Section 3.1.2:
NEW TEXT:
   Using this method when a fast restoration mechanism (such as MPLS FRR
   [RFC4090]) is in place for the link requires careful consideration
   and coordination of defect detection intervals for the link and the
   tunnel.  When using multi-layer protection, particular consideration
   must be given to the interaction of defect detections at different
   layers of a network.  It is recommended to use longer detection
   intervals at the higher layers.  Some recommendations suggest using a
   multiplier of 3 or larger, e.g., 10 msec detection for the link
   failure detection and at least 100 msec for the tunnel failure
   detection.  In many cases, it is not practical to use both protection
   methods at the same time because uncorrelated timers might cause
   unnecessary switchovers and destabilize the network.

>
>
>
> ...
> > > (10) §3.1.4: "An Upstream PE SHOULD be removed from the UMH candidate
> > > list...if...the upstream one-hop branch of the tunnel from P to PE
> cannot be
> > > built." When is it ok to not remove the PE? IOW, why is this action not
> > > required?
> >
> > GIM>> Thank you for bringing this case. It was discussed during
> Shepherd's
> > review and we've asked for the expert's opinion. Jeffrey Zhang kindly
> > suggested the current text. (
> https://mailarchive.ietf.org/arch/browse/bess/?gbt=1&index=LxKYi9F6u1tl2qKtR2Q8wkQPMOA
> )
> > I'll check with him and get back with the answer.
>
> ok.
>
GIM2>> After the discussion, we've decided to make it the requirement,
i.e., s/SHOULD/MUST/

>
>
>
>
> ...
> > > (12) §3.1.5 says that "where this mechanism is used in conjunction with
> > > the method described in Section 5...downstream PEs can compare
> reception
> > > on the two P-tunnels to determine when one of them is down", but §5
> says
> > > that "downstream PEs accept traffic from the primary or standby tunnel,
> > > based on the status of the tunnel (based on Section 3)". IOW, §3.1.5
> > > points at §5 as providing a way to determine if a tunnel is down,
> while §5
> > > points back at §3 as the way to determine which tunnel to receive from.
> > > This pointing back and forth is not a total contradiction, but it
> needs to
> > > be clarified.
> >
> > GIM>> Though it might appear as a circular reference, it was not the
> > intention. An implementation of the method described in the last
> paragraph
> > of Section 3.1.5 can periodically accept traffic from primary and standby
> > tunnels s the method of determining the state of the primary P-tunnel.
> I've
> > updated that paragraph to note that the comparison can be done
> periodically:
> >
> > OLD TEXT:
> > In cases where this mechanism is used in conjunction with the method
> > described in Section 5, no prior knowledge of the rate or maximum
> > inter-packet time on the multicast streams is required; downstream
> > PEs can compare actual packet reception statistics on the two
> > P-tunnels to determine when one of them is down. The detailed
> > specification of this mechanism is outside the scope of this
> > document.
> >
> > NEW TEXT:
> > In cases where this mechanism is used in conjunction with the method
> > described in Section 5, no prior knowledge of the rate or maximum
> > inter-packet time on the multicast streams is required; downstream
> > PEs can periodically compare actual packet reception statistics on
> > the two P-tunnels to determine when one of them is down. The
> > detailed specification of this mechanism is outside the scope of this
> > document.
>
> Let me try again.
>
> §3.1.5 says that the "PEs can compare reception on the two P-tunnels
> to determine when one of them is down".  OTOH, §5 says that the "PEs
> accept traffic from the primary or standby tunnel, based on the status
> of the tunnel".  The difference that I'm trying to point at is that §5
> says that traffic is accepted from only *one* tunnel ("primary *or*
> standby"), but §3.1.5 talks about receiving from *both*.
>
> Adding "periodically" doesn't help because the basic contradiction
> (receiving from one or both) is still there.
>
GIM2>> Thank you for further clarifying your point. I propose to update
that bullet point in Section 5 as follows:
   o  a policy controls downstream PEs from which tunnel to accept
      traffic.  For example, the policy could be based on the status of
      the tunnel or tunnel monitoring method (Section 3.1.5).

>
>
>
>
> > > (13) §3.1.6: "An implementation that does not recognize or is
> configured
> > > not to support this attribute MUST follow procedures defined for
> optional
> > > transitive path attributes in Section 5 of [RFC4271]."
> > >
> > > There cannot be a Normative action specified for a node that "does not
> > > recognize...this attribute" because, by definition, it can't be assumed
> > > that it is aware of this specification. In this case, it is not
> necessary
> > > to say anything about unrecognized attributes because that is already
> > > specified in rfc4271.
> > >
> > > For the "configured not to support this attribute" case, it should be
> > > pointed out that the node should operate as if the attribute was
> > > unrecognized.
> > >
> > > Suggestion>
> > > An implementation that is configured not to support this attribute MUST
> > > follow the procedures defined in Section 5 of [RFC4271] as if the
> attribute
> > > was unrecognized.
> >
> > GIM>> Thank you for pointing this out. In the course of addressing
> comments
> > from other IESG reviewers, this text was updated to: This document
> defines
> > the format and ways of using a new BGP attribute called the "BFD
> > Discriminator". It is an optional transitive BGP attribute. Thus it is
> > expected that an implementation that does not recognize or is configured
> not
> > to support this attribute follows procedures defined for optional
> transitive
> > path attributes in Section 5 of [RFC4271].
> > Would the current text address your concern?
>
> Mmmm...sure.
>
> rfc4271 doesn't talk about the case where a node is "configured not to
> support" an attribute.  The reason I suggested the specific link
> ("configured not to support this attribute...as if the attribute was
> unrecognized.") is that the behavior of ignoring (which may be an
> interpretation of "configured not to support") is different than if
> the attribute was unrecognized.  The text above doesn't make that link
> and, while some implementations may interpret it the way you meant it,
> others may not.
>
GIM2>> Thank you for the suggested clarification of the text. Hope the
updated text is clearer:
NEW TEXT:
   Thus it is expected that an implementation
   that does not recognize or is configured not to support this
   attribute, as if the attribute was unrecognized, follows procedures
   defined for optional transitive path attributes in Section 5 of
   [RFC4271].
>
>
>
>
>
> > > (15) §3.1.6: "The BFD Discriminator attribute MUST be considered
> malformed
> > > if its length is not a non-zero multiple of four." Ok, except that the
> > > specification of the attribute doesn't mention the length (only the
> length
> > > of the TLVs). Please specify the length and any considerations related
> to
> > > the Extended Length bit. Also, given that this is a new attribute,
> with an
> > > unspecified potential number of TLVs, and that the length is apparently
> > > unbounded, all leading to the potential need for extended messages,
> please
> > > specify how to handle peers that cannot accommodate more than 4k octet
> > > messages (rfc8654).
> >
> > GIM>> I've noticed a note from Jeff Hass. Would you agree with his
> opinion?
>
> Partially. ;-)
>
> He's right about the Extended Length bit.
>
>
> My reference to rfc8654 is because of this text from §5:
>
>    It is RECOMMENDED that BGP protocol developers and implementers are
>    conservative in their application and use of BGP Extended Messages.
>    Future protocol specifications MUST describe how to handle peers that
>    can only accommodate 4,096 octet messages.
>
> As I mentioned before, it concerns me that the attribute (while having
> a narrow use in this document) can have wider applicability.  The
> extensibility is significant by allowing sequential or nested TLVs.
> This combination may result in a large attribute, leading to large
> Updates, specially when considering other MVPN-related
> attribute/communities, etc.
>

> Considering the text in rfc8654, the question is: is the attribute
> necessary always, or can it be removed/omitted in some cases?  This is
> not an attribute that is necessary for route selection, for example.
> Given that rfc4271 says this:
>
>    If, due to the limits on the maximum size of an UPDATE message (see
>    Section 4), a single route doesn't fit into the message, the BGP
>    speaker MUST not advertise the route to its peers and MAY choose to
>    log an error locally.
>
> ...it may be possible that by not propagating the new attribute the
> size of the update will be reduced so that the advertisement can be
> made.  OTOH, if the attribute is removed then the tunnel can't be
> monitored.
>
> All I want is for you to consider any potential implications of the
> new attribute.  The conclusion may be that no further change is
> needed.
>
GIM2>> Thank you for further clarifying your comment. We are not defining
any extensions to the BFD Discriminator attribute and it may not be
possible to imagine all the potential extensions that could be proposed in
the future. It seems that it will be more appropriate to review the impact
of the specific extension of the BFD Discriminator attribute when it will
be proposed. At that time, we'll need to consider that excluding the BFD
Discriminator attribute from an update will be interpreted as the removal
of the corresponding p2mp BFD session. That would not cause a switchover
but just make MVPN failover to rely on the convergence of the BGP control
plane.

>
>
>
>
> ...
> > > (17) §3.1.6.1: "MUST use its IP address as the source IP  address"
> Which
> > > address? Please be specific.
> >
> > GIM>> Section 3.1.6.1 includes the list that an Upstream PE is required
> to
> > follow: To enable downstream PEs to track the P-tunnel status using a
> point-
> > to-multipoint (P2MP) BFD session the Upstream PE:
> > ....
> > o MUST use its IP address as the source IP address when transmitting
> > BFD Control packets;
> > Would adding the reference to the Upstream PE address your concern:
> > o MUST use its, i.e., the Upstream PE's, IP address as the source IP
> address
> > when transmitting BFD Control packets;
>
> No -- the question is which address of the Upstream PE?  I'm assuming
> that an upstream PE can have several addresses, which one is to be
> used?  I can guess it is the address of the interface used to reach
> the destination (or something like that)...but I would just want it to
> be clear.
>
GIM2>> Thank you, I now better understand the question. It it the PE
address in the given VRF. I propose the following update:
NEW TEXT:
    o  MUST use its PE address as the source IP address when transmitting
      BFD Control packets;
Also, I think that the text in the next section that defines the process on
a downstream PE needs clarification. I propose to update text as follows:
OLD TEXT:
   o  MUST use the source IP address of the BFD Control packet, the
      value of the BFD Discriminator field, and the x-PMSI Tunnel
      Identifier [RFC6514] the BFD Control packet was received on to
      properly demultiplex BFD sessions.
NEW TEXT:
   o  to properly demultiplex BFD session MUST use:

         the address of the PE that included the BFD Discriminator
         attribute in the x-PMSI A-D Route;

         the value of the BFD Discriminator field in the BFD
         Discriminator attribute;

         the x-PMSI Tunnel Identifier [RFC6514] the BFD Control packet
         was received on.

>
>
>
>
> > > (18) §3.1.6.2: If the IP address doesn't map correctly at the
> downstream
> > > PE (for example, a different local address is used that doesn't
> correspond
> > > to the information in the PMSI attribute), what action should it take?
> Can
> > > the tunnel still be monitored?
> >
> > GIM>> There's a possibility that the same downstream PE is monitoring
> more
> > than one P-tunnel. Since each Upstream PE assigns its own BFD
> Discriminator,
> > there's a chance that the same value is picked by more than one Upstream
> PE.
> > According to Section 5.7 of the RFC 8562:
> > IP and MPLS multipoint tails MUST demultiplex BFD packets based on a
> > combination of the source address, My Discriminator, and the identity
> > of the multipoint path that the multipoint BFD Control packet was
> > received from. Together they uniquely identify the head of the
> > multipoint path.
> >
> > We may consider adding the source address in the BFD Discriminator
> attribute
> > as an optional TLV. I think that might be a good extension that can be
> > introduced in a new document.
>
> Why wait for a new document?  You made a pretty good case for
> signaling the source address.
>
GIM2>> I'd like to defer this question to our AD and BESS WG Chairs.

>
>
>
> ...
> > > (22) §4: "Such behavior is referred to as "revertive" behavior and
> MUST be
> > > supported." The text around this sentence seems to indicate that the
> > > revertive behavior is the default, is that the intent? Or if the intent
> > > for it just to be supported (as written)? Please be clear.
> >
> > GIM>> This part of the document has been updated in the course of
> addressing
> > IESG comments:
> > Such behavior is referred to as "revertive" behavior and MUST be
> supported.
> > Non-revertive behavior refers to the behavior of continuing to select the
> > backup PE as the UMH even after the Primary has come up. This
> non-revertive
> > behavior MAY also be supported by an implementation and would be enabled
> > through some configuration. Selection of the behavior, revertive or
> > non-revertive, is an operational issue, but it MUST be consistent on all
> PEs
> > in the given MVPN.
> >
> > Do you find the updated text clear enough?
>
> Yes.  But it now brings up a new question: can you provide operational
> guidance on the selection of this behavior?
>
GIM2>> I think that the decision to prefer revertive behavior is based on
the resource allocation model for primary and standby P-tunnels. In cases
where the switchover to the standby tunnel does not affect other services
and provides the required quality of service, an operator might use
non-revertive behavior to avoid unnecessary in such case switchover and
thus minimize disruption to the multicast service. I'd propose that be
added as follows:
NEW TEXT:
While revertive is considered the default
   behavior, there might be cases where the switchover to the standby
   tunnel does not affect other services and provides the required
   quality of service.  An operator might use non-revertive behavior to
   avoid unnecessary in such case switchover and thus minimize
   disruption to the multicast service.

>
>
>
> > > (23) §4.1: "...routes that carry the "Standby PE" BGP Community MUST
> have
> > > the LOCAL_PREF attribute set to zero." What should a receiver do if the
> > > LOCAL_PREF is not zero?
> >
> > GIM>> I believe that the preceding text describes the situation when the
> > LOCAL_PREF != 0:
> > ... two different downstream PEs
> > consider different Upstream PEs to be the primary one. In that case,
> > without any precaution taken, both Upstream PEs would process a
> > standby C-multicast route and possibly stop forwarding at the same
> > time.
>
> No, it doesn't.  The text specifies that the update "MUST have the
> LOCAL_PREF attribute set to zero".  The question is not about the
> potential effect, but about the value being anything else (> 0).  Note
> that the value can be low enough (> 0) and no adverse effect may take
> place, but the text doesn't specify a "low value", it explicitly
> specifies 0.
>
GIM2>> I think I see it better now. Would the following text work:
NEW TEXT:
   For this purpose, routes that carry the Standby PE BGP
   Community MUST have the LOCAL_PREF attribute set to the value lower
   than the value specified as the LOCAL_PREF attribute for the route
   that does not carry the Standby PE BGP Community.  The value of zero
   is RECOMMENDED.
Also, the text regarding setting the LOCAL_PREF after the switchover may
need, in my view, the update to not require it is set to zero:
NEW TEXT:
   The new Upstream PE MUST set the LOCAL_PREF attribute
   for that C-multicast route to the same value as when the Standby PE
   BGP Community was included in the advertisement.


>
>
>
> > > (24) §4.1: In the last paragraph of this section, if I follow
> correctly,
> > > the text talks about the case where the standby becomes the primary and
> > > the updated advertisement doesn't have the Standby PE community. If
> that
> > > is correct, then s/ presence/absence of the Standby PE BGP Community/
> > > absence of the Standby PE BGP Community
> > >
> > > Also, the last sentence says that the "LOCAL_PREF attribute MUST be
> set to
> > > zero". If the community is not present, how can a receiver enforce
> this?
> > > What action should it take if the LOCAL_PREF has a different value?
> >
> > GIM>> Thank you for the suggestion, it clarifies the text. As I
> understand,
> > the requirement is for the Standby Upstream PE, not for a downstream PE.
> Would
> > the following update make that clearer:
> > OLD TEXT:
> > The LOCAL_PREF attribute MUST be set to zero.
> > NEW TEXT:
> > The new Upstream PE MUST set the LOCAL_PREF attribute to
> > zero for that C-multicast route.
>
> I don't know what the difference in the meaning is. :-(
>
> There are three parts to my question:
>
> (1) If the community is not present, then this would be a "normal"
> Update.  How does the receiver know the difference so it can enforce
> the "MUST"?
>
> (2) What should the receiver do if the value is not as specified?
>
> (3) Given that the LOCAL_PREF for Updates *with* the community MUST be
> 0, what's the use of sending Updates *without* the community with the
> same value?
>
GIM2>> Had the change above addressed this comment?