Re: [bess] Benjamin Kaduk's Discuss on draft-ietf-bess-mvpn-fast-failover-13: (with DISCUSS and COMMENT)

Greg Mirsky <gregimirsky@gmail.com> Wed, 16 December 2020 20:18 UTC

Return-Path: <gregimirsky@gmail.com>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2316D3A0FC5; Wed, 16 Dec 2020 12:18:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.696
X-Spam-Level:
X-Spam-Status: No, score=-0.696 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_COMMENT_SAVED_URL=1.391, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_HTML_ATTACH=0.01, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XjYvhng2nu6I; Wed, 16 Dec 2020 12:18:09 -0800 (PST)
Received: from mail-lf1-x12e.google.com (mail-lf1-x12e.google.com [IPv6:2a00:1450:4864:20::12e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 622DB3A0F39; Wed, 16 Dec 2020 12:18:08 -0800 (PST)
Received: by mail-lf1-x12e.google.com with SMTP id m12so51649881lfo.7; Wed, 16 Dec 2020 12:18:08 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=PhHhkvNu0E8VTohwXSDTW0u1aAOlyuV2JsprGGJpi+8=; b=X4jTDSOLi7u8SRSs4Ch0/dqmm0jdO6/FtbyfSVWT0alJRifH06iec2OQMe+AcXDTF0 IP6QOi3y1PVtySLbc3mtLawFJCpG5/Gp7zF+u6UshQ7yU8D2ZJFmL4UCcYBihiCBzNpR HSpxrB94J+Vos3ZdsBbEe9HGWCKPlNuTf8G308svcT+jd1Ow8BeaaQELN1OA32uzDfhu 9r7Ha/JvKLbsL372iR/w3bfXk/Couyc+qKMxwrH24iJwVIkoW0NO5+SVYHhmSIk7D5P2 CWUcttHuHPYyyNrWV8zb9PDaKOZlgXnwydB4WEwvQYpQN6UgNNZauq4/kBAKuwfpXQn4 z35g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=PhHhkvNu0E8VTohwXSDTW0u1aAOlyuV2JsprGGJpi+8=; b=EQB1F03TiBirP6npeE9h0rlBZfz7uG4DWAf+tawLYiLp5nI4rAUGiO88hJZ/qW4X2e 4Z7mXuWt/6e3E7UU8AkspO+ky/Rr9iLlhp9Ex6HYKzpDc2XRGIVuwv+lnVzbJpbDHwGM 7iBJZvwiVTvSU9gvr0lJWW2K5iasYW2jrspRr85ocqnoTUuLcV5bGpvV8/J+YNyJMpp3 4t3yEKDD29pbfiojO3Ih5g4cYrFjs+wGljVuTuMT2dd3Mv67M0bD1nZQkmp/geCGqmk7 4ArZUb98i7/v3CtX+PvtxnVbzCkQmtL7Lal4ZRO097V5S3fFR4ZStq9GNncfgOKDcIRl 5L+Q==
X-Gm-Message-State: AOAM531tyThTPKp+gbwZFfrOxhRjaweXQCyMeiSmtNf4nlcrUsO4K5is pwF/HJ/1KF/XtaMFq2kMCnIlZtBOhLedBT5HVYg=
X-Google-Smtp-Source: ABdhPJx2g4JE7NvTfSWOXsf5mvWsaU8gG5y9x49p2PxbjkEjPq4plXyC5l56ZbTw0AZi0nsU73r7oqLRI+OlgfJ67o4=
X-Received: by 2002:a05:6512:3305:: with SMTP id k5mr13119994lfe.35.1608149886202; Wed, 16 Dec 2020 12:18:06 -0800 (PST)
MIME-Version: 1.0
References: <160799346570.23086.9963825503754113381@ietfa.amsl.com>
In-Reply-To: <160799346570.23086.9963825503754113381@ietfa.amsl.com>
From: Greg Mirsky <gregimirsky@gmail.com>
Date: Wed, 16 Dec 2020 12:17:55 -0800
Message-ID: <CA+RyBmUMjPkJ836CUAGN8eevWhS8BEdiFn7WzgAtu3zseUf8Uw@mail.gmail.com>
To: Benjamin Kaduk <kaduk@mit.edu>
Cc: The IESG <iesg@ietf.org>, draft-ietf-bess-mvpn-fast-failover@ietf.org, bess-chairs@ietf.org, BESS <bess@ietf.org>, Stephane Litkowski <slitkows.ietf@gmail.com>
Content-Type: multipart/mixed; boundary="000000000000b6a4ab05b69a9222"
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/AZRM4-B2RkviBSiiYsRD3AOFdDE>
Subject: Re: [bess] Benjamin Kaduk's Discuss on draft-ietf-bess-mvpn-fast-failover-13: (with DISCUSS and COMMENT)
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Dec 2020 20:18:24 -0000

Hi Ben,
thank you for the review, and your detailed comments and direct questions.
Please find my answers and proposed updates in-lined below under the GIM>>
tag.
Attached, please find, the diff highlighting the updates and the new
working version of the draft.

Regards,
Greg

On Mon, Dec 14, 2020 at 4:51 PM Benjamin Kaduk via Datatracker <
noreply@ietf.org> wrote:

> Benjamin Kaduk has entered the following ballot position for
> draft-ietf-bess-mvpn-fast-failover-13: Discuss
>
> When responding, please keep the subject line intact and reply to all
> email addresses included in the To and CC lines. (Feel free to cut this
> introductory paragraph, however.)
>
>
> Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
> for more information about IESG DISCUSS and COMMENT positions.
>
>
> The document, along with other ballot positions, can be found here:
> https://datatracker.ietf.org/doc/draft-ietf-bess-mvpn-fast-failover/
>
>
>
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
>
> Let's talk about what the requirements are for consistency across PEs in
> the algorithm for selecting the Primary Upstream PE.  Section 4 notes
> that "all the PEs of that MVPN [are required] to follow the same UMH
> selection procedure", but leaves the option of non-revertive behavior as
> something that "MAY also be supported by an implementation", without
> requirement for consistency across all PEs.  It seems to me that if some
> PEs use non-revertive behavior and others do not, then they will
> disagree as to which PE is the Primary (or active) PE in some cases,
> which seems to conflict with the initial guidance that all PEs needed to
> pick the same one.  Is it perhaps that the PEs need to agree on which PE
> is to be advertised as Primary but not necessarily to actually be using
> that one for traffic?  Or am I missing something?
>
GIM>> Thank you for pointing out this inconsistency. I agree that the text
needs some tightening. Below is the proposed update:
OLD TEXT:
   Such behavior is referred to as
   "revertive" behavior and MUST be supported.  Non-revertive behavior
   refers to the behavior of continuing to select the backup PE as the
   UMH even after the Primary has come up.  This non-revertive behavior
   MAY also be supported by an implementation and would be enabled
   through some configuration.
NEW TEXT:
Such behavior is referred to as
   "revertive" behavior and MUST be supported.  Non-revertive behavior
   refers to the behavior of continuing to select the backup PE as the
   UMH even after the Primary has come up.  This non-revertive behavior
   MAY also be supported by an implementation and would be enabled
   through some configuration.  Selection of the behavior, revertive or
   non-revertive, is an operational issue, but it MUST be consistent on
   all PEs in the given MVPN.

>
>
> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
>
> Section 1
>
>    Section 3 describes local procedures allowing an egress PE (a PE
>    connected to a receiver site) to take into account the status of
>    P-tunnels to determine the Upstream Multicast Hop (UMH) for a given
>    (C-S, C-G).  [...]
>
> Does it also apply to (C-*, C-G)?  (I'll just mention it once, but the
> handling seems to be somewhat inconsistent throughout the document, with
> (C-*,C-G) getting mentioned sometimes but not always, and no pattern
> obvious to me for when it is or is not included.  I think I see some
> instances where (C-*, C-G) does not make sense, so it would probably not
> be a universal replacement.)
>
GIM>> Yes, it cannot be used interchangeably. We've followed the notation
as defined in the last paragraph of Section 3.1 RFC 6513:
   ... C-group address would be a group address in a
   VPN's address space.  A C-tree is a multicast distribution tree
   constructed and maintained by the PIM C-instances.  A C-flow is a
   stream of multicast packets with a common C-source address and a
   common C-group address.  We will use the notation "(C-S,C-G)" to
   identify specific C-flows.  If a particular C-tree is a shared tree
   (whether unidirectional or bidirectional) rather than a source-
   specific tree, we will sometimes speak of the entire set of flows
   traveling that tree, identifying the set as "(C-*,C-G)".
It is my understanding, that the one reference to (C-*,C-G) in the draft is
in Section 3 bullet B:
          The S-PMSI can be advertised only after the
          Upstream PE receives a C-multicast route for (C-S, C-G)/(C-*,
          C-G) to be carried over the advertised S-PMSI.
The reference is not an introduced text but an informational summary of
Section 5.1 RFC 6513. The text preceding that reference is intended to
clarify its status:
  There are three options specified in Section 5.1 of [RFC6513] for a
   downstream PE to select an Upstream PE.
Would you suggest an additional text?

>
>    Section 5 describes a "hot leaf standby" mechanism that can be used
>    to improve failover time in MVPN.  The approach combines mechanisms
>    defined in Section 3 and Section 4 has similarities with the solution
>    described in [RFC7431] to improve failover times when PIM routing is
>    used in a network given some topology and metric constraints.
>
> nit: grammar issue around "has similarities with" (maybe needs a leading
> "and"?)
>
GIM>> Thank you. Updated to:
NEW TEXT:
   Section 5 describes a "hot leaf standby" mechanism that can be used
   to improve failover time in MVPN.  The approach combines mechanisms
   defined in Section 3 and Section 4, and has similarities with the
   solution described in [RFC7431] to improve failover times when PIM
   routing is used in a network given some topology and metric
   constraints.
>
>
>    VPNs.  An operator would enable these mechanisms using a method
>    discussed in Section 3 in combination with the redundancy provided by
>    a standby PE connected to the source of the multicast flow, and it is
>    assumed that all PEs in the network would support these mechanisms
>    for the procedures to work.  In the case that a BGP implementation
>
> Is it a matter of "the procedure will not work at all unless all PEs in
> the network support it", or "only the PEs that support it will get the
> benefits of it"?  [The next sentence suggests an anwer...]
>
GIM>> The sentence might be too long. Yes, only PEs that support the new
Standby PE community and use any of UMH monitoring methods would converge
faster than PEs that don't support both features. Would the re-wording as
below make the text clear:
NEW TEXT:
   An operator would enable these mechanisms using a method
   discussed in Section 3 combined with the redundancy provided by a
   standby PE connected to the multicast flow source.  PEs that support
   these mechanisms would converge faster and thus provide a more stable
   multicast service.

>
> Section 3
>
>    Section 9.1.1 of [RFC6513] are applicable when using I-PMSI
>    P-tunnels.  That document is a foundation for this document, and its
>    processes all apply here.  Section 9.1.1 mandates the use of specific
>    procedures for sending intra-AS I-PMSI A-D Routes.
>
> (nit) the second "Section 9.1.1" is also referring to RFC 6513, not this
> document, which would be the default interpretation of a bare section
> reference.


> (not-nit) The referenced procedure seems to be about processing, not
> sending, intra-AS I-PMSI A-D routes.  Am I misreading something?
>
GIM>> You are right. The sentence mischaracterizes Section 9.1.1 and has no
informational value. Removed it altogether.

>
> Section 3.1
>
>    Different factors can be considered to determine the "status" of a
>    P-tunnel and are described in the following sub-sections.  The
>    optional procedures described in this section also handle the case
>    the downstream PEs do not all apply the same rules to define what the
>    status of a P-tunnel is (please see Section 6), and some of them will
>    produce a result that may be different for different downstream PEs.
>
> nit: I think it's better to put a word like "where" in "the case the
> downtream PEs".
>
GIM>> I've tried it like the following:
NEW TEXT:
   The
   optional procedures described in this section also handle the case
   when the downstream PEs do not all apply the same rules to define
   what the status of a P-tunnel is (please see Section 6), and some of
   them will produce a result that may be different for different
   downstream PEs.

>
> Section 3.1.3
>
>    corresponding P-tunnel MUST be re-evaluated.  If the P-tunnel
>    transitions from Up to Down state, the Upstream PE that is the
>    ingress of the P-tunnel MUST NOT be considered a valid UMH.
>
> (nit?) I'm not sure how much precedent there is for using "valid" in
> this context -- IIUC the previous discussion of this process referred
> only to whether a PE is a candidate for being the UMH.
>
GIM>> I agree, "candidate" is missing here. Proposed update:
NEW TEXT:
   If the P-tunnel
   transitions from Up to Down state, the Upstream PE that is the
   ingress of the P-tunnel MUST NOT be considered as a valid candidate
   UMH.

>
> Section 3.1.5
>
>    When such a procedure is used, in the context where fast restoration
>    mechanisms are used for the P-tunnels, a configurable timer MUST be
>    set on the downstream PE to wait before updating the UMH, to let the
>    P-tunnel restoration mechanism to execute its actions.  An
>    implementation SHOULD use three seconds as the default value for this
>    timer.
>
> How does this interact with the value of the maximum inter-packet time?
> Suppose that I know to expect at least one packet every ten seconds.  Do
> I wait ten seconds after receiving the last packet and then another
> three seconds, before engaging in an UMH change?
>
GIM>> This scenario is similar to the use of an active OAM detecting a
network failure. The role of the timer to trigger an action if a certain
number of packets have not arrived. Since the maximum inter-packet time is
known, a downstream PE has an expectation of receiving a packet within a
time interval large than the maximum inter-packet interval. In practice,
the timer could be set three times the maximum inter-packet interval, so
that it expires if three consecutive packets were not received. In the case
you've described, I think that the timer must be set larger than 10
seconds, probably 30+ seconds. Would you suggest any additional text here?

>
>    In cases where this mechanism is used in conjunction with the method
>    described in Section 5, no prior knowledge of the rate of the
>    multicast streams is required; downstream PEs can compare reception
>    on the two P-tunnels to determine when one of them is down.
>
> This feels a little underspecified; is there a reference or more
> guidance that we could give about turning a stream of received packets
> on one tunnel into a maximum inter-packet time on another tunnel,
> supposedly carrying the same traffic?
>
GIM>> I think that this text refers to 1+1 protection, i.e., Active-Active
P-tunnels. In that scenario, the determination of the P-tunnel's state can
be done by comparing it to the reception state of the other P-tunnel in the
redundancy group. But there might be corner cases, like a significant delay
in one of P-tunnels, that may need consideration before recommending this
method. I don't think that it is in the scope of the document. Would
appending the paragraph with "The detailed specification of this mechanism
is outside the scope of this document" be acceptable?

>
> Section 3.1.6
>
>       *  one octet-long field of TLV's Type value (Section 7.3)
>
>       *  one octet-long field of the length of the Value field in octets
>
>       *  variable length Value field.
>
>       The length of a TLV MUST be multiple of four octets.
>
> I assume this is the total length, not the value in the length field?
>
GIM>> Correct. Would the following update make it clearer?
NEW TEXT:
      Figure 2 presents the Optional TLV format TLV that
      consists of:

      *  Type - a one-octet-long field that characterizes the
         interpretation of the Value field (Section 7.3)

      *  Length - a one-octet-long field equal to the length of the
         Value field in octets

      *  Value - a variable-length field.

      The length of a TLV as a whole MUST be multiple of four octets.

>
>    The BFD Discriminator attribute MUST be considered malformed if its
>    length is not a non-zero multiple of four.  If the attribute
>    considered malformed, the UPDATE message SHALL be handled using the
>    approach of Attribute Discard per [RFC7606].
>
> nit: s/attribute considered/attribute is considered/
>
GIM>> Thank you! Updated text to:
NEW TEXT:
   The BFD Discriminator attribute MUST be considered malformed if its
   length is not a non-zero multiple of four.  If the attribute is
   deemed to be malformed, the UPDATE message SHALL be handled using the
   approach of Attribute Discard per [RFC7606].

>
> Section 3.1.6.1
>
>    o  MUST periodically transmit BFD Control packets over the x-PMSI
>       P-tunnel after the P-tunnel is considered established.  Note that
>       the methods to declare a P-tunnel has been established are outside
>       the scope of this specification.
>
> Is there a good reference for how to choose the period of transmission?
>
GIM>> Not really. There are many factors that an operator should consider
when configuring the frequency of BFD packets on the MultipointHead system.
One of the aspects to keep in mind is that unlike p2p BFD, there's no
interval negotiation phase in p2mp BFD. As a result, a tail has no
influence over the interval at which the head of the p2mp BFD session
transmits BFD Control messages.

>
>    If the tracking of the P-tunnel by using a P2MP BFD session is
>    enabled after the x-PMSI A-D Route has been already advertised, the
>    x-PMSI A-D Route MUST be re-sent with precisely the same attributes
>    as before and the BFD Discriminator attribute included.
>
> Pedantically, it seems like "precisely the same attributes as before"
> is incompatible with adding the BFD Discriminator attribute.  Phrasing
> that discusses "the only change between the previous advertisement and
> the new advertisement" would not suffer from such a potential issue.
> (And similarly for when the BFD Discriminator attribute is to be
> removed, a couple paragraphs later.)
>
GIM>> Great, thank you. Applied in both cases:
NEW TEXT:
   If the tracking of the P-tunnel by using a P2MP BFD session is
   enabled after the x-PMSI A-D Route has been already advertised, the
   x-PMSI A-D Route MUST be re-sent with the only change between the
   previous advertisement and the new advertisement to be the inclusion
   of the BFD Discriminator attribute.
and
   o  x-PMSI A-D Route MUST be re-sent with the only change between the
      previous advertisement and the new advertisement be the exclusion
      of the BFD Discriminator attribute;

>
> Section 3.1.6.2
>
>    o  MUST use the source IP address of the BFD Control packet, the
>       value of the BFD Discriminator field, and the x-PMSI Tunnel
>       Identifier [RFC6514] the BFD Control packet was received to
>       properly demultiplex BFD sessions.
>
> nit: missing word around "the BFD Control packet was received" (maybe
> "received on/in"?).
>
GIM>> "on" seems the better option. Updated accordingly.

>
>    According to [RFC8562], if the downstream PE receives Down or
>    AdminDown in the State field of the BFD Control packet or associated
>    with the BFD session Detection Timer expires, the BFD session is
>
> nit: "the BFD Detection Timer associated with the BFD session expires"
>
GIM>> Thank you for the helpful suggestion. Updated.

>
>    PE, while others are considered as Standby Upstream PEs.  In such a
>    scenario, when the P-tunnel is considered down, the downstream PE MAY
>    initiate a switchover of the traffic from the Primary Upstream PE to
>    the Standby Upstream PE only if the Standby Upstream PE is deemed
>    available.
>
> I'm not sure that we've defined what it means for an Upstream PE to be
> deemed "available', yet.  I guess it's possible that there is not an
> established P-Tunnel between the (selected) Standby Upstream PE and the
> donstream PE, so just using the Up/Down/not-known-to-be-Down status of
> that P-tunnel is not an option...
>
GIM>> The wording is sloppy, agree. I think that the intention was to say
"deemed in the Up state". That can be determined using the p2mp BFD session
with Standby Upstream PE acting as its MultipointHead. The proposed update
is as follows:
NEW TEXT:
   In such a scenario, when the P-tunnel is considered
   down, the downstream PE MAY initiate a switchover of the traffic from
   the Primary Upstream PE to the Standby Upstream PE only if the
   Standby Upstream PE is deemed to be in the Up state.  That MAY be
   determined from the state of a P2MP BFD session with the Standby
   Upstream PE as the MultipointHead.

>
>    If the downstream PE's P-tunnel is already established when the
>    downstream PE receives the new x-PMSI A-D Route with BFD
>    Discriminator attribute, the downstream PE MUST associate the value
>    of BFD Discriminator field with the P-tunnel and follow procedures
>    listed above in this section if and only if the x-PMSI A-D Route was
>    properly processed as per [RFC6514], and the BFD Discriminator
>    attribute was validated.
>
> We did not discuss any validation of the BFD Discriminator attribute in
> §3.1.6; what procedures would this process entail?
>
GIM>> There's, so far, only one validation condition:

The length of a TLV as a whole MUST be multiple of four octets.


> Section 4
>
>    The procedures described below are limited to the case where the site
>    that contains C-S is connected to two or more PEs, though, to
>    simplify the description, the case of dual-homing is described.  The
>
> I suggest giving at least some considerations to how to choose between
> multiple standby Upstream PEs when there are more than one available.
>
GIM>> I understand your idea but that might be a whole new specification
similar to the selection of UMH from the list of candidates. Perhaps
stating that the selection might use known methods but the specifics are
outside the scope of this document be acceptable? For example (sorry for a
longer quote):
NEW TEXT:
   The procedures described below are limited to the case where the site
   that contains C-S is connected to two or more PEs, though, to
   simplify the description, the case of dual-homing is described.  In
   the case where more than two PEs are connected to the C-s site,
   selection of the Standby PE can be performed using one of the methods
   of selecting a UMH.  Details of the selection are outside the scope
   of this document.  The procedures require all the PEs of that MVPN to
   follow the same UMH selection procedure, as specified in [RFC6513],
   whether the PE selected based on its IP address, hashing algorithm
   described in section 5.1.3 of [RFC6513], or Installed UMH Route.  The
   procedures assume that if a site of a given MVPN that contains C-S is
   dual-homed to two PEs, then all the other sites of that MVPN would
   have two unicast VPN routes (VPN-IPv4 or VPN-IPv6) to C-S, each with
   its RD.

>
>    procedures require all the PEs of that MVPN to follow the same UMH
>    selection procedure, as specified in [RFC6513], whether the PE
>    selected based on its IP address, hashing algorithm described in
>    section 5.1.3 of [RFC6513], or Installed UMH Route.  The procedures
>
> I assume that how the PEs agree on which procedure is in use does not
> involve something being advertised in-band, and is out of scope for this
> document.  But please say so!
>
GIM>> You are right, that is an operational issue and the management
plane's responsibility.
NEW TEXT:
The procedures require all the PEs of that MVPN to
   follow the same UMH selection procedure, as specified in [RFC6513],
   whether the PE selected based on its IP address, the hashing
   algorithm described in section 5.1.3 of [RFC6513], or Installed UMH
   Route.  The consistency of the UMH selection method used among all
   PEs is expected to be provided by the management plane.

>
>    assume that if a site of a given MVPN that contains C-S is dual-homed
>    to two PEs, then all the other sites of that MVPN would have two
>    unicast VPN routes (VPN-IPv4 or VPN-IPv6) to C-S, each with its RD.
>
> nit: s/its RD/its own RD/
>
GIM>> Ack

> Also, please confirm that the unicast routes are *to* C-S, vs *from* it.
>
GIM>> Though it might be somewhat counterintuitive, in the context of MVPN
"to" is correct.

>
> Section 4.1
>
>    o  the NLRI is constructed as the C-multicast route with an RT that
>       identifies the Primary Upstream PE, except that the RD is the same
>       as if the C-multicast route was built using the Standby Upstream
>       PE as the UMH (it will carry the RD associated to the unicast VPN
>       route advertised by the Standby Upstream PE for S and a Route
>       Target derived from the Standby Upstream PE's UMH route's VRF RT
>       Import EC);
>
> This part is a bit confusing to me, since the first part says that the
> RT identifies the Primary Upstream PE, but the second part says that the
> RT is derived from the Standy Upstream PE's [stuff].  But I'm happy to
> trust you that the [stuff] makes it correct!
>
GIM>> Thank you for putting your trust in our collective thinking. AFAIK,
it works.

>
> Section 4.2
>
>       when the PE determines (the use of the particular method to detect
>       the failure is outside the scope of this document) that C-S is not
>       reachable through some other PE, the PE SHOULD install VRF PIM
>
> It seems like a forward reference to §4.3 might be helpful.
>
GIM>> Thank you for your suggestion, the reference is added in the working
version.

>
>    Section 9.3.2 of [RFC6514], describes the procedures of sending a
>    Source-Active A-D Route as a result of receiving the C-multicast
>    route.  These procedures MUST be followed for both the normal and
>    Standby C-multicast routes.
>
> There is no section 9.3.2 in RFC 6514.  There is a 9.2.3 that looks
> perhaps plausible, though the string "Source-Active" does not appear in
> it.
>
GIM>> Great catch, thank you! I believe that the correct section is in RFC
6513, not RFC 6514. The former opens with:
   The issue described in Section 9.3.1 is resolved through the use of
   Source Active A-D routes.  In the remainder this section, we provide
   an example of how this works, along with an informal description of
   the procedures.
Would you agree RFC 6513 makes sense?

>
> Section 4.4.2
>
>    Source AS carried in the C-multicast route.  If the match is found,
>    and the C-multicast route carries the Standby PE BGP Community, then
>    the ASBR MUST perform as follows:
>
> (I assume that there is room for local policy to modify this "MUST",
> e.g., if needed to protect against some form of attack ... perhaps it
> even goes without saying.)
>
GIM>> Indeed. Perhaps the following change makes it more accurate:
NEW TEXT:
   If the match is found,
   and the C-multicast route carries the Standby PE BGP Community, then
   the ASBR implementation that supports this specification MUST be
   configurable to perform as follows:

   o  if the route was received over iBGP and its LOCAL_PREF attribute
      is set to zero, then it MUST be re-advertised in eBGP with a MED
      attribute (MULTI_EXIT_DISC) set to the highest possible value
      (0xffff)

   o  if the route was received over eBGP and its MED attribute set to
      0xffff, then it MUST be re-advertised in iBGP with a LOCAL_PREF
      attribute set to zero

   Other ASBR procedures are applied without modification and, when
   applied, MAY modify the above-listed behavior.

>
> Section 5
>
>    o  Upstream PEs use the "hot standby" optional behavior and thus will
>       forward traffic for a given multicast state as soon as they have
>       whether a (primary) BGP C-multicast route or a Standby BGP
>       C-multicast route for that state (or both)
>
> nit: the grammar is a bit weird here, after "as soon as they have"; I'm
> not confident that I could make an accurate suggestion for a fix.
>
GIM>> Would with a minor update it all reads better:
NEW TEXT:
      Upstream PEs use the "hot standby" optional behavior and thus will
      start forwarding traffic for a given multicast state after they
      have whether a (primary) BGP C-multicast route or a Standby BGP
      C-multicast route for that state (or both)

>
> Section 6
>
> I could almost see the discussion of duplicate packets as being a
> subsection of the security considerations, though I don't mind leaving
> it as-is.
>
GIM>> Thank you for agreeing.

>
> Section 8
>
> We could perhaps make some pro forma note that the BFD Discriminator
> attribute, like all BGP attributes, typically does not benefit from
> cryptographic integrity protection and thus could be spoofed so as to be
> different than what is actually used by the multipoint BFD head.  That
> said, I'm willing to let this fall under the incorporated-by-reference
> BGP security considerations.
>
GIM>> Thank you.

>
> Is it worth noting that operating in "hot" standby mode will increase
> the general level of traffic on the VPN and thus susceptibility to DoS?
>
GIM>> We use hot standby in the control plane only. That would add some BGP
traffic but would not as much as 1+1 protection in the data plane. I think
that the amount of the additional load in the VPN with the "hot standby"
defined in the draft unlikely to make PEs more volnurable to DoS. What do
you think?

>
>    This document uses P2MP BFD, as defined in [RFC8562], which, in turn,
>    is based on [RFC5880].  Security considerations relevant to each
>    protocol are discussed in the respective protocol specifications.  An
>    implementation that supports this specification MUST use a mechanism
>    to control the maximum number of P2MP BFD sessions that can be active
>    at the same time.
>
> What is the objective that this control is designed to achieve?  I can
> "control the maximum number of sessions" by asserting the maximum number
> to be an absurdly large value, but I don't think that would meet the
> spirit of this requirement (it does meet the letter of the requirement).
>
GIM>> Though this recommendation may look as too vague, I think it is
helpful to a developer. I imagine, as we've discussed in regard to the
selection of the interval between BFD Control packets, an operator will
consider the overall load of BFD Control packets across all active BFD
sessions. Do you think that a sentence that connects the number of p2mp BFD
sessions and the rate of BFD Control packets be helpful in this context?

>
>    The methods described in Section 3.1 may produce false-negative state
>    changes that can be the trigger for an unnecessary convergence in the
>    control plane, ultimately negatively impacting the multicast service
>    provided by the VPN.  An operator is expected to consider the network
>    environment and use available controls of the mechanism used to
>    determine the status of a P-tunnel.
>
> We mentioned earlier (e.g., in §3.1) that similar negative effects can
> occur when resiliency mechanisms at different layers interact; that
> might be worth repeating here.
>
GIM>> One of such references is in Section 3.1.2:
   In many cases, it is not practical to use both protection
   methods at the same time because uncorrelated timers might cause
   unnecessary switchovers and destabilize the network.
Thus we referred to Section 3.1 as the encompassing reference to all
possible scenarios. Would you agree with that?