Re: [bess] Alvaro Retana's Discuss on draft-ietf-bess-mvpn-fast-failover-13: (with DISCUSS and COMMENT)

Greg Mirsky <gregimirsky@gmail.com> Wed, 23 December 2020 18:51 UTC

Return-Path: <gregimirsky@gmail.com>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 513F43A0803; Wed, 23 Dec 2020 10:51:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.547
X-Spam-Level:
X-Spam-Status: No, score=0.547 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_COMMENT_SAVED_URL=1.391, HTML_MESSAGE=0.001, NORMAL_HTTP_TO_IP=0.001, NUMERIC_HTTP_ADDR=1.242, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_HTML_ATTACH=0.01, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id z_4yF-W7m-De; Wed, 23 Dec 2020 10:51:15 -0800 (PST)
Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BEC1F3A0800; Wed, 23 Dec 2020 10:51:14 -0800 (PST)
Received: by mail-lf1-x131.google.com with SMTP id h22so32812964lfu.2; Wed, 23 Dec 2020 10:51:14 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=wqCnpfLF/3XueD58qGf0L11y0XVZblidqReEaFZgU1E=; b=j6YVRronZ0Wq9AmjS7QLD4Pcvek7zSqwPoWszsd/48aumCLawhtYNGb4EghnMyiqTD KDZqu4+HZpCmAg0limNLCke8kVC3baJm75QaMPhMc18zl67CFOyAdNT5ZJ1GxQ4hUByz UhuF6WL93LyvPP8vUsB44x7g2dnYcqBQRPDOOd+khhFt8YQgj1NspDR06DJoft8pwSRJ 9ZfQfF9OpIEQccKPjaFSNtBLe/zMHHN3Ap8L4qmTPexO6kponYGrTkfXgdpxeYg+qB5m LAO7demTfoVxVvuw6eroU64121BBiJmi38PqmHTvclho3XBG7KJ13uxbUOtlHuKK+lKl +cGQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=wqCnpfLF/3XueD58qGf0L11y0XVZblidqReEaFZgU1E=; b=uPaDXeNw+g1XdAJ2uk1XNoCYON9LXQD+Tji4HQinFoarEWJUgwDHdnPzzX46F0cyfE WMKU/dtw3xiiR8XEv+/bFcrwBbAIN2AOC5cwEfDQkCyJ8vXN0wlUUz3HK9CP8l52uH3C zGH9+J2zJnYseg9iUXweWYSbAFJ9sJ6/8SG1+ojPUR2UfVfMR/YccNafekEnly9PFwQm mG7YKAd87FN0vXPKpe8IY91eWEN2MmtDlA9YOVt0NAhnjeblTDB2oKTCbDWLtLS3QK77 tOdJiZT8Z3mdXgk/inNaF8toZOFqY7NOLo31a4cOQOXAhp4JjHkwm4hpTIEW2mAv53Hq HVdw==
X-Gm-Message-State: AOAM530uV01fOyc4xL16QdHC85q5sHfBIlZqN29anN3TL2SPZmfFvGBf 2OZDWh43s/+y2m1Ocvjb9iP3XZi2faC1kW9xRlE=
X-Google-Smtp-Source: ABdhPJwh+/A39l1pznhzXx+byNBLD2WQ5qyhA+0k8lo8aDtuYGD5Y/qPl7JgCG13cT+tQ+/WYIEmVagr8FpTipowYtE=
X-Received: by 2002:a2e:b6d0:: with SMTP id m16mr12206032ljo.133.1608749472779; Wed, 23 Dec 2020 10:51:12 -0800 (PST)
MIME-Version: 1.0
References: <1336556383.1214634.1608220368883.ref@mail.yahoo.com> <1336556383.1214634.1608220368883@mail.yahoo.com> <CAMMESsxqkuSMkKRt-q=PagiF8dRGda-MBAvpKGRsEXWqgbaR7w@mail.gmail.com> <CA+RyBmVRS3L51cqJgbsgYM8JOaBhmR+F=SabgP_54xOSGnZi3Q@mail.gmail.com> <CAMMESsxV36nhiXjy5bEYFuHx-CmTLHLPDDA757vuzEPpbW809A@mail.gmail.com> <CA+RyBmVyOrsavvTYTF81VmM9k9Ckp+u4BgXz4Ba3+Ocm_FpXLQ@mail.gmail.com> <CAMMESsx4XfBQHE1r6azr+J+WrEK+S4MuBK_hCXU4xhqBbE7mXA@mail.gmail.com>
In-Reply-To: <CAMMESsx4XfBQHE1r6azr+J+WrEK+S4MuBK_hCXU4xhqBbE7mXA@mail.gmail.com>
From: Greg Mirsky <gregimirsky@gmail.com>
Date: Wed, 23 Dec 2020 10:51:01 -0800
Message-ID: <CA+RyBmW4X9PROzZWOpEDtxGyg6p+h8dL_gKLXfSTL0Ts=+RARw@mail.gmail.com>
To: Alvaro Retana <aretana.ietf@gmail.com>
Cc: "bfd-chairs@ietf.org" <bfd-chairs@ietf.org>, Stephane Litkowski <slitkows.ietf@gmail.com>, "draft-ietf-bess-mvpn-fast-failover@ietf.org" <draft-ietf-bess-mvpn-fast-failover@ietf.org>, "bess-chairs@ietf.org" <bess-chairs@ietf.org>, The IESG <iesg@ietf.org>, "bess@ietf.org" <bess@ietf.org>
Content-Type: multipart/mixed; boundary="000000000000dbaffc05b7262cf7"
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/KKPiNvL2Cg4UYUWKal4in4spk8o>
Subject: Re: [bess] Alvaro Retana's Discuss on draft-ietf-bess-mvpn-fast-failover-13: (with DISCUSS and COMMENT)
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 23 Dec 2020 18:51:23 -0000

Hi Alvaro,
happy Holidays to you and everyone you care about!

Thank you for the great discussion and thought stimulating clues. I've
added more follow-up notes in-line now tagged by GIM3>>.

Regards,
Greg

On Wed, Dec 23, 2020 at 8:16 AM Alvaro Retana <aretana.ietf@gmail.com>
wrote:

> On December 22, 2020 at 10:59:19 PM, Greg Mirsky wrote:
>
>
> Greg:
>
> Hi!
>
> I think we have converged on the first two DISCUSS points.  On the
> third one, I do want to give bfd/idr an opportunity to comment -- I
> agree with you that we should wait for Martin (I believe he's out for
> the rest of the year).
>
GIM3>> I'm glad you feel we've converged on the two DISCUSSes.

>
> There are just a couple of comments left.
>
> Thanks!!
>
> Happy Holidays!
>
> Alvaro.
>
>
>
> > > > >
> ----------------------------------------------------------------------
> > > > > DISCUSS:
> > > > >
> ----------------------------------------------------------------------
> > > > >
> > > > > (1) ...
> > > > >
> > > Your initial statement that "it might be not that simple to compare"
> > > is precisely the reason I think that operational guidance is needed.
> >
> > GIM2>> I propose a new sub-section to summarize the benefits and
> challenges
> > of methods described in Section 3.1: 3.1.8. Operational Considerations
> for
> > Monitoring P-Tunnel's Status
>
> This is a good start.
>
> There are some nits/grammar enhancements that I'll leave for now.  A
> couple of other comments/questions in-line.
>
>
> > Several methods to monitor the status of a P-tunnel are described in
> > Section 3.1. Though there might be no perfect method, a comparison
> > of benefits and challenges of each technique could be helpful to both
> > implementors and network operators.
>
> "a comparison of benefits and challenges of each technique could be
> helpful"
>
> The "could" makes it sound as if that is work for another
> day/document.  Simply get rid of that last sentence.
>
GIM3>> OKay

>
>
> > Tracking the root of an MVPN (Section 3.1.1) concludes about the
> > status of a P-tunnel based on the control plane information.
> > Because, in general, the MPLS data plane is not fate-sharing with the
> > control plane, this method might produce false positive or false
> > negative alarms. On the other hand, because BGP next-hop tracking is
> > broadly supported and deployed, this method might be the easiest to
> > deploy.
>
> "...might produce false positive or false negative alarms."
>
> Maybe add something like "resulting in tunnels that seem up but are
> not able to reach the root, or ones that are declared down
> prematurely" -- just to explain a little the potential effect.
>
GIM3>> Thank you for the specific examples.

>
>
> > Method described in Section 3.1.2 monitors the state of the data
> > plane but only for an egress P-PE link of a P-tunnel. As a result,
> > network failures that affect upstream links might not be detected
> > using this method and the MVPN convergence would be determined by the
> > convergence of the BGP control plane.
>
> "...would be determined by the convergence of the BGP control plane."
>
> This is a case where it seems that combining §3.1.1/§3.1.2 would make
> sense. In fact, tracking the state of the root seems helpful in other
> cases too (below) that are looking at different things.  You said
> before that you didn't think combining the methods make sense -- can
> you please explain why in this section?
>
GIM3>> But that would be my personal opinion that the WG might not agree.
I'm always glad to discuss technical ideas, pros, and contras of that or
this approach to solve the problem but I feel uneasy adding my personal
opinions in the WG document. The document lists a set of techniques but how
they are combined in a product is left for product managers and developers
to decide. Would you agree?

>
>
> > Using the state change of a P2MP RSVP-TE LSP as the trigger to re-
> > evaluate the status of the P-tunnel (Section 3.1.3) relies on the
> > mechanism used to monitor the state of the P2MP LSP.
> >
> > The method described in Section 3.1.4 is simple and prone to false
> > alarms but it is applicable to a sub-set of MVPNs, those that use the
> > leaf-triggered x-PMSI tunnels.
>
> "false alarms"
>
> As with the false indicators above, maybe include a simple example of
> the effect.
>
GIM3>> Ooops, I meant to write "not prone" as "safe from causing false
alarms".
NEW TEXT:
   The method described in Section 3.1.4 is simple and is safe from
   causing false alarms, e.g., considering a tunnel operationally up
   even though its data path has a defect or, conversely, declaring a
   tunnel failed when it is unaffected.  But the method applies to a
   sub-set of MVPNs, those that use the leaf-triggered x-PMSI tunnels.

>
>
> > Though some MVPN might be used to provide a multicast service with
> > predictable interpacket interval (Section 3.1.5), the number of such
> > cases seem limited.
> >
> > Monitoring the status of a P-tunnel using p2mp BFD session
> > (Section 3.1.6) may produce the most accurate and expedient failure
> > notification of all monitoring methods discussed. On the other hand,
> > it requires careful consideration of the additional load of BFD onto
> > network and PE nodes.
>
> "requires careful consideration of the additional load of BFD"
>
> A reference would be nice.
>
GIM3>> I've updated the text with some details. Please let me know if it
works:
NEW TEXT:
   Operators should consider the rate of BFD
   Control packets transmitted by root PEs combined with the number of
   such PEs in the network.  In addition, the number of P2MP BFD
   sessions per PE determines the amount of state information that a PE
   maintains.
>
>
>
>
>
> > > > > (2) ...
> > > > >
> > > My point here is that you don't need the attribute for BFD to be an
> > > useful tool. However, the description of the tool (monitoring) is
> > > tied to the attribute (including the point below about deleting the
> > > BFD session). It doesn't have to be: the use of BFD should be
> > > independent of the bootstrapping mechanism.
> > >
> > > Do you consider other possible ways of bootstrapping the session
> > > valid? Would it be ok to use them to enable the use of BFD? Can a
> > > BFD session that is setup using a different mechanism used to monitor
> > > the tunnel?
> >
> > GIM2>> Thank you for clarifying the question. Yes, a p2mp BFD session
> can be
> > instantiated using some other than the BFD Discriminator attribute
> method.
> > draft-mirsky-mpls-p2mp-bfd gives several options and describes how MPLS
> LSP
> > Ping can be used. But that method, in my opinion, has some operational
> > issues and is less preferable compared to using an extension of a control
> > plane protocol. Other examples of extending a protocol to bootstrap a
> p2mp
> > BFD session can be found in the draft.
> >
> > I've added a note in the first paragraph of Section 3.1.6 to clarify that
> > the use of the BFD Discriminator attribute is optional:
> >
> > NEW TEXT:
> > The P-tunnel status may be derived from the status of a multipoint
> > BFD session [RFC8562] whose discriminator is advertised along with an
> > x-PMSI A-D Route. A P2MP BFD session can be instantiated using a
> > mechanism other than the BFD Discriminator attribute, e.g., MPLS LSP
> > Ping ([I-D.mirsky-mpls-p2mp-bfd]). Description of these methods is
> > outside the scope of this document.
>
> Ok, that's fine.  See a related comment below (15).
>
>
>
>
> > > > > (2b) ...
> > > > >
> > > BTW, I agree with Jeff in > that bfd/idr should be given the
> opportunity
> > > to review this document.
> >
> > GIM2>> I'm leaving this decision to the AD and Chairs of BESS and BFD
> WGs.
>
> Yup.
>
>
>
>
> > > > >
> ----------------------------------------------------------------------
> > > > > COMMENT:
> > > > >
> ----------------------------------------------------------------------
> ...
> > > > > (15) §3.1.6: "The BFD Discriminator attribute MUST be considered
> > > > > malformed if its length is not a non-zero multiple of four." Ok,
> > > > > except that the specification of the attribute doesn't mention the
> > > > > length (only the length of the TLVs). Please specify the length...
>
> Don't forget to specify the length of the attribute.
>
GIM3>> Resulting from the discussion with Jeff, the text we've agreed to is
now as follows:
NEW TEXT:
   The BFD Discriminator attribute MUST be considered malformed if its
   length is smaller than five octets or if Optional TLVs are present,
   but not well-formed.
>
>
>
>
> ...
> > > All I want is for you to consider any potential implications of the
> > > new attribute. The conclusion may be that no further change is
> > > needed.
> >
> > GIM2>> Thank you for further clarifying your comment. We are not defining
> > any extensions to the BFD Discriminator attribute and it may not be
> possible
> > to imagine all the potential extensions that could be proposed in the
> > future. It seems that it will be more appropriate to review the impact of
> > the specific extension of the BFD Discriminator attribute when it will be
> > proposed. At that time, we'll need to consider that excluding the BFD
> > Discriminator attribute from an update will be interpreted as the
> removal of
> > the corresponding p2mp BFD session. That would not cause a switchover but
> > just make MVPN failover to rely on the convergence of the BGP control
> plane.
>
> Fair enough.
>
>
> Two more comments (related, but different):
>
> (a) Being that the attribute (just like other attributes) is not
> protected, it can be removed in flight (for example, by a rogue node).
> As you mentioned, that action will not cause a switchover, but (from
> the end of §3.1.6) the BFD session will be deleted causing the tunnel
> to be left unmonitored.  There isn't much that can be done about this,
> but it should be mentioned somewhere (maybe the Security
> Considerations section).
>
> (b) Back to my DISCUSS point about not needing the attribute to
> bootstrap the BFD session:  If the BFD session is already up (maybe it
> was done using the attribute or maybe it was manually configured, it
> doesn't matter), then not including the attribute disables the
> monitoring, regardless of how the session was set up.  The last
> paragraph in §3.1.6 says this:
>
>    If the downstream PE's P-tunnel is already established, its state
>    being monitored by the P2MP BFD session, and the downstream PE
>    receives the new x-PMSI A-D Route without the BFD Discriminator
>    attribute, and the x-PMSI A-D Route was processed without any error
>    as per the relevant specifications, the downstream PE:
>
> I guess the intent was to say: "if the BFD session was set up using
> the attribute and the attribute is not present anymore".  But the text
> doesn't make that clarification, and a direct reading would result in
> deleting the BFD session even if the attribute was not used.
>
> Suggestion>
>    If the downstream PE's P-tunnel is already established, its state being
>    monitored by the P2MP BFD session set up using the BFD Discriminator
>    Attribute...
>
GIM3>> Thank you for the detailed and clear explanation. I've used the
suggested text.


>
>
>
>
> > > > > (17) ...
> > > > >
> > > No -- the question is which address of the Upstream PE? I'm assuming
> > > that an upstream PE can have several addresses, which one is to be
> > > used? I can guess it is the address of the interface used to reach
> > > the destination (or something like that)...but I would just want it to
> > > be clear.
> >
> > GIM2>> Thank you, I now better understand the question. It it the PE
> address
> > in the given VRF. I propose the following update:
> >
> > NEW TEXT:
> > o MUST use its PE address as the source IP address when transmitting
> > BFD Control packets;
>
> I'm not sure that "PE address" is the best clarification.  I'll let
> you figure it out. :-)
>
GIM3>> RFC 6513 refers to the Upstream PE Address. The PE Distinguisher
Labels Attribute, defined in RFC 6514, includes the PE Address field. BFD
must use the same PE address.

>
>
> > Also, I think that the text in the next section that defines the process
> on a
> > downstream PE needs clarification. I propose to update text as follows:
> > OLD TEXT:
> > o MUST use the source IP address of the BFD Control packet, the
> > value of the BFD Discriminator field, and the x-PMSI Tunnel
> > Identifier [RFC6514] the BFD Control packet was received on to
> > properly demultiplex BFD sessions.
> > NEW TEXT:
> > o to properly demultiplex BFD session MUST use:
> >
> > the address of the PE that included the BFD Discriminator
> > attribute in the x-PMSI A-D Route;
> >
> > the value of the BFD Discriminator field in the BFD
> > Discriminator attribute;
> >
> > the x-PMSI Tunnel Identifier [RFC6514] the BFD Control packet
> > was received on.
>
> This looks good.
>
GIM3>> Thank you.

>
>
>
> > > > > (18) §3.1.6.2(http://3.1.6.2): If the IP address doesn't map
> correctly
> > > > > at the downstream PE (for example, a different local address is
> used
> > > > > that doesn't correspond to the information in the PMSI attribute),
> > > > > what action should it take? Can the tunnel still be monitored?
> > > >
> > > > GIM>> There's a possibility that the same downstream PE is monitoring
> > > > more than one P-tunnel. Since each Upstream PE assigns its own BFD
> > > > Discriminator, there's a chance that the same value is picked by more
> > > > than one Upstream PE.
> > > > According to Section 5.7 of the RFC 8562:
> > > > IP and MPLS multipoint tails MUST demultiplex BFD packets based on a
> > > > combination of the source address, My Discriminator, and the identity
> > > > of the multipoint path that the multipoint BFD Control packet was
> > > > received from. Together they uniquely identify the head of the
> > > > multipoint path.
> > > >
> > > > We may consider adding the source address in the BFD Discriminator
> > > > attribute as an optional TLV. I think that might be a good extension
> > > > that can be introduced in a new document.
> > >
> > > Why wait for a new document? You made a pretty good case for
> > > signaling the source address.
> >
> > GIM2>> I'd like to defer this question to our AD and BESS WG Chairs.
>
> Again, you made a good case for why it is needed for the mechanism to
> work.  Leaving it for later might just leave a hole.  Sure, let's hear
> from the Chairs/AD.
>
>
>
>
> ...
> > > > > (23) ...
> > > > >
> > > No, it doesn't. The text specifies that the update "MUST have the
> > > LOCAL_PREF attribute set to zero". The question is not about the
> > > potential effect, but about the value being anything else (> 0). Note
> > > that the value can be low enough (> 0) and no adverse effect may take
> > > place, but the text doesn't specify a "low value", it explicitly
> > > specifies 0.
> >
> > GIM2>> I think I see it better now. Would the following text work:
> >
> > NEW TEXT:
> > For this purpose, routes that carry the Standby PE BGP
> > Community MUST have the LOCAL_PREF attribute set to the value lower
> > than the value specified as the LOCAL_PREF attribute for the route
> > that does not carry the Standby PE BGP Community. The value of zero
> > is RECOMMENDED.
>
> Yeah...I still have a problem with "MUST have the LOCAL_PREF attribute
> set to the value lower than..." because there's no practical way to
> enforce that by the receiver: if the update with the community comes
> in first then now the receiver was to remember to check when the other
> update comes in later...
>
> s/MUST/must
> The "RECOMMENDED" part should be enough.
>
GIM3>> Applied

>
>
> > Also, the text regarding setting the LOCAL_PREF after the switchover may
> > need, in my view, the update to not require it is set to zero:
> >
> > NEW TEXT:
> > The new Upstream PE MUST set the LOCAL_PREF attribute
> > for that C-multicast route to the same value as when the Standby PE
> > BGP Community was included in the advertisement.
>
> Same thing here: s/MUST/must
>
GIM3>> Applied