[bess] John Scudder's Discuss on draft-ietf-bess-evpn-pref-df-11: (with DISCUSS and COMMENT)

John Scudder via Datatracker <noreply@ietf.org> Fri, 04 August 2023 01:13 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: bess@ietf.org
Delivered-To: bess@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id E6F44C15DF76; Thu, 3 Aug 2023 18:13:52 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: John Scudder via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-bess-evpn-pref-df@ietf.org, bess-chairs@ietf.org, bess@ietf.org, Stephane Litkowski <slitkows.ietf@gmail.com>, slitkows.ietf@gmail.com
X-Test-IDTracker: no
X-IETF-IDTracker: 11.5.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: John Scudder <jgs@juniper.net>
Message-ID: <169111163293.58993.7675372752038172128@ietfa.amsl.com>
Date: Thu, 03 Aug 2023 18:13:52 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/qiWLbqjl_g2l9cQy-5Y-SHi7UcA>
Subject: [bess] John Scudder's Discuss on draft-ietf-bess-evpn-pref-df-11: (with DISCUSS and COMMENT)
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.39
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Aug 2023 01:13:53 -0000

John Scudder has entered the following ballot position for
draft-ietf-bess-evpn-pref-df-11: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/ 
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-bess-evpn-pref-df/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

# John Scudder, RTG AD, comments for draft-ietf-bess-evpn-pref-df-11
CC @jgscudder

## DISCUSS

Thanks for this specification. It seems needed and useful. Once I put in the
effort I think I understand what you're doing. I think it could be presented
even more clearly, and I offer some suggestions toward that below. I also have
several points I want to raise about possible problem areas, which I hope we'll
be able to work through quickly.

### General, Please Update RFC 8584, add IANA Consideration

Since you modify the definition of the DF Election Extended Community, I think
you should update (with the Updates: metadata) RFC 8584, and in your IANA
Considerations you should add <this document> as an additional reference for DF
Election Extended Community in the EVPN Extended Community Sub-Types registry.

### Section 3, potentially breaks other DF algorithms

You have:

      -  Bit 0 (corresponds to Bit 24 of the Designated Forwarder
         Election Extended Community and it is defined by this
         document): D bit or 'Don't Preempt' bit (DP hereafter),
         determines if the PE advertising the Ethernet Segment route
         requests the remote PEs in the Ethernet Segment not to preempt
         it as Designated Forwarder.  The default value is DP=0, which
         is compatible with the 'preempt' or 'revertive' behavior in the
         Default DF Algorithm [RFC7432].  The DP capability is supported
         by DF Algorithms Highest-Preference or Lowest-Preference, and
         MAY be used with the default DF Algorithm or HRW [RFC8584].
         The procedures of the "Don't Preempt" capability for the
         default DF Algorithm or HRW are out of the scope of this
         document.

A cursory skim of RFC 7432 Section 8.5 leads me to think that the Default DF
Algorithm (at least) works by having all PEs run the election independently;
since they run the same algorithm over the same data they are assumed to come
to the same conclusion, and pick the same DF.

If that understanding is correct then I'm concerned about the last
sentence-and-a-half of the quoted text:

         [the DP capability]
         MAY be used with the default DF Algorithm or HRW [RFC8584].
         The procedures of the "Don't Preempt" capability for the
         default DF Algorithm or HRW are out of the scope of this
         document.

because you've taken the fully-specified algorithm defined in 7432, and made it
underspecified by saying that the DP capability MAY be used with it while
explicitly not saying *how* it is to be used.

Perhaps there's some reason I shouldn't worry about this, or perhaps you should
revise the quoted text. Let's discuss.

### Section 4.1, extra "not" inverts meaning of sentence

                                                        a given PE in
   the Ethernet Segment is not considered as candidate for Designated
   Forwarder Election until its corresponding Ethernet A-D per ES and
   Ethernet A-D per EVI routes are not received, as described in
   [RFC8584].

Is the second "not" mistaken, in the quoted text, i.e. should "are not
received" actually be "are received"?

### Section 4.2, "any other logic" unspecified, non-interoperability ensues

   For Ethernet Segments attached to three or more PEs, any other logic
   that provides a fair distribution of the Designated Forwarder
   function among the PEs is valid, as long as that logic is consistent
   in all the PEs in the Ethernet Segment.

On the surface of it, this seems fine (well for small values of "fine" once one
has accepted that we're going to give up on using the control plane to signal
this stuff and are relying on local configuration instead). But in practice, it
seems to me as though it's a recipe for non-interoperable implementations.
Indeed I think that's explicit in "... as long as that logic is consistent in
all the PEs in the Ethernet Segment" and the dire warnings that follow.

While the example presented earlier in Section 4.2 is clean enough -- two PEs,
half of the tags get assigned lowest-preference algorithm and the others get
highest-preference -- once you've got three or more, I assume you're going to
use a hash or something like that. Since you only have two defined algorithms,
you can't cleanly accommodate more than two PEs.

This seems like an important point to be abdicating in a Standards Track
document. Is it the WG's position that two is almost always going to be enough?
(But then why have Section 4.2 at all?) Did the WG discuss and explicitly
accept that by leaving this unspecified, we almost guarantee that different
implementations won't interoperate in this use case?


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------


## COMMENTS

### Abstract

   The Designated Forwarder (DF) in Ethernet Virtual Private Networks
   (EVPN) is defined as the PE responsible for sending Broadcast,
   Unknown unicast and Broadcast traffic (BUM)

Should be "and Multicast", surely?

### Section 2, orphan terms

In the terminology section, these terms are defined but never used:
  - BD
  - NDF
  - BDF
  - Ethernet A-D per ES route
  - ISID
  - MAC-VRF

And these terms are defined but never used outside the Abstract and
Introduction where they're already defined in-line (thanks for that) so they're
also not needed:
  - BUM
  - ESI

### Section 4.1, need forward reference to Section 4.3 about DP bit

In 4.1(e) where you discuss the use of the DP bit, I strongly suggest that you
add a forward reference to Section 4.3 since it's not possible to understand
the use of the DP bit until you walk through the fact that it's going to be
twiddled dynamically to effect the desired behavior.

### General, need an overview somewhere early on; Introduction feels unfinished

I would have been spared a lot of confusion if there had been a brief overview
early on so that I didn't keep hitting surprises. Now that I look at it, the
Introduction seems unfinished, with only the problem statement and
requirements. The natural thing would be to add a Section 1.3, Solution
Overview, that presents just the bare bones outline of the solution, something
like:

1.3. Solution Overview

        To provide a solution that satisfies the above requirements, we
        introduce two new DF Algorithms that can be advertised in the DF
        Election Extended Community [Section 3]. Carried with the new DF
        Election Extended Community variants are a DF election preference
        advertised for each PE, that influences which PE will become DF
        [Section 4.1]. The advertised DF election preference can dynamically
        vary from the administratively configured preference to provide
        non-revertive behavior [Section 4.3].

        An optional solution is discussed in [Section 4.2], for use in Ethernet
        segments that support large numbers of Ethernet Tags and therefore need
        to balance load among multiple DFs.

Feel free to use any of that, or not, as you see fit.

### Section 4.3, why is highest-preference more complex than lowest-preference?

In the paragraph right before point (1), you have, "The procedure is described
assuming Highest-Preference Algorithm in the Ethernet Segment, where local
policy overrides the tie-breaker for a given Ethernet Tag, since this is the
most complex case."

I don't see why highest-preference is any more complex than lowest-preference?

### Section 4.3, algorithm is strangely specified

In point 5, you specify picking *two* reference-PEs, both a Highest-PE and a
Lowest-PE:

       *  Select two "reference-PEs" among the Ethernet Segment routes
          in the virtual Ethernet Segment, the "Highest-PE" and the
          "Lowest-PE":

[and so on]

But then in the paragraph right after point (6), you say

   If the Ethernet Segment uses Highest-Preference Algorithm (for all
   the Ethernet Tags, no local policy), the PEs only need to select the
   "Highest-PE" as the "reference-PE" (i.e., no need to select the
   "Lowest-PE").  If the Ethernet Segment uses Lowest-Preference
   Algorithm for all the Ethernet Tags, the PEs only need to select the
   "Lowest-PE" as the "reference-PE".  The rest of the procedure remains
   the same.

While it's possible to patch together a coherent algorithm after considering
all this, I can't imagine why you trick the reader by telling them to pick
both, but then at the end saying "ha ha, I was only joking, you need just one".
The more straightforward approach, IMO, would be something like this:

OLD:
       *  Select two "reference-PEs" among the Ethernet Segment routes
          in the virtual Ethernet Segment, the "Highest-PE" and the
          "Lowest-PE":

NEW:
       *  Select a "reference-PE" among the Ethernet Segment routes
          in the virtual Ethernet Segment. If the Ethernet Segment uses
          the Highest-Preference algorithm, select a "Highest-PE". If it
          uses the Lowest-Preference algorithm, select a "Lowest-PE", as
          follows:

Or perhaps (since in my suggested text I elided the "no local policy" stuff)
that's the hint for me about why you did it that way. But I think this can
still be done in line, and more clearly, as in:

NEW2:
       *  Select a "reference-PE" among the Ethernet Segment routes
          in the virtual Ethernet Segment. If the Ethernet Segment uses
          the Highest-Preference algorithm, select a "Highest-PE". If it
          uses the Lowest-Preference algorithm, select a "Lowest-PE". If
          some more complex local policy is in use, as discussed in
          [Section 4.2], it may be necessary to select both a Highest-PE
          and a Lowest-PE. They are selected as follows:

And then get rid of the paragraph following (6).

### Section 5, attacker can force revertive behavior

I wasn't going to mention this, but since you point to the non-revertive
behavior as a security benefit (I agree), I think you also have to consider
that an attacker who gets access to the configuration of a PE in the Ethernet
Segment will be able to easily disable non-revertive behavior, by advertising a
conflicting DF election algorithm and thereby forcing fallback to the Default
algorithm.

## NITS

- s/candidate lits/candidate list/
- s/The existence of both provide/The existence of both provides/

## Notes

This review is in the ["IETF Comments" Markdown format][ICMF], You can use the
[`ietf-comments` tool][ICT] to automatically convert this review into
individual GitHub issues.

[ICMF]: https://github.com/mnot/ietf-comments/blob/main/format.md
[ICT]: https://github.com/mnot/ietf-comments