[bess] Benjamin Kaduk's Discuss on draft-ietf-bess-evpn-df-election-framework-07: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Tue, 08 January 2019 17:51 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: bess@ietf.org
Delivered-To: bess@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 91520130F24; Tue, 8 Jan 2019 09:51:58 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk <kaduk@mit.edu>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-bess-evpn-df-election-framework@ietf.org, Stephane Litkowski <stephane.litkowski@orange.com>, bess-chairs@ietf.org, stephane.litkowski@orange.com, bess@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.89.2
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <154696991858.25531.7342921270701060263.idtracker@ietfa.amsl.com>
Date: Tue, 08 Jan 2019 09:51:58 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/hGG0hEk3qsffMORTymWVfyBMcjU>
Subject: [bess] Benjamin Kaduk's Discuss on draft-ietf-bess-evpn-df-election-framework-07: (with DISCUSS and COMMENT)
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.29
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jan 2019 17:51:59 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-bess-evpn-df-election-framework-07: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-bess-evpn-df-election-framework/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

It's not really clear to me that the question of Updating 7432 has been
settled by the responses to the directorate reviews; I've noted a few
places in the text that are problematic in this regard, in the COMMENT
section.

I'm a little worried that we're setting future extensions up for a
combinatoric explosion of required analysis, by requiring new capabilities
and DF algorithms to determine their applicability/compatibility with all
previously defined mechanisms of the other type.  But maybe there's some
easy math to show that it won't be too bad.  Let's have a discussion about
this topic, even if we don't end up needing to make any changes to the
document as a result.  (I do note that this explosion would not really
happen if we combined the two into a single enumerated codepoint space that
combines DF algorithm with the enablement state of all then-defined feature
flags.)

Wection 3.3

   Section 7.6 of [RFC7432] describes how the value of the ES-Import
   Route Target for ESI types 1, 2, and 3 can be auto-derived by using
   the high-order six bytes of the nine byte ESI value. The same auto-
   derivation procedure can be extended to ESI types 0, 4, and 5 as long
   as it is ensured that the auto-derived values for ES-Import RT among
   different ES types don't overlap.

How do I ensure that the auto-derived values don't overlap?

Section 4.2

                     The ESI value MAY be set to all 0's in the Weight
   function below if the operator so chooses.

I'm not 100% sure I'm interpreting this correctly, but this sounds like a
piece of device-specific configuration (i.e., configured by the operator)
that must be the same across all devices for correct operation, but is not
covered by the advertisement of the DF Election Exctended Community.  This
would decrease the robustness of the system to basically the "experimental"
level of DF election algorithm 31, which also relies on universal agreement
of manual configuration.  Is this actually something we want to include?

Section 5

   The AC-DF capability MAY be used with any "DF Alg" algorithm. It MUST

As written, this suggests that it is true for any current or future
algorithm, which is in conflict with the text in Section 3.2 that notes
that "for any new DF Alg defined in future, its applicability/compatibility
to the existing capabilities must be assessed on a case by case basis."  It
seems more prudent to make the assessment after the relevant technologies
are both extant, so I would suggest this be non-normative text, perhaps
"the AC-DF capability is expected to be of general applicability with any
future 'DF Alg' algorithm".


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Section 1.2.1

I a little bit wonder if the risk of poor distribution of DFs with the
default algorithm is being oversold -- any "hash identifiers into buckets"
scheme will be susceptible to pessimal input, but if the inputs are not
attacker-controlled and the pessimal inputs are unlikely to occur randomly,
we may not need to care.

   2- Even in the case when the Ethernet Tag distribution is uniform the
      instance of a PE being up or down results in re-computation ((v
      mod N-1) or (v mod N+1) as is the case); the resulting modulus
      value need not be uniformly distributed because it can be subject
      to the primality of N-1 or N+1 as may be the case.

This is making some assumptions about the (potential) distribution of the
tag values that could be made more clear, as otherwise the primality is
not particularly relevant (particularly for an actual uniform distribution
that covers all possible values).  Similarly below, by the CLRS reference
(CLRS probably has the ability to assume that we're running on binary
computers and may even be doing things like operating on pointers, which
tend to have fixed structure in the low-order bits due to alignment
considerations, etc.  For these (human-allocated?) integer identifiers it's
less clear what assumptions should come into play.)

Section 1.3

   Section 2.2 describes some of the issues that exist in the Default DF

There is no section 2.2; presumably this is supposed to be 1.2.

   o HRW and AC-DF mechanisms are independent of each other. Therefore,
     a PE MAY support either HRW or AC-DF independently or MAY support
     both of them together. A PE MAY also support AC-DF capability along
     with the Default DF election algorithm per [RFC7432].

This seems a little confusing since just a couple paragraphs ago you are
distinguishing between "election algorithms" and "capabilities", but here
the two new things (one of each type) are lumped together as "mechanisms".
If election algorithms and capabilities are inherently independent things,
then maybe there is not a need to reiterate the independence of HRW and
AC-DF here.

Section 3

   This section describes the BGP extensions required to support the new
   DF Election procedures. In addition, since the EVPN specification
   [RFC7432] does leave several questions open as to the precise final
   state machine behavior of the DF election, section 3.1 describes
   precisely the intended behavior.

This text sounds like we should be Update:ing 7432.

Section 3.2

     - Otherwise if even a single advertisement for the type-4 route is
       not received with the locally configured DF Alg and capability,

nit: shouldn't this be "received without"?

Section 3.2.1

   [RFC7432] implementations (i.e., those that predate this
   specification) will not advertise the DF Election Extended Community.

This wording also suggests that we should be Update:ing 7432.

Section 4
I note that the state of the art in non-cryptographic fast hashing has
improved a lot since 1998 and we have things like the Jenkins hash that are
supposed to be superior to CRC-32 and such.

                                   [HRW1999] provides pseudo-random
   functions based on the Unix utilities rand and srand and easily
   constructed XOR functions that perform considerably well. This
   imparts very good properties in the load balancing context. Also each
   server independently and unambiguously arrives at the primary server
   selection. [...]

It's not really clear to me that this text adds much value -- we go on
later to say that we explicitly use a Wrand() function from HRW1999.

Section 4.2

   1.  DF(v) = Si: Weight(v, Es, Si) >= Weight(v, Es, Sj), for all j. In
       case of a tie, choose the PE whose IP address is numerically the
       least. Note 0 <= i,j < Number of PEs in the redundancy group.

I strongly suggest expanding out the notation with more words, e.g. "DF(v)
is defined to be the address Si such that [...]".  We probably shouldn't
assume much abstract math background from RFC readership.  (Similarly for
BDF(v).  The BDF(v) expression doesn't even say what the i, j, and k are
evaluated over.)

   HRW solves the disadvantages pointed out in Section 2.2.1 and
   ensures:

Again, this is now Section 1.2.1

   o More importantly it avoids the needless disruption case of Section
     2.2.1 (3), that is inherent in the existing Default DF Election.

and here.
(Also, this bullet point is just describing the same situation as the
previous one, if I understand correctly.)

Section 5

   modify the DF Election procedures by removing from consideration any
   candidate PE in the ES that cannot forward traffic on the AC that
   belongs to the BD. [...]

What guarantees that the ACS information is available on all PEs involved
in the election?

   In particular, when used with the Default DF Alg, the AC-DF
   capability modifies the Step 3 in the DF Election procedure described
   in [RFC7432] Section 8.5, as follows:

Only a single paragraph follows, but the referenced document has three
paragraphs in the indicated step.  Are the last two paragraphs no longer
intended to apply?  In particular, if we apply this paragraph as a direct
replacement for the RFC 7432 step 3, then there is no longer a normative
description of the modulus-based algorithm, which seems incorrect.  Also,
there's a lot of style/editorial changes, that make the difference in
behavior harder to read from the diff.  (Side note: I don't think this
particular text implies that this document needs an Updates: relation to
RFC 7432, since it is a behavior change conditional on the use of a
negotiated feature.)

   a) When PE1 and PE2 discover ES12, they advertise an ES route for
      ES12 with the associated ES-import extended community and the DF
      Election Extended Community indicating AC-DF=1; they start a timer
      at the same time. [...]

(nit?) This text implies some synchronization between PE1 and PE2 for
starting the timer, whereas I think the intent is just to note that they
each start a timer as they advertise the route, independently of each other.

   In addition to the events defined in the FSM in Section 3.1, the
   following events SHALL modify the candidate PE list and trigger the
   DF re-election in a PE for a given <ES,VLAN> or <ES,VLAN Bundle>. In
   the FSM of Figure 3, the events below MUST trigger a transition from
   DF_DONE to DF_CALC:

Then why are they not listed as part of the referenced FSM (or at least
mentioned with a forward-reference)?

Section 7

Are there any considerations to discuss about increased resource
consumption (e.g., for storing and transmiting Ethernet A-Ds per-<ES,VLAN>
vs. per-<ES,VLAN Bundle>) and the risk of DoS due to reaching resource
caps?

                Note that the network will not benefit of the new
   procedures if the configuration of one of the PEs in the ES is
   changed to the Default [RFC7432] DF Election.

Isn't this the case if there is not unanimity among all PEs in the ES about
what election algorithm is preferred, which is a broader possible case than
one being changed to use the default algorithms?

Section 8

   o Allocate Sub-Type value 0x06 in the "EVPN Extended Community Sub-
     Types" registry defined in [RFC7153] as follows:

Sometimes we see language about "confirm the existing early allocation",
but I assume that the RFC Editor and IANA have a standard way of sorting
this stuff out.