[bess] Benjamin Kaduk's Discuss on draft-ietf-bess-nsh-bgp-control-plane-13: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Wed, 18 December 2019 23:02 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: bess@ietf.org
Delivered-To: bess@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 7728812008C; Wed, 18 Dec 2019 15:02:03 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-bess-nsh-bgp-control-plane@ietf.org, bess-chairs@ietf.org, slitkows.ietf@gmail.com, bess@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.113.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <157671012348.4907.819299139449107547.idtracker@ietfa.amsl.com>
Date: Wed, 18 Dec 2019 15:02:03 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/D0KetNEQdRyLQJ4IWE4NaS3fOzg>
Subject: [bess] Benjamin Kaduk's Discuss on draft-ietf-bess-nsh-bgp-control-plane-13: (with DISCUSS and COMMENT)
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.29
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 18 Dec 2019 23:02:03 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-bess-nsh-bgp-control-plane-13: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-bess-nsh-bgp-control-plane/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

Section 3.2.1.3 seems to talk about intermingling SFIR-RDs and SFIR Pool
Identifiers in a common list, but I do not see how it's defined to
intermingle the (six-octet) Pool Identifier Value with the (eight-octet)
RDs.

Section 3.1 seems to allow multiple SFIRs associated with a given RD,
but the rest of the document seems to assume that any RD has at most one
associated SFIR (as is stated explicitly for SFPRs).  (A few specific
mentions in the COMMENT.)

Within my own limited understanding, it seems like this document is
expanding the boundaries of the SFC Architecture in ways not envisioned
by RFC 7665 or 8300, and the shepherd writeup is pretty quiet on to what
extent this architectural change is accepted by the WG (as opposed to
being contained to just the BGP control plane mechanism).  I'd like to
get a positive affirmation from someone more familiar with the
discussions that this is moving the architecture in the right direction
with respect to things like:
- the introduction of the Service Function Type intermediate
  classification
- the more prominent treatment of looping, jumping, and branching as
  operations within a single SFP without reclassification by using the
  "Change Sequence" SFT entries to indicate these behaviors within the
  SFP definition itself (I note that RFC 7665 does not mention "jump" at all)
- the introduction of "gaps" in the SI sequence of a given SFP

With respect to SFT in particular, it sounds like this is intended to
help with scalability, which makes the genart reviewer's comment about
lack of implementation experience particularly poingant.  It seems like
SFIR pools perform a similar role, though of course not identical; I
didn't get a clear sense of why pools without SFTs are insufficient.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

I support Roman's Discuss.

Section 2.1

   In fact, each SI is mapped to one or more SFs that are implemented by
   one or more Service Function Instances (SFIs) that support those
   specified SFs.  Thus an SI may represent a choice of SFIs of one or
   more Service Function Types.  [...]

In my head a SI that maps to more than one SF implies a requirement to
do all of the listed SFs and in a specific order; "a choice of SFIs of
one or more types" seems to relax that a bit so that the order is
relaxed and perhaps the requirement to do all is also relaxed.  I think
"choice of SFIs for one or more" would alleviate that concern (if it is
even a real concern).

   in the underlay network.  How this indicator is generated and
   supplied, and how an SFF generates a new entropy indicator when it
   forwards a packet to the next SFF are out of scope of this document.

nit: comma after "next SFF".

Section 2.2

   This approach assumes that there is an underlay network that provides
   connectivity between SFFs and Controllers, and that the SFFs are
   grouped to form one or more service function overlay networks through
   which SFPs are built.  We assume BGP connectivity between the
   Controllers and all SFFs within each service function overlay
   network.

Are we thus implicitly not assuming BGP connectivity between arbitrary
pairs of SFFs?

Section 3

   The nexthop field of the MP_REACH_NLRI attribute of the SFC NLRI MUST
   be set to loopback address of the advertising SFF.

nit: "a loopback address" (?)

Section 3.1

   Per [RFC4364] the RD field comprises a two byte Type field and a six
   byte Value field.  If two SFIRs are originated from different
   administrative domains, they MUST have different RDs.  In particular,
   SFIRs from different VPNs (for different service function overlay
   networks) MUST have different RDs, and those RDs MUST be different
   from any non-VPN SFIRs.

I think we should probably be a little more explicit that we reuse the
RD element (and the RD type/value semantics!) wholesale from RFC 4364
and
https://www.iana.org/assignments/route-distinguisher-types/route-distinguisher-types.xhtml
.  That is, we are not just reusing the layout, but the logical type as
well.

Also, are all of these "MUST"s indicating requirements that are new with
this specification, or are some of them inherited from RFC 4364?

The discussion later on in Section 4.2 suggests that perhaps these RDs
need to be unique per SFI, akin to how we say they are unique per SFP in
Section 3.2?

   The Service Function Type identifies the functions/features of
   service function can offer, e.g., classifier, firewall, load
   balancer, etc.  There may be several SFIs that can perform a given
   Service Function.  Each node hosting an SFI MUST originate an SFIR

This seems a bit of a non-sequitur: we go straight from SF*T*s to
discussion of the SFI/SF relationship, and we only mention "type" (but
not "SFT") in the rest of the paragraph, so the connection is harder to
return to than it need be.

   Note that a node may advertise all SFIs of one SFT in one shot using
   normal BGP Update packing.  That is, all of the SFIRs in an Update
   share a common Tunnel Encapsulation and RT attribute.  See also

I think we need to expand "route target".
Also, I suggest s/may advertise all/may advertise all its/

Section 3.1.2

I suggest explicitly stating that these MPLS labels are used to direct
incoming traffic to the SFI in question (fulfilling the promise made by
the abstract).

   Note that it is assumed that each SFF has one or more globally unique
   SFC Context Labels and that the context label space and the SPI
   address space are disjoint.

What I'm taking away from the last clause is roughly "we do not assume
that the values of SFC context labels have any structured relationship
to the corresponding SPI values", even though a strict reading of
"disjoint" would seem to be that "if a value is used in one setting it
is guaranteed to not be used in the other setting".  I think some
rewording may be in order.

Section 3.2

[our treatment of RD should presumably be consistent with Section 3.1,
if we make any changes there]

   byte Value field.  All SFPs MUST be associated with different RDs.
   The association of an SFP with an RD is determined by provisioning.

So this means that any given RD value can have at most one SFP
associated with it?  I guess that's needed in order to provide the level
of separation between SFPs that we need...

Section 3.2.1

Is there intended to be a distinction between the error-handling
behavior for error cases 1/2/6/7 and error case 4?  (If there is, I'm
not seeing it.)

      5., 8.:  The absence of an RD with which to correlate is nothing
         more than a soft error.  The receiver SHOULD store the
         information from the SFP attribute until a corresponding
         advertisement is received.

Should any sort of timeout also be applied?

Section 3.2.1.1

      SFP.  An SFP attribute SHOULD NOT contain more than one
      Association TLV with Association Type 1: if more than one is
      present, the first one MUST be processed and subsequent instances
      MUST be ignored.  Note that documents that define new Association

Do we want to say anything about checking consistency that (per these
rules) the forward SFP is associated with the backward, and the backward
SFP is associated with the forward one?  [I guess Section 7.1 is
probably enough.]

Section 3.2.1.3

[It's not clear to me that grouping by type gives any efficiency savings
here, as grouping by pool is available, and the semantics seem to be
equivalent to if the list of allowed SFI (pool)s was explicitly given
without the container.  A single pool containing all the SFIs in the
type would replace the SFIR-RD-value-zero semantics.]

Section 3.2.1.5

   The SFP Traversal With MPLS Label Stack TLV (Type value 5) is a zero
   length sub-TLV that can be carried in the SFP Attribute and indicates

If it's carried directly in the SFP Attribute, doesn't that make it a
non-sub TLV?

Section 3.2.2

   In all cases, there is no way for the receiving SFF to know which
   SFPR to process, and the SFPRs could be received in any order.  At
   any point in time, when multiple SFPRs have the same SPI but
   different SFPR-RDs, the SFF MUST use the SFPR with the numerically
   lowest SFPR-RD.  The SFF SHOULD log this occurrence to assist with

(The "interpreted as an 8-octet integer in big-endian form" is
implicit?)

Section 4.1

   The main feature introduced by this document is the ability to create
   multiple service function overlay networks through the use of Route
   Targets (RTs) [RFC4364].

As mentioned above, this seems like the second (not first) usage of
"RT".

Section 4.5

   When an SFF gets an SFPR advertisement it will first determine
   whether to import the route by examining the RT.  If the SFPR is
   imported the SFF then determines whether it is on the SFP by looking
   for its own SFIR-RDs in the SFPR.  For each occurrence in the SFP,

Also for non-specific SFT mentions that would apply, right?

Section 5

   o  There may be an obvious branch needed in an SFP such as the
      processing after a firewall where admitted packets continue along
      the SFP, but suspect packets are diverted to a "penalty box".  In
      this case, the next hop in the SFP will be indicated with two
      different SFT values.

My (admittedly poor) understanding of the SFC Architecture was that this
was best done as a reclassification, and that the various SFI options
for a given SI were intended to be roughly equivalent.  Am I just
imagining that?

   2.  From the SFP attribute of that SFPR, it finds the Hop TLV with SI
       value set to SI-y.  If there is no such Hop TLV, then such
       packets cannot be forwarded.

This seems to exclude the "gaps" in SI space that this document is
trying to allow.  Or is the gap processing to have already happened by
the time the SFF is at the "needs to forward" stage?

       A.  An SFIR is relevant if it carries RT-z, the SFT in its NLRI
           matches the SFT value in one of the SFT TLVs, and the RD
           value in its NLRI matches an entry in the list of SFIR-RDs in
           that SFT TLV.

       B.  If an entry in the SFIR-RD list of an SFT TLV contains the
           value zero, then an SFIR is relevant if it carries RT-z and
           the SFT in its NLRI matches the SFT value in that SFT TLV.
           I.e., any SFIR in the service function overlay network
           defined by RT-z and with the correct SFT is relevant.

Do we need to talk about expansion of pools as well?

   Thus, at any point in time when an SFF selects its next hop, it
   chooses from the intersection of the set of next hop RDs contained in
   the SFPR and the RDs contained in its local set of SFIRs.  If the
   intersection is null, the SFPR is unusable.  Similarly, when this
   condition obtains the originator of the SFPR SHOULD either withdraw
   the SFPR or re-advertise it with a new set of RDs for the affected
   hop.

(This seems to require a definition of "contained in" that includes
various types of expansion having been performed.)
nit: s/obtains/is detected/ (??)

Section 7.4

   o  If an SFC Classifiers Extended Community is received with SFT = 0
      then there are two sub-cases:

      *  If there is a choice of SFT in the hop indicated by the value
         of the SI (including SI = 0) then SFT = 0 means there is a free
         choice according to local policy of which SFT to use).

      *  If there is no choice of SFT in the hop indicated by the value
         of SI, then SFT = 0 means that the value of the SFT at that hop
         as indicated in the SFPR for the indicated SPI MUST be used.

side note(?): I think it would be possible to include the second case as
a special case of the first, as a "free choice" among all one allowed
values by the SFP for that SI.

   which the traffic belongs.  Additionally, note that to put the
   indicated SPI into context when multiple SFC overlays are present in
   one network, each FlowSpec update MUST be tagged with the route
   target of the overlay or VPN network for which it is intended.

I'd suggest making this requirement more prominent by putting it earlier
in the section.

Section 7.5

The description of these flags as describing "how the originating SFF
expects to see the SPI/SI represented" has some connotation of
exclusivity ("I expect to see an NSH" implying "I don't expect to see
anything else, including MPLS"), which does not seem like a good fit for
a bitmask where multiple bits can be set (as opposed to just a codepoint
for the encoding to use).

   The following bits are defined by this document:

   Bit 0:  If this bit is set the NSH is to be used to carry the SPI/SI
      in the data plane.

How are we numbering the bits?

Section 7.6

   When a classifier constructs an MPLS label stack for an SFP it starts
   with that SFP' last hop.  If the last hop requires an {SPI, SI} label

nit: "SFP's"

Section 8.1

   SFF1 followed by an SF of type 43 located at SFF2.  This path is
   fully explicit and each SFF is offered no choice in forwarding packet
   along the path.

nit: "forwarding packets" or "forwarding a packet"

Section 8.4

   This example provides a choice of SF type in the second hop in the
   path.  The SI of 250 indicates a choice between SF type 43 located
   through SF2 and SF type 44 located at SF3.

nit: s/located through/located at/?

   When the packets are returned to SFF1 by the SFI the SI will be
   decreased to 250 for the next hop.  SFF1 now has a free choice of
   next hop SFF to execute the next hop in the path selecting between
   all SFF2 that support an SF of type 43 and SFF3 that supports an SF
   of type 44.  These may be completely different functions that are to

nit: I think s/all SFF2 that support/SFF2 that supports/ ?

Section 8.8

Some pedantic part of me wants to complain that this is not a "branch",
since switching to SPF10 is the only option allowed at SI=250 on SPF11.

Section 8.9.1

It might make more sense to mvoe SPF13's definition before the paragraph
that talks about "sufficiently precisely specified SFPs", since SPF13
does not actually "specify [...] exact SFIs to use"

Section 9

I really like the way this section is written; thank you!

I think the core SFC documents cover enough of the considerations
inherent to looping, jumping, and branching that there's probably not
more to say here.

I'll list a handful of potential security topics I can think of after
reading the document, but that's not meant to imply that they all need
to be covered individually in the document: they are *potential* topics,
but may not end up being worth talking about.

We could perhaps note (again) the risk of misidentifying the overlay/VPN
on which a SFI receives a packet and its impact on the SPI/SI lookup,
but it's probably not strictly necessary.

With SFTs being allocated via FCFS, there is perhaps some risk that
different devices that actually provide qualitatively different
functionality will claim to provide the same SFT-number's service, which
potentially could cause problems.

There is perhaps room to talk about the specific bad things that can
happen if unsecured BGP is used (so that attackers can make false SFIR
announcements, false SFPR declarations, etc.), but it's not entirely
clear how useful that would be.

It might also be worth a specific mention of perimeter filtering so the
SFC configuration doesn't escape the local domain where it's intended to
be used.  (I would like to think this is sufficiently obvious, but am
not properly calibrated to make that call.)

Making "live" updates to an existing SFP is likely to have some level of
skew; if this involves both additions and removals it seems like there
is some risk of traversing "unsafe" configurations even when both old
and new configurations on their own would be safe.  (We do have
something of a recommendation to not do this earlier in the document
already, which is good.)

Is it too banal to say that adding more stuff to BGP messages make them
bigger and consume more resources to process?

   Chaining system.  The control plane mechanisms are very similar to
   those used for BGP/MPLS IP VPNs as described in [RFC4364], and so the
   security considerations in that document (Section 23) provide good

s/23/13/

Section 10.4

[When I first saw "association TLV" I was expecting it to share a
registry with some other association grouping usages.  But I don't have
a particular complaint about doing it this way.]