Re: [sfc] Last Call: <draft-ietf-sfc-nsh-18.txt> (Network Service Header (NSH)) to Proposed Standard

Warren Kumari <warren@kumari.net> Sat, 19 August 2017 00:52 UTC

MIME-Version: 1.0
In-Reply-To: <150249274009.24438.5552125434164888108.idtracker@ietfa.amsl.com>
References: <150249274009.24438.5552125434164888108.idtracker@ietfa.amsl.com>
From: Warren Kumari <warren@kumari.net>
Date: Fri, 18 Aug 2017 20:51:37 -0400
Message-ID: <CAHw9_iJWhJxJGngi6yv2Di3XsF-cMZv-mxKV=NNGruEDJcM+eg@mail.gmail.com>
To: IETF Discuss <ietf@ietf.org>
Cc: IETF-Announce <ietf-announce@ietf.org>, draft-ietf-sfc-nsh@ietf.org, "Joel M. Halpern" <jmh@joelhalpern.com>, sfc-chairs@ietf.org, sfc@ietf.org, Alia Atlas <akatlas@gmail.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/sfc/MDKZqA5xUEk8_nb4jvX90mEUUQo>
Subject: Re: [sfc] Last Call: <draft-ietf-sfc-nsh-18.txt> (Network Service Header (NSH)) to Proposed Standard
Precedence: list

On Fri, Aug 11, 2017 at 7:05 PM, The IESG <iesg-secretary@ietf.org> wrote:
>
> The IESG has received a request from the Service Function Chaining WG (sfc)
> to consider the following document: - 'Network Service Header (NSH)'
>   <draft-ietf-sfc-nsh-18.txt> as Proposed Standard
>
> The IESG plans to make a decision in the next few weeks, and solicits final
> comments on this action. Please send substantive comments to the
> ietf@ietf.org mailing lists by 2017-08-25. Exceptionally, comments may be
> sent to iesg@ietf.org instead. In either case, please retain the beginning of
> the Subject line to allow automated sorting.
>
> Abstract
>
>
>    This document describes a Network Service Header (NSH) inserted onto
>    packets or frames to realize service function paths.  NSH also
>    provides a mechanism for metadata exchange along the instantiated
>    service paths.  NSH is the SFC encapsulation required to support the
>    Service Function Chaining (SFC) architecture (defined in RFC7665).
>

<no hats>

In my opinion this document still needs significant work. Christian's
great SecDir review lists many issues which need to be addressed;
since he has already raised concerns, I am not focusing on the
security and privacy stuff in my but review. Instead, I'll discuss
some of the other issues.

I only skimmed sections 6.1 - Section 8.

As this is output of the SFC WG, which is chartered to, amongst other
thing, create a "Generic SFC Encapsulation" (with some more details)
I'll take it as given that this is desired / needed.

Issues:

Issue 1: I find the entire document to be confusing about what an NSH
actually is, and after spending much time reading, am still fairly
confused.
A clear model at the beginning of the document which explains how
these bits fit together would be very helpful. I am familiar with
RFC7665 (the architecture document); my confusion stems from this
draft.

For example the Abstract says:
"This document describes a Network Service Header (NSH) inserted onto
packets or frames to realize service function paths. NSH also provides
a mechanism for metadata exchange along the instantiated service
paths.  NSH is the SFC encapsulation required to support the Service
Function Chaining (SFC) architecture ".

Oh, ok, so it is something inserted onto packets or frames...
*inserted onto* sounds like a shim, however,  "NSH is the SFC
encapsulation...".  indicates that I had the wrong mental model and
this is actually an encapsulation, kinda like GRE. (This is reinforced
with (end of Section 1) "NSH is the SFC encapsulation referenced in
[RFC7665].") Ok, so this encapsulates the packet, got it...
I read some more, and then find (in section 2) that "An outer
transport header is imposed, on NSH and the original packet/frame, for
network forwarding.". Oh, so this is *actually* a shim / header which
goes before the packet, and then the whole thing gets encapsulated? It
shouldn't have been this hard to figure out, but Ok, now I get it...Or
do I? Oh, hang on, what transport header do I use?!
It is only much much later, in Section 4 (page 15) that I find "Once
NSH is added to a packet, an outer encapsulation is used to forward
the original packet and the associated metadata to the start of a
service chain. ... The service header is independent of the
encapsulation used and is encapsulated in existing transports.  The
presence of NSH is indicated via protocol type or other indicator in
the outer encapsulation." - I'm now really confused - first I thought
it was a shim / header, then I thought it was an encapsulation, now I
(finally) see that it is a header which goes before the packet
(encapsulating it) before it gets encapsulated again into something
else. It would be much better if the document was clearer at the
beginning about how all this fits together / what the NSH actually
*is* -- an analogy to layering, or as a shim which goes between a
tunneling header and the original packet would make this much clearer
(and the reader much less annoyed!).

Issue 2: Section 1.2.  Definition of Terms
"Metadata:  Defined in [RFC7665]." - The definition of metadata in
RFC7665 says: "Metadata: Provides the ability to exchange context
information between classifiers and SFs, and among SFs.". This is not
an adequate definition for use in this document - the document defines
a context header which carries metadata, and it need to be better
defined than something which provides an ability.
Apart from RFC7665 actually *defining* metadata, it doesn't really
describe what you do with it. As this document discusses metadata much
more (and how to carry it), I think that it needs to much better
define what it is, what its purpose is, and how to use it.

Issue 3: Section 2.  Network Service Header
"An outer transport header is imposed, on NSH and the original
packet/frame, for network forwarding." - this is very poorly worded,
and I don't really know what I'm supposed to do here. I'm assuming
that I'm supposed to concatenate the NSH and the packet, and then
encapsulate this in some transport of my choosing, but this needs more
precision.

Issue 4: The document talks about using the TTL to avoid loops, and
also the Service Index (SI). "The Service Index MUST be decremented by
a value of 1 by Service Functions or by SFC Proxy nodes after
performing required services  and the new decremented SI value MUST be
used in the egress packet’s NSH. ... Additionally, while the TTL field
is the main mechanism for service plane loop detection, the SI can
also be used for detecting service plane loops.". Much better text is
needed here to discuss how these interact. Lets say the TTL is 14, and
the SI is 1. The packet is now handed to the next SF, which decrements
it to 0. Now what? It this the same as the TTL hitting 0? If it an
error? Do I wrap and go back to 255? What are there 2 mechanisms for
this? Much further down, in Section 6.1, we have: "SI serves as a
mechanism for detecting invalid service function paths.  In
particular, an SI value of zero indicates that forwarding is incorrect
and the packet must be discarded." This answers many of the above, but
having the text better organized would simplify the document.

Issue 5: Section 6. Service Path Forwarding with NSH Subsection 6.1.
SFFs and Overlay Selection
I'm assuming that there is much more detail in other documents which
cover how to use the SPI / SI to do the forwarding -- can you please
insert a reference? I'm somewhat confused by the level of detail
between this document and <whatever the other one is>. This doesn't
contain enough to implement, but seem to include more than I'd expect
to clarify how to interface with some other doc.

Questions / concerns:
Section 2.2.  NSH Base Header
"TTL: Indicates the maximum SFF hops for an SFP.  This field is used
for service plane loop detection. ...  Each SFF involved in forwarding
an NSH packet MUST decrement the TTL value by 1 prior to NSH
forwarding lookup.  Decrementing by 1 from an incoming value of 0
shall result in a TTL value of 63.  The packet MUST NOT be forwarded
if TTL is, after decrement, 0."
Why does decrementing by 1 from 0 result in 63? Under what (valid)
conditions will I receive a packet with an incoming value of 0? (why
is this not an error?)

Section 2.4.  NSH MD Type 1
"When the Base Header specifies MD Type = 0x1, a Fixed Length Context
Header (16-bytes) MUST be present immediately following the Service
Path Header, as per Figure 4.  The value of a Fixed Length Context
Header that carries no metadata MUST be set to zero.". "This
specification does not make any assumptions about the content of the
16 byte Context Header that must be present when the MD Type field is
set to 1, and does not describe the structure or meaning of the
included metadata.". So, the specification doesn't make any
assumptions about the Fixed Length Context Header (I guess it is an
opaque blob), but if it has no metadata is must be 0? And if there is
no metadata, why  would I need to waste 16 bytes? And this document
defines 2 MD types -- why would I use this one instead of a Type 2?
(which seems cleaner, and shorter). Kind of related to Issue 2, I
think that there needs to be more discussions (perhaps in some other
document, which needs a reference) what the purpose of the metadata
is, why it is sometimes metadata and sometimes Context Header, what a
receiver is supposed to do with this, etc.

Section 2.4:
"An SFC-aware SF MUST receive the data semantics first in order to
process the data placed in the mandatory context field.  The data
semantics include both the allocation schema and the meaning of the
included data.  How an SFC-aware SF gets the data semantics is outside
the scope of this specification."  -- related to the above, I still
don't get this. Is "the data placed in the mandatory context field" ==
"metadata"?  And there is no type information associated, so receiver
cannot look at it and know what it is, other than "I expected a blob,
this is a blob, blobs I receive should be of type X, let me try
decoded it as one?" -- shouldn't there be some sort of hint in the NSH
as to what the context field is carrying (like Type 2 has)? I've
looked at  I-D.guichard-sfc-nsh-dc-allocation and
I-D.napper-sfc-nsh-broadband-allocation which discuss how bits in the
context header can be allocated, but would a receiver differentiate
which type is in use (other than being configured to know that blobs
are always of type X). Again, I'm hoping that there is an SFC document
which explains the architecture in more detail then RFC7665.

Section  2.5.1.  Optional Variable Length Metadata
Optional variable length Context Headers have a Class (scope of the
Type), a Type (explicit type of metadata, defined by the Class owner),
and a Length.
A length of 0 means that the context header has no data. It would be
useful to explain what this actually means -- it sounds like a message
with no content, but perhaps receivers are supposed to use the Type in
this case to do something?

Section 3.  NSH Actions
"NSH-aware nodes are the only nodes that may alter the content of NSH
headers." -- this sentence is nonsensical. If a node doesn't know
something is an NSH (is it not NSH-aware), telling it not to touch NSH
headers is basically "Devices on the Internet shouldn't poke at things
that they don't understand".

Section 3:
"At the end of a service function path, an SFF, MUST be the last node
operating on the service header and MUST remove NSH before forwarding
or delivering the un-encapsulated packet." - I'm somewhat confused
what this is trying to say, and assume that it is simply superfluous
commas. But, with those removed the sentence is still confusing to me
-- a SFF is the thing that does the forwarding, and so, by definition
it has to be the last thing operating on the service header (unless
the packet is dropped) -- so, what does the first MUST mean?

Section 4.  NSH Transport Encapsulation
"The presence of NSH is indicated via protocol type or other indicator
in the outer encapsulation." -- just for my interest, what all
encapsulations already have this defined / allocated?

Section 5.  Fragmentation Considerations
"As discussed in [I-D.ietf-rtgwg-dt-encap], within an administrative
domain, an operator can ensure that the underlay MTU is sufficient to
carry SFC traffic without requiring fragmentation." -- I think that
that is stretching what I-D.ietf-rtgwg-dt-encap says a bit far. NSH
Type 1 adds 24 octets (4 octet Base Header, 4 octet Service Path
Header, 16 octet Context Header), plus the gets wrapped in a new
transport, so let's say 24 octets for GRE. This means that, for a
1500byte packet I need an MTU of 1548 (I'm not sure all operators can
support that everywhere). The Type 2 header is of variable length (and
I don't think I saw a limit).
" For example, when NSH is encapsulated in IP, IP-level fragmentation
coupled with Path MTU Discovery (PMTUD) is used. When, on the other
hand, the underlay does not support fragmentation procedures, an error
message SHOULD be logged when dropping a packet too big." -- I think
that more guidance / text is needed here, e.g for v6. Just dropping
the packet and logging an error doesn't solve the base issue causing
the problem. If this protocol wasn't an "encapsulation" I wouldn't
feel so strongly, but if the protocol says that it providing an
encapsulation it needs to handle things that encapsulation protocols
do.

Nits:

Section 1. Introduction
... current service function deployment models have been relatively
static, and bound to topology
[O] static, and bound
[P] static and bound
[R] grammar

...
Specifically, the following functions are necessary:

      The movement of service functions and application workloads in the
      network.

      The ability to easily bind service policy to granular information,
      such as per-subscriber state.

[R]: Numbers or bullets would make the much easier to read.

Section 1.2 - Definition of Terms
"Byte:  All references to "bytes" in this document refer to 8-bit
bytes, or octets." -- this is a circular definition (the section is
Definition of Terms, not Terminology) - suggest dropping the 8-bit
bytes, octets covers it.

Throughout the document:
The document says "A Service Classifier adds NSH.  NSH is removed by
...", and "NSH offers a common and standards-based header" -- is NSH a
technology (as implied by the second phrase), or a thing added to a
packet (like the first implies). Whatever the case, for the first, "A
Service Classifier adds *the* NSH.  *The* NSH is removed by ..." (or,
"an").

Section 1.4. NSH-based Service Chaining
   NSH creates a dedicated service plane, more specifically, NSH
[O] plane, more
[P] plane; more
[R] grammar

 1.  Topological Independence: Service forwarding occurs within the
       service plane, the underlying network topology does not require

[O] plane, the
[P] plane; the    (or "plane, so the")
[R] grammar

Section  2.2.  NSH Base Header
"MD Type: Indicates the format of NSH beyond the mandatory Base Header
and the Service Path Header.  MD Type defines the format of the
metadata being carried." -- please expand MD here. I wasted much time
before guessing that this was MetaData Type.

Given the
   widespread implementation of existing hardware that uses the first
   nibble after an MPLS label stack for ECMP decision processing, this
   document reserves version 01b and this value MUST NOT be used in
[O] 01b and this value
[P] 01b. This value
[R] grammar -- run on sentence, and it reads awkwardly with a
semicolon before the "and" (but that would be grammatically correct,
if preferred.)

Forwarding OAM
   packets unmodified by SFC elements that do not support SFC OAM
   procedures may be acceptable for a subset of OAM functions, but can
   result in unexpected outcomes for others, thus it is recommended to
[
O] others, thus
[P] others; thus
[R] grammar

This specification does not disallow the MD Type value from changing
   along an SFP; however, the specification of the necessary mechanism
   to allow the MD Type to change along an SFP are outside the scope of
   this document, and would need to be defined for that functionality to
[
O] document, and
[P] document and
[R] grammar

Section 2.4:
As SF or SFC Proxy -> A SF or SFC Proxy

Section 2.5.1:
"the SFC-aware SF MUST NOT process the packet and MUST log at least
once per the SPI for which the mandatory metadata is missing." --
feels like you are missing some words after "log" -- presumably "must
log an error", or "this event", or something.

Section 6.1. SFFs and Overlay Selection
This indirection -- SPI to overlay -- creates a true service plane.
   That is, the SFF/SF topology is constructed without impacting the
   network topology but more importantly, service plane only

[O] topology but more importantly,
[P] topology; but more importantly,
[R] grammar

This can be via the overlay or
   underlay and in some case require additional configuration on the SF.

[O] some case
[P] some cases
[R] I think this is what was intended

Section 6.4. Service Graphs
These classifiers may also of course

[O] may also of course
[P] may, of course, also
[R] readability

   modify the metadata associated with the packet.

Section 7.1. NSH Metadata and Policy Enforcement

   As described in Section 2, NSH provides the ability to carry metadata
   along a service path.  This metadata may be derived from several
   sources, common examples include:

[O] sources, common
[P] sources. Common
[R] grammar

      Network nodes/devices: Information provided by network nodes can
      indicate network-centric information (such as VRF or tenant) that
      may be used by service functions, or conveyed to another network

[O] functions, or conveyed
[P] functions or conveyed
[R] grammar

 The granularity of classification may vary.  For
   example, a network switch, acting as a classifier, might only be able
   to classify based on a 5-tuple, whereas, a service function may be

[O] whereas, a service
[P] while a service
[R] readability

In both of the examples above, the service functions perform policy
   decisions based on the result of the initial classification: the SFs
   did not need to perform re-classification, rather they rely on a

[O] re-classification, rather
[P] re-classification; instead,
[R] grammar/readability

   antecedent classification for local policy enforcement.

   Depending on the information carried in the metadata, data privacy
   considerations may need to be considered.  For example, if the

This feels like its punting on the problem more than it should, but
I'm avoiding privacy and security in this particular review.

   metadata conveys tenant information, that information may need to be
   authenticated and/or encrypted between the originator and the
   intended recipients (which may include intended SFs only) .  NSH

[O] ) .
[P] ).
[R] punctuation

   itself does not provide privacy functions, rather it relies on the
   transport/overlay layer.  An operator can select the appropriate
   transport to ensure confidentially (and other security)

[O] confidentially
[P] confidentiality
[R] word choice

Section 8. Security Considerations

 NSH is always encapsulated in a transport protocol (as detailed in
   Section 4 of this specification) and therefore, when required,

[O] ) and therefore,
[P] ); and, therefore,
[R] grammar

W

-- 
I don't think the execution is relevant when it was obviously a bad
idea in the first place.
This is like putting rabid weasels in your pants, and later expressing
regret at having chosen those particular rabid weasels and that pair
of pants.
   ---maf

[sfc] Last Call: <draft-ietf-sfc-nsh-18.txt> (Net… The IESG
Re: [sfc] Last Call: <draft-ietf-sfc-nsh-18.txt> … Warren Kumari
Re: [sfc] Last Call: <draft-ietf-sfc-nsh-18.txt> … Carlos Pignataro (cpignata)
Re: [sfc] Last Call: <draft-ietf-sfc-nsh-18.txt> … Warren Kumari
Re: [sfc] Last Call: <draft-ietf-sfc-nsh-18.txt> … Carlos Pignataro (cpignata)