[nvo3] Benjamin Kaduk's Discuss on draft-ietf-nvo3-geneve-14: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Thu, 05 December 2019 00:36 UTC

MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-nvo3-geneve@ietf.org, Matthew Bocci <matthew.bocci@nokia.com>, nvo3-chairs@ietf.org, matthew.bocci@nokia.com, nvo3@ietf.org
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <157550620414.11231.17700978164671458642.idtracker@ietfa.amsl.com>
Date: Wed, 04 Dec 2019 16:36:44 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/nvo3/vFyp55feuhCxjl3IQyU6UnG_VkE>
Subject: [nvo3] Benjamin Kaduk's Discuss on draft-ietf-nvo3-geneve-14: (with DISCUSS and COMMENT)

Benjamin Kaduk has entered the following ballot position for
draft-ietf-nvo3-geneve-14: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-nvo3-geneve/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

This first point is a "discuss discuss" for which I'd like to get a
sense of what the rest of the IESG feels.  I've read the discussion at
https://mailarchive.ietf.org/arch/msg/last-call/ywRKREnxWAlunHR7MSaTM4ScsDs
but I'm left with a similar sense of uncertainty that Daniel has as to
whether the question is fully resolved.  Specifically, "the question"
that I have in mind is to what extent the Geneve architecture includes
support for middleboxes that inspect (but do not modify!) the Geneve
header and inner payload, to what extent the Geneve architecture is
intended to be applicable to scenarios where (end-to-end per-tunnel)
underlay confidentiality protection is necessary, and whether those
requirements are both strong enough to be deemed an internal
inconsistency of requirements/applicability.  "Interposing advanced
middleboxes" and "service interposition" are conceived as possible uses
for Geneve metadata in Sections 1 and 2.2 as a consideration for why
structured tagging is needed on the data plane and not just the control
plane, which to me suggests that such usage is considered a first-class
use case for Geneve.  Section 6.1.1 discusses encryption for traffic
traversing untrusted links between geographically separated data
centers (though perhaps in this case an encrypted tunnel would be used
just for that untrusted transit and leaving the in-datacenter traffic
visible to middleboxes), but Section 6.1 discusses cases where the tenant
may expect the service provider to provide confidentiality as part of
the service.  Would this be above or below the Geneve encapsulation?
Might some customers insist on one or the other?  The consideration from
Section 6.1 that the provider of the underlay and the provider of the
overlay may not be the same could be taken to imply that the overlay
provider itself wants (cryptographic) protection from the underlay
provider.  I don't have a clear picture of how these considerations
interact.  (I also note that, since DTLS is mentioned, DTLS 1.3 is going
the way of TLS 1.3 and not defining any authentication-only
ciphersuites, so if authentication-only service is desired, DTLS may not
be the way of the future, leaving IPsec AH as the leading candidate.)

Some other section-by-section discuss-level points follow, mostly
self-contained/localized issues.

Section 3.5.1

   o  Some options may be defined in such a way that the position in the
      option list is significant.  Options MUST NOT be changed by
      transit devices.

   o  An option SHOULD NOT be dependent upon any other option in the
      packet, i.e., options can be processed independently of one
      another.  [...]

As was already noted, I don't see how these two requirements are
self-consistent.

   size.  A particular option is specified to have either a fixed
   length, which is constant, or a variable length, which may change
   over time or for different use cases.  This property is part of the
   definition of the option and conveyed by the 'Type'.  For fixed

This text is written as if this specification is going to specify
further substructure for the "Type", with respect to certain types that
have fixed length and others that may vary.  Otherwise the property
would be attached to the option value and not the type value, in my
understanding.  With the current way the registry is laid out it seems
like we need to explicitly say that the entity allocating the option
class value needs to specify the interpretation of the 'type' field when
used with that option class.

Section 4.3.1

   2.  If Geneve is used with zero UDP checksum over IPv6 then such
       tunnel endpoint implementation MUST meet all the requirements
       specified in section 4 of [RFC6936] and requirements 1 as
       specified in section 5 of [RFC6936].

This seems to implicitly be saying that the other numbered requirements
in Section 5 of RFC 6936 can be ignored, which is updating the behavior
of a standards-track document.  We need to either be explicit about the
update or justify why (the rest of) that applicability statement is not
applicable here.  If, as the paragraph following the enumerated list
says, the requirements specified in RFC 6936 continue to apply in full,
why do we need to call out a MUST-level requirement here?

   4.  The Geneve tunnel endpoint that encapsulates the tunnel MAY use
       different IPv6 source addresses for each Geneve tunnel that uses
       Zero UDP checksum mode in order to strengthen the decapsulator's
       check of the IPv6 source address (i.e the same IPv6 source
       address is not to be used with more than one IPv6 destination
       address, irrespective of whether that destination address is a
       unicast or multicast address).  When this is not possible, it is
       RECOMMENDED to use each source address for as few Geneve tunnels
       that use zero UDP checksum as is feasible.

This functionality is not usable without some mechanism to signal from
encapsulator to decapsulator that it is in use.

   The requirement to check the source IPv6 address in addition to the
   destination IPv6 address, [...]

I do not see this specified as a requirement, only a MAY-level
suggestion.

Section 4.6

   o  When performing LSO, a NIC MUST replicate the entire Geneve header
      and all options, including those unknown to the device, onto each
      resulting segment.  However, a given option definition may
      override this rule and specify different behavior in supporting
      devices.  [...]

This second sentence makes the MUST in the first no longer a MUST.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Section 2.2.1

   recipient.  As new functionality becomes sufficiently well defined to
   add to tunnel endpoints, supporting options can be designed using
   ordering restrictions and other techniques to ease parsing.

I'm having trouble parsing the second half of this sentence -- what does
"supporting options" mean as a noun?

   Further, either tunnel endpoints or transit devices MAY use offload
   capabilities of NICs such as checksum offload to improve the
   performance of Geneve packet processing.  The presence of a Geneve
   variable length header SHOULD NOT prevent the tunnel endpoints and
   transit devices from using such offload capabilities.

I agree with the directorate reviewer that this implementation guidance
is unenforcable as normative keywords.

Section 3.1, 3.2

If we're going to give concrete values for the IPv4 protocol/IPv6
NextHeader (17) and destination port (6081), shouldn't we also use the
concreve value for Geneve protocol type (0x6558) that corresponds to the
inner ethernet frame?

I'd also suggest some visual distinction that the "Variable Length
Options" do in fact have variable length, perhaps using the '~'
character in vertical lines.
Similarly, the original ethernet payload need not be 4-byte-aligned and
the figure could make that more prominent.

It's a little awkward to expand FCS on second usage, not first usage.

Section 3.4

      The critical bit allows hardware implementations the flexibility
      to handle options processing in the hardware fastpath or in the
      exception (slow) path without the need to process all the options.
      For example, a critical option such as secure hash to provide
      Geneve header integrity check must be processed by tunnel
      endpoints and typically processed in the hardware fastpath.

I think I'm failing to make a connection between some of these steps.
How does having a critical bit let a header integrity check happen in
the hardware fastpath while deferring other option processing to
software?

   Transit devices MUST maintain consistent forwarding behavior
   irrespective of the value of 'Opt Len', including ECMP link
   selection.  These devices SHOULD be able to forward packets
   containing options without resorting to a slow path.

There seem to be two broad aspects in play here.  First, requiring
insensitivity to "Opt Len" might be because the value would change as a
packet traverses the network.  I think this is forbidden by virtue of
transit devices not being allowed to add/delete options, but please
confirm.  Second, this affects the ability of transit devices to look
past the geneve header to the inner ethernet header and payload.  Given
the substantial discussion we've had in the broader IETF about IPv6
extension headers and the inability of hardware to examine such
variable-length chains to get to the actual upper layer protocol (with
the result that extension headers are largely unusuable on substantial
portions of the internet), it seems like we might conclude from this
statement that either we expect transit devices to not inspect the
upper-layer content or there's a significant chance that this
requirement will be ignored (possibly just by capping the 'Opt Len'
value that is supported), or both.  What makes this setup different from
IPv6 EH such that we expect hardware compliance and a usable deployment?
This is particularly poigniant given that we claim this to be a
requirement on transit devices but allow (in Section 4.5) for endpoints
to use profiles that have a restricted maximum length for the options.
If such profiles are common, the incentive for transit devices to slip
and use the lower maximum length increases.

Section 3.5

      The high order bit of the option type indicates that this is a
      critical option.  If the receiving tunnel endpoint does not
      recognize this option and this bit is set then the packet MUST be
      dropped.  If the 'C' bit (critical bit) is set in any option then
      the 'C' bit in the Geneve base header MUST also be set.  Transit
      devices MUST NOT drop packets on the basis of this bit.  The

nit: since we mention the Geneve header, one might claim that "this bit"
in "MUST NOT drop packets on the basis of this bit" is ambiguous (but
since we said this before for the Geneve header one, I assume we're
talking about the one in the Type field now).

Section 4.4.1

   It is strongly RECOMMENDED that Path MTU Discovery ([RFC1191],
   [RFC8201]) be used by setting the DF bit in the IP header when Geneve
   packets are transmitted over IPv4 (this is the default with IPv6).

Is it the default or the only specified behavior for IPv6?

Section 4.4.3

   outside of the scope of this document.  When physical multicast is in
   use, the 'C' bit in the Geneve header may be used with groups of
   devices with heterogeneous capabilities as each device can interpret
   only the options that are significant to it if they are not critical.

Please double-check this sentence, particularly the "may be used".  If
the intent is, as written, to note that the packets with the 'C' bit set
might take paths with heterogenous paths, I suggest being more explicit
about the consequences that the traffic might only be delivered to some
but not all endpoints.

Section 6

   untrusted boundaries.  In addition, tunnel endpoints should only be
   operated in environments controlled by the service provider, such as
   the hypervisor itself rather than within a customer VM.

Can you say a bit more about how this "should only be operated in
environments controlled by the service provider" meshes with the note in
Section 4.1 that "[i]t is intended for use in public or private data
center environments" (specifically the "public data center" portion) and
the note in Section 6.1 that the provider of the overlay may not be the
same as the provider of the underlay?

Section 6.1.1

   traversing public networks.  Any Geneve overlay data leaving the data
   center network beyond the operator's security domain SHOULD be
   secured by encryption mechanisms such as IPsec or other VPN
   mechanisms to protect the communications between the NVEs when they
   are geographically separated over untrusted network links.

Since we use "mechanisms" in both the IPsec clause and the "other VPN"
clause, the "encryption" does not automatically bind to both clauses
from a grammatical perspective.  Given that "VPN" is currently in use
for both encrypted and non-encrypted schemes (much to my chagrin),
please clarify that the other VPN mechanisms also need to provide
cryptographic confidentiality protection.  (Replacing "VPN mechanisms"
with "VPN technologies" would probably suffice.)

Section 6.2

   network.  To prevent such attacks, an NVE MUST NOT propagate Geneve
   packets beyond the NVE to tenant systems and SHOULD employ packet

We also care about not propagating Geneve packets from the tenant
systems past the NVE, right?

   filtering mechanisms so as not to forward unauthorized traffic
   between TSs in different tenant networks.

What does "TS" stand for, here?

Section 10.2

RFCs 1191, 2460 (er, 8200), 6040, and 8201 should be listed as normative
references.

   [ETYPES]   The IEEE Registration Authority, "IEEE 802 Numbers", 2013,
              <http://www.iana.org/assignments/ieee-802-numbers/ieee-
              802-numbers.xml>.

Hmm, firefox claims the content of this resource is invalid XML, sigh.

[nvo3] Benjamin Kaduk's Discuss on draft-ietf-nvo… Benjamin Kaduk via Datatracker
Re: [nvo3] Benjamin Kaduk's Discuss on draft-ietf… Ganga, Ilango S
Re: [nvo3] Benjamin Kaduk's Discuss on draft-ietf… Ganga, Ilango S
Re: [nvo3] Benjamin Kaduk's Discuss on draft-ietf… Benjamin Kaduk
Re: [nvo3] Benjamin Kaduk's Discuss on draft-ietf… Benjamin Kaduk
Re: [nvo3] Benjamin Kaduk's Discuss on draft-ietf… Ganga, Ilango S
Re: [nvo3] Benjamin Kaduk's Discuss on draft-ietf… Ganga, Ilango S
Re: [nvo3] Benjamin Kaduk's Discuss on draft-ietf… Benjamin Kaduk