Re: [nvo3] Benjamin Kaduk's Discuss on draft-ietf-nvo3-geneve-14: (with DISCUSS and COMMENT)

"Ganga, Ilango S" <ilango.s.ganga@intel.com> Thu, 12 December 2019 20:00 UTC

Return-Path: <ilango.s.ganga@intel.com>
X-Original-To: nvo3@ietfa.amsl.com
Delivered-To: nvo3@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 856B0120A99; Thu, 12 Dec 2019 12:00:10 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.899
X-Spam-Level:
X-Spam-Status: No, score=-6.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VAU1q91YVT-m; Thu, 12 Dec 2019 12:00:07 -0800 (PST)
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E8F94120A62; Thu, 12 Dec 2019 12:00:06 -0800 (PST)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Dec 2019 12:00:06 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.69,306,1571727600"; d="scan'208";a="211225027"
Received: from orsmsx108.amr.corp.intel.com ([10.22.240.6]) by fmsmga008.fm.intel.com with ESMTP; 12 Dec 2019 12:00:06 -0800
Received: from orsmsx155.amr.corp.intel.com (10.22.240.21) by ORSMSX108.amr.corp.intel.com (10.22.240.6) with Microsoft SMTP Server (TLS) id 14.3.439.0; Thu, 12 Dec 2019 12:00:05 -0800
Received: from orsmsx116.amr.corp.intel.com ([169.254.7.30]) by ORSMSX155.amr.corp.intel.com ([169.254.7.176]) with mapi id 14.03.0439.000; Thu, 12 Dec 2019 12:00:05 -0800
From: "Ganga, Ilango S" <ilango.s.ganga@intel.com>
To: Benjamin Kaduk <kaduk@mit.edu>, The IESG <iesg@ietf.org>
CC: "draft-ietf-nvo3-geneve@ietf.org" <draft-ietf-nvo3-geneve@ietf.org>, Matthew Bocci <matthew.bocci@nokia.com>, "nvo3-chairs@ietf.org" <nvo3-chairs@ietf.org>, "nvo3@ietf.org" <nvo3@ietf.org>
Thread-Topic: Benjamin Kaduk's Discuss on draft-ietf-nvo3-geneve-14: (with DISCUSS and COMMENT)
Thread-Index: AQHVqwQXhnEOfkrpo0G8dl67YWS+Zqe28Kpw
Date: Thu, 12 Dec 2019 20:00:04 +0000
Message-ID: <C5A274B25007804B800CB5B289727E3590694A3A@ORSMSX116.amr.corp.intel.com>
References: <157550620414.11231.17700978164671458642.idtracker@ietfa.amsl.com>
In-Reply-To: <157550620414.11231.17700978164671458642.idtracker@ietfa.amsl.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiNWU4NzY1ZjMtYWU5Yy00ZTQ2LWJjOTgtMWQ4NmY3YTY3MGI1IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiQUEwM1Z4SmxNYXp4enA4UHJybWR4T3F3aUVGSXh6aWJjRENCTEh6eCtWNlZSTUFGUUczUVh6cjB3YnhmQjdCUCJ9
x-ctpclassification: CTP_NT
dlp-product: dlpe-windows
dlp-version: 11.2.0.6
dlp-reaction: no-action
x-originating-ip: [10.22.254.139]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/nvo3/G37hH5brjYzYPQLHAfUr54_-Fwg>
Subject: Re: [nvo3] Benjamin Kaduk's Discuss on draft-ietf-nvo3-geneve-14: (with DISCUSS and COMMENT)
X-BeenThere: nvo3@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Network Virtualization Overlays \(NVO3\) Working Group" <nvo3.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nvo3>, <mailto:nvo3-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nvo3/>
List-Post: <mailto:nvo3@ietf.org>
List-Help: <mailto:nvo3-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nvo3>, <mailto:nvo3-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 12 Dec 2019 20:00:12 -0000

Hello Benjamin,

Thanks for your review and comments. Please see below for our responses to some of your comments, enclosed in <Response> </Response>. Let us know if you are satisfied with this resolution.
We will send responses to your other comments separately. 

Thanks,
Ilango Ganga
Geneve Editor

From: Benjamin Kaduk via Datatracker <noreply@ietf.org> 
Sent: Wednesday, December 4, 2019 4:37 PM
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-nvo3-geneve@ietf.org; Matthew Bocci <matthew.bocci@nokia.com>; nvo3-chairs@ietf.org; matthew.bocci@nokia.com; nvo3@ietf.org
Subject: Benjamin Kaduk's Discuss on draft-ietf-nvo3-geneve-14: (with DISCUSS and COMMENT)

Benjamin Kaduk has entered the following ballot position for
draft-ietf-nvo3-geneve-14: Discuss

1. Security and transit devices comments:

Benjamin> “to what extent the Geneve architecture includes support for middleboxes that inspect (but do not modify!) the Geneve header and inner payload”

IG> <Response> 1.   Looks like you are interpreting middlebox as a transit device, but middle box has a broader meaning including service functions (appliances).  So we clarify that the transit devices are forwarding elements like switches and routers (see section 1.2 for definition) . We reiterate that only NVEs can generate and terminate Geneve headers and transit devices can only interpret Geneve headers and that this is optional. A transit device not interpreting or not able to interpret options can forward the packets like any other UDP packet. So it does not affect the forwarding of Geneve packets end to end.  An example use case of a transit device (like a switch/router) is it may interpret information in Geneve headers to perform ECMP forwarding. If a transit device is not able interpret it will still forward based on outer UDP header information.
</Response>

Benjamin> “to what extent the Geneve architecture is intended to be applicable to scenarios where (end-to-end per-tunnel) underlay confidentiality protection is necessary”

IG> <Response> 2.   NVE to NVE end to end encryption (underlay confidentiality) is prevalent in multi-tenant data centers use cases (see section 6.1). In these cases a transit device will still forward Geneve like any other UDP/IP packets. Hence does not affect the forwarding operation of a transit device. 
</Response>

Benjamin> "Interposing advanced middleboxes" and "service interposition" are conceived as possible uses for Geneve metadata in Sections 1 and 2.2 as a consideration for why structured tagging is needed on the data plane and not just the control plane, which to me suggests that such usage is considered a first-class use case for Geneve.

IG> <Response> 3.   In the “service interposition” use case noted above the reference middle box is a network service function, this is not a transit device. The service would implement a NVE function to generate and terminate Geneve headers. Hence this is consistent with Geneve specifications. If data confidentiality is required by the data center operator this is NVE to NVE encryption as expected by the Geneve/NVO3 architecture. 
</Response>

Benjamin> “Section 6.1.1 discusses encryption for traffic traversing untrusted links between geographically separated data centers (though perhaps in this case an encrypted tunnel would be used just for that untrusted transit and leaving the in-datacenter traffic visible to middleboxes), but Section 6.1 discusses cases where the tenant may expect the service provider to provide confidentiality as part of the service.  Would this be above or below the Geneve encapsulation?
”
IG> <Response> 4.   In the scenario described in 6.1.1 (geographically separated data centers) the Geneve packets will be transported over an encrypted service (e.g. a VPN service). 
<Response 5>.  In the scenario described in 6.1 (multi-tenancy) encryption may be used either at the Geneve payload level or at the IP/UDP level in which case the Geneve header will not be visible to transit devices. 
</Response>

Benjamin> "The consideration from Section 6.1 that the provider of the underlay and the provider of the overlay may not be the same could be taken to imply that the overlay provider itself wants (cryptographic) protection from the underlay provider.  I don't have a clear picture of how these considerations interact.  (I also note that, since DTLS is mentioned, DTLS 1.3 is going the way of TLS 1.3 and not defining any authentication-only ciphersuites, so if authentication-only service is desired, DTLS may not be the way of the future, leaving IPsec AH as the leading candidate.)"

IG> <Response> 6.   We expect that the overlay network provider will provide the cryptographic protection with the entire payload being encrypted. 
</Response>

2. UDP Checksum with IPv6 comment:

Benjamin> Section 4.3.1
   2.  If Geneve is used with zero UDP checksum over IPv6 then such
       tunnel endpoint implementation MUST meet all the requirements
       specified in section 4 of [RFC6936] and requirements 1 as
       specified in section 5 of [RFC6936].

"This seems to implicitly be saying that the other numbered requirements in Section 5 of RFC 6936 can be ignored, which is updating the behavior of a standards-track document.  We need to either be explicit about the update or justify why (the rest of) that applicability statement is not applicable here.  If, as the paragraph following the enumerated list says, the requirements specified in RFC 6936 continue to apply in full, why do we need to call out a MUST-level requirement here?"

IG> <Response> 7.  Requirement 1 is a consideration for Geneve implementers, which is why it is highlighted. The others are design considerations for the protocol itself, which have already been incorporated or are not applicable to Geneve. It is not implying that the other requirements should be ignored. We have tried to indicate that with the text stating that all requirements continue to apply. This text is a resolution to TSVART early review (David Black). The text is very similar to RFC 8086 (and RFC 7510).  
</Response>

</End of Responses>

-----Original Message-----
From: Benjamin Kaduk via Datatracker <noreply@ietf.org> 
Sent: Wednesday, December 4, 2019 4:37 PM
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-nvo3-geneve@ietf.org; Matthew Bocci <matthew.bocci@nokia.com>; nvo3-chairs@ietf.org; matthew.bocci@nokia.com; nvo3@ietf.org
Subject: Benjamin Kaduk's Discuss on draft-ietf-nvo3-geneve-14: (with DISCUSS and COMMENT)

Benjamin Kaduk has entered the following ballot position for
draft-ietf-nvo3-geneve-14: Discuss

When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-nvo3-geneve/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

This first point is a "discuss discuss" for which I'd like to get a sense of what the rest of the IESG feels.  I've read the discussion at https://mailarchive.ietf.org/arch/msg/last-call/ywRKREnxWAlunHR7MSaTM4ScsDs
but I'm left with a similar sense of uncertainty that Daniel has as to whether the question is fully resolved.  Specifically, "the question"
that I have in mind is to what extent the Geneve architecture includes support for middleboxes that inspect (but do not modify!) the Geneve header and inner payload, to what extent the Geneve architecture is intended to be applicable to scenarios where (end-to-end per-tunnel) underlay confidentiality protection is necessary, and whether those requirements are both strong enough to be deemed an internal inconsistency of requirements/applicability.  "Interposing advanced middleboxes" and "service interposition" are conceived as possible uses for Geneve metadata in Sections 1 and 2.2 as a consideration for why structured tagging is needed on the data plane and not just the control plane, which to me suggests that such usage is considered a first-class use case for Geneve.  Section 6.1.1 discusses encryption for traffic traversing untrusted links between geographically separated data centers (though perhaps in this case an encrypted tunnel would be used just for that untrusted transit and leaving the in-datacenter traffic visible to middleboxes), but Section 6.1 discusses cases where the tenant may expect the service provider to provide confidentiality as part of the service.  Would this be above or below the Geneve encapsulation?
Might some customers insist on one or the other?  The consideration from Section 6.1 that the provider of the underlay and the provider of the overlay may not be the same could be taken to imply that the overlay provider itself wants (cryptographic) protection from the underlay provider.  I don't have a clear picture of how these considerations interact.  (I also note that, since DTLS is mentioned, DTLS 1.3 is going the way of TLS 1.3 and not defining any authentication-only ciphersuites, so if authentication-only service is desired, DTLS may not be the way of the future, leaving IPsec AH as the leading candidate.)

Some other section-by-section discuss-level points follow, mostly self-contained/localized issues.

Section 3.5.1

   o  Some options may be defined in such a way that the position in the
      option list is significant.  Options MUST NOT be changed by
      transit devices.

   o  An option SHOULD NOT be dependent upon any other option in the
      packet, i.e., options can be processed independently of one
      another.  [...]

As was already noted, I don't see how these two requirements are self-consistent.

   size.  A particular option is specified to have either a fixed
   length, which is constant, or a variable length, which may change
   over time or for different use cases.  This property is part of the
   definition of the option and conveyed by the 'Type'.  For fixed

This text is written as if this specification is going to specify further substructure for the "Type", with respect to certain types that have fixed length and others that may vary.  Otherwise the property would be attached to the option value and not the type value, in my understanding.  With the current way the registry is laid out it seems like we need to explicitly say that the entity allocating the option class value needs to specify the interpretation of the 'type' field when used with that option class.

Section 4.3.1

   2.  If Geneve is used with zero UDP checksum over IPv6 then such
       tunnel endpoint implementation MUST meet all the requirements
       specified in section 4 of [RFC6936] and requirements 1 as
       specified in section 5 of [RFC6936].

This seems to implicitly be saying that the other numbered requirements in Section 5 of RFC 6936 can be ignored, which is updating the behavior of a standards-track document.  We need to either be explicit about the update or justify why (the rest of) that applicability statement is not applicable here.  If, as the paragraph following the enumerated list says, the requirements specified in RFC 6936 continue to apply in full, why do we need to call out a MUST-level requirement here?

   4.  The Geneve tunnel endpoint that encapsulates the tunnel MAY use
       different IPv6 source addresses for each Geneve tunnel that uses
       Zero UDP checksum mode in order to strengthen the decapsulator's
       check of the IPv6 source address (i.e the same IPv6 source
       address is not to be used with more than one IPv6 destination
       address, irrespective of whether that destination address is a
       unicast or multicast address).  When this is not possible, it is
       RECOMMENDED to use each source address for as few Geneve tunnels
       that use zero UDP checksum as is feasible.

This functionality is not usable without some mechanism to signal from encapsulator to decapsulator that it is in use.

   The requirement to check the source IPv6 address in addition to the
   destination IPv6 address, [...]

I do not see this specified as a requirement, only a MAY-level suggestion.

Section 4.6

   o  When performing LSO, a NIC MUST replicate the entire Geneve header
      and all options, including those unknown to the device, onto each
      resulting segment.  However, a given option definition may
      override this rule and specify different behavior in supporting
      devices.  [...]

This second sentence makes the MUST in the first no longer a MUST.


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Section 2.2.1

   recipient.  As new functionality becomes sufficiently well defined to
   add to tunnel endpoints, supporting options can be designed using
   ordering restrictions and other techniques to ease parsing.

I'm having trouble parsing the second half of this sentence -- what does "supporting options" mean as a noun?

   Further, either tunnel endpoints or transit devices MAY use offload
   capabilities of NICs such as checksum offload to improve the
   performance of Geneve packet processing.  The presence of a Geneve
   variable length header SHOULD NOT prevent the tunnel endpoints and
   transit devices from using such offload capabilities.

I agree with the directorate reviewer that this implementation guidance is unenforcable as normative keywords.

Section 3.1, 3.2

If we're going to give concrete values for the IPv4 protocol/IPv6 NextHeader (17) and destination port (6081), shouldn't we also use the concreve value for Geneve protocol type (0x6558) that corresponds to the inner ethernet frame?

I'd also suggest some visual distinction that the "Variable Length Options" do in fact have variable length, perhaps using the '~'
character in vertical lines.
Similarly, the original ethernet payload need not be 4-byte-aligned and the figure could make that more prominent.

It's a little awkward to expand FCS on second usage, not first usage.

Section 3.4

      The critical bit allows hardware implementations the flexibility
      to handle options processing in the hardware fastpath or in the
      exception (slow) path without the need to process all the options.
      For example, a critical option such as secure hash to provide
      Geneve header integrity check must be processed by tunnel
      endpoints and typically processed in the hardware fastpath.

I think I'm failing to make a connection between some of these steps.
How does having a critical bit let a header integrity check happen in the hardware fastpath while deferring other option processing to software?

   Transit devices MUST maintain consistent forwarding behavior
   irrespective of the value of 'Opt Len', including ECMP link
   selection.  These devices SHOULD be able to forward packets
   containing options without resorting to a slow path.

There seem to be two broad aspects in play here.  First, requiring insensitivity to "Opt Len" might be because the value would change as a packet traverses the network.  I think this is forbidden by virtue of transit devices not being allowed to add/delete options, but please confirm.  Second, this affects the ability of transit devices to look past the geneve header to the inner ethernet header and payload.  Given the substantial discussion we've had in the broader IETF about IPv6 extension headers and the inability of hardware to examine such variable-length chains to get to the actual upper layer protocol (with the result that extension headers are largely unusuable on substantial portions of the internet), it seems like we might conclude from this statement that either we expect transit devices to not inspect the upper-layer content or there's a significant chance that this requirement will be ignored (possibly just by capping the 'Opt Len'
value that is supported), or both.  What makes this setup different from
IPv6 EH such that we expect hardware compliance and a usable deployment?
This is particularly poigniant given that we claim this to be a requirement on transit devices but allow (in Section 4.5) for endpoints to use profiles that have a restricted maximum length for the options.
If such profiles are common, the incentive for transit devices to slip and use the lower maximum length increases.

Section 3.5

      The high order bit of the option type indicates that this is a
      critical option.  If the receiving tunnel endpoint does not
      recognize this option and this bit is set then the packet MUST be
      dropped.  If the 'C' bit (critical bit) is set in any option then
      the 'C' bit in the Geneve base header MUST also be set.  Transit
      devices MUST NOT drop packets on the basis of this bit.  The

nit: since we mention the Geneve header, one might claim that "this bit"
in "MUST NOT drop packets on the basis of this bit" is ambiguous (but since we said this before for the Geneve header one, I assume we're talking about the one in the Type field now).

Section 4.4.1

   It is strongly RECOMMENDED that Path MTU Discovery ([RFC1191],
   [RFC8201]) be used by setting the DF bit in the IP header when Geneve
   packets are transmitted over IPv4 (this is the default with IPv6).

Is it the default or the only specified behavior for IPv6?

Section 4.4.3

   outside of the scope of this document.  When physical multicast is in
   use, the 'C' bit in the Geneve header may be used with groups of
   devices with heterogeneous capabilities as each device can interpret
   only the options that are significant to it if they are not critical.

Please double-check this sentence, particularly the "may be used".  If the intent is, as written, to note that the packets with the 'C' bit set might take paths with heterogenous paths, I suggest being more explicit about the consequences that the traffic might only be delivered to some but not all endpoints.

Section 6

   untrusted boundaries.  In addition, tunnel endpoints should only be
   operated in environments controlled by the service provider, such as
   the hypervisor itself rather than within a customer VM.

Can you say a bit more about how this "should only be operated in environments controlled by the service provider" meshes with the note in Section 4.1 that "[i]t is intended for use in public or private data center environments" (specifically the "public data center" portion) and the note in Section 6.1 that the provider of the overlay may not be the same as the provider of the underlay?

Section 6.1.1

   traversing public networks.  Any Geneve overlay data leaving the data
   center network beyond the operator's security domain SHOULD be
   secured by encryption mechanisms such as IPsec or other VPN
   mechanisms to protect the communications between the NVEs when they
   are geographically separated over untrusted network links.

Since we use "mechanisms" in both the IPsec clause and the "other VPN"
clause, the "encryption" does not automatically bind to both clauses from a grammatical perspective.  Given that "VPN" is currently in use for both encrypted and non-encrypted schemes (much to my chagrin), please clarify that the other VPN mechanisms also need to provide cryptographic confidentiality protection.  (Replacing "VPN mechanisms"
with "VPN technologies" would probably suffice.)

Section 6.2

   network.  To prevent such attacks, an NVE MUST NOT propagate Geneve
   packets beyond the NVE to tenant systems and SHOULD employ packet

We also care about not propagating Geneve packets from the tenant systems past the NVE, right?

   filtering mechanisms so as not to forward unauthorized traffic
   between TSs in different tenant networks.

What does "TS" stand for, here?

Section 10.2

RFCs 1191, 2460 (er, 8200), 6040, and 8201 should be listed as normative references.

   [ETYPES]   The IEEE Registration Authority, "IEEE 802 Numbers", 2013,
              <http://www.iana.org/assignments/ieee-802-numbers/ieee-
              802-numbers.xml>.

Hmm, firefox claims the content of this resource is invalid XML, sigh.