Re: [nvo3] Draft Geneve

Tom Herbert <therbert@google.com> Sat, 01 March 2014 22:28 UTC

Return-Path: <therbert@google.com>
X-Original-To: nvo3@ietfa.amsl.com
Delivered-To: nvo3@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8C8781A0AA6 for <nvo3@ietfa.amsl.com>; Sat, 1 Mar 2014 14:28:26 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.926
X-Spam-Level:
X-Spam-Status: No, score=-1.926 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FM_FORGED_GMAIL=0.622, RP_MATCHES_RCVD=-0.547, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id j1MawOYCf5ky for <nvo3@ietfa.amsl.com>; Sat, 1 Mar 2014 14:28:22 -0800 (PST)
Received: from mail-ie0-x232.google.com (mail-ie0-x232.google.com [IPv6:2607:f8b0:4001:c03::232]) by ietfa.amsl.com (Postfix) with ESMTP id 5DB741A02AE for <nvo3@ietf.org>; Sat, 1 Mar 2014 14:28:22 -0800 (PST)
Received: by mail-ie0-f178.google.com with SMTP id lx4so289050iec.37 for <nvo3@ietf.org>; Sat, 01 Mar 2014 14:28:20 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=kFlDAr9jk0A8VRcZovcPBsCDJD0UgpjgLVBYHo5rsgU=; b=ITZXvL/aGuVFUAnJpYuLPEo9qgI9k21pRwbYEqw5uFlCg2hBqPhjg6+bcjFj6dPa4j ixS1j9THvLURybinnnjlHUorAbT3FRMOMufbAqaC77vSxeg5MqgRPgfbWew+qaCJO726 LP/ff5ewTxkgo/CI0C0Fnj63LQkc79AW9ooGAlLIcvDmvbd23uGLxMeZuma4V3ixPEgv Fqxrr8aQlCTKAT57gkmcBnDxMGJKdLK6NUo7+ZotpUsQIVey+Es9yzwUhZCOWra7v4Ca j+kclLIlA33fgkRKbWrgdDBPTuC25FOeSntBwZwdyPZrZY90hCDuT9tpthrKSb6jWVfA y/QA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=kFlDAr9jk0A8VRcZovcPBsCDJD0UgpjgLVBYHo5rsgU=; b=C0f8djkwxTUt5CrfAA+gGaLbxJFZUUSwtVG8umq/H2jFjniL6vFU0eQaUQMhPb8Bx9 JIczfi/yQAncAmDVfyiff2LUW31WTNT8WK+T93+1rNjTXRapa0lZbLM1vgu/UMuV+5eG hlU4iogqQva69LVYFSjk68BQBjXtePGAtpaKmZNMymvmbxwKL3n2bKAqARIF+i0YCvnU q859msiKb4FHEr0qC8QSJZ44GCO+nNLP7jZQf8EfOkpnpBzQs43ixF7T8Bec8FRtLWRY ACa8v1F047jc0p92w6Jfi1MBDVu7bbhTCkfVi/Yvwn5AgvCGwUmEWXrc5wTst1E5jS2V 1vKw==
X-Gm-Message-State: ALoCoQl91lLoPMeaZowo84JeWapljX9wrUh7h1+1wCtrxHaa3BVaYc6QRbGMeUzUKkhayXB4qNmfiTZjfsWN2zLmP6XIxAnchpJfO0NRhVU3utMgtXbGz0ReJvrUrtFJ/4S8eH2JIfun6dRRVFTPrgTPgCknrNc2+R5tYOdFuiInHPzMwhePcSFUcvfqMuzPjefcpQsEKaaf
MIME-Version: 1.0
X-Received: by 10.43.143.211 with SMTP id jn19mr19084223icc.0.1393712899856; Sat, 01 Mar 2014 14:28:19 -0800 (PST)
Received: by 10.64.148.98 with HTTP; Sat, 1 Mar 2014 14:28:19 -0800 (PST)
In-Reply-To: <53124A4B.6070608@cisco.com>
References: <53104916.5040606@cisco.com> <1278160553.35330292.1393596841752.JavaMail.root@vmware.com> <53109FF1.1020904@cisco.com> <1219445865.35385295.1393600136972.JavaMail.root@vmware.com> <CF362062.65F9%kegray@cisco.com> <278b5c26711e4ae3a2ddba4bdb4f190b@BY2PR03MB128.namprd03.prod.outlook.com> <5310BE26.7060004@cisco.com> <db29312dcbfa4cb881a458b5eca8fcfe@BY2PR03MB128.namprd03.prod.outlook.com> <5310CB8E.5010006@cisco.com> <e0661d11f2184bc08a9651da702a0b93@BY2PR03MB128.namprd03.prod.outlook.com> <5310DC89.1050508@cisco.com> <CA+C0YO2=N1LGVKwwTRXVBYZ6oy6b1AHw935uw-RQy8A2B4gz3Q@mail.gmail.com> <CA+mtBx_QQCNxOnJBZWFh8bSc3wHqiQ7uSUBQD1YQb4Ayr1XjpQ@mail.gmail.com> <53118586.2080306@cisco.com> <CA+mtBx8sCVvZwL2az7HKpXcVZdf2KSnibBXBtGsYHyfFhdJ_Hg@mail.gmail.com> <53124A4B.6070608@cisco.com>
Date: Sat, 01 Mar 2014 14:28:19 -0800
Message-ID: <CA+mtBx_fGH8SdLScfstpA7u9O39m17Jq0ovxs22tazr0-q9_yQ@mail.gmail.com>
From: Tom Herbert <therbert@google.com>
To: "Anton Ivanov (antivano)" <antivano@cisco.com>
Content-Type: text/plain; charset="UTF-8"
Archived-At: http://mailarchive.ietf.org/arch/msg/nvo3/iDckbj5K_MAuC4cquQdd5fC3LSk
Cc: "nvo3@ietf.org" <nvo3@ietf.org>
Subject: Re: [nvo3] Draft Geneve
X-BeenThere: nvo3@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Network Virtualization Overlays \(NVO3\) Working Group" <nvo3.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nvo3>, <mailto:nvo3-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nvo3/>
List-Post: <mailto:nvo3@ietf.org>
List-Help: <mailto:nvo3-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nvo3>, <mailto:nvo3-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 01 Mar 2014 22:28:26 -0000

On Sat, Mar 1, 2014 at 1:00 PM, Anton Ivanov (antivano)
<antivano@cisco.com> wrote:
> Hi Tom,
>
> Based on your comments you have not followed the discussion. I think it will
> be good if you go through the thread in the archive.
>
This discussion is about geneve which is what I was commenting on.

> First, we started the discussion by announcing the fact that we have open
> sourced a working static tunnel L2TPv3 implementation as an overlay at vNIC
> level (allowing for off-host switching, direct overlay to physical and
> direct vm to vm overlay). Based on this discussion I will add back (I
> actually removed it as I saw it as unnecessary) the "application-specific
> data between header and payload" feature.
>
> Second, I have been very specific about using _STATIC_ tunnels. What you are
> saying in your mail is mostly invalid in a static tunnel context.
>
> Static tunnels are just PWEs  - same as any other encaps.
>
> 1. There is no control plane. RFC 4719 does not apply.
>
> 2. For an example - see
> http://tools.ietf.org/html/draft-mkonstan-l2tpext-keyed-ipv6-tunnel-00 .
> This is one example use case, the limitations specified in it are not
> necessary in most others.
>
> 3. L2TPv3 static tunnels do not specify what the payload is and have no
> in-band information on this. You can have PPP, Ethernet, IP, ZigBee, RFC1149
> or whatever else you may like. If you want a special packet type in the
> static tunnel case it is up to you. L2TPv3 will carry it for you.
>
If you don't carry a protocol type in each packet doesn't this prevent
network devices that don't have access to tunnel state from parsing
the inner packet. How would you implement LRO in a NIC, deep packet
inspection, or network flow monitoring in that case?

> 4. There is no issue with offload in static tunnels as there is nothing
> inband in a static tunnel to signal actual header length. In fact, even the
> cookie size is unknown. If you want an offload you have to specify where the
> offload to start looking at. This is no different from any other arbitrary
> case of variable header. So if a NIC or NPU can offload starting from an
> arbitrary offset into the packet (f.e. geneve) it should also be possible to
> make it offload L2TPv3
>
> 5. I am not surprised with what you saw with GRE. However it does not apply
> here - see 4. You saw that because GRE provides info on what the payload is
> in the header. So an offload implementation is entitled to know where to
> find the payload packet and how to treat it. Protocols that do not provide
> this information in the header (L2TPv3) do not have this problem. With these
> you need to configure explicitly where to look for the packet and how to
> offload.
>
> I am not a VXLAN expert. However, I suspect that most of this is applicable
> there (or can be made to apply) too.
>
> So I am going to repeat what has been said quite a few times - the world
> does not need another encapsulation, the existing one(s) can do the job.
> Please use them.
>
> A
>
>
> On 01/03/14 17:13, Tom Herbert wrote:
>
> On Fri, Feb 28, 2014 at 11:01 PM, Anton Ivanov (antivano)
> <antivano@cisco.com> wrote:
>
> On 01/03/14 01:51, Tom Herbert wrote:
>
> On Fri, Feb 28, 2014 at 3:24 PM, Sam Aldrin <aldrin.ietf@gmail.com> wrote:
>
> Hi all,
>
> Read the draft but have few questions on the same line others have asked.
>
> - Is this draft intended for standardizing within NVo3 WG? The status
> indicates it as informational. Also it is good to have it as draft-nvo3....,
> if it is meant for NVo3 WG.
> - I fail to find good reasoning, in the current version of the draft, on why
> design of encap transport header should be closely associated with metadata
> OR closely tied together? Could you add more details to clarify?
>
> The draft alludes to the general need for extensibility, but does not
> provide any example uses, so maybe I can suggest one. We have a real
> use case for an encapsulation protocol with security to allow
> validation of the virtual network identifier. In their current for
> vxlan and nvgre have no provisions for authenticating or integrity
> check of vni, existing mechanisms in the network were not deemed
> robust enough to guarantee integrity of vni and ensure strict
> isolation between tenants. UDP checksum is not sufficient for this. We
> need a mechanism to at least have enforce an unpredictable security
> token, or possibly at stronger authentication using something like a
> message digest. This is intrinsic to the encapsulation, we cannot
> deploy network virtualization without this security, hence an
> extensible protocol is desirable. Additionally, as the network scales,
> new threats emerge, we may have need for further extensions to adapt.
> All of this needs to be efficient and amenable to HW performance
> optimizations.
>
>
> Tom, you are describing the L2TPv3 cookie.
> http://tools.ietf.org/html/rfc3931#section-4.1.1 That has already been
> defined and standardized in 2005.
>
> That's great, and I would certainly want to adapt that to a data
> center encapsulation protocol, but L2TP is *not* an equivalent
> protocol to encapsulations like GRE. It is a tunnel protocol, more
> than encapsulation. It's circuit based needing negotiation, and there
> is no way to specify Ethertype or IP protocol. As I mentioned before,
> I'm not going to artificially force IP packets in Ethernet frames just
> to satisfy the needs of an encapsulation protocol-- this needs to work
> the other way around, we need an encapsulation that is generic to
> directly encapsulate IP packets and other protocols.
>
> As quite a few people said - we do not need to invent a new
> encapsulation for the goals of this draft or for the goals of NVO3 for
> that matter. This just proves the point.
>
> Saying we don't need a new encapsulation is not proof we don't need one.
> :-)
>
> We can copy that option to VXLAN or NVGRE as an extension if we wish too.
>
> Unfortunately, that's not feasible. In optional fields model of vxlan
> and nvgre in order to compute the offset of the next header, an
> implementation needs to know the lengths of all the present optional
> fields So if a new optional field is used a device that doesn't know
> about won't be able to skip over it. This manifests itself when
> hardware devices implement based on parsing the encapsulated headers.
> We saw exactly this problem when trying to add the security token to
> GRE, this broke ECMP in network switches as well as LRO on the NIC. So
> once vxlan and nvgre are deployed in HW, there really is no way to
> extend them without breaking compatibility-- for all intents and
> purposes these protocols are not extensible. The solution, which we
> advocate in GUE, is that protocols with variable length headers need
> to have a header length field to allow devices to skip over unknown
> fields.
>
> Another deficiency that I see in the current encapsulations that
> really needs to be addressed is the interaction with IPsec.  Just
> saying we can use IPsec with any of these encapsulations to provide
> security is *not* sufficient! For instance, we can secure vxlan
> packets with IPsec by encrypting the UDP packet. This provides packet
> security, but now the network has no visibility into the encapsulation
> so we can't route or firewall based on vni so we've lost the value.
> For this reason we really want the encapsulation in the outside header
> (which actually would be the same property if vlan were used). I don't
> see a reasonable way to do this with protocols encapsulating by
> Ethertype, which is a reason why GUE uses IP protocol.
>
> As far as metadata extensions - I believe there is an agreement that we
> should do it. Similarly there is a consensus that they should not be
> welded into the network header. That particular aspect of the design has
> no other function but to be a "mono-culture monopoly license".
>
> Unless meta data extensions, or for that matter TLVs, are defined
> which are intrinsic to the operation of the protocol, it's exceedingly
> likely that hardware vendors will implement their fast path assuming
> no extensions or options. This is precisely why IP options have been
> rendered useless, and does not bode well for meta data extensions or
> TLVs. Besides, what is important enough to be directly in the header
> versus what should be in an extension seems arbitrary to me.
>
> Any options we deploy associated with encapsulation will be important
> and may very well appear in *all* packets sent so they need to be
> super efficient for processing. Neither do we want any baked in
> restrictions on what fields we might want to route or firewall, for
> instance some day we might add new QoS classification field for
> special tenant or groups. The encapsulation protocol should support
> this, this should not kill HW optimizations, and I would expect that
> we can program switches to perform QoS routing based on the new field
> without needing to HW change.
>
> Tom
>
> A.
> _______________________________________________
> nvo3 mailing list
> nvo3@ietf.org
> https://www.ietf.org/mailman/listinfo/nvo3
>
> _______________________________________________
> nvo3 mailing list
> nvo3@ietf.org
> https://www.ietf.org/mailman/listinfo/nvo3
>
>
>
> _______________________________________________
> nvo3 mailing list
> nvo3@ietf.org
> https://www.ietf.org/mailman/listinfo/nvo3
>