Re: [nvo3] Draft Geneve
"Anton Ivanov (antivano)" <antivano@cisco.com> Sat, 01 March 2014 21:00 UTC
Return-Path: <antivano@cisco.com>
X-Original-To: nvo3@ietfa.amsl.com
Delivered-To: nvo3@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B1C011A032F for <nvo3@ietfa.amsl.com>; Sat, 1 Mar 2014 13:00:54 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -15.047
X-Spam-Level:
X-Spam-Status: No, score=-15.047 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-0.547, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id U5JV9bkYdF6I for <nvo3@ietfa.amsl.com>; Sat, 1 Mar 2014 13:00:50 -0800 (PST)
Received: from rcdn-iport-4.cisco.com (rcdn-iport-4.cisco.com [173.37.86.75]) by ietfa.amsl.com (Postfix) with ESMTP id 5D33B1A02CD for <nvo3@ietf.org>; Sat, 1 Mar 2014 13:00:50 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=19927; q=dns/txt; s=iport; t=1393707648; x=1394917248; h=from:to:subject:date:message-id:references:in-reply-to: mime-version; bh=N7A7Ki+IsU0Ree/laxIfmf4qAk2+rex6BuqGFbBfizw=; b=K/U6jec3vkeDR6LRBjbiHnHwFG4MuwmErB+WabM06sDlAkMiEsDpyiRN nMHj5Ptu53h10TBrJEc7thMEIMkIR2Xmo4SM4FPJvt85MxVgy+6RQwkk7 5t1noB9VD2i6LQUx/ev4JrvgAc0JA/QrQK+mZzROBA/uzNHZm1pyaSY5c M=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AhsFADZJElOtJV2a/2dsb2JhbABagwY7V8EggRMWdIImAQEEAQEBaAMKEQIBCCEWCAcJAwIBAgEPBgsUEQIEDQYCAhEGh0oDEQ3EXg2HHReMQ4FFAQFWhDgEiROLPoF+gW2BMosxhUiDLYFqBxcGHA
X-IronPort-AV: E=Sophos; i="4.97,569,1389744000"; d="scan'208,217"; a="307495838"
Received: from rcdn-core-3.cisco.com ([173.37.93.154]) by rcdn-iport-4.cisco.com with ESMTP; 01 Mar 2014 21:00:47 +0000
Received: from xhc-rcd-x10.cisco.com (xhc-rcd-x10.cisco.com [173.37.183.84]) by rcdn-core-3.cisco.com (8.14.5/8.14.5) with ESMTP id s21L0le3000351 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL) for <nvo3@ietf.org>; Sat, 1 Mar 2014 21:00:47 GMT
Received: from xmb-aln-x12.cisco.com ([169.254.7.200]) by xhc-rcd-x10.cisco.com ([173.37.183.84]) with mapi id 14.03.0123.003; Sat, 1 Mar 2014 15:00:47 -0600
From: "Anton Ivanov (antivano)" <antivano@cisco.com>
To: "nvo3@ietf.org" <nvo3@ietf.org>
Thread-Topic: [nvo3] Draft Geneve
Thread-Index: AQHPNF9OIRc3MngbZEqEtPjRdT7DfSUP+Z1CmaKiLoCDr+oLMPxP3V+A///v5VCAAGyzAP//nBBQAA59hAAAC+MS4P//tSWAgABKAACAACkqAIAAVkkAgACrbgCAAD8jgA==
Date: Sat, 01 Mar 2014 21:00:46 +0000
Message-ID: <53124A4B.6070608@cisco.com>
References: <53104916.5040606@cisco.com> <1278160553.35330292.1393596841752.JavaMail.root@vmware.com> <53109FF1.1020904@cisco.com> <1219445865.35385295.1393600136972.JavaMail.root@vmware.com> <CF362062.65F9%kegray@cisco.com> <278b5c26711e4ae3a2ddba4bdb4f190b@BY2PR03MB128.namprd03.prod.outlook.com> <5310BE26.7060004@cisco.com> <db29312dcbfa4cb881a458b5eca8fcfe@BY2PR03MB128.namprd03.prod.outlook.com> <5310CB8E.5010006@cisco.com> <e0661d11f2184bc08a9651da702a0b93@BY2PR03MB128.namprd03.prod.outlook.com> <5310DC89.1050508@cisco.com> <CA+C0YO2=N1LGVKwwTRXVBYZ6oy6b1AHw935uw-RQy8A2B4gz3Q@mail.gmail.com> <CA+mtBx_QQCNxOnJBZWFh8bSc3wHqiQ7uSUBQD1YQb4Ayr1XjpQ@mail.gmail.com> <53118586.2080306@cisco.com> <CA+mtBx8sCVvZwL2az7HKpXcVZdf2KSnibBXBtGsYHyfFhdJ_Hg@mail.gmail.com>
In-Reply-To: <CA+mtBx8sCVvZwL2az7HKpXcVZdf2KSnibBXBtGsYHyfFhdJ_Hg@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130116 Icedove/10.0.12
x-originating-ip: [10.60.178.250]
Content-Type: multipart/alternative; boundary="_000_53124A4B6070608ciscocom_"
MIME-Version: 1.0
Archived-At: http://mailarchive.ietf.org/arch/msg/nvo3/uzWo6wJEs1jowEp4O-E_WUjZW1k
Subject: Re: [nvo3] Draft Geneve
X-BeenThere: nvo3@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Network Virtualization Overlays \(NVO3\) Working Group" <nvo3.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nvo3>, <mailto:nvo3-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nvo3/>
List-Post: <mailto:nvo3@ietf.org>
List-Help: <mailto:nvo3-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nvo3>, <mailto:nvo3-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 01 Mar 2014 21:00:54 -0000
Hi Tom, Based on your comments you have not followed the discussion. I think it will be good if you go through the thread in the archive. First, we started the discussion by announcing the fact that we have open sourced a working static tunnel L2TPv3 implementation as an overlay at vNIC level (allowing for off-host switching, direct overlay to physical and direct vm to vm overlay). Based on this discussion I will add back (I actually removed it as I saw it as unnecessary) the "application-specific data between header and payload" feature. Second, I have been very specific about using _STATIC_ tunnels. What you are saying in your mail is mostly invalid in a static tunnel context. Static tunnels are just PWEs - same as any other encaps. 1. There is no control plane. RFC 4719 does not apply. 2. For an example - see http://tools.ietf.org/html/draft-mkonstan-l2tpext-keyed-ipv6-tunnel-00 . This is one example use case, the limitations specified in it are not necessary in most others. 3. L2TPv3 static tunnels do not specify what the payload is and have no in-band information on this. You can have PPP, Ethernet, IP, ZigBee, RFC1149 or whatever else you may like. If you want a special packet type in the static tunnel case it is up to you. L2TPv3 will carry it for you. 4. There is no issue with offload in static tunnels as there is nothing inband in a static tunnel to signal actual header length. In fact, even the cookie size is unknown. If you want an offload you have to specify where the offload to start looking at. This is no different from any other arbitrary case of variable header. So if a NIC or NPU can offload starting from an arbitrary offset into the packet (f.e. geneve) it should also be possible to make it offload L2TPv3 5. I am not surprised with what you saw with GRE. However it does not apply here - see 4. You saw that because GRE provides info on what the payload is in the header. So an offload implementation is entitled to know where to find the payload packet and how to treat it. Protocols that do not provide this information in the header (L2TPv3) do not have this problem. With these you need to configure explicitly where to look for the packet and how to offload. I am not a VXLAN expert. However, I suspect that most of this is applicable there (or can be made to apply) too. So I am going to repeat what has been said quite a few times - the world does not need another encapsulation, the existing one(s) can do the job. Please use them. A On 01/03/14 17:13, Tom Herbert wrote: On Fri, Feb 28, 2014 at 11:01 PM, Anton Ivanov (antivano) <antivano@cisco.com><mailto:antivano@cisco.com> wrote: On 01/03/14 01:51, Tom Herbert wrote: On Fri, Feb 28, 2014 at 3:24 PM, Sam Aldrin <aldrin.ietf@gmail.com><mailto:aldrin.ietf@gmail.com> wrote: Hi all, Read the draft but have few questions on the same line others have asked. - Is this draft intended for standardizing within NVo3 WG? The status indicates it as informational. Also it is good to have it as draft-nvo3...., if it is meant for NVo3 WG. - I fail to find good reasoning, in the current version of the draft, on why design of encap transport header should be closely associated with metadata OR closely tied together? Could you add more details to clarify? The draft alludes to the general need for extensibility, but does not provide any example uses, so maybe I can suggest one. We have a real use case for an encapsulation protocol with security to allow validation of the virtual network identifier. In their current for vxlan and nvgre have no provisions for authenticating or integrity check of vni, existing mechanisms in the network were not deemed robust enough to guarantee integrity of vni and ensure strict isolation between tenants. UDP checksum is not sufficient for this. We need a mechanism to at least have enforce an unpredictable security token, or possibly at stronger authentication using something like a message digest. This is intrinsic to the encapsulation, we cannot deploy network virtualization without this security, hence an extensible protocol is desirable. Additionally, as the network scales, new threats emerge, we may have need for further extensions to adapt. All of this needs to be efficient and amenable to HW performance optimizations. Tom, you are describing the L2TPv3 cookie. http://tools.ietf.org/html/rfc3931#section-4.1.1 That has already been defined and standardized in 2005. That's great, and I would certainly want to adapt that to a data center encapsulation protocol, but L2TP is *not* an equivalent protocol to encapsulations like GRE. It is a tunnel protocol, more than encapsulation. It's circuit based needing negotiation, and there is no way to specify Ethertype or IP protocol. As I mentioned before, I'm not going to artificially force IP packets in Ethernet frames just to satisfy the needs of an encapsulation protocol-- this needs to work the other way around, we need an encapsulation that is generic to directly encapsulate IP packets and other protocols. As quite a few people said - we do not need to invent a new encapsulation for the goals of this draft or for the goals of NVO3 for that matter. This just proves the point. Saying we don't need a new encapsulation is not proof we don't need one. :-) We can copy that option to VXLAN or NVGRE as an extension if we wish too. Unfortunately, that's not feasible. In optional fields model of vxlan and nvgre in order to compute the offset of the next header, an implementation needs to know the lengths of all the present optional fields So if a new optional field is used a device that doesn't know about won't be able to skip over it. This manifests itself when hardware devices implement based on parsing the encapsulated headers. We saw exactly this problem when trying to add the security token to GRE, this broke ECMP in network switches as well as LRO on the NIC. So once vxlan and nvgre are deployed in HW, there really is no way to extend them without breaking compatibility-- for all intents and purposes these protocols are not extensible. The solution, which we advocate in GUE, is that protocols with variable length headers need to have a header length field to allow devices to skip over unknown fields. Another deficiency that I see in the current encapsulations that really needs to be addressed is the interaction with IPsec. Just saying we can use IPsec with any of these encapsulations to provide security is *not* sufficient! For instance, we can secure vxlan packets with IPsec by encrypting the UDP packet. This provides packet security, but now the network has no visibility into the encapsulation so we can't route or firewall based on vni so we've lost the value. For this reason we really want the encapsulation in the outside header (which actually would be the same property if vlan were used). I don't see a reasonable way to do this with protocols encapsulating by Ethertype, which is a reason why GUE uses IP protocol. As far as metadata extensions - I believe there is an agreement that we should do it. Similarly there is a consensus that they should not be welded into the network header. That particular aspect of the design has no other function but to be a "mono-culture monopoly license". Unless meta data extensions, or for that matter TLVs, are defined which are intrinsic to the operation of the protocol, it's exceedingly likely that hardware vendors will implement their fast path assuming no extensions or options. This is precisely why IP options have been rendered useless, and does not bode well for meta data extensions or TLVs. Besides, what is important enough to be directly in the header versus what should be in an extension seems arbitrary to me. Any options we deploy associated with encapsulation will be important and may very well appear in *all* packets sent so they need to be super efficient for processing. Neither do we want any baked in restrictions on what fields we might want to route or firewall, for instance some day we might add new QoS classification field for special tenant or groups. The encapsulation protocol should support this, this should not kill HW optimizations, and I would expect that we can program switches to perform QoS routing based on the new field without needing to HW change. Tom A. _______________________________________________ nvo3 mailing list nvo3@ietf.org<mailto:nvo3@ietf.org> https://www.ietf.org/mailman/listinfo/nvo3 _______________________________________________ nvo3 mailing list nvo3@ietf.org<mailto:nvo3@ietf.org> https://www.ietf.org/mailman/listinfo/nvo3
- [nvo3] Draft Geneve Anton Ivanov (antivano)
- Re: [nvo3] Draft Geneve Brad Hedlund
- Re: [nvo3] Draft Geneve Anton Ivanov (antivano)
- Re: [nvo3] Draft Geneve Brad Hedlund
- Re: [nvo3] Draft Geneve Pankaj Garg
- Re: [nvo3] Draft Geneve Anton Ivanov (antivano)
- Re: [nvo3] Draft Geneve Anton Ivanov (antivano)
- Re: [nvo3] Draft Geneve Ken Gray (kegray)
- Re: [nvo3] Draft Geneve Pankaj Garg
- Re: [nvo3] Draft Geneve Anton Ivanov (antivano)
- Re: [nvo3] Draft Geneve Pankaj Garg
- Re: [nvo3] Draft Geneve Tom Herbert
- Re: [nvo3] Draft Geneve Anton Ivanov (antivano)
- Re: [nvo3] Draft Geneve Pankaj Garg
- Re: [nvo3] Draft Geneve Dino Farinacci
- Re: [nvo3] Draft Geneve Anton Ivanov (antivano)
- Re: [nvo3] Draft Geneve Sam Aldrin
- Re: [nvo3] Draft Geneve Tom Herbert
- Re: [nvo3] Draft Geneve Anton Ivanov (antivano)
- Re: [nvo3] Draft Geneve Tom Herbert
- Re: [nvo3] Draft Geneve Anton Ivanov (antivano)
- Re: [nvo3] Draft Geneve Tom Herbert
- Re: [nvo3] Draft Geneve Anton Ivanov