Re: [nvo3] Draft NVO3 WG Charter

Roger Jørgensen <rogerj@gmail.com> Sat, 18 February 2012 12:50 UTC

Return-Path: <rogerj@gmail.com>
X-Original-To: nvo3@ietfa.amsl.com
Delivered-To: nvo3@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CDB2D21F8646 for <nvo3@ietfa.amsl.com>; Sat, 18 Feb 2012 04:50:54 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.109
X-Spam-Level:
X-Spam-Status: No, score=-3.109 tagged_above=-999 required=5 tests=[AWL=0.190, BAYES_00=-2.599, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5UFsBxTGcQXU for <nvo3@ietfa.amsl.com>; Sat, 18 Feb 2012 04:50:53 -0800 (PST)
Received: from mail-vw0-f44.google.com (mail-vw0-f44.google.com [209.85.212.44]) by ietfa.amsl.com (Postfix) with ESMTP id 429F521F8645 for <nvo3@ietf.org>; Sat, 18 Feb 2012 04:50:53 -0800 (PST)
Received: by vbbfr13 with SMTP id fr13so3262716vbb.31 for <nvo3@ietf.org>; Sat, 18 Feb 2012 04:50:52 -0800 (PST)
Received-SPF: pass (google.com: domain of rogerj@gmail.com designates 10.52.178.35 as permitted sender) client-ip=10.52.178.35;
Authentication-Results: mr.google.com; spf=pass (google.com: domain of rogerj@gmail.com designates 10.52.178.35 as permitted sender) smtp.mail=rogerj@gmail.com; dkim=pass header.i=rogerj@gmail.com
Received: from mr.google.com ([10.52.178.35]) by 10.52.178.35 with SMTP id cv3mr6234264vdc.44.1329569452754 (num_hops = 1); Sat, 18 Feb 2012 04:50:52 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=7QPxxv78WBVJuQJhmihP7iH7GCwpucDiOXX4bFfNvR8=; b=I4WM2xCoIvCw9sgYTY0qBfbdGb5isaIoFOmj7Sp+o7XZNR8JR8VbTm087b19yfr7W9 S5sDFMkd7SNFeXAVgEDPotU5jDFnnazqLcsNJvGpdEJPtjVH4DwLigNXVhwTuwZA/Dm2 mXiY72IiGt0b/sOvsBdKC9EgfQW7oU4L7AO8g=
MIME-Version: 1.0
Received: by 10.52.178.35 with SMTP id cv3mr5040280vdc.44.1329569452661; Sat, 18 Feb 2012 04:50:52 -0800 (PST)
Received: by 10.52.170.197 with HTTP; Sat, 18 Feb 2012 04:50:52 -0800 (PST)
In-Reply-To: <201202171451.q1HEptR3027370@cichlid.raleigh.ibm.com>
References: <201202171451.q1HEptR3027370@cichlid.raleigh.ibm.com>
Date: Sat, 18 Feb 2012 13:50:52 +0100
Message-ID: <CAKFn1SEr6qkxOHAsz0j=Tpo6wkNXu8YhbEEYZwCVou=g8zA03A@mail.gmail.com>
From: Roger Jørgensen <rogerj@gmail.com>
To: nvo3@ietf.org
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [nvo3] Draft NVO3 WG Charter
X-BeenThere: nvo3@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "L2 \"Network Virtualization Over l3\" overlay discussion list \(nvo3\)" <nvo3.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nvo3>, <mailto:nvo3-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nvo3>
List-Post: <mailto:nvo3@ietf.org>
List-Help: <mailto:nvo3-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nvo3>, <mailto:nvo3-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 18 Feb 2012 12:50:55 -0000

On Fri, Feb 17, 2012 at 3:51 PM, Thomas Narten <narten@us.ibm.com> wrote:
> Below is a draft charter for this effort. One detail is that we
> started out calling this effort NVO3 (Network Virtualization Over L3),
> but have subsequently realized that we should not focus on just "over
> L3". One goal of this effort is to develop an overlay standard that
> works over L3, but we do not want to restrict ourselves only to "over
> L3". The framework and architecture that we are proposing to work on
> should be applicable to other overlays as well (e.g., L2 over
> L2). This is (hopefully) captured in the proposed charter.
>
> Comments?

Plenty, much have been said in other mails in this thread.

I've only experienced this problem in a small scale, we (the VM/server
guys) cause major headache for the network group when they realized
what virtualization was, that a machine could be running anywhere in
the network, or in any of the datacenters involved. They had no
control of the traffic flow, and freaked even more out when they
realized there were several clusters of VM running from different
datacenters and different part of the network exchanging lots of
traffic.
That the traffic crossed the backbone up to 8times or more (yes it's
true) for a group of 4 VMs made the network group consider the server
group insane... some of these problem can be solved for VMware but I
haven't seen the same being solved for other virtualization platforms.


The other side of the VM vs network problem I meet daily are VM in the
same group/LAN running from different datacenters...  plenty of new
tools, standard, products and technologies out there, but none that
make it easy. Not to mention single/group of VM moving between
datacenters.



LISP, TRILL, VXLAN, different type of VPN's solve some problems, but
they don't solve The Big Picture problem. Especial not the one with
VM's moving random around in your entire datacenter(s).
Most of these also have another problem, MTU size. Add something on
top of something else and you use up packetsize...



So the problem is real, and it need to consider both L2 and L3, who
said we will always be running IP?



Just to add another aspect, who said the IP the VMs have configured
need to be the same as they use outside their own interconnect? Maybe
they can have one IP range configured for all communication between
them but use another set of IP's toward everyone else, a bit like
LISP's concept of EID but where the EID's ain't known for the outside.




... in my current job I'm lucky because we have enough fibre in the
ground to support multi-site datacenters with virtual unlimited
bandwidth between them, however I still have to solve the split-brain
issue that _will_ happen sooner or later with a special set of
outages.



--- Roger J ---

>
> Thomas
>
> NVO: Network Virtualization Overlays
>
> Support for multi-tenancy has become a core requirement of data
> centers, especially in the context of data centers which include
> virtualized servers known as virtual machines (VMs).  With
> multi-tenancy, a data center can support the needs of many thousands
> of individual tenants, ranging from individual groups or departments
> within a single organization all the way up to supporting thousands of
> individual customers.  A key multi-tenancy requirement is traffic
> isolation, so that a tenant's traffic (and internal address usage) is
> not visible to any other tenant and does not collide with addresses
> used within the data center itself.  Such isolation can be achieved by
> creating and assigning one or more virtual networks to each tenant
> such that traffic within a virtual network is isolated from traffic in
> other virtual networks.
>
> Tenant isolation is primarily achieved today within data centers using
> Ethernet VLANs. But the 12-bit VLAN tag field isn't large enough to
> support existing and future needs. A number of approaches to extending
> VLANs and scaling L2s have been proposed or developed, including IEEE
> 802.1ah Shortest Path Bridging (SPB) and TRILL (with the proposed
> fine-grained labeling extension).  At the L3 (IP) level, VXLAN and
> NVGRE have also been proposed. As outlined in
> draft-narten-nvo3-overlay-problem-statement-01.txt, however, existing
> L2 approaches are not satisfactory for all data center operators,
> e.g., larger data centers that desire to keep L2 domains small or push
> L3 further into the data center (e.g., all the way to top-of-rack
> switches). Furthermore, there is a desire to decouple the
> configuration of the data center network from the configuration
> associated with individual tenant applications and to seamlessly and
> rapidly update the network state to handle live VM migrations or fast
> spin-up and spin-down of new tenant VMs (or servers). Such tasks are
> complicated by the need to simultaneously reconfigure and update data
> center network state (e.g., VLAN settings on individual switches).
>
> This WG will develop an approach to multi-tenancy that does not rely
> on any underlying L2 mechanisms to support multi-tenancy. In
> particular, the WG will develop an approach where multitenancy is
> provided at the IP layer using an encapsulation header that resides
> above IP. This effort is explicitly intended to leverage the interest
> in L3 overlay approaches as exemplified by VXLAN
> (draft-mahalingam-dutt-dcops-vxlan-00.txt) and NVGRE
> (draft-sridharan-virtualization-nvgre-00.txt).
>
> Overlays are a form of "map and encap", where an ingress node maps the
> destination address of an arriving packet (e.g., from a source tenant
> VM) into the address of an egress node to which the packet can be
> tunneled to. The ingress node then encapsulates the packet in an outer
> header and tunnels it to the egress node, which decapsulates the
> packet and forwards the original (unmodified) packet to its ultimate
> destination (e.g., a destination tenant VM). All map-and-encap
> approaches must address two issues: the encapsulation format (i.e.,
> the contents of the outer header) and how to distribute and manage the
> mapping tables used by the tunnel end points.
>
> The first area of work concerns encapsulation formats. This WG will
> develop requirements and desirable properties for any encapsulation
> format. Given the number of already existing encapsulation formats,
> it is not an explicit goal of this effort to choose exactly one format
> or to develop yet another new one.
>
> A second work area is in the control plane, which allows an ingress
> node to map the "inner" (tenant VM) address into an "outer"
> (underlying transport network) address in order to tunnel a packet
> across the data center. We propose to develop two control planes. One
> control plane will use a learning mechanism similar to IEEE 802.1D
> learning, and could be appropriate for smaller data centers. A second,
> more scalable control plane would be aimed at large sites, capable of
> scaling to hundreds of thousands of nodes. Both control planes will
> need to handle the case of VMs moving around the network in a dynamic
> fashion, meaning that they will need to support tunnel endpoints
> registering and deregistering mappings as VMs change location and
> ensuring that out-of-date mapping tables are only used for short
> periods of time. Finally, the second control plane must also be
> applicable to geographically dispersed data centers.
>
> Although a key objective of this WG is to produce a solution that
> supports an L2 over L3 overlay, an important goal is to develop a
> "layer agnostic" framework and architecture, so that any specific
> overlay approach can reuse the output of this working group. For
> example, there is no inherent reason why the same framework could not
> be used to provide for L2 over L2 or L3 over L3. The main difference
> would be in the address formats of the inner and outer headers and the
> encapsulation header itself.
>
> Finally, some work may be needed in connecting an overlay network with
> traditional L2 or L3 VPNs (e.g., VPLS). One approach appears straight
> forward, in that there is a clear boundary between a VPN device and
> the edge of an overlay network. Packets forwarded across the boundary
> would simply need to have the tenant identifier on the overlay side
> mapped into a corresponding VPN identifier on the VPN
> side. Conceptually, this would appear to be analogous to what is done
> already today when interfacing between L2 VLANs and VPNs.
>
> The specific deliverables for this group include:
>
> 1) Finalize and publish the overall problem statement as an
> Informational RFC (basis:
> draft-narten-nvo3-overlay-problem-statement-01.txt)
>
> 2) Develop requirements and desirable properties for any encapsulation
> format, and identify suitable encapsulations. Given the number of
> already existing encapsulation formats, it is not an explicit goal of
> this effort to choose exactly one format or to develop a new one.
>
> 3) Produce a Standards Track control plane document that specifies how
> to build mapping tables using a "learning" approach. This document is
> expected to be short, as the algorithm itself will use a mechanism
> similar to IEEE 802.1D learning.
>
> 4) Develop requirements (and later a Standards Track protocol) for a
> more scalable control plane for managing and distributing the mappings
> of "inner" to "outer" addresses. We will develop a reusable framework
> suitable for use by any mapping function in which there is a need to
> map "inner" to outer addresses. Starting point:
> draft-kreeger-nvo3-overlay-cp-00.txt
>
> _______________________________________________
> nvo3 mailing list
> nvo3@ietf.org
> https://www.ietf.org/mailman/listinfo/nvo3



-- 

Roger Jorgensen           |
rogerj@gmail.com          | - IPv6 is The Key!
http://www.jorgensen.no   | roger@jorgensen.no