Re: [nvo3] Draft NVO3 WG Charter

Stewart Bryant <stbryant@cisco.com> Fri, 17 February 2012 19:10 UTC

Return-Path: <stbryant@cisco.com>
X-Original-To: nvo3@ietfa.amsl.com
Delivered-To: nvo3@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CFFE821F85CC for <nvo3@ietfa.amsl.com>; Fri, 17 Feb 2012 11:10:48 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -110.587
X-Spam-Level:
X-Spam-Status: No, score=-110.587 tagged_above=-999 required=5 tests=[AWL=0.012, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qhHcCrJlsocQ for <nvo3@ietfa.amsl.com>; Fri, 17 Feb 2012 11:10:47 -0800 (PST)
Received: from ams-iport-1.cisco.com (ams-iport-1.cisco.com [144.254.224.140]) by ietfa.amsl.com (Postfix) with ESMTP id 20E2821F85C5 for <nvo3@ietf.org>; Fri, 17 Feb 2012 11:10:13 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=stbryant@cisco.com; l=10086; q=dns/txt; s=iport; t=1329505814; x=1330715414; h=message-id:date:from:reply-to:mime-version:to:cc:subject: references:in-reply-to:content-transfer-encoding; bh=c3O06wXmepgsxW+3TrlBFqGz6w7O5YTfbJbXLOviDq0=; b=kqzN4et8C1LhweG/Fcsm17cM48I48xwUGdV6ZE3DZSjlSr/2zSjii0P6 7JyzziMpKx6fpDFrapLW1NVm1lWGC9iRPoFNtE4YczHlfcbysRW+dEqbh yXJRK4nK9JtEjIhMuma3WV0B1Pv3Krs6uRlc/u/llPno964x8NZZy2hY8 c=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AmwFAAKlPk+Q/khL/2dsb2JhbAA6Cq8+gmGBB4F1AQEBAwESAQIBIjUHBAEFCwsSBgkWDwkDAgECATcOBg0BBwEBFweHXpoZAYMxDwGbJok3gj8BCQIKBwQEAQIBAgkFAgIFAwIDBwQEAgIDCQGDUQODcASVN5Jo
X-IronPort-AV: E=Sophos;i="4.73,439,1325462400"; d="scan'208";a="129792789"
Received: from ams-core-2.cisco.com ([144.254.72.75]) by ams-iport-1.cisco.com with ESMTP; 17 Feb 2012 19:10:13 +0000
Received: from cisco.com (mrwint.cisco.com [64.103.70.36]) by ams-core-2.cisco.com (8.14.3/8.14.3) with ESMTP id q1HJADYb010767; Fri, 17 Feb 2012 19:10:13 GMT
Received: from dhcp-bdlk10-data-vlan300-64-103-107-157.cisco.com (localhost [127.0.0.1]) by cisco.com (8.14.4+Sun/8.8.8) with ESMTP id q1HJABHj029961; Fri, 17 Feb 2012 19:10:12 GMT
Message-ID: <4F3EA613.5040202@cisco.com>
Date: Fri, 17 Feb 2012 19:10:11 +0000
From: Stewart Bryant <stbryant@cisco.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:10.0.1) Gecko/20120208 Thunderbird/10.0.1
MIME-Version: 1.0
To: Thomas Narten <narten@us.ibm.com>
References: <201202171451.q1HEptR3027370@cichlid.raleigh.ibm.com>
In-Reply-To: <201202171451.q1HEptR3027370@cichlid.raleigh.ibm.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Cc: nvo3@ietf.org
Subject: Re: [nvo3] Draft NVO3 WG Charter
X-BeenThere: nvo3@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: stbryant@cisco.com
List-Id: "L2 \"Network Virtualization Over l3\" overlay discussion list \(nvo3\)" <nvo3.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nvo3>, <mailto:nvo3-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nvo3>
List-Post: <mailto:nvo3@ietf.org>
List-Help: <mailto:nvo3-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nvo3>, <mailto:nvo3-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Feb 2012 19:10:48 -0000

On 17/02/2012 14:51, Thomas Narten wrote:
> Below is a draft charter for this effort. One detail is that we
> started out calling this effort NVO3 (Network Virtualization Over L3),
> but have subsequently realized that we should not focus on just "over
> L3". One goal of this effort is to develop an overlay standard that
> works over L3, but we do not want to restrict ourselves only to "over
> L3". The framework and architecture that we are proposing to work on
> should be applicable to other overlays as well (e.g., L2 over
> L2). This is (hopefully) captured in the proposed charter.

This worries me. It is going to be difficult to avoid getting into
a situation of boiling oceans here, and we need to make sure we
start in the right place. That said I think that if a simple general
solution can be designed as opposed to yet another application
specific encapsulation, that would be a great service to the industry
over the long term.
>
> Comments?
>
> Thomas
>
> NVO: Network Virtualization Overlays
>
> Support for multi-tenancy has become a core requirement of data
> centers, especially in the context of data centers which include
> virtualized servers known as virtual machines (VMs).  With
> multi-tenancy, a data center can support the needs of many thousands
> of individual tenants, ranging from individual groups or departments
> within a single organization all the way up to supporting thousands of
> individual customers.  A key multi-tenancy requirement is traffic
> isolation, so that a tenant's traffic (and internal address usage) is
> not visible to any other tenant and does not collide with addresses
> used within the data center itself.  Such isolation can be achieved by
> creating and assigning one or more virtual networks to each tenant
> such that traffic within a virtual network is isolated from traffic in
> other virtual networks.
>
> Tenant isolation is primarily achieved today within data centers using
> Ethernet VLANs. But the 12-bit VLAN tag field isn't large enough to
> support existing and future needs. A number of approaches to extending
> VLANs and scaling L2s have been proposed or developed, including IEEE
> 802.1ah Shortest Path Bridging (SPB) and TRILL (with the proposed
> fine-grained labeling extension).  At the L3 (IP) level, VXLAN and
> NVGRE have also been proposed. As outlined in
> draft-narten-nvo3-overlay-problem-statement-01.txt, however, existing
> L2 approaches are not satisfactory for all data center operators,
> e.g., larger data centers that desire to keep L2 domains small or push
> L3 further into the data center (e.g., all the way to top-of-rack
> switches). Furthermore, there is a desire to decouple the
> configuration of the data center network from the configuration
> associated with individual tenant applications and to seamlessly and
> rapidly update the network state to handle live VM migrations or fast
> spin-up and spin-down of new tenant VMs (or servers). Such tasks are
> complicated by the need to simultaneously reconfigure and update data
> center network state (e.g., VLAN settings on individual switches).
>
> This WG will develop an approach to multi-tenancy that does not rely
> on any underlying L2 mechanisms to support multi-tenancy. In
> particular, the WG will develop an approach where multitenancy is
> provided at the IP layer using an encapsulation header that resides
> above IP. This effort is explicitly intended to leverage the interest
> in L3 overlay approaches as exemplified by VXLAN
> (draft-mahalingam-dutt-dcops-vxlan-00.txt) and NVGRE
> (draft-sridharan-virtualization-nvgre-00.txt).

The WG will need to consider the operations that it wants to
encode in the "transport" layer. Encapsulation, delivery and
multiplexing are compulsory, but are there others?

I am NOT proposing to push MPLS here, but think for a little
while about the subtly of that small header which encodes
not (encap + delivery + mux), but a set of opaque
instructions agreed between peers. So think very carefully
about what you want hard coded and self describing and
what flexibility you want to provide.


>
> Overlays are a form of "map and encap", where an ingress node maps the
> destination address of an arriving packet (e.g., from a source tenant
> VM) into the address of an egress node to which the packet can be
> tunneled to. The ingress node then encapsulates the packet in an outer
> header and tunnels it to the egress node, which decapsulates the
> packet and forwards the original (unmodified) packet to its ultimate
> destination (e.g., a destination tenant VM). All map-and-encap
> approaches must address two issues: the encapsulation format (i.e.,
> the contents of the outer header) and how to distribute and manage the
> mapping tables used by the tunnel end points.
>
> The first area of work concerns encapsulation formats. This WG will
> develop requirements and desirable properties for any encapsulation
> format. Given the number of already existing encapsulation formats,
> it is not an explicit goal of this effort to choose exactly one format
> or to develop yet another new one.
>
> A second work area is in the control plane, which allows an ingress
> node to map the "inner" (tenant VM) address into an "outer"
> (underlying transport network) address in order to tunnel a packet
> across the data center. We propose to develop two control planes. One
> control plane will use a learning mechanism similar to IEEE 802.1D
> learning, and could be appropriate for smaller data centers. A second,
> more scalable control plane would be aimed at large sites, capable of
> scaling to hundreds of thousands of nodes.
The WG clearly needs to solve both problems, but I think that
it is too early to say whether you need two control planes or not
for scaling. However the concept of mandating that the encapsulation
layer be decoupled from the control protocol adds significantly
to the utility of the encapsulation. The WG needs to bare in mind
that there may in the long run be many reasons to create
additional control protocols besides scaling.

> Both control planes will
> need to handle the case of VMs moving around the network in a dynamic
> fashion, meaning that they will need to support tunnel endpoints
> registering and deregistering mappings as VMs change location and
> ensuring that out-of-date mapping tables are only used for short
> periods of time. Finally, the second control plane must also be
> applicable to geographically dispersed data centers.

I think that we need to start by figuring out the properties that are
needed and the figure out whether we need one, two of some other
number of control protocols.

>
> Although a key objective of this WG is to produce a solution that
> supports an L2 over L3 overlay, an important goal is to develop a
> "layer agnostic" framework and architecture, so that any specific
> overlay approach can reuse the output of this working group. For
> example, there is no inherent reason why the same framework could not
> be used to provide for L2 over L2 or L3 over L3. The main difference
> would be in the address formats of the inner and outer headers and the
> encapsulation header itself.

>
> Finally, some work may be needed in connecting an overlay network with
> traditional L2 or L3 VPNs (e.g., VPLS). One approach appears straight
> forward, in that there is a clear boundary between a VPN device and
> the edge of an overlay network. Packets forwarded across the boundary
> would simply need to have the tenant identifier on the overlay side
> mapped into a corresponding VPN identifier on the VPN
> side. Conceptually, this would appear to be analogous to what is done
> already today when interfacing between L2 VLANs and VPNs.
Remember that we broke this up into a number of work packages
in order to scale the IETF to the problem.
>
> The specific deliverables for this group include:
>
> 1) Finalize and publish the overall problem statement as an
> Informational RFC (basis:
> draft-narten-nvo3-overlay-problem-statement-01.txt)
OK

However consider producing a framework next so that
work can proceed in parallel on the next topics with the
framework acting as glue to hold the independent components
together.
>
> 2) Develop requirements and desirable properties for any encapsulation
> format, and identify suitable encapsulations. Given the number of
> already existing encapsulation formats, it is not an explicit goal of
> this effort to choose exactly one format or to develop a new one.
OK. When we get here we will know whether we use one or
more existing or a new one. We need to work really hard to
make sure the decision here is objective.
>
> 3) Produce a Standards Track control plane document that specifies how
> to build mapping tables using a "learning" approach. This document is
> expected to be short, as the algorithm itself will use a mechanism
> similar to IEEE 802.1D learning.
This is not how we should define this in the charter "similar to
802.1D" sets all sorts of expectations concerning the design
and takes us into all sorts of territory we do not want to go to.

At this stage only the first sentence is needed.
>
> 4) Develop requirements (and later a Standards Track protocol) for a
> more scalable control plane for managing and distributing the mappings
> of "inner" to "outer" addresses. We will develop a reusable framework
> suitable for use by any mapping function in which there is a need to
> map "inner" to outer addresses. Starting point:
> draft-kreeger-nvo3-overlay-cp-00.txt
Conceptually the first sentence is correct, although you assume a
lot by talking about inner and outer addresses.
The WG will decide the starting point when it is formed.

For the purposes of going forward we need to talk about scalability
in the control plane, but I am not sure exactly how much solution
oriented detail is appropriate at this stage.

- Stewart