Re: [nvo3] Draft NVO3 WG Charter

<david.black@emc.com> Mon, 20 February 2012 20:11 UTC

Return-Path: <david.black@emc.com>
X-Original-To: nvo3@ietfa.amsl.com
Delivered-To: nvo3@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EE21F21F87EA for <nvo3@ietfa.amsl.com>; Mon, 20 Feb 2012 12:11:09 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -109.587
X-Spam-Level:
X-Spam-Status: No, score=-109.587 tagged_above=-999 required=5 tests=[AWL=1.012, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LAuV6eoOnAtq for <nvo3@ietfa.amsl.com>; Mon, 20 Feb 2012 12:11:07 -0800 (PST)
Received: from mexforward.lss.emc.com (mexforward.lss.emc.com [128.222.32.20]) by ietfa.amsl.com (Postfix) with ESMTP id A8AC321F8716 for <nvo3@ietf.org>; Mon, 20 Feb 2012 12:11:04 -0800 (PST)
Received: from hop04-l1d11-si01.isus.emc.com (HOP04-L1D11-SI01.isus.emc.com [10.254.111.54]) by mexforward.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id q1KKB0hx000931 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 20 Feb 2012 15:11:01 -0500
Received: from mailhub.lss.emc.com (mailhub.lss.emc.com [10.254.222.129]) by hop04-l1d11-si01.isus.emc.com (RSA Interceptor); Mon, 20 Feb 2012 15:10:51 -0500
Received: from mxhub30.corp.emc.com (mxhub30.corp.emc.com [128.222.70.170]) by mailhub.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id q1KKAou5023464; Mon, 20 Feb 2012 15:10:50 -0500
Received: from mx14a.corp.emc.com ([169.254.1.157]) by mxhub30.corp.emc.com ([128.222.70.170]) with mapi; Mon, 20 Feb 2012 15:10:50 -0500
From: david.black@emc.com
To: yakov@juniper.net
Date: Mon, 20 Feb 2012 15:10:49 -0500
Thread-Topic: [nvo3] Draft NVO3 WG Charter
Thread-Index: Aczv3HkEo64WyT2qQ62Bkdw4UvMECQAKacQg
Message-ID: <7C4DFCE962635144B8FAE8CA11D0BF1E05AEAEF752@MX14A.corp.emc.com>
References: <201202171451.q1HEptR3027370@cichlid.raleigh.ibm.com>, <5E893DB832F57341992548CDBB333163A55C70661A@EMBX01-HQ.jnpr.net> <5E613872-0E27-46D2-8097-B31E7F0F37C5@mimectl>, <5E893DB832F57341992548CDBB333163A55C70669D@EMBX01-HQ.jnpr.net> <B56CFB4A-2393-42C7-9A89-0AA397512F12@mimectl> <201202201430.q1KEUW158093@magenta.juniper.net>
In-Reply-To: <201202201430.q1KEUW158093@magenta.juniper.net>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-EMM-MHVC: 1
Cc: narten@us.ibm.com, jdrake@juniper.net, rbonica@juniper.net, nvo3@ietf.org, afarrel@juniper.net, nitinb@juniper.net
Subject: Re: [nvo3] Draft NVO3 WG Charter
X-BeenThere: nvo3@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "L2 \"Network Virtualization Over l3\" overlay discussion list \(nvo3\)" <nvo3.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nvo3>, <mailto:nvo3-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nvo3>
List-Post: <mailto:nvo3@ietf.org>
List-Help: <mailto:nvo3-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nvo3>, <mailto:nvo3-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 20 Feb 2012 20:11:10 -0000

Yaakov,

> What are the specific *technical* reason(s) why MPLS over GRE is
> a "non-starter" for (a) ToR switches, (b) datacenter access switches,
> and (c) hypervisor softswitches ?

Sure, that was on my list of things to do, so thanks for asking, but
my original assertion was:

> > > > BGP and MPLS are non-starters for a lot of datacenter-internal
> > > > networks.

Let's start with one of the more important problems that has motivated
interest in overlays for virtual networking.  From the proposed charter:

   Support for multi-tenancy has become a core requirement of data
   centers, especially in the context of data centers which include
   virtualized servers known as virtual machines (VMs).  

In these datacenters, there is a sizeable population of virtual machines
running using VLANs.  In a bit more detail, that means:
	- Data Plane: TCP/IP, Ethernet VLANs
	- Control Plane: IP Routing based on IGPs (e.g., OSPF), VLAN
		configuration, LLDP, etc.
Beyond that, management, operational practices and network admin skills
are matched to the environment.

Again from the proposed charter:

   Tenant isolation is primarily achieved today within data centers using
   Ethernet VLANs. But the 12-bit VLAN tag field isn't large enough to
   support existing and future needs.

This is an incremental growth problem - the datacenter is running fine
with VLANs, but VLAN address space is being exhausted.  The solution
should be incremental in impact and incrementally deployable.

Taking a look at MPLS and BGP, and assuming that the gaps previously
pointed out in the marques-l3vpn-end-system draft are addressed, I see
the following:
	- Introduce new data plane: MPLS
	- Introduce new control plane: BGP
	- Significant changes to management and operational practices
	- New network admin skills required
That's not the best incremental impact story.  The last one is particularly
important.

Incremental deployment also leaves a bit to be desired, as the
approach described in the marques-l3vpn-end-system does not work with
existing VM live migration implementations, because the VM's IP addressing
("VM route table" in the draft) has to be modified.  This has some
consequences:

	- Existing live VM migration implementations won't be able to move
		a VM between VLAN and MPLS-BGP environments because they
		don't reconfigure the VM route table.
	- New live VM migration implementations that want to support this
		sort of cross-environment migration will need additional
		functionality (e.g., may need to coax the VM to go renew
		its DHCP lease to discover what happened, and hope it copes).
	- Cold migration of a VM between VLAN and MPLS-BGP environments
		requires reconfiguration to change the VM route table. 
		Similarly, use of a common VM template across both environments
		requires an additional reconfiguration step.

If you'll pardon the double-negative, this isn't to say that an MPLS-BGP
approach can't be made to work, and I'd definitely like to see a common
protocol between end systems and the network for new live migration 
implementations that can be used by any relevant technology.  Rather, the
deployment and operational pain of the above makes it a non-starter
courtesy of its impacts (data plane, control plane, management, operational
practices, required netadmin skills), and operational drawbacks wrt
existing live VM migration deployments.

I believe that with enough engineering and design work, something workable
will emerge here, but I'm concerned that for a significant portion of the
problem space, this is akin to trying to turn a hammer into a powered
screwdriver.

Thanks,
--David

> -----Original Message-----
> From: nvo3-bounces@ietf.org [mailto:nvo3-bounces@ietf.org] On Behalf Of Yakov Rekhter
> Sent: Monday, February 20, 2012 9:31 AM
> To: Black, David
> Cc: narten@us.ibm.com; jdrake@juniper.net; rbonica@juniper.net; nvo3@ietf.org; afarrel@juniper.net;
> nitinb@juniper.net
> Subject: Re: [nvo3] Draft NVO3 WG Charter
> 
> David,
> 
> > Hi John,
> >
> > > > BGP and MPLS are non-starters for a lot of datacenter-internal
> > > > networks.
> > >
> > > [JD]  This is an assertion.  It is also the misses the fact that MPLS
> > > is only required to mux/demux packets at the edges of the VPN network.
> >
> > Indeed it is, but I stand by it.  The interesting "edges of the VPN
> > network" for NVO include datacenter ToR switches, datacenter access
> > switches and hypervisor softswitches - there are plenty of examples of
> > these for which MPLS and BGP are non-starters.
> 
> What are the specific *technical* reason(s) why MPLS over GRE is
> a "non-starter" for (a) ToR switches, (b) datacenter access switches,
> and (c) hypervisor softswitches ?
> 
> Yakov.
> 
> > I suggest reading the NVGRE and VXLAN drafts for more context:
> >
> >    http://tools.ietf.org/html/draft-sridharan-virtualization-nvgre-00
> >    http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-00
> >
> >
> >
> > Thanks,
> > --David
> > ----------------------------------------------------
> > David L. Black, Distinguished Engineer
> > EMC Corporation, 176 South St., Hopkinton, MA  01748
> > +1 (508) 293-7953             FAX: +1 (508) 293-7786
> > david.black@emc.com        Mobile: +1 (978) 394-7754
> > ----------------------------------------------------