Re: [armd] how does "draft-sridharan-virtualization-nvgre-00"advertise its external facing hosts' IP addresses to external world?

Gary Berger <gaberger@cisco.com> Fri, 23 September 2011 13:31 UTC

Return-Path: <gaberger@cisco.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C7E5621F8C5E for <armd@ietfa.amsl.com>; Fri, 23 Sep 2011 06:31:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.127
X-Spam-Level:
X-Spam-Status: No, score=-1.127 tagged_above=-999 required=5 tests=[AWL=0.075, BAYES_00=-2.599, HTML_MESSAGE=0.001, MIME_QP_LONG_LINE=1.396]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eh4jwnK3ybC7 for <armd@ietfa.amsl.com>; Fri, 23 Sep 2011 06:31:11 -0700 (PDT)
Received: from rcdn-iport-1.cisco.com (rcdn-iport-1.cisco.com [173.37.86.72]) by ietfa.amsl.com (Postfix) with ESMTP id 45A6321F8C5A for <armd@ietf.org>; Fri, 23 Sep 2011 06:31:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=gaberger@cisco.com; l=48056; q=dns/txt; s=iport; t=1316784826; x=1317994426; h=date:subject:from:to:cc:message-id:in-reply-to: mime-version; bh=hEqPhqLzeHQjWuDrp6HfrxBGEJEdOgS/ZHDdC/GgITY=; b=csZcArts3xtefKbM1RPEMFkhFleTI5OrACZNAKY5InwJJyp0zmZYB03K 1v3PzmQ5+rPZEEjzo2R93Upo4rbvPaqnadUgFBqTekLmG1tZufmnY7ZNi 0MAPC7YSZOt4AKTMNg9hsolY3Wsol+PD/niK8+IgFWQLg5BHKS0r/KX6q 4=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Am0BAHiJfE6tJV2b/2dsb2JhbABCgk2WTIdbAYR8djxueIFTAQEBAwEBAQEPAQcBIjEEBwUHBwgRAwEBAQEgAQIEKAYfCQgGAQ0FCRIHh1YGl3QBnieHAQSHQowQhSCEdYc4
X-IronPort-AV: E=Sophos; i="4.68,430,1312156800"; d="scan'208,217"; a="23563745"
Received: from rcdn-core-4.cisco.com ([173.37.93.155]) by rcdn-iport-1.cisco.com with ESMTP; 23 Sep 2011 13:33:44 +0000
Received: from [10.82.247.253] (rtp-vpn2-2042.cisco.com [10.82.247.253]) by rcdn-core-4.cisco.com (8.14.3/8.14.3) with ESMTP id p8NDXf5i032207; Fri, 23 Sep 2011 13:33:43 GMT
User-Agent: Microsoft-MacOutlook/14.12.0.110505
Date: Fri, 23 Sep 2011 09:33:40 -0400
From: Gary Berger <gaberger@cisco.com>
To: Vishwas Manral <vishwas.ietf@gmail.com>, Narasimhan Venkataramaiah <narave@microsoft.com>
Message-ID: <CAA1FDB1.2A942%gaberger@cisco.com>
Thread-Topic: [armd] how does "draft-sridharan-virtualization-nvgre-00"advertise its external facing hosts' IP addresses to external world?
In-Reply-To: <CAOyVPHTaawuzbJH+iv_+oV4ZenHbBfzdr-cmLBUfC0fMQL=V2A@mail.gmail.com>
Mime-version: 1.0
Content-type: multipart/alternative; boundary="B_3399615223_12189497"
Cc: "armd@ietf.org" <armd@ietf.org>
Subject: Re: [armd] how does "draft-sridharan-virtualization-nvgre-00"advertise its external facing hosts' IP addresses to external world?
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 23 Sep 2011 13:31:16 -0000

I think these are all interesting approaches.. Whether you demultiplex a
customer on an 802.1ah I-SID, a VXLAN VID,  NVGRE TNI or some other set of
bits,  we still seem to be carrying excessive state across the network which
will continue to make things more fragile. If we continue to drive the
problem into the host-stack (because that¹s where the customer has some
choice) we will lose the benefit of building intelligent services in the
network. 

The problem space is still the same I.e. Because of the TCP pseudo-header
binding the IP address to the point of attachment we have to encapsulate in
order to hide this limiting factor). The arguments about "whatever over GRE"
are beyond the problem space IMHO but certainly a clear indication of the
flawed model. 

Some have suggested the answer is a distributed directory service (for
broadcast containment) which allows the implementor to define their own
bindings. This along with the programability of the network (e.g. OF) in
order to instantiate those bindings into the forwarding plane. Certainly
that solution can only be based on the scope of control but none-the-less, I
thought that this example was where this working group derived its charter.

-g


From:  Vishwas Manral <vishwas.ietf@gmail.com>
Date:  Thu, 22 Sep 2011 20:17:59 -0700
To:  Narasimhan Venkataramaiah <narave@microsoft.com>
Cc:  "armd@ietf.org" <armd@ietf.org>
Subject:  Re: [armd] how does
"draft-sridharan-virtualization-nvgre-00"advertise its external facing
hosts' IP addresses to external world?

Hi Simha,
 
I am talking about the difference between Layer-2 mobility and Layer-3
mobility.
 
Thanks,
Vishwas
On Thu, Sep 22, 2011 at 8:12 PM, Narasimhan Venkataramaiah
<narave@microsoft.com> wrote:
> Do you mean in the context of network virtualization or in general its useful?
> Mobility is already one aspect of network virtualization and its satisfied by
> the MAC in GRE option.
>  
> Simha
>  
> From: Vishwas Manral [mailto:vishwas.ietf@gmail.com]
> Sent: Thursday, September 22, 2011 8:08 PM
> 
> 
> To: Murari Sridharan
> Cc: Narasimhan Venkataramaiah; Linda Dunbar; david.black@emc.com;
> armd@ietf.org
> Subject: Re: [armd] how does "draft-sridharan-virtualization-nvgre-00"
> advertise its external facing hosts' IP addresses to external world?
>  
> 
> Murari,
> 
>  
> 
> You could interpret it that way, but I see the scope increase is considerably
> more than just removing the MAC header, though that would be just one such
> advantage.
> 
>  
> 
> Thanks,
> 
> Vishwas
> 
> On Thu, Sep 22, 2011 at 7:41 PM, Murari Sridharan <muraris@microsoft.com>
> wrote:
> 
> In the context of the proposal here where you can carry your IP around
> wherever you go doesn¹t that already equivalent to IP mobility? If I
> understand you right you are simply saying don¹t just restrict the draft to
> MAC-in-GRE but do IP-in-GRE right?
>  
> From: Vishwas Manral [mailto:vishwas.ietf@gmail.com]
> Sent: Thursday, September 22, 2011 7:38 PM
> 
> 
> To: Murari Sridharan
> Cc: Narasimhan Venkataramaiah; Linda Dunbar; david.black@emc.com;
> armd@ietf.org
> Subject: Re: [armd] how does "draft-sridharan-virtualization-nvgre-00"
> advertise its external facing hosts' IP addresses to external world?
> 
>  
> 
> Murari think IP mobility. :)
> 
>  
> 
> On Thu, Sep 22, 2011 at 5:00 PM, Murari Sridharan <muraris@microsoft.com>
> wrote:
> 
> Do you have a scenario in mind?
> 
> 
> From: Vishwas Manral
> Sent: 9/22/2011 4:55 PM
> 
> 
> To: Murari Sridharan
> Cc: Narasimhan Venkataramaiah; Linda Dunbar; david.black@emc.com;
> armd@ietf.org
> Subject: Re: [armd] how does "draft-sridharan-virtualization-nvgre-00"
> advertise its external facing hosts' IP addresses to external world?
> 
> Hi Murari,
> 
>  
> 
> Yes that is what I mean.
> 
>  
> 
> Thanks,
> 
> Vishwas
> 
> On Thu, Sep 22, 2011 at 4:50 PM, Murari Sridharan <muraris@microsoft.com>
> wrote:
> 
> You mean not an Ethernet frame but some IP payload?
>  
> From: Vishwas Manral [mailto:vishwas.ietf@gmail.com]
> Sent: Thursday, September 22, 2011 4:49 PM
> To: Murari Sridharan
> Cc: Narasimhan Venkataramaiah; Linda Dunbar; david.black@emc.com;
> armd@ietf.org 
> 
> 
> Subject: Re: [armd] how does "draft-sridharan-virtualization-nvgre-00"
> advertise its external facing hosts' IP addresses to external world?
> 
>  
> 
> Murari,
> 
>  
> 
> What I am saying is the inner header should be allowed to be L3.
> 
>  
> 
> From the diagram you have that does not seem to be the case. Am I missing it
> totally?
> 
>  
> 
> Thanks,
> 
> Vishwas
> 
> On Thu, Sep 22, 2011 at 4:43 PM, Murari Sridharan <muraris@microsoft.com>
> wrote:
> 
> Vishwas, Thanks for the feedback we will definitely consider adding that. I am
> not sure what you mean by doing L3 instead of L2. We allow any arbitrary
> virtual topology including L3.
>  
> Thanks
>  
> From: Vishwas Manral [mailto:vishwas.ietf@gmail.com]
> Sent: Thursday, September 22, 2011 4:19 PM
> 
> 
> To: Narasimhan Venkataramaiah
> Cc: Linda Dunbar; Murari Sridharan; david.black@emc.com; armd@ietf.org
> Subject: Re: [armd] how does "draft-sridharan-virtualization-nvgre-00"
> advertise its external facing hosts' IP addresses to external world?
> 
>  
> 
> Hi Simha,
> 
>  
> 
> I see this as the only difference between VXLAN and the NVGRE solution
> (besides ofcourse that TNI needs to be parsed in the intermediate device for
> hashing and using lesser number of bytes).
> 
>  
> 
> I would think you should add it to your draft immediately. With tunneling you
> consolidate the addresses visible to the core and by providing a hash
> mechanism, you are providing some level of randomness.
> 
>  
> 
> The other thing you should look at is L3 (IPv4/ IPv6) over NVGRE instead of L2
> alone. I guess it would be the same comment for the VXLAN proposal too.
> 
>  
> 
> Thanks,
> 
> Vishwas
> 
> On Thu, Sep 22, 2011 at 4:11 PM, Narasimhan Venkataramaiah
> <narave@microsoft.com> wrote:
> 
> The draft mentions exactly this as one use of the reserved 8 bits in Section
> 4. An NVGRE endpoint could use the 8 bits to further distribute flows
> belonging to a particular TNI and the switches use all 32 bits to get entropy.
> One step further would be for the switches to get full entropy from the inner
> Ethernet frame. I take it that your comment would be to make it explicit in
> the draft. Right?
>  
> One
>    such example could be to use the upper 8 bits of the Key field to
>    add flow based entropy and tag all the packets from a flow with an entropy
> label.
>  
> Simha
>  
> From: Vishwas Manral [mailto:vishwas.ietf@gmail.com]
> Sent: Thursday, September 22, 2011 4:04 PM
> To: Narasimhan Venkataramaiah
> Cc: Linda Dunbar; Murari Sridharan; david.black@emc.com; armd@ietf.org
> Subject: Re: [armd] how does "draft-sridharan-virtualization-nvgre-00"
> advertise its external facing hosts' IP addresses to external world?
> 
>  
> 
> Hi Simha,
> 
>  
> 
> The main (Standards Track) change in your draft is the addition of TNI.
> 
>  
> 
> A question I have is a TNI identifies a particular tenant and all flows
> from/to a tenant will be hashed to the same path (even with the changes in
> switches to do hashing to use TNI).
> 
>  
> 
> Why do you not use the last 8 bits which you have kept as reserved for
> providing the randomization for hashing flows between same to/from on
> different paths?
> 
>  
> 
> Thanks,
> 
> Vishwas
> 
> On Sun, Sep 18, 2011 at 11:01 AM, Narasimhan Venkataramaiah
> <narave@microsoft.com> wrote:
> The easiest from the point of view of configuration would be to route
> everything back through the enterprise - not necessarily the optimal from the
> enterprise point of view. Are you referring to a scenario where the VMs subnet
> is split between the cloud and the enterprise? Otherwise I don't see the
> implication on virtualization as its no different than getting the traffic
> routed to the enterprise in the first case.
> 
> Simha
> 
> ________________________________________
> From: armd-bounces@ietf.org [armd-bounces@ietf.org] on behalf of Linda Dunbar
> [linda.dunbar@huawei.com]
> Sent: Sunday, September 18, 2011 7:06 AM
> To: Murari Sridharan; david.black@emc.com; armd@ietf.org
> Subject: [armd] how does "draft-sridharan-virtualization-nvgre-00" advertise
> its external facing hosts' IP addresses to external world?
> 
> 
> Hi Murari,
> 
> Thank you very much for sharing the presentation.
> 
> One question:
> 
> For a host within an Enterprise site which needs to communicate with external
> peers, the host either uses public IP address which is visible to external
> peers or uses private IP address which is translated to public address at the
> Enterprise site's gateway.
> 
> When this host is moved to "Cloud data center", will the "Cloud Data center"
> advertise this host address to external peers? Or will all external peers go
> through enterprise's gateway to reach this host which is no longer residing in
> the enterprise site?
> 
> Thanks, Linda
> 
>> > -----Original Message-----
>> > From: armd-bounces@ietf.org [mailto:armd-bounces@ietf.org] On Behalf Of
>> > Murari Sridharan
>> > Sent: Saturday, September 17, 2011 3:02 PM
>> > To: david.black@emc.com; armd@ietf.org
>> > Subject: Re: [armd] soliciting typical network designs for ARMD
>> >
>> > FYI, here is a talk that I gave last week in relation to the nvgre
>> > draft below.
>> > http://channel9.msdn.com/Events/BUILD/BUILD2011/SAC-442T
>> >
>> > Thanks
>> > Murari
>> >
>> > -----Original Message-----
>> > From: armd-bounces@ietf.org [mailto:armd-bounces@ietf.org] On Behalf Of
>> > david.black@emc.com
>> > Sent: Friday, September 16, 2011 6:14 AM
>> > To: armd@ietf.org
>> > Subject: Re: [armd] soliciting typical network designs for ARMD
>> >
>> > And two more drafts on this topic:
>> >
>> > http://www.ietf.org/id/draft-mahalingam-dutt-dcops-vxlan-00.txt
>> > http://www.ietf.org/id/draft-sridharan-virtualization-nvgre-00.txt
>> >
>> > The edge switches could be the software switches in hypervisors.
>> >
>> > Thanks,
>> > --David
>> >
>> >
>>> > > -----Original Message-----
>>> > > From: armd-bounces@ietf.org [mailto:armd-bounces@ietf.org] On Behalf
>>> > > Of Warren Kumari
>>> > > Sent: Wednesday, August 31, 2011 3:16 PM
>>> > > To: Vishwas Manral
>>> > > Cc: armd@ietf.org
>>> > > Subject: Re: [armd] soliciting typical network designs for ARMD
>>> > >
>>> > >
>>> > > On Aug 11, 2011, at 11:40 PM, Vishwas Manral wrote:
>>> > >
>>>> > > > Hi Linda/ Anoop,
>>>> > > >
>>>> > > > Here is the example of the design I was talking about, as defined
>> > by google.
>>> > >
>>> > > Just a clarification -- s/as defined by google/as described by
>> > someone
>>> > > who happens to work for google/
>>> > >
>>> > > W
>>> > >
>>>> > > > http://www.ietf.org/id/draft-wkumari-dcops-l3-vmmobility-00.txt
>>>> > > >
>>>> > > > Thanks,
>>>> > > > Vishwas
>>>> > > > On Tue, Aug 9, 2011 at 2:50 PM, Anoop Ghanwani
>> > <anoop@alumni.duke.edu> wrote:
>>>> > > >
>>>>>>>> > > > >>>>
>>>> > > > (though I think if there was a standard way to map Multicast MAC to
>>>> > > > Multicast IP, they could
>>> > > probably use such a standard mechanisms).
>>>>>>>> > > > >>>>
>>>> > > >
>>>> > > > They can do that, but then this imposes requirements on the
>>>> > > > equipment to be able to do multicast forwarding, and even if does,
>>>> > > > because of pruning requirements the number of groups would be very
>>>> > > > large.  The average data center switch probably won't handle that
>>>> > > > many groups.
>>>> > > >
>>>> > > > On Tue, Aug 9, 2011 at 2:41 PM, Vishwas Manral
>> > <vishwas.ietf@gmail.com> wrote:
>>>> > > > Hi Anoop,
>>>> > > >
>>>> > > > From what I know they do not use Multicast GRE (I hear the extra 4
>>>> > > > bytes in the GRE header is a
>>> > > proprietery extension).
>>>> > > >
>>>> > > > I think a directory based mechanism is what is used (though I think
>>>> > > > if there was a standard way to
>>> > > map Multicast MAC to Multicast IP, they could probably use such a
>> > standard mechanisms).
>>>> > > >
>>>> > > > Thanks,
>>>> > > > Vishwas
>>>> > > > On Tue, Aug 9, 2011 at 2:03 PM, Anoop Ghanwani
>> > <anoop@alumni.duke.edu> wrote:
>>>> > > > Hi Vishwas,
>>>> > > >
>>>> > > > How do they get multicast through the network in that case?
>>>> > > > Are they planning to use multicast GRE, or just use directory based
>>>> > > > lookups and not worry about multicast applications for now?
>>>> > > >
>>>> > > > Anoop
>>>> > > >
>>>> > > > On Tue, Aug 9, 2011 at 1:27 PM, Vishwas Manral
>> > <vishwas.ietf@gmail.com> wrote:
>>>> > > > Hi Linda,
>>>> > > >
>>>> > > > The data packets can be tunnelled at the ToR over say a GRE packet
>>>> > > > and the core is a Layer-3 core
>>> > > (except for the downstream ports). So we could have encapsulation/
>>> > > decapsulation of L2 over GRE at the ToR.
>>>> > > >
>>>> > > > The very same thing can be done at the hypervisor layer too, in
>>>> > > > which case the entire DC network
>>> > > would look like a Layer-3 flat network including the ToR to server
>>> > > link and the hypervisor would do the tunneling.
>>>> > > >
>>>> > > > I am not sure if you got the points above or not. I know cloud OS
>>>> > > > companies that provide the service
>>> > > and have big announced customers.
>>>> > > >
>>>> > > > Thanks,
>>>> > > > Vishwas
>>>> > > > On Tue, Aug 9, 2011 at 11:51 AM, Linda Dunbar <dunbar.ll@gmail.com>
>> > wrote:
>>>> > > > Vishwas,
>>>> > > >
>>>> > > > In my mind the bullet 1) in the list refers to ToR switches
>>>> > > > downstream ports (facing servers)
>>> > > running Layer 2 and ToR uplinks ports run IP Layer 3.
>>>> > > >
>>>> > > > Have you seen data center networks with ToR switches downstream
>>>> > > > ports (i.e. facing servers) enabling
>>> > > IP routing, even though the physical links are Ethernet?
>>>> > > > If yes, we should definitely include it in the ARMD draft.
>>>> > > >
>>>> > > > Thanks,
>>>> > > > Linda
>>>> > > > On Tue, Aug 9, 2011 at 12:58 PM, Vishwas Manral
>> > <vishwas.ietf@gmail.com> wrote:
>>>> > > > Hi Linda,
>>>> > > > I am unsure what you mean by this, but:
>>>> > > >   * layer 3 all the way to TOR (Top of Rack switches), We can also
>>>> > > > have a heirarchical network, with the core totally Layer-3 (and
>>>> > > > having seperate
>>> > > routing), from the hosts still in a large Layer-3 subnet. Another
>>> > > aspect could be to have a totally
>>> > > Layer-3 network.
>>>> > > >
>>>> > > > The difference between them is the link between the servers and the
>> > ToR.
>>>> > > >
>>>> > > > Thanks,
>>>> > > > Vishwas
>>>> > > > On Tue, Aug 9, 2011 at 10:22 AM, Linda Dunbar <dunbar.ll@gmail.com>
>> > wrote:
>>>> > > > During the 81st IETF ARMD WG discussion, it was suggested that it
>> > is
>>>> > > > necessary to document typical
>>> > > data center network designs so that address resolution scaling issues
>>> > > can be properly described. Many data center operators have expressed
>> > that they can't openly reveal their detailed network designs.
>>> > > Therefore, we only want to document anonymous designs without too
>> > much
>>> > > detail. During the journey of establishing ARMD, we have come across
>> > the following typical data center network designs:
>>>> > > >   * layer 3 all the way to TOR (Top of Rack switches),
>>>> > > >   * large layer 2 with hundreds (or thousands) of ToRs being
>>>> > > > interconnected by Layer 2. This
>>> > > design will have thousands of hosts under the L2/L3 boundary router
>>> > > (s)
>>>> > > >   * CLOS design  with thousands of switches. This design will have
>>>> > > > thousands of hosts under the
>>> > > L2/L3 boundary router(s)
>>>> > > > We have heard that each of the designs above has its own problems.
>>>> > > > ARMD problem statements might
>>> > > need to document DC problems under each typical design.
>>>> > > > Please send feedback to us (either to the armd email list  or to
>> > the
>>>> > > > ARMD chair Benson & Linda) to
>>> > > indicate if we have missed any typical Data Center network designs.
>>>> > > >
>>>> > > > Your contribution can greatly accelerate the progress of ARMD WG.
>>>> > > >
>>>> > > > Thank you very much.
>>>> > > >
>>>> > > > Linda & Benson
>>>> > > >
>  
>  
>  
>  
>  

_______________________________________________ armd mailing list
armd@ietf.org https://www.ietf.org/mailman/listinfo/armd