Re: [armd] how does "draft-sridharan-virtualization-nvgre-00" advertise its external facing hosts' IP addresses to external world?

Vishwas Manral <vishwas.ietf@gmail.com> Fri, 23 September 2011 02:35 UTC

Return-Path: <vishwas.ietf@gmail.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C139E11E8094 for <armd@ietfa.amsl.com>; Thu, 22 Sep 2011 19:35:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.194
X-Spam-Level:
X-Spam-Status: No, score=-3.194 tagged_above=-999 required=5 tests=[AWL=0.404, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2ng4+tCtb8Ji for <armd@ietfa.amsl.com>; Thu, 22 Sep 2011 19:35:12 -0700 (PDT)
Received: from mail-qy0-f172.google.com (mail-qy0-f172.google.com [209.85.216.172]) by ietfa.amsl.com (Postfix) with ESMTP id 782DF11E808F for <armd@ietf.org>; Thu, 22 Sep 2011 19:35:12 -0700 (PDT)
Received: by qyk32 with SMTP id 32so6863282qyk.10 for <armd@ietf.org>; Thu, 22 Sep 2011 19:37:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=nb2ypdT3oQWmisywdGALv70vF3Icd4dpi5+bX/nPvIg=; b=Cp99CSm6HAX+lSYgjR3g3mkZuEyfaYXoXXuhscntDPRsfBXUkPeJ6mxareHZDWiSd3 5je/ib7sb4LwkbHkpv60vrpVWlMlkpf7EbheASmnccanAnSBvh+QjDcrtovhc8d0amdh pYym8p9/F0wP5lodQfQ/QelZ/zK12wz72GWQ4=
MIME-Version: 1.0
Received: by 10.229.65.229 with SMTP id k37mr2286027qci.281.1316745465157; Thu, 22 Sep 2011 19:37:45 -0700 (PDT)
Received: by 10.229.28.70 with HTTP; Thu, 22 Sep 2011 19:37:45 -0700 (PDT)
In-Reply-To: <EF5EF2B13ED09B4F871D9A0DBCA463C216C2A472@TK5EX14MBXC300.redmond.corp.microsoft.com>
References: <CAP_bo1b_2D=fbJJ8uGb8LPWb-6+sTQn1Gsh9YAp8pFs3JY_rrw@mail.gmail.com> <CAOyVPHTLYv=-GbjimpDr5NsxMUeWKtVKzStY9yxQO7s4YD2Ywg@mail.gmail.com> <CAP_bo1Ya7p+OS7fS40jE4+UZuhmeO+MAroC=CZK5sMEE625z8Q@mail.gmail.com> <CAOyVPHTcFr7F4ymQyXyECtS6f8z1XyZn40a_5WcpcjF9y0hZvQ@mail.gmail.com> <CA+-tSzx6DGPptGdtx5awzhnPPJgRHow2SWfuwRP4rwjdN1MXmw@mail.gmail.com> <CAOyVPHRUFrm2xqwrd4OVQbRotae+3+E8xhOF4n1dmWERVdLPEg@mail.gmail.com> <CA+-tSzzvj=eUYT4ZOKiy9yGssmrx71eby2f1xkKKh4NkXL5-Vg@mail.gmail.com> <CAOyVPHS-OF8+GRpmcAxbCj5_HEvgVSOvRMA2hC66v1pxs526Nw@mail.gmail.com> <35BAFA1F-25E8-442E-8FE6-2D5691DCBEAC@kumari.net> <7C4DFCE962635144B8FAE8CA11D0BF1E058CCE4D4C@MX14A.corp.emc.com> <EF5EF2B13ED09B4F871D9A0DBCA463C216C1E72D@TK5EX14MBXC300.redmond.corp.microsoft.com> <4A95BA014132FF49AE685FAB4B9F17F610CA2E67@dfweml503-mbx.china.huawei.com> <65755BEBE02F7C41BD4F137AED91DA5E2FD2AF86@TK5EX14MBXW601.wingroup.windeploy.ntdev.microsoft.com> <CAOyVPHTbxYzQoJDjENDhB+VoT=HBxVTeWYWDRoWtjSnejO_=cA@mail.gmail.com> <65755BEBE02F7C41BD4F137AED91DA5E2FD36DD1@TK5EX14MBXW601.wingroup.windeploy.ntdev.microsoft.com> <CAOyVPHQJ5Z+mCQjRGaowkZRvV+xPLUzzhtN=C1fomFtQOa3+Sg@mail.gmail.com> <EF5EF2B13ED09B4F871D9A0DBCA463C216C2A337@TK5EX14MBXC300.redmond.corp.microsoft.com> <CAOyVPHSeLbxaWQ79NX4K44xmm8V-PL+DEyx+=sFwqb54oSudfg@mail.gmail.com> <EF5EF2B13ED09B4F871D9A0DBCA463C216C2A3B4@TK5EX14MBXC300.redmond.corp.microsoft.com> <CAOyVPHRHw4YJ9We4JokUQXRQAiU8hnUi5OKF=i4k_xj6A+g+Vg@mail.gmail.com> <EF5EF2B13ED09B4F871D9A0DBCA463C216C2A472@TK5EX14MBXC300.redmond.corp.microsoft.com>
Date: Thu, 22 Sep 2011 19:37:45 -0700
Message-ID: <CAOyVPHQFD_nB8bRjsLU0idihH=qTQMC_Y=Vh9UnUOt3qoxCo7A@mail.gmail.com>
From: Vishwas Manral <vishwas.ietf@gmail.com>
To: Murari Sridharan <muraris@microsoft.com>
Content-Type: multipart/alternative; boundary="0016e64ed514b5d86604ad92b1ff"
Cc: "armd@ietf.org" <armd@ietf.org>
Subject: Re: [armd] how does "draft-sridharan-virtualization-nvgre-00" advertise its external facing hosts' IP addresses to external world?
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 23 Sep 2011 02:35:14 -0000

Murari think IP mobility. :)

On Thu, Sep 22, 2011 at 5:00 PM, Murari Sridharan <muraris@microsoft.com>wrote:

>  Do you have a scenario in mind?
> ------------------------------
> From: Vishwas Manral
> Sent: 9/22/2011 4:55 PM
>
> To: Murari Sridharan
> Cc: Narasimhan Venkataramaiah; Linda Dunbar; david.black@emc.com;
> armd@ietf.org
> Subject: Re: [armd] how does "draft-sridharan-virtualization-nvgre-00"
> advertise its external facing hosts' IP addresses to external world?
>
>  Hi Murari,
>
> Yes that is what I mean.
>
> Thanks,
> Vishwas
> On Thu, Sep 22, 2011 at 4:50 PM, Murari Sridharan <muraris@microsoft.com>wrote:
>
>>  You mean not an Ethernet frame but some IP payload?****
>>
>> ** **
>>
>> *From:* Vishwas Manral [mailto:vishwas.ietf@gmail.com]
>> *Sent:* Thursday, September 22, 2011 4:49 PM
>> *To:* Murari Sridharan
>> *Cc:* Narasimhan Venkataramaiah; Linda Dunbar; david.black@emc.com;
>> armd@ietf.org
>>
>> *Subject:* Re: [armd] how does "draft-sridharan-virtualization-nvgre-00"
>> advertise its external facing hosts' IP addresses to external world?****
>>
>>   ** **
>>
>> Murari,****
>>
>>  ****
>>
>> What I am saying is the inner header should be allowed to be L3. ****
>>
>>  ****
>>
>> From the diagram you have that does not seem to be the case. Am I missing
>> it totally?****
>>
>>  ****
>>
>> Thanks,****
>>
>> Vishwas****
>>
>> On Thu, Sep 22, 2011 at 4:43 PM, Murari Sridharan <muraris@microsoft.com>
>> wrote:****
>>
>> Vishwas, Thanks for the feedback we will definitely consider adding that.
>> I am not sure what you mean by doing L3 instead of L2. We allow any
>> arbitrary virtual topology including L3. ****
>>
>>  ****
>>
>> Thanks****
>>
>>  ****
>>
>> *From:* Vishwas Manral [mailto:vishwas.ietf@gmail.com]
>> *Sent:* Thursday, September 22, 2011 4:19 PM ****
>>
>>
>> *To:* Narasimhan Venkataramaiah
>> *Cc:* Linda Dunbar; Murari Sridharan; david.black@emc.com; armd@ietf.org
>> *Subject:* Re: [armd] how does "draft-sridharan-virtualization-nvgre-00"
>> advertise its external facing hosts' IP addresses to external world?****
>>
>>  ****
>>
>> Hi Simha,****
>>
>>  ****
>>
>> I see this as the only difference between VXLAN and the NVGRE solution
>> (besides ofcourse that TNI needs to be parsed in the intermediate device for
>> hashing and using lesser number of bytes).****
>>
>>  ****
>>
>> I would think you should add it to your draft immediately. With
>> tunneling you consolidate the addresses visible to the core and by providing
>> a hash mechanism, you are providing some level of randomness.****
>>
>>  ****
>>
>> The other thing you should look at is L3 (IPv4/ IPv6) over NVGRE instead
>> of L2 alone. I guess it would be the same comment for the VXLAN proposal
>> too.****
>>
>>  ****
>>
>> Thanks,****
>>
>> Vishwas****
>>
>> On Thu, Sep 22, 2011 at 4:11 PM, Narasimhan Venkataramaiah <
>> narave@microsoft.com> wrote:****
>>
>> The draft mentions exactly this as one use of the reserved 8 bits in
>> Section 4. An NVGRE endpoint could use the 8 bits to further distribute
>> flows belonging to a particular TNI and the switches use all 32 bits to get
>> entropy. One step further would be for the switches to get full entropy from
>> the inner Ethernet frame. I take it that your comment would be to make it
>> explicit in the draft. Right?****
>>
>>  ****
>>
>> One****
>>
>>    such example could be to use the upper 8 bits of the Key field to****
>>
>>    add flow based entropy and tag all the packets from a flow with an
>> entropy label.****
>>
>>  ****
>>
>> Simha****
>>
>>  ****
>>
>> *From:* Vishwas Manral [mailto:vishwas.ietf@gmail.com]
>> *Sent:* Thursday, September 22, 2011 4:04 PM
>> *To:* Narasimhan Venkataramaiah
>> *Cc:* Linda Dunbar; Murari Sridharan; david.black@emc.com; armd@ietf.org
>> *Subject:* Re: [armd] how does "draft-sridharan-virtualization-nvgre-00"
>> advertise its external facing hosts' IP addresses to external world?****
>>
>>  ****
>>
>> Hi Simha,****
>>
>>  ****
>>
>> The main (Standards Track) change in your draft is the addition of TNI.**
>> **
>>
>>  ****
>>
>> A question I have is a TNI identifies a particular tenant and all flows
>> from/to a tenant will be hashed to the same path (even with the changes in
>> switches to do hashing to use TNI).****
>>
>>  ****
>>
>> Why do you not use the last 8 bits which you have kept as reserved for
>> providing the randomization for hashing flows between same to/from on
>> different paths?****
>>
>>  ****
>>
>> Thanks,****
>>
>> Vishwas****
>>
>> On Sun, Sep 18, 2011 at 11:01 AM, Narasimhan Venkataramaiah <
>> narave@microsoft.com> wrote:****
>>
>> The easiest from the point of view of configuration would be to route
>> everything back through the enterprise - not necessarily the optimal from
>> the enterprise point of view. Are you referring to a scenario where the VMs
>> subnet is split between the cloud and the enterprise? Otherwise I don't see
>> the implication on virtualization as its no different than getting the
>> traffic routed to the enterprise in the first case.
>>
>> Simha
>>
>> ________________________________________
>> From: armd-bounces@ietf.org [armd-bounces@ietf.org] on behalf of Linda
>> Dunbar [linda.dunbar@huawei.com]
>> Sent: Sunday, September 18, 2011 7:06 AM
>> To: Murari Sridharan; david.black@emc.com; armd@ietf.org
>> Subject: [armd] how does "draft-sridharan-virtualization-nvgre-00"
>> advertise its external facing hosts' IP addresses to external world?****
>>
>>
>> Hi Murari,
>>
>> Thank you very much for sharing the presentation.
>>
>> One question:
>>
>> For a host within an Enterprise site which needs to communicate with
>> external peers, the host either uses public IP address which is visible to
>> external peers or uses private IP address which is translated to public
>> address at the Enterprise site's gateway.
>>
>> When this host is moved to "Cloud data center", will the "Cloud Data
>> center" advertise this host address to external peers? Or will all external
>> peers go through enterprise's gateway to reach this host which is no longer
>> residing in the enterprise site?
>>
>> Thanks, Linda
>>
>> > -----Original Message-----
>> > From: armd-bounces@ietf.org [mailto:armd-bounces@ietf.org] On Behalf Of
>> > Murari Sridharan
>> > Sent: Saturday, September 17, 2011 3:02 PM
>> > To: david.black@emc.com; armd@ietf.org
>> > Subject: Re: [armd] soliciting typical network designs for ARMD
>> >
>> > FYI, here is a talk that I gave last week in relation to the nvgre
>> > draft below.
>> > http://channel9.msdn.com/Events/BUILD/BUILD2011/SAC-442T
>> >
>> > Thanks
>> > Murari
>> >
>> > -----Original Message-----
>> > From: armd-bounces@ietf.org [mailto:armd-bounces@ietf.org] On Behalf Of
>> > david.black@emc.com
>> > Sent: Friday, September 16, 2011 6:14 AM
>> > To: armd@ietf.org
>> > Subject: Re: [armd] soliciting typical network designs for ARMD
>> >
>> > And two more drafts on this topic:
>> >
>> > http://www.ietf.org/id/draft-mahalingam-dutt-dcops-vxlan-00.txt
>> > http://www.ietf.org/id/draft-sridharan-virtualization-nvgre-00.txt
>> >
>> > The edge switches could be the software switches in hypervisors.
>> >
>> > Thanks,
>> > --David
>> >
>> >
>> > > -----Original Message-----
>> > > From: armd-bounces@ietf.org [mailto:armd-bounces@ietf.org] On Behalf
>> > > Of Warren Kumari
>> > > Sent: Wednesday, August 31, 2011 3:16 PM
>> > > To: Vishwas Manral
>> > > Cc: armd@ietf.org
>> > > Subject: Re: [armd] soliciting typical network designs for ARMD
>> > >
>> > >
>> > > On Aug 11, 2011, at 11:40 PM, Vishwas Manral wrote:
>> > >
>> > > > Hi Linda/ Anoop,
>> > > >
>> > > > Here is the example of the design I was talking about, as defined
>> > by google.
>> > >
>> > > Just a clarification -- s/as defined by google/as described by
>> > someone
>> > > who happens to work for google/
>> > >
>> > > W
>> > >
>> > > > http://www.ietf.org/id/draft-wkumari-dcops-l3-vmmobility-00.txt
>> > > >
>> > > > Thanks,
>> > > > Vishwas
>> > > > On Tue, Aug 9, 2011 at 2:50 PM, Anoop Ghanwani
>> > <anoop@alumni.duke.edu> wrote:
>> > > >
>> > > > >>>>
>> > > > (though I think if there was a standard way to map Multicast MAC to
>> > > > Multicast IP, they could
>> > > probably use such a standard mechanisms).
>> > > > >>>>
>> > > >
>> > > > They can do that, but then this imposes requirements on the
>> > > > equipment to be able to do multicast forwarding, and even if does,
>> > > > because of pruning requirements the number of groups would be very
>> > > > large.  The average data center switch probably won't handle that
>> > > > many groups.
>> > > >
>> > > > On Tue, Aug 9, 2011 at 2:41 PM, Vishwas Manral
>> > <vishwas.ietf@gmail.com> wrote:
>> > > > Hi Anoop,
>> > > >
>> > > > From what I know they do not use Multicast GRE (I hear the extra 4
>> > > > bytes in the GRE header is a
>> > > proprietery extension).
>> > > >
>> > > > I think a directory based mechanism is what is used (though I think
>> > > > if there was a standard way to
>> > > map Multicast MAC to Multicast IP, they could probably use such a
>> > standard mechanisms).
>> > > >
>> > > > Thanks,
>> > > > Vishwas
>> > > > On Tue, Aug 9, 2011 at 2:03 PM, Anoop Ghanwani
>> > <anoop@alumni.duke.edu> wrote:
>> > > > Hi Vishwas,
>> > > >
>> > > > How do they get multicast through the network in that case?
>> > > > Are they planning to use multicast GRE, or just use directory based
>> > > > lookups and not worry about multicast applications for now?
>> > > >
>> > > > Anoop
>> > > >
>> > > > On Tue, Aug 9, 2011 at 1:27 PM, Vishwas Manral
>> > <vishwas.ietf@gmail.com> wrote:
>> > > > Hi Linda,
>> > > >
>> > > > The data packets can be tunnelled at the ToR over say a GRE packet
>> > > > and the core is a Layer-3 core
>> > > (except for the downstream ports). So we could have encapsulation/
>> > > decapsulation of L2 over GRE at the ToR.
>> > > >
>> > > > The very same thing can be done at the hypervisor layer too, in
>> > > > which case the entire DC network
>> > > would look like a Layer-3 flat network including the ToR to server
>> > > link and the hypervisor would do the tunneling.
>> > > >
>> > > > I am not sure if you got the points above or not. I know cloud OS
>> > > > companies that provide the service
>> > > and have big announced customers.
>> > > >
>> > > > Thanks,
>> > > > Vishwas
>> > > > On Tue, Aug 9, 2011 at 11:51 AM, Linda Dunbar <dunbar.ll@gmail.com>
>> > wrote:
>> > > > Vishwas,
>> > > >
>> > > > In my mind the bullet 1) in the list refers to ToR switches
>> > > > downstream ports (facing servers)
>> > > running Layer 2 and ToR uplinks ports run IP Layer 3.
>> > > >
>> > > > Have you seen data center networks with ToR switches downstream
>> > > > ports (i.e. facing servers) enabling
>> > > IP routing, even though the physical links are Ethernet?
>> > > > If yes, we should definitely include it in the ARMD draft.
>> > > >
>> > > > Thanks,
>> > > > Linda
>> > > > On Tue, Aug 9, 2011 at 12:58 PM, Vishwas Manral
>> > <vishwas.ietf@gmail.com> wrote:
>> > > > Hi Linda,
>> > > > I am unsure what you mean by this, but:
>> > > >   * layer 3 all the way to TOR (Top of Rack switches), We can also
>> > > > have a heirarchical network, with the core totally Layer-3 (and
>> > > > having seperate
>> > > routing), from the hosts still in a large Layer-3 subnet. Another
>> > > aspect could be to have a totally
>> > > Layer-3 network.
>> > > >
>> > > > The difference between them is the link between the servers and the
>> > ToR.
>> > > >
>> > > > Thanks,
>> > > > Vishwas
>> > > > On Tue, Aug 9, 2011 at 10:22 AM, Linda Dunbar <dunbar.ll@gmail.com>
>> > wrote:
>> > > > During the 81st IETF ARMD WG discussion, it was suggested that it
>> > is
>> > > > necessary to document typical
>> > > data center network designs so that address resolution scaling issues
>> > > can be properly described. Many data center operators have expressed
>> > that they can't openly reveal their detailed network designs.
>> > > Therefore, we only want to document anonymous designs without too
>> > much
>> > > detail. During the journey of establishing ARMD, we have come across
>> > the following typical data center network designs:
>> > > >   * layer 3 all the way to TOR (Top of Rack switches),
>> > > >   * large layer 2 with hundreds (or thousands) of ToRs being
>> > > > interconnected by Layer 2. This
>> > > design will have thousands of hosts under the L2/L3 boundary router
>> > > (s)
>> > > >   * CLOS design  with thousands of switches. This design will have
>> > > > thousands of hosts under the
>> > > L2/L3 boundary router(s)
>> > > > We have heard that each of the designs above has its own problems.
>> > > > ARMD problem statements might
>> > > need to document DC problems under each typical design.
>> > > > Please send feedback to us (either to the armd email list  or to
>> > the
>> > > > ARMD chair Benson & Linda) to
>> > > indicate if we have missed any typical Data Center network designs.
>> > > >
>> > > > Your contribution can greatly accelerate the progress of ARMD WG.
>> > > >
>> > > > Thank you very much.
>> > > >
>> > > > Linda & Benson
>> > > >****
>>
>>  ****
>>
>> ** **
>>
>>
>