Re: [armd] soliciting typical network designs for ARMD

Anoop Ghanwani <anoop@alumni.duke.edu> Tue, 09 August 2011 21:49 UTC

Return-Path: <ghanwani@gmail.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D779A21F8C6E for <armd@ietfa.amsl.com>; Tue, 9 Aug 2011 14:49:38 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.726
X-Spam-Level:
X-Spam-Status: No, score=-3.726 tagged_above=-999 required=5 tests=[AWL=-0.750, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 93SquH8BhWKo for <armd@ietfa.amsl.com>; Tue, 9 Aug 2011 14:49:38 -0700 (PDT)
Received: from mail-qw0-f44.google.com (mail-qw0-f44.google.com [209.85.216.44]) by ietfa.amsl.com (Postfix) with ESMTP id C1CAE21F8C69 for <armd@ietf.org>; Tue, 9 Aug 2011 14:49:37 -0700 (PDT)
Received: by qwc23 with SMTP id 23so280403qwc.31 for <armd@ietf.org>; Tue, 09 Aug 2011 14:50:07 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=1mCKQovlP/nPpI/aa5B3trBOfaJmNJ1XsOwkE3CjZFM=; b=YNCprPzO2RLf+leDzvf4jLUZFcC62io6nbEZ+pismcasylCjm88NHoPiZN2Txub9ko bw5OJ5UhzyFvqa5a05g2qvjVNvOiZgaC+geHVV2Vedw6nwF6bPjWOA2BuU1T+5IL3oAB 9vVxDgFkCi9alFmDg4PWGou38Je22hqKQnQa0=
MIME-Version: 1.0
Received: by 10.229.25.212 with SMTP id a20mr5781371qcc.148.1312926607131; Tue, 09 Aug 2011 14:50:07 -0700 (PDT)
Sender: ghanwani@gmail.com
Received: by 10.229.216.138 with HTTP; Tue, 9 Aug 2011 14:50:07 -0700 (PDT)
In-Reply-To: <CAOyVPHRUFrm2xqwrd4OVQbRotae+3+E8xhOF4n1dmWERVdLPEg@mail.gmail.com>
References: <CAP_bo1b_2D=fbJJ8uGb8LPWb-6+sTQn1Gsh9YAp8pFs3JY_rrw@mail.gmail.com> <CAOyVPHTLYv=-GbjimpDr5NsxMUeWKtVKzStY9yxQO7s4YD2Ywg@mail.gmail.com> <CAP_bo1Ya7p+OS7fS40jE4+UZuhmeO+MAroC=CZK5sMEE625z8Q@mail.gmail.com> <CAOyVPHTcFr7F4ymQyXyECtS6f8z1XyZn40a_5WcpcjF9y0hZvQ@mail.gmail.com> <CA+-tSzx6DGPptGdtx5awzhnPPJgRHow2SWfuwRP4rwjdN1MXmw@mail.gmail.com> <CAOyVPHRUFrm2xqwrd4OVQbRotae+3+E8xhOF4n1dmWERVdLPEg@mail.gmail.com>
Date: Tue, 09 Aug 2011 14:50:07 -0700
X-Google-Sender-Auth: AQRw7XAsrVQRZ1FE7UacdcJqXsc
Message-ID: <CA+-tSzzvj=eUYT4ZOKiy9yGssmrx71eby2f1xkKKh4NkXL5-Vg@mail.gmail.com>
From: Anoop Ghanwani <anoop@alumni.duke.edu>
To: Vishwas Manral <vishwas.ietf@gmail.com>
Content-Type: multipart/alternative; boundary="00163641708d08c4c404aa198cb3"
Cc: armd@ietf.org
Subject: Re: [armd] soliciting typical network designs for ARMD
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Aug 2011 21:51:41 -0000

>>>>
(though I think if there was a standard way to map Multicast MAC to
Multicast IP, they could probably use such a standard mechanisms).
>>>>

They can do that, but then this imposes requirements on the
equipment to be able to do multicast forwarding, and even if does,
because of pruning requirements the number of groups would be
very large.  The average data center switch probably won't handle
that many groups.

On Tue, Aug 9, 2011 at 2:41 PM, Vishwas Manral <vishwas.ietf@gmail.com>wrote:

> Hi Anoop,
>
> From what I know they do not use Multicast GRE (I hear the extra 4 bytes in
> the GRE header is a proprietery extension).
>
> I think a directory based mechanism is what is used (though I think if
> there was a standard way to map Multicast MAC to Multicast IP, they could
> probably use such a standard mechanisms).
>
> Thanks,
> Vishwas
> On Tue, Aug 9, 2011 at 2:03 PM, Anoop Ghanwani <anoop@alumni.duke.edu>wrote:
>
>> Hi Vishwas,
>>
>> How do they get multicast through the network in that case?
>> Are they planning to use multicast GRE, or just use directory
>> based lookups and not worry about multicast applications
>> for now?
>>
>> Anoop
>>
>>   On Tue, Aug 9, 2011 at 1:27 PM, Vishwas Manral <vishwas.ietf@gmail.com>wrote:
>>
>>>   Hi Linda,
>>>
>>> The data packets can be tunnelled at the ToR over say a GRE packet and
>>> the core is a Layer-3 core (except for the downstream ports). So we could
>>> have encapsulation/ decapsulation of L2 over GRE at the ToR.
>>>
>>> The very same thing can be done at the hypervisor layer too, in which
>>> case the entire DC network would look like a Layer-3 flat network including
>>> the ToR to server link and the hypervisor would do the tunneling.
>>>
>>> I am not sure if you got the points above or not. I know cloud OS
>>> companies that provide the service and have big announced customers.
>>>
>>> Thanks,
>>> Vishwas
>>>   On Tue, Aug 9, 2011 at 11:51 AM, Linda Dunbar <dunbar.ll@gmail.com>wrote:
>>>
>>>> Vishwas,
>>>>
>>>> In my mind the bullet 1) in the list refers to ToR switches downstream
>>>> ports (facing servers) running Layer 2 and ToR uplinks ports run IP Layer 3.
>>>>
>>>>
>>>> Have you seen data center networks with ToR switches downstream ports
>>>> (i.e. facing servers) enabling IP routing, even though the physical links
>>>> are Ethernet?
>>>> If yes, we should definitely include it in the ARMD draft.
>>>>
>>>> Thanks,
>>>> Linda
>>>>   On Tue, Aug 9, 2011 at 12:58 PM, Vishwas Manral <
>>>> vishwas.ietf@gmail.com> wrote:
>>>>
>>>>> Hi Linda,
>>>>> I am unsure what you mean by this, but:
>>>>>
>>>>>    1. layer 3 all the way to TOR (Top of Rack switches),
>>>>>
>>>>> We can also have a heirarchical network, with the core totally Layer-3
>>>>> (and having seperate routing), from the hosts still in a large Layer-3
>>>>> subnet. Another aspect could be to have a totally Layer-3 network.
>>>>>
>>>>> The difference between them is the link between the servers and the
>>>>> ToR.
>>>>>
>>>>> Thanks,
>>>>> Vishwas
>>>>>   On Tue, Aug 9, 2011 at 10:22 AM, Linda Dunbar <dunbar.ll@gmail.com>wrote:
>>>>>
>>>>>> During the 81st IETF ARMD WG discussion, it was suggested that it is
>>>>>> necessary to document typical data center network designs so that address
>>>>>> resolution scaling issues can be properly described. Many data center
>>>>>> operators have expressed that they can't openly reveal their detailed
>>>>>> network designs. Therefore, we only want to document anonymous designs
>>>>>> without too much detail. During the journey of establishing ARMD, we have
>>>>>> come across the following typical data center network designs:
>>>>>>
>>>>>>    1. layer 3 all the way to TOR (Top of Rack switches),
>>>>>>    2. large layer 2 with hundreds (or thousands) of ToRs being
>>>>>>    interconnected by Layer 2. This design will have thousands of hosts under
>>>>>>    the L2/L3 boundary router (s)
>>>>>>    3. CLOS design  with thousands of switches. This design will have
>>>>>>    thousands of hosts under the L2/L3 boundary router(s)
>>>>>>
>>>>>> We have heard that each of the designs above has its own problems.
>>>>>> ARMD problem statements might need to document DC problems under each
>>>>>> typical design.
>>>>>> Please send feedback to us (either to the armd email list  or to the
>>>>>> ARMD chair Benson & Linda) to indicate if we have missed any typical Data
>>>>>> Center network designs.
>>>>>>
>>>>>> Your contribution can greatly accelerate the progress of ARMD WG.
>>>>>>
>>>>>> Thank you very much.
>>>>>>
>>>>>> Linda & Benson
>>>>>>
>>>>>