[armd] review of

Anoop Ghanwani <ghanwani@gmail.com> Fri, 04 May 2012 02:21 UTC

Return-Path: <ghanwani@gmail.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0A6A721F86C2 for <armd@ietfa.amsl.com>; Thu, 3 May 2012 19:21:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.599
X-Spam-Level:
X-Spam-Status: No, score=-3.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id C-1lD8a55+X7 for <armd@ietfa.amsl.com>; Thu, 3 May 2012 19:21:55 -0700 (PDT)
Received: from mail-pz0-f52.google.com (mail-pz0-f52.google.com [209.85.210.52]) by ietfa.amsl.com (Postfix) with ESMTP id 4BC6721F86BE for <armd@ietf.org>; Thu, 3 May 2012 19:21:55 -0700 (PDT)
Received: by dadz9 with SMTP id z9so3437135dad.39 for <armd@ietf.org>; Thu, 03 May 2012 19:21:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=ogvjhsOkGyr7zXNEC2ZCj/8pBPY+l6rfFgP4GmuDJEs=; b=ejUPfVkA3Abra8dqKURsFNKY5bUivm4OEHRs52eGiRQcGVEAKkCq1YWz/dM9YCv8M8 W3k62FJjKFBWJWKch8MSEg50eZxAciJ/7TZ0qkcy5Vy7RlJdsmFw5UxTc8M760PhXSAP jQXzDE1KYdMQJVT2PnC6fPUVi9lVqZk2KI/Wv6SwlW84Q8VF6D8BLHDbC/HzZW+q5jZg /JPt5qfHz/XzrQ6IqIhHTKLzvc0CiNlERqwD1dJDxQiYvJ/AY3GD01VBW0tzIVV0M0fz 3XwWsLKTpqPTqZtyY2/yzW8iBYeykzYEYzVFrFEY2hTM16AdvtgUQji9T0zvcniPWmTW 9X6Q==
MIME-Version: 1.0
Received: by 10.68.129.99 with SMTP id nv3mr12609436pbb.161.1336098115102; Thu, 03 May 2012 19:21:55 -0700 (PDT)
Received: by 10.142.153.18 with HTTP; Thu, 3 May 2012 19:21:54 -0700 (PDT)
Date: Thu, 03 May 2012 19:21:54 -0700
Message-ID: <CA+-tSzxY2AdMqcOSDDY3A-o+wJj=Ww5FE4btEe1uPgDMbehANA@mail.gmail.com>
From: Anoop Ghanwani <ghanwani@gmail.com>
To: armd@ietf.org
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Subject: [armd] review of
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: anoop@alumni.duke.edu
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 04 May 2012 02:21:56 -0000

As one of the people who had volunteered to review
draft-ietf-armd-problem-statement-02, here are my
comments on the document.  They are mostly editorial
but there are a few minor issues with the content.
Otherwise, the document accurately captures the gist
of the problem.

Anoop

=====================

All sections
==========
For consistency
Data Center -> data center
Access Layer, Access layer -> access layer
Aggregation Layer, Aggregation layer -> aggregation layer
Layer 2, layer 2 -> L2
Layer 3, layer 3 -> L3
Use ARP Request or ARP request consistently (prefer ARP Request)

Section 1
=========
datacenters -> data centers

Section 2
=========
In the definition for VM:
Replace "bare" with "physical, non-virtualized".

In the definition for EoR:
Delete extraneous "network".

Section 3
=========
Change
"poor implementations of loop detection and prevention"
to
"poor implementations of loop detection and prevention or
misconfiguration errors".

Change
"With virtualization, a single physical server can host 10 (or more)
VMs, each having its own IP (and MAC) addresses"
to
"With virtualization, a single physical server can host many VMs,
each having its own network
interfaces with IP and MAC addresses."
[The number is captured in a later sentence.]

"services" is used all over the place, but there is no definition
for services.  Äpplication, however, is defined.  I think what is
meant here is application.

Last paragraph:
These number -> This number.

Below Figure 1:
Delete "Figure 1".

Section 4.1
==========

Change
"The access switches might be placed either on
  top-of-rack (ToR) or at end-of-row (EoR) physical configuration."
to
"The access layer may be implemented by wiring the
servers within a rack to a  top-of-rack (ToR) switch or,
less commonly, the servers could be wired directly to an
end-of-row (EoR) switch."

Section 4.4.1
============

For consistency with the following 2 sections
change title from Layer 3 to L3.

"This topology is ideal for scenarios where servers
  attached to a particular access switch generally run applications
  that are are confined to using a single subnet."
I'm not sure I agree with this.  There are many issues
surround this including the capabilities of the devices
in the network, the use of the multicast, and the preferences
of the network administrator.

Section 4.4.2
============
"This topology allows for a great deal
   of flexibility as servers attached to one access switch can be re-
   loaded with applications with different IP prefix and VMs can now
   migrate between racks without IP address changes. "
to
"This topology allows a greater level of flexibility as
servers attached to any access switch can be reloaded
with applications and provisioned with IP addresses
from multiple prefixes as needed.  Further, in such
an environment, VMs can migrate between racks without
IP address changes.

Change
"layer 2 traffic are still partitioned by VLANs"
to
"layer 2 traffic is still partitioned using VLANs"

"Even though
   layer 2 traffic are still partitioned by VLANs, the fact that all
   VLANs are enabled on all ports can lead to broadcast traffic on all
   VLANs to traverse all links and ports, which is same effect as one
   big Layer 2 domain. "
I disagree with this because all VLANs would only
need to be provisioned on the aggregation-facing ports.
The disadvantage here is that a lot more broadcast traffic
hits the aggregation layer, and when we need to cross
VLAN boundaries, the traffic must go all way to the
aggregation switch even though the source and destination
may be on the same access switch, and the requirement
for larger ARP tables at the aggregation switches.

Section 4.4.4
=============
"   There are several approaches regarding how overlay networks can make
   very large layer 2 network scale and enable mobility. "
to
"There are several approaches where overlay networks
can be used to build very large L2 networks to enable VM mobility"

"This can help the data
   center designer to control the size of the L2 domain.  "
It should be clarified that this only applies when L3 overlays
are in use.

"However, the
   Overlay Edge switches/routers which perform the network address
   encapsulation/decapsulation must ultimately perform a L2 address
   resolution and could still potentially face scaling issues at that
   point."
It's not the overlay edge switches that have the scaling
problem, its the volume of broadcasts that need to be
sent across the core and that is not helped simply by
using an L3 overlay.

Change
"physical addresses (MAC)"
to
"physical (MAC) addresses"

For consistency, change
"Layer 2 / Layer 3 boundary"
to
L2/L3 boundary

Section 4.5
===========

Change
"appropriately sized Access, Aggregation and Core networks"
to
"appropriately sized access, aggregation and core layers"

Change
"Broadly speaking it is desirable"
to
"Broadly speaking, it is desirable"

Section 5
=========
ARP response -> ARP Reply

rerun ARP -> reissue an ARP Request

Section 6
==========

"Thus, whereas all
   nodes must process every ARP query, ND queries are processed only by
   the nodes to which they are intended."
When virtualization is in use, the NIC is often operated
in promiscuous mode, which means that the packet would
be delivered to the hypervisor/vswitch and the filtering
would have to be done there (usually implemented in software),
making the problem almost as bad as with ARP.

Section 7.1
=========
ARP query -> ARP Request

Change
   "One common router implementation architecture has ARP processing
   handled"
to
"ARP processing in routers is commonly handled..."

revalidate timer -> aging timer

their ASIC fast paths -> their forwarding ASICs

target's Subnet -> target's subnet

Section 7.2
===========

nodes to which they are intended -> nodes for which they are intended

Section 7.3
===========
ten (or  more) VMs -> many VMs
(in the 10's today, but growing rapidly as the number of cores per CPU
increases)