Re: [armd] review of draft-ietf-armd-problem-statement-02
Thomas Narten <narten@us.ibm.com> Fri, 25 May 2012 19:44 UTC
Return-Path: <narten@us.ibm.com>
X-Original-To: armd@ietfa.amsl.com
Delivered-To: armd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A98F021F87A1 for <armd@ietfa.amsl.com>; Fri, 25 May 2012 12:44:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -110.599
X-Spam-Level:
X-Spam-Status: No, score=-110.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uJ-bmOQm+hZR for <armd@ietfa.amsl.com>; Fri, 25 May 2012 12:44:21 -0700 (PDT)
Received: from e7.ny.us.ibm.com (e7.ny.us.ibm.com [32.97.182.137]) by ietfa.amsl.com (Postfix) with ESMTP id 44D0421F87B6 for <armd@ietf.org>; Fri, 25 May 2012 12:44:19 -0700 (PDT)
Received: from /spool/local by e7.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for <armd@ietf.org> from <narten@us.ibm.com>; Fri, 25 May 2012 15:44:18 -0400
Received: from d01dlp01.pok.ibm.com (9.56.224.56) by e7.ny.us.ibm.com (192.168.1.107) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 25 May 2012 15:43:45 -0400
Received: from d01relay07.pok.ibm.com (d01relay07.pok.ibm.com [9.56.227.147]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id D001838C8059 for <armd@ietf.org>; Fri, 25 May 2012 15:43:43 -0400 (EDT)
Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay07.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q4PJhiLr24510632 for <armd@ietf.org>; Fri, 25 May 2012 15:43:44 -0400
Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q4PJhhMX022251 for <armd@ietf.org>; Fri, 25 May 2012 16:43:44 -0300
Received: from cichlid.raleigh.ibm.com ([9.80.11.36]) by d01av03.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q4PJhgGX022207 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 25 May 2012 16:43:43 -0300
Received: from cichlid.raleigh.ibm.com (localhost [127.0.0.1]) by cichlid.raleigh.ibm.com (8.14.5/8.12.5) with ESMTP id q4PJhfTb019425; Fri, 25 May 2012 15:43:41 -0400
Message-Id: <201205251943.q4PJhfTb019425@cichlid.raleigh.ibm.com>
To: anoop@alumni.duke.edu
In-reply-to: <CA+-tSzxY2AdMqcOSDDY3A-o+wJj=Ww5FE4btEe1uPgDMbehANA@mail.gmail.com>
References: <CA+-tSzxY2AdMqcOSDDY3A-o+wJj=Ww5FE4btEe1uPgDMbehANA@mail.gmail.com>
Comments: In-reply-to Anoop Ghanwani <ghanwani@gmail.com> message dated "Thu, 03 May 2012 19:21:54 -0700."
Date: Fri, 25 May 2012 15:43:40 -0400
From: Thomas Narten <narten@us.ibm.com>
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 12052519-5806-0000-0000-00001591D53A
Cc: armd@ietf.org
Subject: Re: [armd] review of draft-ietf-armd-problem-statement-02
X-BeenThere: armd@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <armd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/armd>, <mailto:armd-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/armd>
List-Post: <mailto:armd@ietf.org>
List-Help: <mailto:armd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/armd>, <mailto:armd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 25 May 2012 19:44:21 -0000
H Anoop. Thanks for your review very detailed review comments. I've adopted most of them directly. Some questions below. Anoop Ghanwani <ghanwani@gmail.com> writes: > Section 4.4.1 > ============ > For consistency with the following 2 sections > change title from Layer 3 to L3. > "This topology is ideal for scenarios where servers > Â attached to a particular access switch generally run applications > Â that are are confined to using a single subnet." > I'm not sure I agree with this. There are many issues > surround this including the capabilities of the devices > in the network, the use of the multicast, and the preferences > of the network administrator. I agree that "is ideal" is too strong. How about if I say instead: This topology has benefits in scenarios ... Would that address your concerns? > "Even though > layer 2 traffic are still partitioned by VLANs, the fact that all > VLANs are enabled on all ports can lead to broadcast traffic on all > VLANs to traverse all links and ports, which is same effect as one > big Layer 2 domain. " > I disagree with this because all VLANs would only > need to be provisioned on the aggregation-facing ports. > The disadvantage here is that a lot more broadcast traffic > hits the aggregation layer, and when we need to cross > VLAN boundaries, the traffic must go all way to the > aggregation switch even though the source and destination > may be on the same access switch, and the requirement > for larger ARP tables at the aggregation switches. I struggled with this for a long while. Is this text any better?: <t> When the L3 domain only extends to aggregation switches, hosts in any of the IP subnets configured on the aggregation switches can be reachable via L2 through any access switches if access switches enable all the VLANs. This topology allows a greater level of flexibility as servers attached to any access switch can be reloaded with applications that have been provisioned with IP addresses from multiple prefixes as needed. Further, in such an environment, VMs can migrate between racks without IP address changes. The drawback of this design however is that multiple VLANs have to be enabled on all access switches and all access-facing ports on aggregation switches. Even though L2 traffic is still partitioned by VLANs, the fact that all VLANs are enabled on all ports can lead to broadcast traffic on all VLANs to traverse all links and ports, which is same effect as one big L2 domain on the access-facing side of the aggregation switch. In addition, internal traffic itself might have to cross different L2 boundaries resulting in significant ARP/ND load at the aggregation switches. This design provides a good tradeoff between flexibility and L2 domain size. A moderate sized data center might utilize this approach to provide high availability services at a single location. </t> > "However, the > Overlay Edge switches/routers which perform the network address > encapsulation/decapsulation must ultimately perform a L2 address > resolution and could still potentially face scaling issues at that > point." > It's not the overlay edge switches that have the scaling > problem, its the volume of broadcasts that need to be > sent across the core and that is not helped simply by > using an L3 overlay. I also struggled quite a bit with this comment. Is the following an improvement?: <t> A potential problem that arises in a large data center is when a large number of hosts communicate with their peers in different subnets, all these hosts send (and receive) data packets to their respective L2/L3 boundary nodes as the traffic flows are generally bi-directional. This has the potential to further highlight any scaling problems. These L2/L3 boundary nodes have to process ARP/ND requests sent from originating subnets and resolve physical (MAC) addresses in the target subnets for what are generally bi-directional flows. Therefore, for maximum flexibility in managing the data center workload, it is often desirable to use overlays to place related groups of hosts in the same topological subnet to avoid the L2/L3 boundary translation. The use of overlays in the data center network can be a useful design mechanism to help manage a potential bottleneck at the L2 / L3 boundary by redefining where that boundary exists. </t> > Section 6 > ========== > "Thus, whereas all > nodes must process every ARP query, ND queries are processed only by > the nodes to which they are intended." > When virtualization is in use, the NIC is often operated > in promiscuous mode, which means that the packet would > be delivered to the hypervisor/vswitch and the filtering > would have to be done there (usually implemented in software), > making the problem almost as bad as with ARP. Revised text: Thus, whereas all nodes must process every ARP query, ND queries are processed only by the nodes to which they are intended. In cases where multicast filtering can't effectively be implemented in the NIC (e.g., as on hypervisors supporting virualization), filtering would need to be done in software (e.g., in the hypervisor's vSwitch). Thomas
- [armd] review of Anoop Ghanwani
- Re: [armd] review of draft-ietf-armd-problem-stat… Thomas Narten
- Re: [armd] review of draft-ietf-armd-problem-stat… Anoop Ghanwani