Re: [armd] Gen-art] review: draft-ietf-armd-problem-statement-03

"Joel M. Halpern" <> Wed, 29 August 2012 16:07 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 432AE21F852E; Wed, 29 Aug 2012 09:07:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -102.349
X-Spam-Status: No, score=-102.349 tagged_above=-999 required=5 tests=[AWL=-0.084, BAYES_00=-2.599, IP_NOT_FRIENDLY=0.334, USER_IN_WHITELIST=-100]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id xTV7S2f8AiBc; Wed, 29 Aug 2012 09:07:55 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 7427021F853A; Wed, 29 Aug 2012 09:07:55 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 1B5C8A6E0A; Wed, 29 Aug 2012 09:07:54 -0700 (PDT)
Received: from localhost (localhost []) by (Postfix) with ESMTP id 71D261BD3F68; Wed, 29 Aug 2012 09:07:53 -0700 (PDT)
X-Virus-Scanned: Debian amavisd-new at
Received: from [] ( []) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPSA id 253D71BD4505; Wed, 29 Aug 2012 09:07:52 -0700 (PDT)
Message-ID: <>
Date: Wed, 29 Aug 2012 12:07:48 -0400
From: "Joel M. Halpern" <>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120824 Thunderbird/15.0
MIME-Version: 1.0
To: Thomas Narten <>
References: <> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc:, "A. Jean Mahoney" <>, "" <>
Subject: Re: [armd] Gen-art] review: draft-ietf-armd-problem-statement-03
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion of issues associated with large amount of virtual machines being introduced in data centers and virtual hosts introduced by Cloud Computing." <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 29 Aug 2012 16:07:56 -0000

All of the proposed resolutions look very good.  Thank you.

With regard to routers and ARP caches, my concern is that from what I 
saw of the years, common practice did not seem to match the SHOULD from 
the RFCs.  I am a little remote from most implementations at the moment 
(the ones I can check easily are a tiny fraction of the market), so I 
was suggesting that be double-checked.


On 8/29/2012 11:59 AM, Thomas Narten wrote:
> Hi Joel.
> Thanks for the review comments. (And sorry for taking so long to respond!)
> "Joel M. Halpern" <> writes:
>> Major issues:
>>       The use of the term "switch" seems confusing.  I had first assumed
>> that it meant an ethernet switch (which might have  abit of L3 smarts,
>> or might not.  I was trying not to be picky.)  But then, in section 6.3
>> it refers to "core switches ... are the data center gateways to external
>> networks" which means that those are routers.
> The switch vs. router terminology is tricky.
> 6.3 says:
>     Core switches connect multiple aggregation switches and are the data
>     center gateway(s) to external networks or interconnect to different
>     sets of racks within one data center.
> How about I change that to:
>     Core switches connect multiple aggregation switches and interface
>     with data center gateway(s) to external networks or interconnect to
>     different sets of racks within one data center.
> I know that is just side stepping this a bit, but Section 6.4 has more
> text about the L2/L3 boundaries in various deployments. This document
> is walking a bit of tightrope by trying to be general and not too
> specific. If we get too specific, folk start screaming "that's not the
> way my data center looks".
>> Moderate Issue:
>>      The document seems to be interestingly selective in what modern
>> technologies it chooses to mention.  Mostly it seems to be describing
>> problems with data center networks using technology more than 5 years
>> old.  Since that is the widely deployed practice, that is
>>      defensible.
> I think this has to do with how the WG was chartered.
>> But then the document chooses to mention new work such as OpenFlow,
>> without mentioning the work IEEE has done on broadcast ad multicast
>> containment for data centers.  It seems to me that we need to be
>> consistent, either describing only the widely deployed technology, or
>> including a fair mention of already defined and productized solutions
>> that are not yet widely deployed.
> I'd be fine with taking out the references to OpenFlow. I don't think
> it adds much to the document.
>>       On a related note, the document assumes that multicast NDs are
>> delivered to all nodes, while in practice I believe existing techniques
>> to filter such multicast messages closer to the source are widely
>> deployed.  (Section 5.)
> This paragraph has been signficantly revised. The current proposed
> text  is:
> 	Broadly speaking, from the perspective of address resolution,
>          IPv6's Neighbor Discovery (ND) behaves much like ARP, with a
>          few notable differences. First, ARP uses broadcast, whereas ND
>          uses multicast. Specifically, when querying for a target IP
>          address, ND maps the target address into an IPv6 Solicited
>          Node multicast address. Using multicast rather than broadcast
>          has the benefit that the multicast frames do not necessarily
>          need to be sent to all parts of the network, i.e., only to
>          segments where listeners for the Solicited Node multicast
>          address reside. In the case where multicast frames are
>          delivered to all parts of the network, sending to a multicast
>          still has the advantage that most (if not all) nodes will
>          filter out the (unwanted) multicast query via filters
>          installed in the NIC rather than burdening host software with
>          the need to process such packets. Thus, whereas all nodes must
>          process every ARP query, ND queries are processed only by the
>          nodes to which they are intended. In cases where multicast
>          filtering can't effectively be implemented in the NIC (e.g.,
>          as on hypervisors supporting virtualization), filtering would
>          need to be done in software (e.g., in the hypervisor's
>          vSwitch).
> Is that better?	
>> Minor issues:
>>       I presume that section 6.4.2 which describes needing to enable all
>> VLANs on all aggregation ports is a description of current practice,
>> since it is not a requirement of current technologies, either via VLAN
>> management or orchestration?
> Yes.
>>       Section 6.4.4 seems very odd.  The title is "overlays".  Are there
>> widely deployed overlays?
> I keep hearing yes, but proprietary, so little can be said about them.
>> If so, it would be good to name the
>> technologies being referred to here.  If this is intended to refer to
>> the overlay proposal in IETF and IEEE, I think that the characterization
>> is somewhat misleading, and probably is best simply removed.
> Hmm, I didn't actually write this text. It originally came from
> draft-karir-armd-datacenter-reference-arch, which was merged into the
> problem statement document by the WG.
> I agree this section is kind of fuzzy,
> I'm on the fence about what to do. Are there other opinions?
>>       Is the fifth paragraph of section 71. on ARP processing and
>> buffering in the absence of ARP cache entries accurate?  I may well be
>> out of date, but it used to be the case that most routers dropped the
>> packets, and some would buffer 1 packet deep at most.  This description
>> indicates a rather more elaborate behavior.
> RFC 1122 says:
>   ARP Packet Queue
>              The link layer SHOULD save (rather than discard) at least
>              one (the latest) packet of each set of packets destined to
>              the same unresolved IP address, and transmit the saved
>              packet when the address has been resolved.
> RFC 1812 says:
> 3.3.2 Address Resolution Protocol - ARP
>     Routers that implement ARP MUST be compliant and SHOULD be
>     unconditionally compliant with the requirements in [INTRO:2].
>     The link layer MUST NOT report a Destination Unreachable error to IP
>     solely because there is no ARP cache entry for a destination; it
>     SHOULD queue up to a small number of datagrams breifly while
>     performing the ARP request/reply sequence, and reply that the
>     destination is unreachable to one of the queued datagrams only when
>     this proves fruitless.
>>       Given that this document says it is a general document about
>> scaling issues for data centers, I am surprised that the security
>> considerations section does not touch on the increased complexity of
>> segregating subscriber traffic (customer A can not talk to customer B)
>> when there are very large numbers of customers, and the itneraction of
>> this with L2 scope.
> The ARMD WG struggled a bit about scope, and all it was chartered to
> do was a problem statement related to address resolution.
> Looking at the title of the document "Problem Statement for ARMD", I'd
> argue that's not helpful for an RFC given that ARMD will close  and
> there is no followup WG planned. How about I change the title to
> something like:
>      Address Resolution Problems in Large Data Center Networks
> I don't want to add other issues like traffic segregation to the
> document at this point. Amoung other things, the WG really doesn't
> have the energy for this... The intro is pretty clear (IMO) about the
> limited scope of the document.
> Thomas