[OPSAWG] AD review of draft-ietf-opsawg-large-flow-load-balancing
Benoit Claise <bclaise@cisco.com> Tue, 18 February 2014 15:55 UTC
Return-Path: <bclaise@cisco.com>
X-Original-To: opsawg@ietfa.amsl.com
Delivered-To: opsawg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 59E201A01F0 for <opsawg@ietfa.amsl.com>; Tue, 18 Feb 2014 07:55:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.048
X-Spam-Level:
X-Spam-Status: No, score=-10.048 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RP_MATCHES_RCVD=-0.548, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9Mgxb_WbwT7C for <opsawg@ietfa.amsl.com>; Tue, 18 Feb 2014 07:55:08 -0800 (PST)
Received: from aer-iport-2.cisco.com (aer-iport-2.cisco.com [173.38.203.52]) by ietfa.amsl.com (Postfix) with ESMTP id 287A11A0238 for <opsawg@ietf.org>; Tue, 18 Feb 2014 07:55:07 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=20591; q=dns/txt; s=iport; t=1392738904; x=1393948504; h=message-id:date:from:mime-version:to:cc:subject; bh=FeupTH+EHQySWO+hwsC/Pbu5gSMLZi+BPLSK84YQayQ=; b=f4bj8CIcLiHfC0rBzrOmVQ93ndnPwDyIvAHUK+3PLG1dlsPsmTQeCQux SZyDaj7WwGtmO27EDP0Psg/nIeburNbeHMTh2v4ZcwchciM8irA48u/0J NMJ3/YvndBRx9p59D4oe2mRD56qr659fyR77ttTWeLzJ7dmTi1mqCf5lB k=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AmkFAPCBA1OQ/khR/2dsb2JhbABZgwaJbrZxgRkWdIJTUQE8FhgDAgECAVgBBwEBiAHMDxeOFggaSYQ/BJRDg2mGR4tcgy47gS0
X-IronPort-AV: E=Sophos;i="4.97,502,1389744000"; d="scan'208,217";a="4736405"
Received: from ams-core-1.cisco.com ([144.254.72.81]) by aer-iport-2.cisco.com with ESMTP; 18 Feb 2014 15:55:03 +0000
Received: from [10.60.67.85] (ams-bclaise-8914.cisco.com [10.60.67.85]) by ams-core-1.cisco.com (8.14.5/8.14.5) with ESMTP id s1IFt2SY017546; Tue, 18 Feb 2014 15:55:03 GMT
Message-ID: <53038256.1040309@cisco.com>
Date: Tue, 18 Feb 2014 16:55:02 +0100
From: Benoit Claise <bclaise@cisco.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0
MIME-Version: 1.0
To: draft-ietf-opsawg-large-flow-load-balancing@tools.ietf.org
Content-Type: multipart/alternative; boundary="------------060305070101070405060908"
Archived-At: http://mailarchive.ietf.org/arch/msg/opsawg/lxKkxqDEQWkCz5aCLGNpyn2TIpE
Cc: "opsawg@ietf.org" <opsawg@ietf.org>
Subject: [OPSAWG] AD review of draft-ietf-opsawg-large-flow-load-balancing
X-BeenThere: opsawg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OPSA Working Group Mail List <opsawg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/opsawg>, <mailto:opsawg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/opsawg/>
List-Post: <mailto:opsawg@ietf.org>
List-Help: <mailto:opsawg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/opsawg>, <mailto:opsawg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Feb 2014 15:55:13 -0000
Dear authors, Here is my AD review of draft-ietf-opsawg-large-flow-load-balancing - Section 1: Networks extensively use link aggregation groups (LAG) [802.1AX] and equal cost multi-paths (ECMP) [RFC 2991] as techniques for capacity scaling. For the problems addressed by this document, network traffic can be predominantly categorized into two traffic types: long-lived large flows and other flows. ... This draft describes mechanisms for optimal LAG/ECMP component link utilization while using hash-based techniques. The mechanisms comprise the following steps -- recognizing_large flows_ in a router; and assigning the large flows to specific LAG/ECMP component links or redistributing the small flows when a component link on the router is congested. It is useful to keep in mind that in typical use cases for this mechanism the_large flows_ are those that consume a significant amount of bandwidth on a link, e.g. greater than 5% of link bandwidth. The number of such flows would necessarily be fairly small, e.g. on the order of 10's or 100's per LAG/ECMP. In other words, the number of _ large flows_ is NOT expected to be on the order of millions of flows. Examples of such large flows would be IPsec tunnels in service provider backbone networks or storage backup traffic in data center networks. 3 instances of "large flows": do you mean "long-lived large flows"? If not, why do you make a distinction between long-lived large flows and other flows in the first paragraph? I eventually understood the source of confusion when I read the terminology section: Large flow(s): long-lived large flow(s) Either use capitalized term in the Intro section (actually throughout the doc.) so that we understand that the term is defined somewhere, or make it clear in the intro that large flow(s) = long-lived large flow(s) - This document presents improved load distribution techniques based on the large flow awareness. Improved compared to? - In several places, starting with the title and abstract, you speak about mechanisms (plural). However, looking at section 4.2, it seems that you propose a single mechanism? Or maybe you consider 4.1, 4.2, 4.3 as different mechanisms? - Step 3) On receiving the alert about the congested component link, the operator, through a central management entity, finds the large flows mapped to that component link and the LAG/ECMP group to which the component link belongs. Step 4) The operator can choose to rebalance the large flows on lightly loaded component links of the LAG/ECMP group or redistribute the small flows on the congested link to other component links of the group. The operator, through a central management entity, can choose one of the following actions: 1) Indicate specific large flows to rebalance; 2) Have the router decide the best large flows to rebalance; 3) Have the router redistribute all the small flows on the congested link to other component links in the group. "Indicate specific large flows to rebalance", "through a central management entity", what you describe is basically traffic engineering. Other the other hand, for 2) and 3), why do you need a central management entity? - A number of routers support sampling techniques such as sFlow [sFlow- v5, sFlow-LAG], PSAMP [RFC 5475] and NetFlow Sampling [RFC 3954]. For the purpose of large flow identification, sampling must be enabled on all of the egress ports in the router where such measurements are desired. I don't understand the second sentence. One way to read this is: sampling must be _enabled _on all of the egress ports where such measurements are desired. Ok, this is an obvious statement. If the measurements are desired, enable them Or maybe you want to say: _sampling _must be enabled on all of the egress ports where such measurements are desired. This is a false statement: if you have the choice between sampling and non sampling, use non sampling measurements. Or maybe you want to say: sampling must be enabled on _all _of the egress ports where such measurements are desired. This is a false statement: if I have ECMP on 2 links, and only one of them can't do non sampling, then we should not force sampling on both links. You see, I'm confused. You miss a couple of key messages: - if unsampled measurements are available, use those. - egress means where LAG/ECMP are enabled (this is important for the paragraph starting with "If egress sampling is not available, ingress sampling can suffice since the central management entity use") - If egress sampling is not available, ingress sampling can suffice since the central management entity used by the sampling technique typically has multi-node visibility and can use the samples from an immediately downstream node to make measurements for egress traffic at the local node. It's not clear if "ingress" means the ingress interface of the router itself, or the ingress interface of the downstream router. A drawing is required. Both options are possible: 1. ingress interfaces on the router where LAG/ECMP is initiated flow monitoring must be enabled on all ingress interfaces flow monitoring must have a way to know the egress interfaces 2. ingress interfaces of the downstream router only work for LAG or ECMP single hop ingress interfaces = all components from LAG/ECMP (multiple ifIndex, typically) this entire section 4.3.3 needs some improvements - On one side, you wrote "Specific algorithms for placement of large flows are out of scope of this document.". On the other side, "The following parameters are required the configuration of _this_ feature". It seems contradictory. It's unclear why you need the following parameter: . Imbalance threshold: the difference between the utilization of the least utilized and most utilized component links. Expressed as a percentage of link speed. Also, does ECMP/LAG always require equivalent link speed for their components? - 5.2. System Configuration and Identification Parameters . IP address: The IP address of a specific router that the feature is being configured on, or that the large flow placement is being applied to. . LAG ID: Identifies the LAG. The LAG ID may be required when configuring this feature (to apply a specific set of large flow identification parameters to the LAG) and will be required when specifying flow placement to achieve the desired rebalancing. . Component Link ID: Identifies the component link within a LAG. This is required when specifying flow placement to achieve the desired rebalancing. Nothing regarding ECMP? - For high speed links, the etherStatsHighCapacityTable MIB [RFC 3273] can be used. Well, only for ethernet. EDITORIAL: - figure 2 OLD: +-----------+ -> +-----------+ | | -> | | | | ===> | | | (1)|--------|(1) | | | -> | | | | -> | | | (R1) | -> | (R2) | NEW: +-----------+ -> +-----------+ | | -> | | | | ===> | | | (1)|--------|(1) | | | -> | | | | -> | | | (R1) | -> | (R2) | - The indentation in section 2 is not correct - "For tunneling protocols like GRE, VXLAN, NVGRE, STT, etc.," You need to expand and provide references. - a PBR rule Expand. - OLD: +-----------+ -> +-----------+ | | -> | | | | ===> | | | (1)|--------|(1) | | | | | | | ===> | | | | -> | | | | -> | | | (R1) | -> | (R2) | | (2)|--------|(2) | NEW: +-----------+ -> +-----------+ | | -> | | | | ===> | | | (1)|--------|(1) | | | | | | | ===> | | | | -> | | | | -> | | | (R1) | -> | (R2) | | (2)|--------|(2) | - OLD: The IPFIX information model [RFC 7011] NEW: The IPFIX information model [RFC 7012] Regards, Benoit
- [OPSAWG] AD review of draft-ietf-opsawg-large-flo… Benoit Claise