[aqm] Review of draft-ietf-aqm-eval-guidelines
Jim Gettys <jg@freedesktop.org> Mon, 13 April 2015 14:38 UTC
Return-Path: <gettysjim@gmail.com>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 177111AC432 for <aqm@ietfa.amsl.com>; Mon, 13 Apr 2015 07:38:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 2.382
X-Spam-Level: **
X-Spam-Status: No, score=2.382 tagged_above=-999 required=5 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FM_FORGED_GMAIL=0.622, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_PASS=-0.001, URI_TRY_3LD=0.959] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gLDi4d4MOUH0 for <aqm@ietfa.amsl.com>; Mon, 13 Apr 2015 07:38:00 -0700 (PDT)
Received: from mail-wi0-x22c.google.com (mail-wi0-x22c.google.com [IPv6:2a00:1450:400c:c05::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 84B8E1AC436 for <aqm@ietf.org>; Mon, 13 Apr 2015 07:37:54 -0700 (PDT)
Received: by wizk4 with SMTP id k4so75447769wiz.1 for <aqm@ietf.org>; Mon, 13 Apr 2015 07:37:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:content-type; bh=UuEDmlJ2/6xslgA7d37Lf6b3A5DRsTjHde6B2pr05Rk=; b=IqD167s4UDLwPOjacIqSIscvzUFkHHSQ7Edypil9Q62Z6PJkFBR/jJwrluskgmkOy1 p64vvsRdzWhDkbh43PWlWp5mzDI0yvQXP4jLCrgixEtzO6GaQcfiaJcW//v1GDaWzlVa Oixlt+JHeeRrUQd/LJHHy/KlHwuPTiq6HnQbxEqKuzg2clgU5VYDMwtWuxVOqvL3UqnB VoocktBIZh0b+92F2imZJoqAcdhS4cR+ieJg1j+tU9MORXVi01DfrTv9Dk8WhkF6f9Sv tkGcg1Z4nSIwkojxjsWnFfMUgH/BrQgPMpG4wm/nWBLQf5wj26+9ZohJfJjK9L18w6ep XClQ==
MIME-Version: 1.0
X-Received: by 10.180.90.236 with SMTP id bz12mr21864773wib.33.1428935872442; Mon, 13 Apr 2015 07:37:52 -0700 (PDT)
Sender: gettysjim@gmail.com
Received: by 10.194.118.166 with HTTP; Mon, 13 Apr 2015 07:37:52 -0700 (PDT)
Date: Mon, 13 Apr 2015 10:37:52 -0400
X-Google-Sender-Auth: oNdifEo6gcK7SlllOTrNN6QoQ_4
Message-ID: <CAGhGL2Cj44U94UL6W-9OeoyKTm4TE=JWVozCzAp-wr5eN9UETg@mail.gmail.com>
From: Jim Gettys <jg@freedesktop.org>
To: "aqm@ietf.org" <aqm@ietf.org>
Content-Type: multipart/alternative; boundary="f46d043be09c1577de05139c0fa1"
Archived-At: <http://mailarchive.ietf.org/arch/msg/aqm/HUmxuDzNxiBtqQ0aSiMrkeQDcGY>
Subject: [aqm] Review of draft-ietf-aqm-eval-guidelines
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 13 Apr 2015 14:38:09 -0000
Sorry this review is later than I would have liked: I came down with a cold after the IETF and this review was half done. Hope this helps. - Jim Required behavior, and running code ============================ My immediate, overall reaction is that this document prescribes a large set of tests to be performed, using RFC 2119 MUST/SHOULD/MAY terminology. However, I doubt if any AQM algorithm has been evaluated to all of those tests; and it is unrealistic without "running code" for such tests that anyone will actually run them; instead, this seems to be an "exercise to the AQM implementer" of large magnitude, possibly so large as that no one will actually perform even a fraction of them. Will that mean that future AQM algorithms get stalled in working groups because the tests are not done as prescribed? And how will we intercompare tests between algorithms, with the test code not available? And as there is little consequence that I can see to not performing the tests, it seems unlikely that anyone will actually perform them. I think if one cataloged all the tests requested by this document that the number of tests is very large. I wonder if a small number of more aggressive tests might not be easier. It seems to me that if, as a community, we really believe in doing such evaluations, we need a common set of tools for doing them, so that we can actually easily inter compare results. We really need *running code* for tests to evaluate AQM's/packet scheduling algorithms. I fear that this will be an academic exercise without such a shared code base for testing. My reaction to this draft is that without such a set of tests, this document is an academic exercise that will have little impact. I'd much prefer to see a (smaller) concrete set of tests devised (probably an elaboration of the data we get from Toke's netperf-wrapper tool). Packet scheduling ============== This draft is essentially bereft of any discussion of packet scheduling/flow isolation, when, in many ways, it makes a bigger difference than the AQM (mark/drop algorithm only, in the definition of this document). Section 12 purports to discuss packet scheduling, but it is 5 sentences (< 1/2 page) total out of a document of >20 pages (of actual discussion, not boiler plate)! Somehow I suspect it is worth a bit more discussion ;-). Yet packet scheduling/AQM do interact at some level; I don't think that it can/should be handled in a separate draft. For example, one of the wins of flow isolation is that TCP's ack clocking starts behaving normally again, and ack compression is much less problematic. One of the surprises of fq_codel was finding that we were able to use the full bandwidth of the link in both directions simultaneously (increasing net aggregate bandwidth on a link). A second example: once you have flow isolation (e.g. fq_codel), it is still important to understand if the AQM (mark/drop algorithm) is effective, as bufferbloat can still kill single application performance if the mark/drop algorithm is ineffective. Knowing what will happen to that that single flow's behavior still matters. A third example: ironically, TCP RTT unfairness gets worse when you control the RTT with an effective AQM. Flow queuing helps with this issue, which Time variance =========== The draft is silent on the topic of time variance behavior of AQM's. Those of us facing the problems at the edge of the network are amazingly aware of how rapidly available bandwidth may change, both because radio is inherently a shared medium, and because of radio propagation. Other media are also highly variable, if not as commonly as WiFi/cellular; even switched Ethernet is very variable these days (vlan's have hidden the actual bandwidth of a cable from view). I talked to one VOIP provider who tracked their problems down to what vlans were doing behind his back. Cable and other systems are also shared/variable media (e.g. PowerBoost on Comcast, or even day/night diurnal variation. *The* most exciting day for me while CoDel was being developed was when Kathy Nichols shared with me some data from a simulation showing CoDel's behavior when faced with a dramatic (instantaneous, in this case) drop in bandwidth (10x), and watching it recover to a new operating point. It seems to me that some sort of test(s) for what happens when bandwidth changes (in particular, high to low) are in order to evaluate algorithms. Unfortunately, the last I knew some simulators were not very good at simulating this sort of behavior (IIRC, Kathy was using ns2, and she had to kludge something to get this behavior). Most important (particularly once you have flow queuing), may be whether the algorithm may become unstable/oscillate in the face of such changes or not. This is still an area I worry about for WiFi, where the RTT of many paths is roughly comparable to the kind of variance in throughput seen. Time will tell, I guess; but right now, we have no tools to explore this is any sort of testing. Specific comments =============== Section 1. Introduction ------------------------------ Para 1. It is a bit strange there is no reference to fq_codel at this date, particularly since everyone involved in CoDel much prefers fq_codel and we cannot anticipate anyone using CoDel in preference to fq_codel; it's in Linux primarily to enable easy testing, rather than expecting anyone to *actually* use it. Flow queuing is a huge win. And fq_codel is shipping/deployed on millions of devices at this date. There is an ID in the working group defining fq_codel.... Having a concrete thing to talk about you that does packet scheduling will help you when you get to redoing section 12. Para 3: The first sentence is misleading and untrue, as far as I can tell. SLA's have had little to do with it, at least in the consumer edge where we hurt the most today. Bufferbloat has happened as 1) memory became really cheap, 2) Worshiping at the altar of the religion that dropping a packet must be bad 2) to do a single stream TCP performance test has meant that the minimum size of a drop tail queue has been at least the BDP. o the maximum bandwidth the device might ever go o 100ms, which is roughly continental delay. Net result is *minimum* buffering that is often 5-10 or more times sane buffering. Then people, thinking that more must be better, used any left over memory. Note that this is already much too much for many applications (music, VOIP; 100ms is *not* an acceptable amount of latency for a connection to impose).... Compounding this has been accounting in packets, rather than bytes or time. So people built operating systems to saturate a high speed link with tiny UDP packets for RPC, without realizing that the buffering was then another factor of 20 higher for TCP... Then these numbers (for Ethernet) were copied into the WiFi drivers without thought. Broadband head ends and modems aren't much better. As trying to explain the history of how things went so badly wrong is not helpful, I'd just strike the first sentence starting "In order to meet" and just note that buffering has often been ten-100 or more times larger than needed. The Interplanetary record at the moment is held by GoGo Inflight, with 760 seconds of buffering measured (Mars is closer for part of its orbit). Section 2. End to end metrics. ---------------------------------------- There is no discussion of any "Flow startup time". For many/most "ants", the key time for good performance is the time to *start* or *restart an idle* flow. Not only are these operations such as DNS lookups, TCP opens, SSL handshakes, dhcp handshakes, etc, but for web pages, the first packet or so contains the size information required to layout a page, which is key to interactive "speed", all needing any formal traffic classification. The biggest single win of flow queuing is minimizing this lost time; fq_codel goes even a step further by its "sparse flow" optimization, where the first packet(s) of a new or idle flow are scheduled before flows that have built a queue. So DNS lookups, TCP opens, the first data packet in an image, SSL handshakes, all fly through the router. For web surfing, the flow completion time is relatively irrelevant; what the user cares about is how long until he can read the screen without it reflowing. Section 3.2 Buffer Size ------------------------------ The first sentence of this section is pretty useless. What is the bandwidth? What is the delay? later tests talk about testing at different RTT's... And the whole point of AQM's is to get away from static buffer sizes.... Instead, when static buffer sizes are set, there needs to be justification and explanation given, so that other implementers/testers can understand how/when/if it should be changed for deployment or other testing. These buffer default sizes are mentioned in the fq_codel draft since home routers have limited RAM, so having some upper limit on buffer sizes is a practical implementation limit (and one we'd be happy to do do away with entirely; we just haven't gotten around to it as yet, and the draft describes the current implementation, rather than mythology). The buffer size constants on fq_codel are really on the "to fix someday" list, rather than anything we otherwise care about in the algorithm. With more code complexity, we could remove both the # flows and buffer size limits in fq_codel and make either/both fully dynamic. So far, it hasn't seemed worth the effort to make these automatic, particularly before we understand more about the dynamic behavior in wireless networks, ***and have better feedback/integration with the device driver layer so that we might have insight into the information required to make these dynamic***. Right now, too much information is hidden from the queue management layer (at least in Linux). Section 4.3 Unresponsive transport sender --------------------------------------------------------- You might note that flow queuing systems help protect against (some) such unresponsive senders. Why are you specifying a New Reno flow? One could argue, possibly correctly, that Cubic is now the most common CC algorithm. Some rational is in order. And Cubic is more likely to drive badly bloated connections to the point of insanity. Section 5.1 Motivation ----------------------------- Some mention of flow queuing is in order... 5.2 Required Tests -------------------------- It seems to me that there is a third test: a short RTT (maybe 20MS), which is comparable to what you might find given a CDN is nearby. Section 6. Burst absorption ------------------------------------- There is a converse situation to burst absorption: that is momentary bandwidth drops; these can occur both in VLAN situations, but also in wireless systems, where you can easily have temporary drops in available bandwidth. Section 6.2: Required tests ------------------------------------- Unfortunately, HTTP/1.1 Web traffic in the current world is much worse than the proposed tests: typically it will be N essentially simultaneous connections, each with an IW10; and the size of most objects are small, and fit into that IW10 window. Bursts of hundreds of packets are commonplace. I cover some of this from Google data in my blog: https://gettys.wordpress.com/2013/07/10/low-latency-requires-smart-queuing-traditional-aqm-is-not-enough/ So the web test is not specified decently: is it 100 packets back to back? or 10 IW10 connections? Also, flow completion, while interesting, is *not* the only interesting/useful metric; getting the first packet(s) of each connection allows page layout to occur without reflows and is equally interesting/useful. If you look in the above blog entry, you'll see a bar chart of visiting a web page N times; one of the wonderful fq_codel results was the lack of variance in the completion time, since the wrong packets weren't getting badly delayed/dropped. (The web page completion time bar chart). 7.2.1 Definition of the congestion level --------------------------------------------------- I am nervous about defining congestion in terms of loss rate; when I measured loss rates in today's bloated edge network, the loss rates were quite high (several percent) even with single flows (probably because Cubic was being driven insane by the lack of losses on links with large amounts of buffering. So encouraging readers to believe that loss rates always correlates with congestion level seems fraught with problems to me, as most will jump to the incorrect conclusion that measuring one gives you the other. 7.3 Parameter Sensitivity --------------------------------- How would one show stability? Is there a reference you can give to an example of such stability analysis? 8.1 Traffic Mix ------------------- The test is ill defined... What is "6 webs" for a test? Why 5 TCP flows? are the flows bi-directional, or unidirectional? 8.2 Bi-directional traffic ------------------------------ It isn't clear that bi-directional traffic is actually required; the second paragraph seems to contradict the third paragraph. 10. Operator control knobs... --------------------------------------- Isn't really this a case of wanting to understand the degenerate behavior of an algorithm, so that we understand the "worst case" behavior (and try to see if an algorithm "first, does no harm", even in cases where said algorithm should have been tuned (or autotuned itself)? For example, the current # of hash buckets default in fq_codel (that someday, we may make fully auto-tuning), if applied in a situation with many flows so there are frequent hash collisions causes the algorithm to lose its sparse flow optimization, but it is still going to apply CoDel to any elephant flows and behave roughly as DRR would; it should not "misbehave" and cause bad behavior: just not the better optimal behavior if the # of hash buckets had been set to something reasonable. This sort of analysis is worth including in an evaluation.... 12. Interaction with scheduling ---------------------------------------- Somehow this section is currently pretty useless, particularly since good scheduling can be of greater benefit that the details of mark/drop algorithm. I think we can/should do better here. 13.3 Comparing AQM schemes ------------------------------------------ I think there is a different issue left unmentioned at the moment, which is the "applicability" of an algorithm to a given situation. For example, setting the CoDel target in fq_codel below the media access latency will hurt its behavior. So twisting that knob is in fact necessary under some circumstances, and a document should discuss when that may be necessary. Similarly the target of 5ms is pretty arbitrary and CoDel is in general not very sensitive to what the target or interval are (except in this case of setting an unacheivable target. The values were chosen after a lot of simulation by Kathy Nichols. Helping people understand these cases is important. Section 13.4 Packet Sizes... -------------------------------------- I think there is a simple rule that needs addition to this section: an AQM/scheduling system "should guarantee progress", or particular flows can fail entirely under some circumstances. This "liveness" constraint is a property that all algorithms should provide.
- [aqm] Review of draft-ietf-aqm-eval-guidelines Jim Gettys
- Re: [aqm] Review of draft-ietf-aqm-eval-guidelines Nicolas Kuhn