Re: [aqm] Review of draft-ietf-aqm-eval-guidelines

Nicolas Kuhn <nicolas.kuhn@telecom-bretagne.eu> Fri, 19 June 2015 07:57 UTC

Return-Path: <nicolas.kuhn@telecom-bretagne.eu>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 731431A854D for <aqm@ietfa.amsl.com>; Fri, 19 Jun 2015 00:57:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 2.109
X-Spam-Level: **
X-Spam-Status: No, score=2.109 tagged_above=-999 required=5 tests=[BAYES_50=0.8, HELO_EQ_FR=0.35, HTML_MESSAGE=0.001, SPF_PASS=-0.001, URI_TRY_3LD=0.959] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Khix5QOOc12a for <aqm@ietfa.amsl.com>; Fri, 19 Jun 2015 00:56:53 -0700 (PDT)
Received: from zproxy210.enst-bretagne.fr (zproxy210.enst-bretagne.fr [192.108.117.8]) by ietfa.amsl.com (Postfix) with ESMTP id C66511A8709 for <aqm@ietf.org>; Fri, 19 Jun 2015 00:56:51 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by zproxy210.enst-bretagne.fr (Postfix) with ESMTP id C0D59232135; Fri, 19 Jun 2015 09:56:50 +0200 (CEST)
Received: from zproxy210.enst-bretagne.fr ([127.0.0.1]) by localhost (zproxy210.enst-bretagne.fr [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id UjjLuuTh7KDt; Fri, 19 Jun 2015 09:56:48 +0200 (CEST)
Received: from localhost (localhost [127.0.0.1]) by zproxy210.enst-bretagne.fr (Postfix) with ESMTP id E1DCB232258; Fri, 19 Jun 2015 09:56:47 +0200 (CEST)
X-Virus-Scanned: amavisd-new at zproxy210.enst-bretagne.fr
Received: from zproxy210.enst-bretagne.fr ([127.0.0.1]) by localhost (zproxy210.enst-bretagne.fr [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id X6iMYInMhSww; Fri, 19 Jun 2015 09:56:47 +0200 (CEST)
Received: from [IPv6:2001:660:7301:3728:c53b:34ee:4c52:63eb] (passerelle-interne.enst-bretagne.fr [192.108.117.210]) by zproxy210.enst-bretagne.fr (Postfix) with ESMTPSA id 8F6F1232267; Fri, 19 Jun 2015 09:56:47 +0200 (CEST)
Content-Type: multipart/alternative; boundary="Apple-Mail=_796BD730-D0F2-429D-95CC-8F74DAA19C9D"
Mime-Version: 1.0 (Mac OS X Mail 8.0 \(1990.1\))
From: Nicolas Kuhn <nicolas.kuhn@telecom-bretagne.eu>
In-Reply-To: <CAGhGL2Cj44U94UL6W-9OeoyKTm4TE=JWVozCzAp-wr5eN9UETg@mail.gmail.com>
Date: Fri, 19 Jun 2015 09:56:58 +0200
Message-Id: <3345EA5D-005E-42A1-BEF9-F3FD5B2D2D1F@telecom-bretagne.eu>
References: <CAGhGL2Cj44U94UL6W-9OeoyKTm4TE=JWVozCzAp-wr5eN9UETg@mail.gmail.com>
To: Jim Gettys <jg@freedesktop.org>
X-Mailer: Apple Mail (2.1990.1)
Archived-At: <http://mailarchive.ietf.org/arch/msg/aqm/pS9WlUlK5gExF8Ohu_syUXrNiUk>
Cc: "aqm@ietf.org" <aqm@ietf.org>
Subject: Re: [aqm] Review of draft-ietf-aqm-eval-guidelines
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Jun 2015 07:57:02 -0000

Dear Jim, 

We just posted a *-04 version of the guidelines that features some modifications
following your comments. 

Please have a look inline where we provide more details on the modifications 
and how we assessed your comments. 

> On 13 Apr 2015, at 16:37, Jim Gettys <jg@freedesktop.org <mailto:jg@freedesktop.org>> wrote:
> 
> Sorry this review is later than I would have liked: I came down with a cold after the IETF and this review was half done. Hope this helps.
> 

Our turn to be sorry for the delayed answer to your review. 

>                                            - Jim
> 
> 
> Required behavior, and running code
> ============================
> 
> My immediate, overall reaction is that this document prescribes a large set of tests to be performed, using RFC 2119 MUST/SHOULD/MAY terminology.
> 

We think that after the modifications provided by Gorry, the used terminology is more accurate now. 

> However, I doubt if any AQM algorithm has been evaluated to all of those tests; and it is unrealistic without "running code" for such tests that anyone will actually run them; instead, this seems to be an "exercise to the AQM implementer" of large magnitude, possibly so large as that no one will actually perform even a fraction of them.
> 
> Will that mean that future AQM algorithms get stalled in working groups because the tests are not done as prescribed? And how will we intercompare tests between algorithms, with the test code not available?
> 
> And as there is little consequence that I can see to not performing the tests, it seems unlikely that anyone will actually perform them.
> I think if one cataloged all the tests requested by this document that the number of tests is very large. I wonder if a small number of more aggressive tests might not be easier.
> 

In the *-02 version, it was indeed not really clear about what scenarios MUST been considered, and which may not. 
In the *-04 version, we have introduced at the end of the document a table that summarises the requirements on each
test. We hope that this is clearer now; we may discuss on these requirements to achieve a consensus on what is 
needed for the working group to standardise a proposal. 

> It seems to me that if, as a community, we really believe in doing such evaluations, we need a common set of tools for doing them, so that we can actually easily inter compare results.
> 
> We really need *running code* for tests to evaluate AQM's/packet scheduling algorithms. I fear that this will be an academic exercise without such a shared code base for testing. 
> 
> My reaction to this draft is that without such a set of tests, this document is an academic exercise that will have little impact. I'd much prefer to see a (smaller) concrete set of tests devised (probably an elaboration of the data we get from Toke's netperf-wrapper tool).
>  

We agree that it would be easier to have a common test code, however
one of the intent of the guidelines is not to be dependent on the platform. 
Therefore, providing running code can hardly be done in this context.

> 
> Packet scheduling
> ==============
> 
> This draft is essentially bereft of any discussion of packet scheduling/flow isolation, when, in many ways, it makes a bigger difference than the AQM (mark/drop algorithm only, in the definition of this document).
> 
> Section 12 purports to discuss packet scheduling, but it is 5 sentences (< 1/2 page) total out of a document of >20 pages (of actual discussion, not boiler plate)!
> 
> Somehow I suspect it is worth a bit more discussion ;-).
> 
> Yet packet scheduling/AQM do interact at some level; I don't think 
> that it can/should be handled in a separate draft. 
> 
> For example, one of the wins of flow isolation is that TCP's ack clocking starts behaving normally again, and ack compression is much less problematic. One of the surprises of fq_codel was finding that we were able to use the full bandwidth of the link in both directions simultaneously (increasing net aggregate bandwidth on a link).
> 
> A second example: once you have flow isolation (e.g. fq_codel), it is still important to understand if the AQM (mark/drop algorithm) is effective, as bufferbloat can still kill single application performance if the mark/drop algorithm is ineffective. Knowing what will happen to that that single flow's behavior still matters.
> 
> A third example: ironically, TCP RTT unfairness gets worse when you control the RTT with an effective AQM.  Flow queuing helps with this issue, which 


We agree that the packet scheduling aspect is very important. However, 
characterising scheduling algorithms was not the primary goal of these 
guidelines and would require specific scenarios. The proposed scenarios
are specific to marking/dropping algorithms and we may not introduce comments
on scheduling algorithms here and there in the text. 

We have added the following text in the introduction: 
“
   In order to ascertain whether the WG should undertake standardizing
   an AQM proposal, the WG requires guidelines for assessing AQM
   proposals.  This document provides the necessary characterization
   guidelines.  There may be a debate on whether a scheduling scheme is
   additional to an AQM algorithm or is a part of an AQM algorithm.  The
   rest of this memo refers to AQM as a dropping/marking policy that
   does not feature a scheduling scheme.  This document may be
   complemented with another one on guidelines for assessing combination
   of packet scheduling and AQM.  We note that such a document will
   inherit all the guidelines from this document plus any additional
   scenarios relevant for packet scheduling such as flow starvation
   evaluation or impact of the number of hash buckets.
“

and the following in the “Interaction with Scheduling” section
“
   These guidelines do not propose guidelines to assess the performance
   of scheduling algorithms.  Indeed, as opposed to characterizing AQM
   schemes that is related to their capacity to control the queuing
   delay in a queue, characterizing scheduling schemes is related to the
   scheduling itself and its interaction with the AQM scheme.  As one
   example, the scheduler may create sub-queues and the AQM scheme may
   be applied on each of the sub-queues, and/or the AQM could be applied
   on the whole queue.  Also, schedulers might, such as FQ-CoDel
   [FQ-CoDel] or FavorQueue [FAVOUR], introduce flow prioritization.  In
   these cases, specific scenarios should be proposed to ascertain that
   these scheduler schemes not only helps in tackling the bufferbloat,
   but also are robust under a wide variety of operating conditions.
   This is out of the scope of this document that focus on dropping and/
   or marking AQM schemes.
"

> 
> Time variance
> ===========
> 
> The draft is silent on the topic of time variance behavior of AQM's.
> 
> Those of us facing the problems at the edge of the network are amazingly aware of how rapidly available bandwidth may change, both because radio is inherently a shared medium, and because of radio propagation.
> 
> Other media are also highly variable, if not as commonly as WiFi/cellular; even switched Ethernet is very variable these days (vlan's have hidden the actual bandwidth of a cable from view). 
> I talked to one VOIP provider who tracked their problems down to what vlans were doing behind his back.
> 
> Cable and other systems are also shared/variable media (e.g. PowerBoost on Comcast, or even day/night diurnal variation.
> 
> *The* most exciting day for me while CoDel was being developed was when Kathy Nichols shared with me some data from a simulation showing CoDel's behavior when faced with a dramatic (instantaneous, in this case) drop in bandwidth (10x), and watching it recover to a new operating point.
> 
> It seems to me that some sort of test(s) for what happens when bandwidth changes (in particular, high to low) are in order to evaluate algorithms.
> 
> Unfortunately, the last I knew some simulators were not very good at simulating this sort of behavior (IIRC, Kathy was using ns2, and she had to kludge something to get this behavior).
> 
> Most important (particularly once you have flow queuing), may be whether the algorithm may become unstable/oscillate in the face of such changes or not.  This is still an area I worry about for WiFi, where the RTT of many paths is roughly comparable to the kind of variance in throughput seen.  Time will tell, I guess; but right now, we have no tools to explore this is any sort of testing.
> 

We have a specific scenario on time variance (7.2.6.  Varying available capacity) where we propose the following: 

"
   This scenario can be used to help characterize how the AQM behaves
   and adapts to bandwidth changes.  The experiments are not meant to
   reflect the exact conditions of Wi-Fi environments since its hard to
   design repetitive experiments or accurate simulations for such
   scenarios.

   To emulate varying draining rates, the bottleneck capacity between
   nodes 'Router L' and 'Router R' varies over the course of the
   experiment as follows:

   o  Experiment 1: the capacity varies between two values within a
      large time-scale.  As an example, the following phases may be
      considered: phase I - 100Mbps during 0-20s; phase II - 10Mbps
      during 20-40s; phase I again, and so on.

   o  Experiment 2: the capacity varies between two values within a
      short time-scale.  As an example, the following phases may be
      considered: phase I - 100Mbps during 0-100ms; phase II - 10Mbps
      during 100-200ms; phase I again, and so on.
“

The objective of the above experiments is to help characterize how the AQM behaves and adapts to bandwidth changes. The experiments are not meant to reflect the exact conditions of wifi environments since its hard to design repetitive experiments or accurate simulations for such scenarios. 

Please let us know if you have specific text suggestions for this scenario.

> 
> Specific comments
> ===============
> 
> Section 1. Introduction
> ------------------------------
> 
> Para 1.
> It is a bit strange there is no reference to fq_codel at this date, particularly since everyone involved in CoDel much prefers fq_codel and we cannot anticipate anyone using CoDel in preference to fq_codel; it's in Linux primarily to enable easy testing, rather than expecting anyone to *actually* use it.
> 
>  Flow queuing is a huge win. And fq_codel is shipping/deployed on millions of devices at this date.  There is an ID in the working group defining fq_codel.... Having a concrete thing to talk about you that does packet scheduling will help you when you get to redoing section 12.
> 

We agree that flow queuing can be a huge win, but in our opinion this is not the scope of the guidelines. These guidelines are for “AQM” characterisations and not “AQM and scheduling” characterisations. If the document was to be used to characterise scheduling schemes, that would induce more scenarios. 

> Para 3:
> The first sentence is misleading and untrue, as far as I can tell. SLA's have had little to do with it, at least in the consumer edge where we hurt the most today.
> 
> Bufferbloat has happened as 1) memory became really cheap, 2) Worshiping at the altar of the religion that dropping a packet must be bad 2) to do a single stream TCP performance test has meant that the minimum size of a drop tail queue has been at least the BDP.
> 
> o the maximum bandwidth the device might ever go
> o 100ms, which is roughly continental delay.
> 
> Net result is *minimum* buffering that is often 5-10 or more times sane buffering.  Then people, thinking that more must be better, used any left over memory. Note that this is already much too much for many applications (music, VOIP; 100ms is *not* an acceptable amount of latency for a connection to impose)....
> 
> Compounding this has been accounting in packets, rather than bytes or time.  So people built operating systems to saturate a high speed link with tiny UDP packets for RPC, without realizing that the buffering was then another factor of 20 higher for TCP...
> 
> Then these numbers (for Ethernet) were copied into the WiFi drivers without thought. Broadband head ends and modems aren't much better.         
> 
> As trying to explain the history of how things went so badly wrong is not helpful, I'd just strike the first sentence starting "In order to meet" and just note that buffering has often been ten-100 or more times larger than needed.
> 
> The Interplanetary record at the moment is held by GoGo Inflight, with 760 seconds of buffering measured (Mars is closer for part of its orbit).
> 
> 

You are right. We have removed the first sentence and reformulate the second sentence as follows:
“
   Bufferbloat [BB2011] is the consequence of deploying large unmanaged
   buffers on the Internet, which has lead to an increase in end-to-end
   delay: the buffering has often been measured to be ten times or
   hundred times larger than needed.
"


> Section 2. End to end metrics.
> ----------------------------------------
> 
> There is no discussion of any "Flow startup time".
> 
> For many/most "ants", the key time for good performance is the time to *start* or *restart an idle* flow.
> 
> Not only are these operations such as DNS lookups, TCP opens, SSL handshakes, dhcp handshakes, etc, but for web pages, the first packet or so contains the size information required to layout a page, which is key to interactive "speed", all needing any formal traffic classification.
> 
> The biggest single win of flow queuing is minimizing this lost time; fq_codel goes even a step further by its "sparse flow" optimization, where the first packet(s) of a new or idle flow are scheduled before flows that have built a queue. So DNS lookups, TCP opens, the first data packet in an image, SSL handshakes, all fly through the router.
> 
> For web surfing, the flow completion time is relatively irrelevant; what the user cares about is how long until he can read the screen without it reflowing.


We have added a metric on the flow start up and a comment on web surfing in the flow completion time section, as follows: 

“
   If this metric is used to evaluate the performance of web transfers,
   we propose to rather consider the time needed to download all the
   objects that compose the web page, as this makes more sense in terms
   of user experience than assessing the time needed to download each
   object.

2.2.  Flow start up time

   The flow start up time is the time between the request has been sent
   from the client and the server starts to transmit data.  The amount
   of packets dropped by an AQM may seriously affect the waiting period
   during which the data transfer has not started.  This metric would
   specifically focus on the operations such as DNS lookups, TCP opens
   of SSL handshakes.

"

> 
> Section 3.2 Buffer Size
> ------------------------------
> 
> The first sentence of this section is pretty useless.  What is the bandwidth?  What is the delay?  later tests talk about testing at different RTT's...  And the whole point of AQM's is to get away from static buffer sizes....
> 
> Instead, when static buffer sizes are set, there needs to be justification and explanation given, so that other implementers/testers can understand how/when/if it should be changed for deployment or other testing.  
> 
> These buffer default sizes are mentioned in the fq_codel draft since home routers have limited RAM, so having some upper limit on buffer sizes is a practical implementation limit (and one we'd be happy to do do away with entirely; we just haven't gotten around to it as yet, and the draft describes the current implementation, rather than mythology). 
> 
> The buffer size constants on fq_codel are really on the "to fix someday" list, rather than anything we otherwise care about in the algorithm.  With more code complexity, we could remove both the # flows and buffer size limits in fq_codel and make either/both fully dynamic.  So far, it hasn't seemed worth the effort to make these automatic, particularly before we understand more about the dynamic behavior in wireless networks, ***and have better feedback/integration with the device driver layer so that we might have insight into the information required to make these dynamic***.  Right now, too much information is hidden from the queue management layer (at least in Linux).
> 

Indeed, the whole point of AQM’s is to get away from static buffer sizes, but we use the amount of congestion-losses to define the congestion levels; therefore, we need to specify that being clear about the size of the buffers is essential.
The first sentence may not be clear enough. We propose to change it by the following: 
“
   The size of the buffers should be carefully chosen, and is to be set
   to the bandwidth-delay product; the bandwith being the bottleneck
   capacity and the delay the larger RTT in the considered network. 
"

> Section 4.3 Unresponsive transport sender
> ---------------------------------------------------------
> 
> You might note that flow queuing systems help protect against (some) such unresponsive senders.
> 
> Why are you specifying a New Reno flow?  One could argue, possibly correctly, that Cubic is now the most common CC algorithm.  Some rational is in order.  And Cubic is more likely to drive badly bloated connections to the point of insanity.
> 


We consider NewReno as a “TCP friendly” sender and Cubic a “Aggressive sender”. 
We acknowledge that CUBIC is responsible for filling the buffers, but NewReno is still 
used in some servers. 

[ http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6594906 <http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6594906> ] 

Also, if the AQM are going to be deployed in the upstream, and many machines using
Window (and Compound) or OS X (NewReno+SACK), data senders that will exercise
such AQM would not only face CUBIC flows.

> Section 5.1 Motivation
> -----------------------------
> 
> Some mention of flow queuing is in order...
> 
> 5.2 Required Tests
> --------------------------
> 
> It seems to me that there is a third test: a short RTT (maybe 20MS), which is comparable to what you might find given a CDN is nearby.
> 

The scenario considers RTT that can be as low as 5 ms. 

> Section 6. Burst absorption
> -------------------------------------
> 
> There is a converse situation to burst absorption: that is momentary bandwidth drops; these can occur both in VLAN situations, but also in wireless systems, where you can easily have temporary drops in available bandwidth.
> 

We think that momentary bandwidth drops are considered in the (7.2.6.  Varying available capacity) scenario.
To consider “burst absorption” and “variable bandwidth”, we have modified the text as follows:

“
   The scenario MAY consist of TCP NewReno flows between sender A and
   receiver B.  To better assess the impact of draining rates on the AQM
   behavior, the tester MUST compare its performance with those of drop-
   tail and SHOULD provide a reference document for their proposal
   discussing performance and deployment compared to those of drop-tail.
   Burst traffic, such as presented in Section 6.2, could also be
   considered to assess the impact of varying available capacity on the
   burst absorption of the AQM.
"

> Section 6.2: Required tests
> -------------------------------------
> 
> Unfortunately, HTTP/1.1 Web traffic in the current world is much worse than the proposed tests: typically it will be N essentially simultaneous connections, each with an IW10; and the size of most objects are small, and fit into that IW10 window.  Bursts of hundreds of packets are commonplace.
> 
> I cover some of this from Google data in my blog:
> 
> https://gettys.wordpress.com/2013/07/10/low-latency-requires-smart-queuing-traditional-aqm-is-not-enough/ <https://gettys.wordpress.com/2013/07/10/low-latency-requires-smart-queuing-traditional-aqm-is-not-enough/>
> 
> So the web test is not specified decently: is it 100 packets back to back? or 10 IW10 connections?
> 
> Also, flow completion, while interesting, is *not* the only interesting/useful metric; getting the first packet(s) of each connection allows page layout to occur without reflows and is equally interesting/useful.
> 
> If you look in the above blog entry, you'll see a bar chart of visiting a web page N times; one of the wonderful fq_codel results was the lack of variance in the completion time, since the wrong packets weren't getting badly delayed/dropped. (The web page completion time bar chart).
> 

At the beginning, we did not want to go too much into details about how to generate the traffic. 
However, to assess your accurate comment, we propose to add the following:

“
   A new web page download could start after the previous web page
   download is finished.  Each web page could be composed by at least 50
   objects and the size of each object should be at least 1kB. 6 TCP
   parallel connections SHOULD be generated to download the objects,
   each parallel connections having an initial congestion window set to
   10 packets.
"

> 7.2.1 Definition of the congestion level
> ---------------------------------------------------
> 
> I am nervous about defining congestion in terms of loss rate; when I measured loss rates in today's bloated edge network, the loss rates were quite high (several percent) even with single flows (probably because Cubic was being driven insane by the lack of losses on links with large amounts of buffering.  So encouraging readers to believe that loss rates always correlates with congestion level seems fraught with problems to me, as most will jump to the incorrect conclusion that measuring one gives you the other.
> 

We acknowledge that measuring the loss rate does not gives you an accurate estimation of the congestion. 
However, in order to quantify what we mean by “mild”, “medium” and “heavy” congestions, we had to provide
some numbers. For the sake of clarity, we will add the following: 

“
   These guidelines use the loss rate to define the different congestion
   levels, but they do not stipulate that in other circumstances,
   measuring the congestion level gives you an accurate estimation of
   the loss rate or vice-versa.
"

> 7.3 Parameter Sensitivity
> ---------------------------------
> 
> How would one show stability?  Is there a reference you can give to an example of such stability analysis?
> 

The goal of this section is that AQM proposals should come along with justifications on how the 
parameters have been set and what is the effect of changing these values. 

> 8.1 Traffic Mix
> -------------------
> 
> The test is ill defined...  What is "6 webs" for a test?  Why 5 TCP flows? are the flows bi-directional, or unidirectional?
> 

This test is just a recommendation and the tester should feel free to use any other combination of these flows. 
Unless specified, the flows are unidirectional.


> 8.2 Bi-directional traffic
> ------------------------------
> 
> It isn't clear that bi-directional traffic is actually required; the second paragraph seems to contradict the third paragraph.
> 

This section has been updated as follows: 

“
   Control packets such as DNS requests/responses, TCP SYNs/ACKs are
   small, but their loss can severely impact the application
   performance.  The scenario proposed in this section will help in
   assessing whether the introduction of an AQM scheme increases the
   loss probability of these important packets.

   For this scenario, traffic MUST be generated in both downlink and
   uplink, such as defined in Section 3.1.  These guidelines RECOMMEND
   to consider a mild congestion level and the traffic presented in
   Section 7.2.2 in both directions.  In this case, the metrics reported
   MUST be the same as in Section 7.2 for each direction.

   The traffic mix presented in Section 8.1 MAY also be generated in
   both directions.
"

> 10. Operator control knobs...
> ---------------------------------------
> 
> Isn't really this a case of wanting to understand the degenerate behavior of an algorithm, so that we understand the "worst case" behavior (and try to see if an algorithm "first, does no harm", even in cases where said algorithm should have been tuned (or autotuned itself)?  
> 
> For example, the current # of hash buckets default in fq_codel (that someday, we may make fully auto-tuning), if applied in a situation with many flows so there are frequent hash collisions causes the algorithm to lose its sparse flow optimization, but it is still going to apply CoDel to any elephant flows and behave roughly as DRR would; it should not "misbehave" and cause bad behavior: just not the better optimal behavior if the # of hash buckets had been set to something reasonable.
> 
> This sort of analysis is worth including in an evaluation….

We have adapted the text as follows:
“ 
   AQM proposals SHOULD describe the parameters that control the
   macroscopic AQM behavior, and identify any parameters that require
   require tuning to operational conditions.  It could be interesting to
   also discuss that even if an AQM scheme may not adequatly auto-tune
   its parameters, the resulting performance may not be optimal, but
   close to something reasonable.
"

> 
> 12. Interaction with scheduling
> ----------------------------------------
> 
> Somehow this section is currently pretty useless, particularly since good scheduling can be of greater benefit that the details of mark/drop algorithm.
> 
> I think we can/should do better here.

This section has been updated as follows: 

“
13.3.  Assessing the interaction between AQM and scheduling

   These guidelines do not propose guidelines to assess the performance
   of scheduling algorithms.  Indeed, as opposed to characterizing AQM
   schemes that is related to their capacity to control the queuing
   delay in a queue, characterizing scheduling schemes is related to the
   scheduling itself and its interaction with the AQM scheme.  As one
   example, the scheduler may create sub-queues and the AQM scheme may
   be applied on each of the sub-queues, and/or the AQM could be applied
   on the whole queue.  Also, schedulers might, such as FQ-CoDel
   [FQ-CoDel] or FavorQueue [FAVOUR], introduce flow prioritization.  In
   these cases, specific scenarios should be proposed to ascertain that
   these scheduler schemes not only helps in tackling the bufferbloat,
   but also are robust under a wide variety of operating conditions.
   This is out of the scope of this document that focus on dropping and/
   or marking AQM schemes.
"

> 
> 13.3 Comparing AQM schemes
> ------------------------------------------
> 
> I think there is a different issue left unmentioned at the moment, which is the "applicability" of an algorithm to a given situation.
> 
> For example, setting the CoDel target in fq_codel below the media access latency will hurt its behavior.  So twisting that knob is in fact necessary under some circumstances, and a document should discuss when that may be necessary. 
> 
> Similarly the target of 5ms is pretty arbitrary and CoDel is in general not very sensitive to what the target or interval are (except in this case of setting an unacheivable target. The values were chosen after a lot of simulation by Kathy Nichols. Helping people understand these cases is important.
> 

We think that providing such discussions would be important in the “Operator Control and Auto-tuning” section. 
To reflect that, we propose to add the following: 
“
   If there are any fixed parameters within the AQM, their setting
   SHOULD be discussed and justified.
"

> Section 13.4 Packet Sizes...
> --------------------------------------
> 
> I think there is a simple rule that needs addition to this section: an AQM/scheduling system "should guarantee progress", or particular flows can fail entirely under some circumstances. This "liveness" constraint is a property that all algorithms should provide.
> 
> 

This section mainly refers to the "‘Recommendations" document. We believe that if there are anything to be said 
on that topic, it should rather be added to the "Recommendations" document.

Thanks a lot for your feedback,
we hope to have assessed most of your comments. 

Kind regards, 

The authors. 

> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> aqm mailing list
> aqm@ietf.org <mailto:aqm@ietf.org>
> https://www.ietf.org/mailman/listinfo/aqm