Re: [aqm] I-D Action: draft-ietf-aqm-codel-03.txt

Michael Menth <menth@uni-tuebingen.de> Thu, 07 July 2016 12:34 UTC

Return-Path: <menth@uni-tuebingen.de>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C18D912D752 for <aqm@ietfa.amsl.com>; Thu, 7 Jul 2016 05:34:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.026
X-Spam-Level:
X-Spam-Status: No, score=-4.026 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, RP_MATCHES_RCVD=-1.426] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rbboItL-4vxn for <aqm@ietfa.amsl.com>; Thu, 7 Jul 2016 05:34:15 -0700 (PDT)
Received: from mx03.uni-tuebingen.de (mx03.uni-tuebingen.de [134.2.5.213]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1BD1812D688 for <aqm@ietf.org>; Thu, 7 Jul 2016 05:34:14 -0700 (PDT)
Received: from [134.2.11.131] (chaos.informatik.uni-tuebingen.de [134.2.11.131]) by mx03.uni-tuebingen.de (Postfix) with ESMTPSA id A988D782A7; Thu, 7 Jul 2016 14:34:12 +0200 (CEST)
To: "Goltsman, Polina" <polina.goltsman@student.kit.edu>, Andrew Mcgregor <andrewmcgr@google.com>
References: <20160316000053.5163.88345.idtracker@ietfa.amsl.com> <56EEC35B.3000007@mti-systems.com> <56F0635A.7060004@student.kit.edu> <CAPRuP3k9NuTpK7WsRfOymJbTSup5aY7RgUtEDd-wsapXJxbKVg@mail.gmail.com> <1467892048445.64588@student.kit.edu>
From: Michael Menth <menth@uni-tuebingen.de>
Message-ID: <577E4C53.9030909@uni-tuebingen.de>
Date: Thu, 07 Jul 2016 14:34:27 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2
MIME-Version: 1.0
In-Reply-To: <1467892048445.64588@student.kit.edu>
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/aqm/x-9I0htiR7YetDe4fbsjJDbLnRo>
Cc: Fabian Schwarzkopf <f.schwarzkopf@yahoo.de>, Wesley Eddy <wes@mti-systems.com>, Dave Taht <dave.taht@gmail.com>, Jana Iyengar <jri@google.com>, "Bless, Roland (TM)" <roland.bless@kit.edu>, "aqm@ietf.org" <aqm@ietf.org>
Subject: Re: [aqm] I-D Action: draft-ietf-aqm-codel-03.txt
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Jul 2016 12:34:20 -0000

Dear Polina, dear all,

Am 07.07.2016 um 13:49 schrieb Goltsman, Polina:
> Dear Andrew, Dear Wesley,
> 
> 
> First of all, sorry for the long waiting time. I still haven't processed
> "nits" but I have 2 major issues.
> 
> 
> *Issue 1: *This issue is still related to the reentering dropping state.
> 
> 
> As of the latest I-D, the "reentering dropping state"  appears in form
> of one sentence in the text and a small code block in the pseudo-code.
> 
> 
> The text is on page 13:
> 
>>Additional logic prevents re-
>>entering the dropping state too soon after exiting it and resumes the
>>dropping state at a recent control level, if one exists.
> 
> The comment to the pseudo-code on page 20:
> 
> 
>            // If min went above target close to when it last went
>            // below, assume that the drop rate that controlled the
>            // queue on the last cycle is a good starting point to
>            // control it now. ('drop_next' will be at most 'interval'
>            // later than the time of the last drop so 'now - drop_next'
>            // is a good approximation of the time from the last drop
>            // until now.) *Implementations vary slightly here; this is // the Linux version, which
> is more widely deployed and // tested.
> 
> *
> 
> 
> Or summarized: this issue is not addressed in text, which instead just
> mentions that the issue exists, there are several versions of this logic
> and here is one of them.
> 
> 
> As it reads now, this "reentering" should be a tiny insignificant part
> of the algorithm that (1) doesn't happen in "most" of the cases, and (2)
> when it does happen, it will only "slightly" affect "performance" (with
> proper definitions of "most", "slightly", and "performance").
> 
> 
> Is it really so? 
> 
> 
> Based on our evaluations, with pure CoDel (without FQ-CoDel),
> "reentering" is actually a common case. I think Dave and Toke should
> have more experimental results to answer this question. (I included Dave
> in CC)


We also studied this issue about one or two years ago. The algorithm
controlling the reentering of the dropping mode may have a significant
impact on control variables, drop rates, drop pattern depending on the
traffic. It can also impact utilization. Most interesting is probably
Fig. 6 on page 7 in
https://atlas.informatik.uni-tuebingen.de/~menth/papers/Menth16e.pdf

Kind regards,

Michael



> 
> 
> If it is so, I still think that the "common case" should be at the very
> least explained in text too.
> 
> 
> *Issue 2:* The "initial drop spacing" is inconsistent
> 
> 
> Regarding the "initial drop spacing" there are following lines in the
> document.
> 
> 
> between pages 11 and 12:
> 
>>and so the initial drop spacing 
>>SHOULD be set to the estimator's *interval plus twice the target*
>>(i.e., initial drop spacing = 1.1 * interval)
> 
> page 12:
> 
>>the initial next
>>drop spacing is intended to be long enough to give the endpoints time
>>to react to the single drop so SHOULD be set to a value *of 1.1 times >the interval*
> 
> page 16:
> 
> 
>>and the initial drop
>>spacing is also set to *interval*.
> 
> and it is also set to*one interval* in pseudo-code, which I believe is
> this line:
> 
> 
>> drop_next_ = control_law(now, count_);
> 
> 
> Note: I think there was an email on CoDel mailing list
> (https://lists.bufferbloat.net/pipermail/codel/) about this issue.
> 
> 
> 
> ------------------------------------------------------------------------
> *From:* Andrew Mcgregor <andrewmcgr@google.com>
> *Sent:* Friday, June 3, 2016 3:50 AM
> *To:* Goltsman, Polina
> *Cc:* Wesley Eddy; aqm@ietf.org; Jana Iyengar
> *Subject:* Re: [aqm] I-D Action: draft-ietf-aqm-codel-03.txt
>  
> Thank you Polina.
> 
> We have incorporated most of these (sometimes with slightly different
> solutions than you suggested), and a new draft will be out very
> shortly.  I felt that by far the most important was the different way
> the Linux implementation addressed re-entry to the dropping state, and
> the pseudocode now does exactly that.
> 
> Andrew
> 
> On 22 March 2016 at 08:10, Polina Goltsman
> <polina.goltsman@student.kit.edu
> <mailto:polina.goltsman@student.kit.edu>> wrote:
> 
>     Dear Wesley, Dear All,
> 
>     First of all our feedback regarding different "re-entering dropping
>     state" in the document and in the Linux implementation
>     (http://www.ietf.org/mail-archive/web/aqm/current/msg01686.html) was
>     not addressed.
> 
>     As FQ-CoDel  relies on CoDel, this issue is also (partly) relevant
>     for the FQ-CoDel document. In the introduction FQ-CoDel references
>     ns-3 and Linux implementations where the first one uses the
>     re-entering logic from the CoDel document while the second from
>     CoDel Linux implementation. The algorithm that has seen widespread
>     testing according to Section 7 is (I suppose) the Linux version. Is
>     this situation acceptable for an algorithm specification?
> 
>     /[since this comment was supposed to be sent before the end of 2015,
>     feel free to (silently) ignore it]/
>     Second, unlike Rasool Al-Saadi (see
>     http://www.ietf.org/mail-archive/web/aqm/current/msg01693.html) I do
>     not like the document. Although I agree that the pseudocode is
>     sufficient to create a working implementation, however, in my
>     opinion, the rest of the document makes implementing CoDeL more
>     confusing (at least without reading [CODEL2012] first). Is it normal
>     for a RFC, which, as I assume, should primarily contain an algorithm
>     specification to contain the algorithm specification ONLY in form of
>     pseudocode?  
>     These are two items that I found the most confusing:
>     (1) section 3 introduces estimator, setpoint, and control loop which
>     are not clearly distinguishable in pseudocode. It would be nice if
>     section 4 explained how the three entities transition into the
>     routines in pseudocode.
>     (2) IMHO, there is some missing explanations. For example, the
>     document never says how exactly "bad"/persistent queue is
>     determined. The document says in Section 3.1 that the minimum is
>     tracked, but it never says that persistent queue is when the minimum
>     is above setpoint/target for at least an interval. As another
>     example, Section 3.3 could say how the controller looks like exactly.
> 
>     I also have some small comments and nits regarding the latest
>     version (-03):
> 
>     ---
> 
>     in the whole document there are several places of using "we" to
>     (supposedly) refer to the authors. According to Bob Briscoe's
>     message
>     (http://www.ietf.org/mail-archive/web/aqm/current/msg01805.html),
>     "we" do not necessary refer to the authors in a rfc. Are these "we"s ok?
> 
>     ---
> 
>     abstract:
> 
>     This document describes a general *framework* called CoDel 
> 
>     IMHO the word "framework" is too overloaded and too broad. It might
>     be better to call CoDel an AQM or something that has "queue" in it.
>     Other ways (e.g., http://queue.acm.org/detail.cfm?id=2839461)
>     probably also use it to control a queue.
> 
>     ---
> 
>     section 1:
> 
>     (nit)
>        Despite this
>        awareness, the problem has only gotten worse as Moore's Law growth in
>        memory density fueled an exponential increase in buffer pool size.
>        Efforts to deploy AQM have been frustrated by difficult configuration
>        and negative impact on network utilization.  *This problem,* recently
>        christened "bufferbloat", [TSV2011] [BB2011] has become increasingly
>        important throughout the Internet but particularly at the consumer
>        edge.
> 
>     it reads like "this problem" refers to "efforts to deploy AQM" and
>     not to "increase in buffer pool size"
> 
>     ---
> 
>     (nit)
> 
>        To understand
>        queue management, it is critical to understand the difference between
>        the necessary, useful *"good" queue*, and the counterproductive *"bad" queue*.
> 
>     strictly speaking "good" and "bad" queue were not defined either
>     before or in this sentence (it is explained in section 3 or in
>     [CODEL2012])
> 
>     ---
> 
>     (nit)
> 
>        o  treat "good queue" and "bad queue" differently, that is, keep
>           delay low while permitting necessary bursts of traffic
> 
>     (1) see above
>     (2) I would say that the goal is "to keep delay low while ..." and
>     to treat good/bad queue differently is a mechanism used to achieve
>     the goal
> 
>     ---
> 
>     section 2:
> 
>     (nit)
>           In this document, *the characters ">>" preceding an indented
>     line(s)*
>        indicates a compliance requirement statement using the key words
>        listed above.  This convention aids reviewers in quickly identifying
>        or finding the explicit compliance requirements of this RFC.
> 
>     I maybe missing something, but I can't find any use of "characters
>     ">>"" anywhere except this line in any (txt,html, pdf) of the versions.
> 
>     ---
> 
>     section 3:
> 
>     generally IMHO it would be nice to introduce /interval/ and /target/
>     so it is more obvious that these are parameters (or some other kind
>     of variables). Interval is more or less obvious (although imho
>     should be more explicit). The target is first introduced at the end
>     of section 3.2 as:
> 
>        This results in a particularly simple form for the
>        setpoint: the ideal range for the permitted standing queue is between
>        5% and 10% of the TCP connection's RTT.  Thus *target* is simply 5% of
>        the interval of section 3.1
>     <https://tools.ietf.org/html/draft-ietf-aqm-codel-03#section-3.1>.
> 
>     It was not previously defined that /target/ is a parameter with a
>     meaning of /setpoint/. I don't think that this is clear for people
>     who do not know what target is in advance.
> 
>     [It seems that it was intended to introduce /interval/ and /target/
>     parameters in section 4]
> 
>     ---
> 
>        Instead of averages
>        *we recommend* tracking the minimum sojourn time, then, 
> 
>     "we recommend" or "CoDeL does"? According to the last paragraph in
>     Section 3.1, tracking minimum is a core part of CoDeL:
> 
>        These two innovations, use of sojourn time as observed values *and the local minimum as the statistic to monitor queue congestion* are key to
>        CoDel's estimator building block.
> 
>     ---
> 
>        Since the peak queue delay*is simply f r*, power is solely a function
> 
>     is it f/r ?
> 
>     ---
>     (nit)
> 
>        The only remaining building block needed for a basic AQM is a
>        'control loop' algorithm to effectively drive the queueing system
>        from any *'persistent queue above target'* state to a state where the
>        *persistent queue is below target*.
> 
>     It would be nice to explicitly state that the "persistent queue
>     above target" is indication of "bad" queue/congestion somewhere
>     above or in this line. I think that these terms are used in
>     [codel2012] but were not used before in the document. See also the
>     next item.
> 
>     ---
>     In section 3.3 there are three "states":
> 
>       *
> 
>         'persistent queue above target' state
> 
>       *
> 
>         'has persistent queue' state
> 
>       *
> 
>          dropping state
> 
>     It would be nice to a) use the same one b) explain how CoDeL gets
>     in/out of there
> 
>     ---
> 
>     (probably nit)
> 
>        When
>        *the minimum sojourn time first crosses the target* and *CoDel drops a packet*, the earliest the controller could see the effect of the drop
> 
>     from current sentence it could be inferred that the two events occur
>     at the same time which is not the case. Maybe " when persistent
>     queue above target is detected" would be better.
> 
>     ---
> 
>        variation is less than the target, and so the *initial drop spacing SHOULD be set to the estimator's interval plus
>     twice the target (i.e., initial drop spacing = 1.1 * interval)* to ensure that the
> 
>     this contradicts the pseudocode:
> 
>        dodequeue_result codel_queue_t::dodequeue(time_t now)
> 
>     	...
>     	first_above_time_ = now + interval_;
> 
>     section 4:
> 
>        To keep from
>        making drops when it would starve the output link, CoDel makes
>        another check before dropping to see if at least an MTU worth of
>        bytes remains in the buffer. If not, the packet SHOULD NOT be
>        dropped and, *currently*, CoDel exits the drop state. 
> 
>     what is the meaning of "currently" ? in the current experimental
>     version?
> 
>     ---
>     (nit)
> 
>        In practice, this value is not known or measured *(*though see
>        Section 6.2
>     <https://tools.ietf.org/html/draft-ietf-aqm-codel-03#section-6.2> for an application where interval is measured.
> 
>     missing closing bracket
> 
>     Section 5 / pseudocode: (all nits)
> 
>     The variables are introduced as first_time_above, dropping, but are
>     used as first_time_above_, dropping_ (with underscores at end). 
> 
>     ---
> 
>     *Packet** CoDelQueue::dequeue()
> 
>     a pointer to current packet was introduced as packet_t
> 
>     *--- double* now = clock();;
> 
>     time units were supposed to be of type time_t
> 
>     *--- dodequeueResult* r = dodequeue(now);
> 
>     compare with: typedef struct { ... } *dodequeue_result* below
>     (present in two routines)
> 
>     ---
> 
>     dodequeueResult r = {*NULL, queue_t::dequeue() *};
> 
>     shouldn't it be {queue_t::dequeue(), false} ?
> 
> 
>     Best Regards,
>     Polina
> 
> 
>     * note the reply by Dave Taht:
>     http://www.ietf.org/mail-archive/web/aqm/current/msg01687.html
> 
>     On 03/20/2016 04:35 PM, Wesley Eddy wrote:
>>     It looks like the WGLC feedback on the document body is
>>     incorporated, so this is good.
>>
>>     Is there a reason to stay with Informational and not Experimental
>>     like we've done with PIE an d FQ-CoDel?
>>
>>     Also, idnits has some problems with the references that should be
>>     fixed (e.g. "NATAL2010" is probably supposed to be "NETAL2010"):
>>     https://tools.ietf.org/idnits?url=https://tools.ietf.org/id/draft-ietf-aqm-codel-03.txt
> 
> 
>     _______________________________________________
>     aqm mailing list
>     aqm@ietf.org <mailto:aqm@ietf.org>
>     https://www.ietf.org/mailman/listinfo/aqm
> 
> 
> 
> 
> -- 
> Andrew McGregor | SRE | andrewmcgr@google.com
> <mailto:andrewmcgr@google.com> | +61 4 1071 2221
> 
> 
> _______________________________________________
> aqm mailing list
> aqm@ietf.org
> https://www.ietf.org/mailman/listinfo/aqm
> 

-- 
Prof. Dr. habil. Michael Menth
University of Tuebingen
Faculty of Science
Department of Computer Science
Chair of Communication Networks
Sand 13, 72076 Tuebingen, Germany
phone: (+49)-7071/29-70505
fax: (+49)-7071/29-5220
mailto:menth@uni-tuebingen.de
http://kn.inf.uni-tuebingen.de