Re: [aqm] I-D Action: draft-ietf-aqm-codel-03.txt

"Goltsman, Polina" <polina.goltsman@student.kit.edu> Thu, 07 July 2016 11:49 UTC

Return-Path: <polina.goltsman@student.kit.edu>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6021312D7B1 for <aqm@ietfa.amsl.com>; Thu, 7 Jul 2016 04:49:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.625
X-Spam-Level:
X-Spam-Status: No, score=-5.625 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-1.426] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cUCp8_lc0GDE for <aqm@ietfa.amsl.com>; Thu, 7 Jul 2016 04:49:46 -0700 (PDT)
Received: from scc-mailout-kit-02.scc.kit.edu (scc-mailout-kit-02.scc.kit.edu [IPv6:2a00:1398:9:f712::810d:e752]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D88C012D0C7 for <aqm@ietf.org>; Thu, 7 Jul 2016 04:49:45 -0700 (PDT)
Received: from kit-msx-28.kit.edu ([2a00:1398:9:f612::28]) by scc-mailout-kit-02.scc.kit.edu with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (envelope-from <polina.goltsman@student.kit.edu>) id 1bL7oK-0004cj-UY; Thu, 07 Jul 2016 13:49:43 +0200
Received: from kit-msx-31.kit.edu (2a00:1398:9:f612::31) by KIT-MSX-28.kit.edu (2a00:1398:9:f612::28) with Microsoft SMTP Server (TLS) id 15.0.1178.4; Thu, 7 Jul 2016 13:49:38 +0200
Received: from kit-msx-31.kit.edu ([fe80::281f:cf9a:1283:1d90]) by kit-msx-31.kit.edu ([fe80::281f:cf9a:1283:1d90%16]) with mapi id 15.00.1178.000; Thu, 7 Jul 2016 13:49:38 +0200
From: "Goltsman, Polina" <polina.goltsman@student.kit.edu>
To: Andrew Mcgregor <andrewmcgr@google.com>
Thread-Topic: [aqm] I-D Action: draft-ietf-aqm-codel-03.txt
Thread-Index: AQHRfxbvhSr+gsN4dEOPal6Mw2ErA59ibXCAgAIArACAcubuAIA2HfLC
Date: Thu, 07 Jul 2016 11:49:38 +0000
Message-ID: <1467892048445.64588@student.kit.edu>
References: <20160316000053.5163.88345.idtracker@ietfa.amsl.com> <56EEC35B.3000007@mti-systems.com> <56F0635A.7060004@student.kit.edu>, <CAPRuP3k9NuTpK7WsRfOymJbTSup5aY7RgUtEDd-wsapXJxbKVg@mail.gmail.com>
In-Reply-To: <CAPRuP3k9NuTpK7WsRfOymJbTSup5aY7RgUtEDd-wsapXJxbKVg@mail.gmail.com>
Accept-Language: en-US, de-DE
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [2a00:1398:2:4006:527b:9dff:fee2:e7f7]
Content-Type: multipart/alternative; boundary="_000_146789204844564588studentkitedu_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/aqm/ENA1VZmcFVXCJWrMIbmey4wAYnw>
Cc: Wesley Eddy <wes@mti-systems.com>, Jana Iyengar <jri@google.com>, "Bless, Roland (TM)" <roland.bless@kit.edu>, "aqm@ietf.org" <aqm@ietf.org>, Dave Taht <dave.taht@gmail.com>
Subject: Re: [aqm] I-D Action: draft-ietf-aqm-codel-03.txt
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Jul 2016 11:49:50 -0000

Dear Andrew, Dear Wesley,


First of all, sorry for the long waiting time. I still haven't processed "nits" but I have 2 major issues.


Issue 1: This issue is still related to the reentering dropping state.


As of the latest I-D, the "reentering dropping state"  appears in form of one sentence in the text and a small code block in the pseudo-code.


The text is on page 13:

>Additional logic prevents re-
>entering the dropping state too soon after exiting it and resumes the
>dropping state at a recent control level, if one exists.

The comment to the pseudo-code on page 20:


           // If min went above target close to when it last went
           // below, assume that the drop rate that controlled the
           // queue on the last cycle is a good starting point to
           // control it now. ('drop_next' will be at most 'interval'
           // later than the time of the last drop so 'now - drop_next'
           // is a good approximation of the time from the last drop
           // until now.) Implementations vary slightly here; this is
           // the Linux version, which is more widely deployed and
           // tested.



Or summarized: this issue is not addressed in text, which instead just mentions that the issue exists, there are several versions of this logic and here is one of them.


As it reads now, this "reentering" should be a tiny insignificant part of the algorithm that (1) doesn't happen in "most" of the cases, and (2) when it does happen, it will only "slightly" affect "performance" (with proper definitions of "most", "slightly", and "performance").


Is it really so?


Based on our evaluations, with pure CoDel (without FQ-CoDel), "reentering" is actually a common case. I think Dave and Toke should have more experimental results to answer this question. (I included Dave in CC)


If it is so, I still think that the "common case" should be at the very least explained in text too.


Issue 2: The "initial drop spacing" is inconsistent


Regarding the "initial drop spacing" there are following lines in the document.


between pages 11 and 12:

>and so the initial drop spacing
>SHOULD be set to the estimator's interval plus twice the target
>(i.e., initial drop spacing = 1.1 * interval)

page 12:

>the initial next
>drop spacing is intended to be long enough to give the endpoints time
>to react to the single drop so SHOULD be set to a value of 1.1 times
>the interval

page 16:


>and the initial drop
>spacing is also set to interval.

and it is also set to one interval in pseudo-code, which I believe is this line:


> drop_next_ = control_law(now, count_);


Note: I think there was an email on CoDel mailing list (https://lists.bufferbloat.net/pipermail/codel/) about this issue.



________________________________
From: Andrew Mcgregor <andrewmcgr@google.com>
Sent: Friday, June 3, 2016 3:50 AM
To: Goltsman, Polina
Cc: Wesley Eddy; aqm@ietf.org; Jana Iyengar
Subject: Re: [aqm] I-D Action: draft-ietf-aqm-codel-03.txt

Thank you Polina.

We have incorporated most of these (sometimes with slightly different solutions than you suggested), and a new draft will be out very shortly.  I felt that by far the most important was the different way the Linux implementation addressed re-entry to the dropping state, and the pseudocode now does exactly that.

Andrew

On 22 March 2016 at 08:10, Polina Goltsman <polina.goltsman@student.kit.edu<mailto:polina.goltsman@student.kit.edu>> wrote:
Dear Wesley, Dear All,

First of all our feedback regarding different "re-entering dropping state" in the document and in the Linux implementation (http://www.ietf.org/mail-archive/web/aqm/current/msg01686.html) was not addressed.

As FQ-CoDel  relies on CoDel, this issue is also (partly) relevant for the FQ-CoDel document. In the introduction FQ-CoDel references ns-3 and Linux implementations where the first one uses the re-entering logic from the CoDel document while the second from CoDel Linux implementation. The algorithm that has seen widespread testing according to Section 7 is (I suppose) the Linux version. Is this situation acceptable for an algorithm specification?

[since this comment was supposed to be sent before the end of 2015, feel free to (silently) ignore it]
Second, unlike Rasool Al-Saadi (see http://www.ietf.org/mail-archive/web/aqm/current/msg01693.html) I do not like the document. Although I agree that the pseudocode is sufficient to create a working implementation, however, in my opinion, the rest of the document makes implementing CoDeL more confusing (at least without reading [CODEL2012] first). Is it normal for a RFC, which, as I assume, should primarily contain an algorithm specification to contain the algorithm specification ONLY in form of pseudocode?
These are two items that I found the most confusing:
(1) section 3 introduces estimator, setpoint, and control loop which are not clearly distinguishable in pseudocode. It would be nice if section 4 explained how the three entities transition into the routines in pseudocode.
(2) IMHO, there is some missing explanations. For example, the document never says how exactly "bad"/persistent queue is determined. The document says in Section 3.1 that the minimum is tracked, but it never says that persistent queue is when the minimum is above setpoint/target for at least an interval. As another example, Section 3.3 could say how the controller looks like exactly.

I also have some small comments and nits regarding the latest version (-03):

---

in the whole document there are several places of using "we" to (supposedly) refer to the authors. According to Bob Briscoe's message (http://www.ietf.org/mail-archive/web/aqm/current/msg01805.html), "we" do not necessary refer to the authors in a rfc. Are these "we"s ok?

---

abstract:

This document describes a general framework called CoDel

IMHO the word "framework" is too overloaded and too broad. It might be better to call CoDel an AQM or something that has "queue" in it. Other ways (e.g., http://queue.acm.org/detail.cfm?id=2839461) probably also use it to control a queue.

---

section 1:

(nit)
   Despite this
   awareness, the problem has only gotten worse as Moore's Law growth in
   memory density fueled an exponential increase in buffer pool size.
   Efforts to deploy AQM have been frustrated by difficult configuration
   and negative impact on network utilization.  This problem, recently
   christened "bufferbloat", [TSV2011] [BB2011] has become increasingly
   important throughout the Internet but particularly at the consumer
   edge.

it reads like "this problem" refers to "efforts to deploy AQM" and not to "increase in buffer pool size"

---

(nit)

   To understand
   queue management, it is critical to understand the difference between
   the necessary, useful "good" queue, and the counterproductive "bad"
   queue.

strictly speaking "good" and "bad" queue were not defined either before or in this sentence (it is explained in section 3 or in [CODEL2012])

---

(nit)

   o  treat "good queue" and "bad queue" differently, that is, keep
      delay low while permitting necessary bursts of traffic

(1) see above
(2) I would say that the goal is "to keep delay low while ..." and to treat good/bad queue differently is a mechanism used to achieve the goal

---

section 2:

(nit)
      In this document, the characters ">>" preceding an indented line(s)
   indicates a compliance requirement statement using the key words
   listed above.  This convention aids reviewers in quickly identifying
   or finding the explicit compliance requirements of this RFC.

I maybe missing something, but I can't find any use of "characters ">>"" anywhere except this line in any (txt,html, pdf) of the versions.

---

section 3:

generally IMHO it would be nice to introduce interval and target so it is more obvious that these are parameters (or some other kind of variables). Interval is more or less obvious (although imho should be more explicit). The target is first introduced at the end of section 3.2 as:


   This results in a particularly simple form for the
   setpoint: the ideal range for the permitted standing queue is between
   5% and 10% of the TCP connection's RTT.  Thus target is simply 5% of
   the interval of section 3.1<https://tools.ietf.org/html/draft-ietf-aqm-codel-03#section-3.1>.


It was not previously defined that target is a parameter with a meaning of setpoint. I don't think that this is clear for people who do not know what target is in advance.

[It seems that it was intended to introduce interval and target parameters in section 4]

---


   Instead of averages
   we recommend tracking the minimum sojourn time, then,

"we recommend" or "CoDeL does"? According to the last paragraph in Section 3.1, tracking minimum is a core part of CoDeL:

   These two innovations, use of sojourn time as observed values and the
   local minimum as the statistic to monitor queue congestion are key to
   CoDel's estimator building block.

---

   Since the peak queue delay is simply f r, power is solely a function

is it f/r ?

---
(nit)

   The only remaining building block needed for a basic AQM is a
   'control loop' algorithm to effectively drive the queueing system
   from any 'persistent queue above target' state to a state where the
   persistent queue is below target.

It would be nice to explicitly state that the "persistent queue above target" is indication of "bad" queue/congestion somewhere above or in this line. I think that these terms are used in [codel2012] but were not used before in the document. See also the next item.

---
In section 3.3 there are three "states":

  *

'persistent queue above target' state

  *

'has persistent queue' state

  *

 dropping state

It would be nice to a) use the same one b) explain how CoDeL gets in/out of there

---

(probably nit)

   When
   the minimum sojourn time first crosses the target and CoDel drops a
   packet, the earliest the controller could see the effect of the drop

from current sentence it could be inferred that the two events occur at the same time which is not the case. Maybe " when persistent queue above target is detected" would be better.

---

   variation is less than the target, and so the initial drop spacing
   SHOULD be set to the estimator's interval plus twice the target
   (i.e., initial drop spacing = 1.1 * interval) to ensure that the

this contradicts the pseudocode:

   dodequeue_result codel_queue_t::dodequeue(time_t now)

        ...
        first_above_time_ = now + interval_;

section 4:


   To keep from
   making drops when it would starve the output link, CoDel makes
   another check before dropping to see if at least an MTU worth of
   bytes remains in the buffer. If not, the packet SHOULD NOT be
   dropped and, currently, CoDel exits the drop state.

what is the meaning of "currently" ? in the current experimental version?

---
(nit)

   In practice, this value is not known or measured (though see
   Section 6.2<https://tools.ietf.org/html/draft-ietf-aqm-codel-03#section-6.2> for an application where interval is measured.

missing closing bracket

Section 5 / pseudocode: (all nits)

The variables are introduced as first_time_above, dropping, but are used as first_time_above_, dropping_ (with underscores at end).

---

Packet* CoDelQueue::dequeue()


a pointer to current packet was introduced as packet_t

---

double now = clock();;


time units were supposed to be of type time_t

---

dodequeueResult r = dodequeue(now);

compare with: typedef struct { ... } dodequeue_result below (present in two routines)

---

dodequeueResult r = { NULL, queue_t::dequeue() };

shouldn't it be {queue_t::dequeue(), false} ?


Best Regards,
Polina


* note the reply by Dave Taht: http://www.ietf.org/mail-archive/web/aqm/current/msg01687.html

On 03/20/2016 04:35 PM, Wesley Eddy wrote:
It looks like the WGLC feedback on the document body is incorporated, so this is good.

Is there a reason to stay with Informational and not Experimental like we've done with PIE an d FQ-CoDel?

Also, idnits has some problems with the references that should be fixed (e.g. "NATAL2010" is probably supposed to be "NETAL2010"):
https://tools.ietf.org/idnits?url=https://tools.ietf.org/id/draft-ietf-aqm-codel-03.txt


_______________________________________________
aqm mailing list
aqm@ietf.org<mailto:aqm@ietf.org>
https://www.ietf.org/mailman/listinfo/aqm




--
Andrew McGregor | SRE | andrewmcgr@google.com<mailto:andrewmcgr@google.com> | +61 4 1071 2221