Re: [aqm] CoDel's control law that determines drop frequency

Andrew Mcgregor <andrewmcgr@google.com> Wed, 30 September 2015 00:42 UTC

Return-Path: <andrewmcgr@google.com>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2A01B1B5687 for <aqm@ietfa.amsl.com>; Tue, 29 Sep 2015 17:42:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.312
X-Spam-Level: *
X-Spam-Status: No, score=1.312 tagged_above=-999 required=5 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id davGzmJiW8Um for <aqm@ietfa.amsl.com>; Tue, 29 Sep 2015 17:42:07 -0700 (PDT)
Received: from mail-la0-x233.google.com (mail-la0-x233.google.com [IPv6:2a00:1450:4010:c03::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 253781B5685 for <aqm@ietf.org>; Tue, 29 Sep 2015 17:42:07 -0700 (PDT)
Received: by lahh2 with SMTP id h2so28074337lah.0 for <aqm@ietf.org>; Tue, 29 Sep 2015 17:42:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=NQrnkPEfyzK3gJX6tKKierQW8mjyZnVs6EnGWoPW+6U=; b=Y0xj7yjGBQSQnMqw5IEDP6p9c2zWhoe32B85xR7BsIJVZ/cpJpRIL+ViGUGZ/dDRsD jarbjmlRm0Kk5W1hoDd8DboHKVmT5mEd1XRaRuksbIjhfY8Ai7Wp+kMG2uWecsPUfS3h Z1D00yiVMFQlI2ekzMWp9iaZtwJ2MtAlW9rm2cpTeUyScvWqWnsZVOYG2aRSzMLJhm6c MX4aa7tPuC3p06RvUOrauVlds/O92oW6kWoPgKOW6NVhooLExgIbzyXB7MZs+m0OkaN4 2aFSxUu8BoQ9YwHeJ2ociLDF/f3HhnejVJSS/TyR4OGwjSzb6TjtFCC1RTUgTV0hVFcM jlMg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=NQrnkPEfyzK3gJX6tKKierQW8mjyZnVs6EnGWoPW+6U=; b=ReeYjODSBdQWJhhHWHbrdmketJVmI+VGLUiaiGD+cShHjm7nIIgSHdn1m6cwUDeHjZ 17Vv6r17lWJDB/YVj/fceQJbEwH5SLR/5CPnpbOelGxrxSth0cercE1l6ZsnuLUZUN1y OzVw+aI9nN/Yw2SyVjNaV1AvwV6ZbDAXJtmoX7+tk1HE4ST6Z2jrxUK4CYsXhwy7rTa3 oZ9awClCOTPefWl+BitG/m6hnzfrRbfQM6TAsSbRa7uBYxC0OqiDMdtST3iLg+KlCsWO VNNXa7AjUOXbxMIRf51y3xBVkRakMafaCZEVvTlDv3kr73QN2ZIZqZylAKklJj0xjort RqGA==
X-Gm-Message-State: ALoCoQkO/IT6LnnH/x3JcGRUTvPgZXkMNBLdiNHBXqLpe/Ms7FOmVWs+tVQOSKQ3ouaPLjwUeXG9
MIME-Version: 1.0
X-Received: by 10.152.23.199 with SMTP id o7mr189304laf.76.1443573724561; Tue, 29 Sep 2015 17:42:04 -0700 (PDT)
Received: by 10.114.79.34 with HTTP; Tue, 29 Sep 2015 17:42:04 -0700 (PDT)
In-Reply-To: <56045CA8.2060103@bobbriscoe.net>
References: <201311122230.rACMUBmH003536@bagheera.jungle.bt.co.uk> <87wpzfpbd3.fsf@alrua-karlstad.karlstad.toke.dk> <56045CA8.2060103@bobbriscoe.net>
Date: Wed, 30 Sep 2015 10:42:04 +1000
Message-ID: <CAPRuP3mmg_-uxmtLUXprCmPyLSUuUA7t2dRZpDs_mwtnTgrSQA@mail.gmail.com>
From: Andrew Mcgregor <andrewmcgr@google.com>
To: Bob Briscoe <ietf@bobbriscoe.net>
Content-Type: multipart/alternative; boundary=089e0158c95a0f9f2c0520ec331d
Archived-At: <http://mailarchive.ietf.org/arch/msg/aqm/k_Dr0anee_pEEGgE38flw0WJnLM>
Cc: AQM IETF list <aqm@ietf.org>, Van Jacobson <vanj@google.com>, Kathleen Nichols <nichols@pollere.com>
Subject: Re: [aqm] CoDel's control law that determines drop frequency
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Sep 2015 00:42:10 -0000

Hmm, that's really interesting.

Most interesting is that my understanding is that the control law was
intended to deal with aggregates of mostly TCP-like traffic, and that an
overload of unresponsive traffic wasn't much of a goal; this seems like
vaguely reasonable behaviour, I suppose, given that pathological situation.

But I don't have a way to derive the control law from first principles at
this time (I haven't been working on that for a long time now).

On 25 September 2015 at 06:27, Bob Briscoe <ietf@bobbriscoe.net> wrote:

> Toke,
>
> Having originally whinged that no-one ever responded to my original 2013
> posting, now it's my turn to be embarrassed for having missed your
> interesting response for over 3 months.
>
> Cool that the analysis proves correct in practice - always nice.
>
> The question is still open whether this was the intention, and if so why
> this particular control law was intended.
> I would rather we started from a statement of what the control law ought
> to do, then derive it.
>
> Andrew McGregor said he would have a go at this question some time ago...
> Andrew?
>
>
> Bob
>
>
>
> On 07/06/15 20:27, Toke Høiland-Jørgensen wrote:
>
> Hi Bob
>
> Apologies for reviving this ancient thread; been meaning to get around
> to it sooner, but well... better late than never I suppose.
>
> (Web link to your original mail, in case Message-ID referencing breaks:https://www.ietf.org/mail-archive/web/aqm/current/msg00376.html ).
>
> Having recently had a need to understand CoDel's behaviour in more
> detail, your analysis popped out of wherever it's been hiding in the
> back of my mind and presented itself as maybe a good place to start. :)
>
> So anyhow, I'm going to skip the initial assertions in your email and
> focus on the analysis:
>
>
> Here's my working (pls check it - I may have made mistakes)
> _________________
> For brevity, I'll define some briefer variable names:
>         interval =      I [s]
>         next_drop =     D [s]
>         packet-rate =   R [pkt/s]
>         count =         n [pkt]
>
> >From the CoDel control law code:
>         D(n) = I / sqrt(n)
> And the instantaneous drop probability is:
>         p(n) = 1/( R * D(n) )
>
> Then the slope of the rise in drop probability with time is:
>         Delta p / Delta t       = [p(n+1) - p(n)] / D(n)
>                                 = [1/D(n+1) - 1/D(n)] / [ R * D(n) ]
>                                 = sqrt(n) * [sqrt(n+1) - sqrt(n)] / [R*I*I]
>                                 = [ sqrt(n(n+1)) - n ] / R*I^2
>
> I couldn't find anything wrong with the derivation. I'm not entirely
> sure that I think it makes sense to speak about an "instantaneous drop
> probability" for an algorithm that is not probabilistic in nature.
> However, interpreting p(n) as "the fraction of packets dropped over the
> interval from D(n) to D(n+1)" makes sense, I guess, and for this
> analysis that works.
>
>
> At count = 1, the numerator starts at sqrt(2)-1 = 0.414.
> Amd as n increases, it rapidly tends to 1/2.
>
> So CoDel's rate of increase of drop probability with time is nearly constant (it
> is always between 0.414 and 0.5) and it rapidly approaches 0.5 after a few
> drops, tending towards:
>         dp/dt = 1/(2*R*I^2)
>
> This constant increase clearly has very little to do with the square-root law of
> TCP Reno.
>
> In the above formula, drop probability increases inversely proportional to the
> packet rate. For instance, with I = 100ms and 1500B packets
> at 10Mb/s =>    R = 833 pkt/s =>        dp/dt = 6.0% /s
> at 100Mb/s =>   R = 8333 pkt/s =>       dp/dt = 0.6% /s
>
> I also tried to test this. I configured CoDel (on a Linux 4.0 box) on
> 1Mbps, 2Mbps and 10Mbps links with interval settings of 1 second and
> 500ms, and a total packet limit of 100k packets. This was to make it
> deliberately slower to react (so the change in drop probability is more
> visible), and to make sure no packets are dropped from queue overflow.
>
> I then sent an unresponsive UDP stream over the link at 110% of the link
> capacity (as passed to Iperf, so approximately), and collected the
> output of `tc -s qdisc` every 0.2 seconds.
>
> The attached plot is of 'pkts dropped / (pkts sent + pkts dropped)' in a
> 2-second sliding window over the duration of the test (the plot is also
> available here:https://kau.toke.dk/ietf/codel-drop-rate/codel-drop-rate.svg ).
>
> I've included linear trend lines from the initial time to the point of
> maximum drop probability, and as is apparent from the plot, got quite a
> good fit (r>0.99 for all six data sets). The legend includes the slopes
> of the linear fits for each of the data sets, which are not too far from
> what your analysis predicts (and I'm guessing the difference can be
> attributed to the difference in exact packet rates, but I haven't
> checked).
>
> The Flent data files with the qdisc stats over time (readable by the
> newest git version of Flent), as well as the Python script I used to
> create the graph are available here: https://kau.toke.dk/ietf/codel-drop-rate/
>
> So, in short: It seems that CoDel's "drop rate" does increase linearly
> in the presence of a persistent queue, and that the rate of increase
> depends on both the interval and the link rate.
>
> Now, I'll refrain from commenting on whether or not this is "bad", or
> indeed if it is contrary to design. It was surprising to me at least, so
> I thought I'd share my findings, in the hope that someone would either
> find them useful or tell me how they're wrong (or both!). :)
>
> -Toke
>
>
>
> _______________________________________________
> aqm mailing listaqm@ietf.orghttps://www.ietf.org/mailman/listinfo/aqm
>
>
> --
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/
>
>
> _______________________________________________
> aqm mailing list
> aqm@ietf.org
> https://www.ietf.org/mailman/listinfo/aqm
>
>


-- 
Andrew McGregor | SRE | andrewmcgr@google.com | +61 4 1071 2221