Re: [aqm] CoDel's control law that determines drop frequency

Dave Taht <dave.taht@gmail.com> Wed, 30 September 2015 09:00 UTC

Return-Path: <dave.taht@gmail.com>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D6B0B1A21C3 for <aqm@ietfa.amsl.com>; Wed, 30 Sep 2015 02:00:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wLlY2LZJejLw for <aqm@ietfa.amsl.com>; Wed, 30 Sep 2015 02:00:03 -0700 (PDT)
Received: from mail-oi0-x234.google.com (mail-oi0-x234.google.com [IPv6:2607:f8b0:4003:c06::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A76B01A21C2 for <aqm@ietf.org>; Wed, 30 Sep 2015 02:00:03 -0700 (PDT)
Received: by oiww128 with SMTP id w128so18584160oiw.2 for <aqm@ietf.org>; Wed, 30 Sep 2015 02:00:03 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=GWw30l4D9P72ieZErKMj3AIlc+Ll+A3LMFUAq9hEQpM=; b=VyEsWR1CPXFjycxEg5xl5jxNBdchDbdICQhDJnqHwEFNjPEADRI1uaBdnhf1dlybr7 HXg2ypjUJDi6vMVeyn/GME079foWWhitHDraSSRLLwMd/v0CV8Oydy61zVkB3BIlVj4/ 7xG2z42WVm2PNUR5sZLBLQBVVFNxJtEYrrOJmpQWRT8U4Hb5jBZ1ekhan9NNK5n/nKs6 UDCRLoi1+VlZpxFR9EEAeKu4INSc8QWYd+zwYvaqorndnVOYY41tNEkvm/O4z12e/g7B gFaa+klj34u223FlOTzVJ+RJqvXlOdLBH87+M/oga+QIVQMoCW5b/8S/4bAxnUSlWnWp NndA==
MIME-Version: 1.0
X-Received: by 10.202.203.194 with SMTP id b185mr1432058oig.104.1443603603036; Wed, 30 Sep 2015 02:00:03 -0700 (PDT)
Received: by 10.202.108.212 with HTTP; Wed, 30 Sep 2015 02:00:02 -0700 (PDT)
In-Reply-To: <560BA261.6020206@bobbriscoe.net>
References: <201311122230.rACMUBmH003536@bagheera.jungle.bt.co.uk> <87wpzfpbd3.fsf@alrua-karlstad.karlstad.toke.dk> <56045CA8.2060103@bobbriscoe.net> <CAPRuP3mmg_-uxmtLUXprCmPyLSUuUA7t2dRZpDs_mwtnTgrSQA@mail.gmail.com> <560BA261.6020206@bobbriscoe.net>
Date: Wed, 30 Sep 2015 02:00:02 -0700
Message-ID: <CAA93jw6zdQa1BGDhKajUzORXkCsSCygMPhAsyFfgMqas-w+Xfg@mail.gmail.com>
From: Dave Taht <dave.taht@gmail.com>
To: Bob Briscoe <ietf@bobbriscoe.net>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <http://mailarchive.ietf.org/arch/msg/aqm/E3mTXz3hAEkwaX5fB6SieeIHr0Q>
Cc: Andrew Mcgregor <andrewmcgr@google.com>, Kathleen Nichols <nichols@pollere.com>, AQM IETF list <aqm@ietf.org>, Van Jacobson <vanj@google.com>
Subject: Re: [aqm] CoDel's control law that determines drop frequency
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Sep 2015 09:00:11 -0000

On Wed, Sep 30, 2015 at 1:50 AM, Bob Briscoe <ietf@bobbriscoe.net> wrote:
> Andrew,
>
> I am also not so interested in an AQM dealing directly with unresponsive
> traffic - I prefer to keep policing and AQM as separately deployable
> functions, because AQM should be policy-neutral, whereas policing inherently
> involves policy.
>
> My concern was merely that CoDel's linear increase in drop probability can
> take a long time to reach where it intends to get to. I would have thought
> some form of exponential increase, or at least super-linear, would have been
> more responsive to changing traffic conditions. I.e., rather than have to
> answer the question "how quickly should drop probability increase?", make it
> increase increasingly quickly.
>
> Early on, Rong Pan showed that it takes CoDel ages to bring high load under
> control. I think this linear increase is the reason.

cake uses a better curve for codel, but we still need to do more
testing in the lab.

http://www.bufferbloat.net/projects/codel/wiki/CakeTechnical

>
> Bob
>
>
>
> On 30/09/15 01:42, Andrew Mcgregor wrote:
>
> Hmm, that's really interesting.
>
> Most interesting is that my understanding is that the control law was
> intended to deal with aggregates of mostly TCP-like traffic, and that an
> overload of unresponsive traffic wasn't much of a goal; this seems like
> vaguely reasonable behaviour, I suppose, given that pathological situation.
>
> But I don't have a way to derive the control law from first principles at
> this time (I haven't been working on that for a long time now).
>
> On 25 September 2015 at 06:27, Bob Briscoe <ietf@bobbriscoe.net> wrote:
>>
>> Toke,
>>
>> Having originally whinged that no-one ever responded to my original 2013
>> posting, now it's my turn to be embarrassed for having missed your
>> interesting response for over 3 months.
>>
>> Cool that the analysis proves correct in practice - always nice.
>>
>> The question is still open whether this was the intention, and if so why
>> this particular control law was intended.
>> I would rather we started from a statement of what the control law ought
>> to do, then derive it.
>>
>> Andrew McGregor said he would have a go at this question some time ago...
>> Andrew?
>>
>>
>> Bob
>>
>>
>>
>> On 07/06/15 20:27, Toke Høiland-Jørgensen wrote:
>>
>> Hi Bob
>>
>> Apologies for reviving this ancient thread; been meaning to get around
>> to it sooner, but well... better late than never I suppose.
>>
>> (Web link to your original mail, in case Message-ID referencing breaks:
>> https://www.ietf.org/mail-archive/web/aqm/current/msg00376.html ).
>>
>> Having recently had a need to understand CoDel's behaviour in more
>> detail, your analysis popped out of wherever it's been hiding in the
>> back of my mind and presented itself as maybe a good place to start. :)
>>
>> So anyhow, I'm going to skip the initial assertions in your email and
>> focus on the analysis:
>>
>> Here's my working (pls check it - I may have made mistakes)
>> _________________
>> For brevity, I'll define some briefer variable names:
>>         interval =      I [s]
>>         next_drop =     D [s]
>>         packet-rate =   R [pkt/s]
>>         count =         n [pkt]
>>
>> >From the CoDel control law code:
>>         D(n) = I / sqrt(n)
>> And the instantaneous drop probability is:
>>         p(n) = 1/( R * D(n) )
>>
>> Then the slope of the rise in drop probability with time is:
>>         Delta p / Delta t       = [p(n+1) - p(n)] / D(n)
>>                                 = [1/D(n+1) - 1/D(n)] / [ R * D(n) ]
>>                                 = sqrt(n) * [sqrt(n+1) - sqrt(n)] /
>> [R*I*I]
>>                                 = [ sqrt(n(n+1)) - n ] / R*I^2
>>
>> I couldn't find anything wrong with the derivation. I'm not entirely
>> sure that I think it makes sense to speak about an "instantaneous drop
>> probability" for an algorithm that is not probabilistic in nature.
>> However, interpreting p(n) as "the fraction of packets dropped over the
>> interval from D(n) to D(n+1)" makes sense, I guess, and for this
>> analysis that works.
>>
>> At count = 1, the numerator starts at sqrt(2)-1 = 0.414.
>> Amd as n increases, it rapidly tends to 1/2.
>>
>> So CoDel's rate of increase of drop probability with time is nearly
>> constant (it
>> is always between 0.414 and 0.5) and it rapidly approaches 0.5 after a few
>> drops, tending towards:
>>         dp/dt = 1/(2*R*I^2)
>>
>> This constant increase clearly has very little to do with the square-root
>> law of
>> TCP Reno.
>>
>> In the above formula, drop probability increases inversely proportional to
>> the
>> packet rate. For instance, with I = 100ms and 1500B packets
>> at 10Mb/s =>    R = 833 pkt/s =>        dp/dt = 6.0% /s
>> at 100Mb/s =>   R = 8333 pkt/s =>       dp/dt = 0.6% /s
>>
>> I also tried to test this. I configured CoDel (on a Linux 4.0 box) on
>> 1Mbps, 2Mbps and 10Mbps links with interval settings of 1 second and
>> 500ms, and a total packet limit of 100k packets. This was to make it
>> deliberately slower to react (so the change in drop probability is more
>> visible), and to make sure no packets are dropped from queue overflow.
>>
>> I then sent an unresponsive UDP stream over the link at 110% of the link
>> capacity (as passed to Iperf, so approximately), and collected the
>> output of `tc -s qdisc` every 0.2 seconds.
>>
>> The attached plot is of 'pkts dropped / (pkts sent + pkts dropped)' in a
>> 2-second sliding window over the duration of the test (the plot is also
>> available here:
>> https://kau.toke.dk/ietf/codel-drop-rate/codel-drop-rate.svg ).
>>
>> I've included linear trend lines from the initial time to the point of
>> maximum drop probability, and as is apparent from the plot, got quite a
>> good fit (r>0.99 for all six data sets). The legend includes the slopes
>> of the linear fits for each of the data sets, which are not too far from
>> what your analysis predicts (and I'm guessing the difference can be
>> attributed to the difference in exact packet rates, but I haven't
>> checked).
>>
>> The Flent data files with the qdisc stats over time (readable by the
>> newest git version of Flent), as well as the Python script I used to
>> create the graph are available here:
>> https://kau.toke.dk/ietf/codel-drop-rate/
>>
>> So, in short: It seems that CoDel's "drop rate" does increase linearly
>> in the presence of a persistent queue, and that the rate of increase
>> depends on both the interval and the link rate.
>>
>> Now, I'll refrain from commenting on whether or not this is "bad", or
>> indeed if it is contrary to design. It was surprising to me at least, so
>> I thought I'd share my findings, in the hope that someone would either
>> find them useful or tell me how they're wrong (or both!). :)
>>
>> -Toke
>>
>>
>>
>> _______________________________________________
>> aqm mailing list
>> aqm@ietf.org
>> https://www.ietf.org/mailman/listinfo/aqm
>>
>>
>> --
>> ________________________________________________________________
>> Bob Briscoe                               http://bobbriscoe.net/
>>
>>
>> _______________________________________________
>> aqm mailing list
>> aqm@ietf.org
>> https://www.ietf.org/mailman/listinfo/aqm
>>
>
>
>
> --
> Andrew McGregor | SRE | andrewmcgr@google.com | +61 4 1071 2221
>
>
> --
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/
>
>
> _______________________________________________
> aqm mailing list
> aqm@ietf.org
> https://www.ietf.org/mailman/listinfo/aqm
>



-- 
Dave Täht
Do you want faster, better, wifi? https://www.patreon.com/dtaht