Re: [aqm] CoDel's control law that determines drop frequency

Bob Briscoe <ietf@bobbriscoe.net> Wed, 30 September 2015 08:50 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EFADC1A21A9 for <aqm@ietfa.amsl.com>; Wed, 30 Sep 2015 01:50:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.6
X-Spam-Level:
X-Spam-Status: No, score=-2.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8JF9dBKMwys8 for <aqm@ietfa.amsl.com>; Wed, 30 Sep 2015 01:50:46 -0700 (PDT)
Received: from server.dnsblock1.com (server.dnsblock1.com [85.13.236.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 863791A21A6 for <aqm@ietf.org>; Wed, 30 Sep 2015 01:50:45 -0700 (PDT)
Received: from 8.37.199.146.dyn.plus.net ([146.199.37.8]:49145 helo=[192.168.0.15]) by server.dnsblock1.com with esmtpsa (TLSv1.2:DHE-RSA-AES128-SHA:128) (Exim 4.85) (envelope-from <ietf@bobbriscoe.net>) id 1ZhD62-0006yg-JN; Wed, 30 Sep 2015 09:50:42 +0100
To: Andrew Mcgregor <andrewmcgr@google.com>
References: <201311122230.rACMUBmH003536@bagheera.jungle.bt.co.uk> <87wpzfpbd3.fsf@alrua-karlstad.karlstad.toke.dk> <56045CA8.2060103@bobbriscoe.net> <CAPRuP3mmg_-uxmtLUXprCmPyLSUuUA7t2dRZpDs_mwtnTgrSQA@mail.gmail.com>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <560BA261.6020206@bobbriscoe.net>
Date: Wed, 30 Sep 2015 09:50:41 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0
MIME-Version: 1.0
In-Reply-To: <CAPRuP3mmg_-uxmtLUXprCmPyLSUuUA7t2dRZpDs_mwtnTgrSQA@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------040005030000050107060706"
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server.dnsblock1.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: server.dnsblock1.com: authenticated_id: in@bobbriscoe.net
Archived-At: <http://mailarchive.ietf.org/arch/msg/aqm/kNQ-rs1prS_jsqxctGvW_j8sE-Q>
Cc: AQM IETF list <aqm@ietf.org>, Van Jacobson <vanj@google.com>, Kathleen Nichols <nichols@pollere.com>
Subject: Re: [aqm] CoDel's control law that determines drop frequency
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Sep 2015 08:50:50 -0000

Andrew,

I am also not so interested in an AQM dealing directly with unresponsive 
traffic - I prefer to keep policing and AQM as separately deployable 
functions, because AQM should be policy-neutral, whereas policing 
inherently involves policy.

My concern was merely that CoDel's linear increase in drop probability 
can take a long time to reach where it intends to get to. I would have 
thought some form of exponential increase, or at least super-linear, 
would have been more responsive to changing traffic conditions. I.e., 
rather than have to answer the question "how quickly should drop 
probability increase?", make it increase increasingly quickly.

Early on, Rong Pan showed that it takes CoDel ages to bring high load 
under control. I think this linear increase is the reason.


Bob


On 30/09/15 01:42, Andrew Mcgregor wrote:
> Hmm, that's really interesting.
>
> Most interesting is that my understanding is that the control law was 
> intended to deal with aggregates of mostly TCP-like traffic, and that 
> an overload of unresponsive traffic wasn't much of a goal; this seems 
> like vaguely reasonable behaviour, I suppose, given that pathological 
> situation.
>
> But I don't have a way to derive the control law from first principles 
> at this time (I haven't been working on that for a long time now).
>
> On 25 September 2015 at 06:27, Bob Briscoe <ietf@bobbriscoe.net 
> <mailto:ietf@bobbriscoe.net>> wrote:
>
>     Toke,
>
>     Having originally whinged that no-one ever responded to my
>     original 2013 posting, now it's my turn to be embarrassed for
>     having missed your interesting response for over 3 months.
>
>     Cool that the analysis proves correct in practice - always nice.
>
>     The question is still open whether this was the intention, and if
>     so why this particular control law was intended.
>     I would rather we started from a statement of what the control law
>     ought to do, then derive it.
>
>     Andrew McGregor said he would have a go at this question some time
>     ago... Andrew?
>
>
>     Bob
>
>
>
>     On 07/06/15 20:27, Toke Høiland-Jørgensen wrote:
>>     Hi Bob
>>
>>     Apologies for reviving this ancient thread; been meaning to get around
>>     to it sooner, but well... better late than never I suppose.
>>
>>     (Web link to your original mail, in case Message-ID referencing breaks:
>>     https://www.ietf.org/mail-archive/web/aqm/current/msg00376.html  ).
>>
>>     Having recently had a need to understand CoDel's behaviour in more
>>     detail, your analysis popped out of wherever it's been hiding in the
>>     back of my mind and presented itself as maybe a good place to start. :)
>>
>>     So anyhow, I'm going to skip the initial assertions in your email and
>>     focus on the analysis:
>>
>>>     Here's my working (pls check it - I may have made mistakes)
>>>     _________________
>>>     For brevity, I'll define some briefer variable names:
>>>              interval =      I [s]
>>>              next_drop =     D [s]
>>>              packet-rate =   R [pkt/s]
>>>              count =         n [pkt]
>>>
>>>     >From the CoDel control law code:
>>>              D(n) = I / sqrt(n)
>>>     And the instantaneous drop probability is:
>>>              p(n) = 1/( R * D(n) )
>>>
>>>     Then the slope of the rise in drop probability with time is:
>>>              Delta p / Delta t       = [p(n+1) - p(n)] / D(n)
>>>                                      = [1/D(n+1) - 1/D(n)] / [ R * D(n) ]
>>>                                      = sqrt(n) * [sqrt(n+1) - sqrt(n)] / [R*I*I]
>>>                                      = [ sqrt(n(n+1)) - n ] / R*I^2
>>     I couldn't find anything wrong with the derivation. I'm not entirely
>>     sure that I think it makes sense to speak about an "instantaneous drop
>>     probability" for an algorithm that is not probabilistic in nature.
>>     However, interpreting p(n) as "the fraction of packets dropped over the
>>     interval from D(n) to D(n+1)" makes sense, I guess, and for this
>>     analysis that works.
>>
>>>     At count = 1, the numerator starts at sqrt(2)-1 = 0.414.
>>>     Amd as n increases, it rapidly tends to 1/2.
>>>
>>>     So CoDel's rate of increase of drop probability with time is nearly constant (it
>>>     is always between 0.414 and 0.5) and it rapidly approaches 0.5 after a few
>>>     drops, tending towards:
>>>              dp/dt = 1/(2*R*I^2)
>>>
>>>     This constant increase clearly has very little to do with the square-root law of
>>>     TCP Reno.
>>>
>>>     In the above formula, drop probability increases inversely proportional to the
>>>     packet rate. For instance, with I = 100ms and 1500B packets
>>>     at 10Mb/s =>    R = 833 pkt/s =>        dp/dt = 6.0% /s
>>>     at 100Mb/s =>   R = 8333 pkt/s =>       dp/dt = 0.6% /s
>>     I also tried to test this. I configured CoDel (on a Linux 4.0 box) on
>>     1Mbps, 2Mbps and 10Mbps links with interval settings of 1 second and
>>     500ms, and a total packet limit of 100k packets. This was to make it
>>     deliberately slower to react (so the change in drop probability is more
>>     visible), and to make sure no packets are dropped from queue overflow.
>>
>>     I then sent an unresponsive UDP stream over the link at 110% of the link
>>     capacity (as passed to Iperf, so approximately), and collected the
>>     output of `tc -s qdisc` every 0.2 seconds.
>>
>>     The attached plot is of 'pkts dropped / (pkts sent + pkts dropped)' in a
>>     2-second sliding window over the duration of the test (the plot is also
>>     available here:
>>     https://kau.toke.dk/ietf/codel-drop-rate/codel-drop-rate.svg  ).
>>
>>     I've included linear trend lines from the initial time to the point of
>>     maximum drop probability, and as is apparent from the plot, got quite a
>>     good fit (r>0.99 for all six data sets). The legend includes the slopes
>>     of the linear fits for each of the data sets, which are not too far from
>>     what your analysis predicts (and I'm guessing the difference can be
>>     attributed to the difference in exact packet rates, but I haven't
>>     checked).
>>
>>     The Flent data files with the qdisc stats over time (readable by the
>>     newest git version of Flent), as well as the Python script I used to
>>     create the graph are available here:https://kau.toke.dk/ietf/codel-drop-rate/
>>
>>     So, in short: It seems that CoDel's "drop rate" does increase linearly
>>     in the presence of a persistent queue, and that the rate of increase
>>     depends on both the interval and the link rate.
>>
>>     Now, I'll refrain from commenting on whether or not this is "bad", or
>>     indeed if it is contrary to design. It was surprising to me at least, so
>>     I thought I'd share my findings, in the hope that someone would either
>>     find them useful or tell me how they're wrong (or both!). :)
>>
>>     -Toke
>>
>>
>>     _______________________________________________
>>     aqm mailing list
>>     aqm@ietf.org <mailto:aqm@ietf.org>
>>     https://www.ietf.org/mailman/listinfo/aqm
>
>     -- 
>     ________________________________________________________________
>     Bob Briscoehttp://bobbriscoe.net/
>
>
>     _______________________________________________
>     aqm mailing list
>     aqm@ietf.org <mailto:aqm@ietf.org>
>     https://www.ietf.org/mailman/listinfo/aqm
>
>
>
>
> -- 
> Andrew McGregor | SRE |andrewmcgr@google.com 
> <mailto:andrewmcgr@google.com> | +61 4 1071 2221

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/