Re: [aqm] CoDel's control law that determines drop frequency

Andrew Mcgregor <andrewmcgr@google.com> Sun, 25 October 2015 22:44 UTC

Return-Path: <andrewmcgr@google.com>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 89E4F1B3306 for <aqm@ietfa.amsl.com>; Sun, 25 Oct 2015 15:44:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 2.669
X-Spam-Level: **
X-Spam-Status: No, score=2.669 tagged_above=-999 required=5 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FB_ROLLER_IS_T=1.357, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HH7pdvW4hUjT for <aqm@ietfa.amsl.com>; Sun, 25 Oct 2015 15:44:52 -0700 (PDT)
Received: from mail-lf0-x234.google.com (mail-lf0-x234.google.com [IPv6:2a00:1450:4010:c07::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A972F1B3305 for <aqm@ietf.org>; Sun, 25 Oct 2015 15:44:51 -0700 (PDT)
Received: by lfbn126 with SMTP id n126so94787236lfb.2 for <aqm@ietf.org>; Sun, 25 Oct 2015 15:44:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=6tXxJ2HXqRpTn8Xb6J7c+hshBCO9Sjcyrs6f3kOHUyI=; b=geBuvJsNTz8OOuEGIYL5ButHeDBHL0YpVTed4roEIxI1snQ8SYHaFNgE3FuZPDo+kG WddLJJQ31R77rzEmFk9H21gYegJJeHZsOPImOy7UHnREZwL+RizDFMQWyHwCPv3UvXfe 6sEyNfAxiUxV5U+L3/wwgxQxt+iEzNSybUOUTOa1fz4ZMhwlNrjDhpBRSshrLP8fn9NH pSLqpHCKTW1IkWn2/HE5cvWISw/RAlgAC09eUETWNTDAD4K39eqsMnhcRcwxRdFjWwF8 orPBc4yTvc4qUuEqLFTHaNTliz74l5j+ATg0AjzdIE5KGO9LnGgpK52Zzk6/kbJVkChL OiIQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=6tXxJ2HXqRpTn8Xb6J7c+hshBCO9Sjcyrs6f3kOHUyI=; b=kLNWUU0dvJVRLV2wcyruqFEQDTqVgRDsa1zC+szyEojabF4BBELrmwCEq8Wd1IMbfr YDbofTzHy0tBsLF5MBkumZpX56MR11Zwd9BNe3ekOD80HAZsc62AqIBG+6LNSg7BZNt9 orvbci8yN4BR/MNrn7vUzyscH9YRIWk4TTXq7+6U03HNpF7tXNP15/UyF8n1RwXx0wcX a2rzeomPiehRS2F0sA3tLBrEFML4eiDR1unwv3p98XCCz5PmcYNyyAmk5q/nqIkACKwH 9ZHVE4AlSzZM2Ihhcfr1t+IvV3Xk4XtIi/ithCUWhtggq58ha7USlfTOUsaSlB8Bve8E hJUQ==
X-Gm-Message-State: ALoCoQmgfKP7+fV04vd5ffPbqcMCO4WVCItedliw4w6+Nvx34rvgPU8pR+OTnOTKV78JHOyCngTr
MIME-Version: 1.0
X-Received: by 10.112.126.70 with SMTP id mw6mr15826637lbb.81.1445813089471; Sun, 25 Oct 2015 15:44:49 -0700 (PDT)
Received: by 10.114.183.135 with HTTP; Sun, 25 Oct 2015 15:44:49 -0700 (PDT)
In-Reply-To: <E8355113905631478EFF04F5AA706E9830D62275@wtl-exchp-2.sandvine.com>
References: <201311122230.rACMUBmH003536@bagheera.jungle.bt.co.uk> <87wpzfpbd3.fsf@alrua-karlstad.karlstad.toke.dk> <56045CA8.2060103@bobbriscoe.net> <CAPRuP3mmg_-uxmtLUXprCmPyLSUuUA7t2dRZpDs_mwtnTgrSQA@mail.gmail.com> <560BA261.6020206@bobbriscoe.net> <CAA93jw6zdQa1BGDhKajUzORXkCsSCygMPhAsyFfgMqas-w+Xfg@mail.gmail.com> <560BDED0.1080402@bobbriscoe.net> <E8355113905631478EFF04F5AA706E9830D62275@wtl-exchp-2.sandvine.com>
Date: Mon, 26 Oct 2015 09:44:49 +1100
Message-ID: <CAPRuP3kOGVgfKaea9V3RtYjzcC1F0GDMheZKNhjrHzgZKC-0bQ@mail.gmail.com>
From: Andrew Mcgregor <andrewmcgr@google.com>
To: Dave Dolson <ddolson@sandvine.com>
Content-Type: multipart/alternative; boundary=001a11c36e489c642a0522f5974d
Archived-At: <http://mailarchive.ietf.org/arch/msg/aqm/QhDAWDs9P7ugsk79bEC5TMdHz2c>
Cc: Kathleen Nichols <nichols@pollere.com>, Bob Briscoe <ietf@bobbriscoe.net>, Dave Taht <dave.taht@gmail.com>, Van Jacobson <vanj@google.com>, AQM IETF list <aqm@ietf.org>
Subject: Re: [aqm] CoDel's control law that determines drop frequency
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 25 Oct 2015 22:44:56 -0000

CoDel does have the form of a controller; drop rate (not probability) is a
function of sojourn time (not queue size) and history, encoded in the state
variables.

Now, I don't take it as proven that the particular form of the controller
is the best we could do, but making it a rate and based on sojourn time are
clear wins. Yes, you can use size as a proxy for sojourn time if your link
really has a constant bit rate, but not even ethernet is exactly CBR in
practice (and in some hardware situations, knowing the size is much more
expensive than measuring sojourn; the opposite can also apply).  Yes, you
can use probability as a proxy for rate if you are doing a hardware
implementation or otherwise doing the rate the way CoDel does is hard.

PIE is closely related; the controller is the biggest difference, and it
uses both those proxies because it was aimed at a situation where there was
hardware support for one set of choices.

On 23 October 2015 at 07:07, Dave Dolson <ddolson@sandvine.com> wrote:

> Catching up...
>
> Bob said:
> > I meant to say in my mail to Andrew that the rate at which the control
> > law increases the dropping frequency ought to depend on how far the
> > queue is from target, or even how quickly it is diverging from target.
> > I've always said that this control law should not be just a hard-coded
> > thing that has no dependency on how the queue is moving.
>
> In other words, make it a PID controller, not just a PI controller?
> (Or is it currently just "I" ?)
> With consideration of the derivative term (how quickly the queue
> size is moving away from the target), a faster deviation from target
> queue size would cause a faster response.
>
>
> I find it unfortunate that neither CODEL nor PIE algorithms are formulated
> as a
> controller. It's difficult to reason about.
> (They *are* controllers, just not explained that way.)
> IMHO, an explanation should begin with a linear controller,
> with drop probability as a function of queue size,
> and then explanations of why non-linear modifications were necessary.
>
>
> -----Original Message-----
> From: aqm [mailto:aqm-bounces@ietf.org] On Behalf Of Bob Briscoe
> Sent: Wednesday, September 30, 2015 9:09 AM
> To: Dave Taht; Andrew Mcgregor
> Cc: Kathleen Nichols; AQM IETF list; Van Jacobson
> Subject: Re: [aqm] CoDel's control law that determines drop frequency
>
> Dave,
>
> It says the point at which cake enters drop state depends on how quickly
> the queue is growing.
> But I don't see anything on the page you referenced about a better
> curve. I looked under AQM and under other possibly relevant headings.
>
> I meant to say in my mail to Andrew that the rate at which the control
> law increases the dropping frequency ought to depend on how far the
> queue is from target, or even how quickly it is diverging from target.
> I've always said that this control law should not be just a hard-coded
> thing that has no dependency on how the queue is moving.
>
> Also below I've restated a couple of points I made back in 2013 when I
> first posted this.
>
> On 12/11/13 22:30, Bob Briscoe wrote:
> > 5/ My analysis also shows that the rate of increase in drop probablity
> > is inversely proportional to the link rate. I don't believe this is
> > desirable, as it will require tuning for the expected number of flows
> > on the link. However it is probably OK for variable rate links.
> >
> > 6/ The rate of increase of drop probability is also inversely
> > proportional to the square of 'interval'. I suggest a new constant is
> > defined for the control law, rather than relating the control law to
> > interval in this way. Otherwise, when anyone tunes interval, they will
> > get undesirable changes to the control law (e.g. reducing interval 10x
> > [to 10ms] would make the control law 100x more sluggish).
>
> This last point is particularly important. If the control law is still
> to be hard-coded, it should relate to another variable RTT_ave, not
> interval. TO decide when to enter dropping state, interval should be set
> to the likely worst-case RTT (in the low hundreds of ms). But, If
> hard-coded, the dynamics of the control law will be correct most often
> if it depends on the /most likely/ RTT, which these days with CDNs is
> perhaps 10-25ms (depending on population density in a country).
>
> There's another reason for having two separate variables, not just
> 'interval' for both. I've said the following before somewhere as well:
> For ECN marking, 'interval' can be set much lower (or even zero) because
> being on a hair-trigger with ECN marks doesn't matter, whereas it does
> for drops.{Note 1} Then CoDel will signal ECN without the sluggish delay
> it has to have before it starts dropping. However, once it has entered
> dropping state, the ECN control law still needs to be based on the same
> RTT_ave as for drop, because the control law is trying to mimic the
> dynamics of the congestion control, which RFC3168 says should be the
> same for ECN as for drop.{Note 2}
>
>
> Bob
>
> {Note 1}: The faster dynamics won't affect the long-term stable rate
> between an ECN and a non-ECN flow, because by definition long-term
> stable means the dynamics have settled.
> {Note 2}: As suggested in draft-briscoe-aqm-dualq-coupled-00, if the
> congestion control were radically different, the packets would probably
> need another identifier to distinguish them from 'classic' ECN.
>
> On 30/09/15 10:00, Dave Taht wrote:
> > On Wed, Sep 30, 2015 at 1:50 AM, Bob Briscoe <ietf@bobbriscoe.net>
> wrote:
> >> Andrew,
> >>
> >> I am also not so interested in an AQM dealing directly with unresponsive
> >> traffic - I prefer to keep policing and AQM as separately deployable
> >> functions, because AQM should be policy-neutral, whereas policing
> inherently
> >> involves policy.
> >>
> >> My concern was merely that CoDel's linear increase in drop probability
> can
> >> take a long time to reach where it intends to get to. I would have
> thought
> >> some form of exponential increase, or at least super-linear, would have
> been
> >> more responsive to changing traffic conditions. I.e., rather than have
> to
> >> answer the question "how quickly should drop probability increase?",
> make it
> >> increase increasingly quickly.
> >>
> >> Early on, Rong Pan showed that it takes CoDel ages to bring high load
> under
> >> control. I think this linear increase is the reason.
> > cake uses a better curve for codel, but we still need to do more
> > testing in the lab.
> >
> > http://www.bufferbloat.net/projects/codel/wiki/CakeTechnical
> >
> >> Bob
> >>
> >>
> >>
> >> On 30/09/15 01:42, Andrew Mcgregor wrote:
> >>
> >> Hmm, that's really interesting.
> >>
> >> Most interesting is that my understanding is that the control law was
> >> intended to deal with aggregates of mostly TCP-like traffic, and that an
> >> overload of unresponsive traffic wasn't much of a goal; this seems like
> >> vaguely reasonable behaviour, I suppose, given that pathological
> situation.
> >>
> >> But I don't have a way to derive the control law from first principles
> at
> >> this time (I haven't been working on that for a long time now).
> >>
> >> On 25 September 2015 at 06:27, Bob Briscoe <ietf@bobbriscoe.net> wrote:
> >>> Toke,
> >>>
> >>> Having originally whinged that no-one ever responded to my original
> 2013
> >>> posting, now it's my turn to be embarrassed for having missed your
> >>> interesting response for over 3 months.
> >>>
> >>> Cool that the analysis proves correct in practice - always nice.
> >>>
> >>> The question is still open whether this was the intention, and if so
> why
> >>> this particular control law was intended.
> >>> I would rather we started from a statement of what the control law
> ought
> >>> to do, then derive it.
> >>>
> >>> Andrew McGregor said he would have a go at this question some time
> ago...
> >>> Andrew?
> >>>
> >>>
> >>> Bob
> >>>
> >>>
> >>>
> >>> On 07/06/15 20:27, Toke Høiland-Jørgensen wrote:
> >>>
> >>> Hi Bob
> >>>
> >>> Apologies for reviving this ancient thread; been meaning to get around
> >>> to it sooner, but well... better late than never I suppose.
> >>>
> >>> (Web link to your original mail, in case Message-ID referencing breaks:
> >>> https://www.ietf.org/mail-archive/web/aqm/current/msg00376.html ).
> >>>
> >>> Having recently had a need to understand CoDel's behaviour in more
> >>> detail, your analysis popped out of wherever it's been hiding in the
> >>> back of my mind and presented itself as maybe a good place to start. :)
> >>>
> >>> So anyhow, I'm going to skip the initial assertions in your email and
> >>> focus on the analysis:
> >>>
> >>> Here's my working (pls check it - I may have made mistakes)
> >>> _________________
> >>> For brevity, I'll define some briefer variable names:
> >>>          interval =      I [s]
> >>>          next_drop =     D [s]
> >>>          packet-rate =   R [pkt/s]
> >>>          count =         n [pkt]
> >>>
> >>> >From the CoDel control law code:
> >>>          D(n) = I / sqrt(n)
> >>> And the instantaneous drop probability is:
> >>>          p(n) = 1/( R * D(n) )
> >>>
> >>> Then the slope of the rise in drop probability with time is:
> >>>          Delta p / Delta t       = [p(n+1) - p(n)] / D(n)
> >>>                                  = [1/D(n+1) - 1/D(n)] / [ R * D(n) ]
> >>>                                  = sqrt(n) * [sqrt(n+1) - sqrt(n)] /
> >>> [R*I*I]
> >>>                                  = [ sqrt(n(n+1)) - n ] / R*I^2
> >>>
> >>> I couldn't find anything wrong with the derivation. I'm not entirely
> >>> sure that I think it makes sense to speak about an "instantaneous drop
> >>> probability" for an algorithm that is not probabilistic in nature.
> >>> However, interpreting p(n) as "the fraction of packets dropped over the
> >>> interval from D(n) to D(n+1)" makes sense, I guess, and for this
> >>> analysis that works.
> >>>
> >>> At count = 1, the numerator starts at sqrt(2)-1 = 0.414.
> >>> Amd as n increases, it rapidly tends to 1/2.
> >>>
> >>> So CoDel's rate of increase of drop probability with time is nearly
> >>> constant (it
> >>> is always between 0.414 and 0.5) and it rapidly approaches 0.5 after a
> few
> >>> drops, tending towards:
> >>>          dp/dt = 1/(2*R*I^2)
> >>>
> >>> This constant increase clearly has very little to do with the
> square-root
> >>> law of
> >>> TCP Reno.
> >>>
> >>> In the above formula, drop probability increases inversely
> proportional to
> >>> the
> >>> packet rate. For instance, with I = 100ms and 1500B packets
> >>> at 10Mb/s =>    R = 833 pkt/s =>        dp/dt = 6.0% /s
> >>> at 100Mb/s =>   R = 8333 pkt/s =>       dp/dt = 0.6% /s
> >>>
> >>> I also tried to test this. I configured CoDel (on a Linux 4.0 box) on
> >>> 1Mbps, 2Mbps and 10Mbps links with interval settings of 1 second and
> >>> 500ms, and a total packet limit of 100k packets. This was to make it
> >>> deliberately slower to react (so the change in drop probability is more
> >>> visible), and to make sure no packets are dropped from queue overflow.
> >>>
> >>> I then sent an unresponsive UDP stream over the link at 110% of the
> link
> >>> capacity (as passed to Iperf, so approximately), and collected the
> >>> output of `tc -s qdisc` every 0.2 seconds.
> >>>
> >>> The attached plot is of 'pkts dropped / (pkts sent + pkts dropped)' in
> a
> >>> 2-second sliding window over the duration of the test (the plot is also
> >>> available here:
> >>> https://kau.toke.dk/ietf/codel-drop-rate/codel-drop-rate.svg ).
> >>>
> >>> I've included linear trend lines from the initial time to the point of
> >>> maximum drop probability, and as is apparent from the plot, got quite a
> >>> good fit (r>0.99 for all six data sets). The legend includes the slopes
> >>> of the linear fits for each of the data sets, which are not too far
> from
> >>> what your analysis predicts (and I'm guessing the difference can be
> >>> attributed to the difference in exact packet rates, but I haven't
> >>> checked).
> >>>
> >>> The Flent data files with the qdisc stats over time (readable by the
> >>> newest git version of Flent), as well as the Python script I used to
> >>> create the graph are available here:
> >>> https://kau.toke.dk/ietf/codel-drop-rate/
> >>>
> >>> So, in short: It seems that CoDel's "drop rate" does increase linearly
> >>> in the presence of a persistent queue, and that the rate of increase
> >>> depends on both the interval and the link rate.
> >>>
> >>> Now, I'll refrain from commenting on whether or not this is "bad", or
> >>> indeed if it is contrary to design. It was surprising to me at least,
> so
> >>> I thought I'd share my findings, in the hope that someone would either
> >>> find them useful or tell me how they're wrong (or both!). :)
> >>>
> >>> -Toke
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> aqm mailing list
> >>> aqm@ietf.org
> >>> https://www.ietf.org/mailman/listinfo/aqm
> >>>
> >>>
> >>> --
> >>> ________________________________________________________________
> >>> Bob Briscoe                               http://bobbriscoe.net/
> >>>
> >>>
> >>> _______________________________________________
> >>> aqm mailing list
> >>> aqm@ietf.org
> >>> https://www.ietf.org/mailman/listinfo/aqm
> >>>
> >>
> >>
> >> --
> >> Andrew McGregor | SRE | andrewmcgr@google.com | +61 4 1071 2221
> >>
> >>
> >> --
> >> ________________________________________________________________
> >> Bob Briscoe                               http://bobbriscoe.net/
> >>
> >>
> >> _______________________________________________
> >> aqm mailing list
> >> aqm@ietf.org
> >> https://www.ietf.org/mailman/listinfo/aqm
> >>
> >
> >
>
> --
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/
>
> _______________________________________________
> aqm mailing list
> aqm@ietf.org
> https://www.ietf.org/mailman/listinfo/aqm
>



-- 
Andrew McGregor | SRE | andrewmcgr@google.com | +61 4 1071 2221