[aqm] CoDel's control law that determines drop frequency

Bob Briscoe <bob.briscoe@bt.com> Tue, 12 November 2013 22:30 UTC

Return-Path: <bob.briscoe@bt.com>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CEA5F11E810B for <aqm@ietfa.amsl.com>; Tue, 12 Nov 2013 14:30:30 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.349
X-Spam-Level:
X-Spam-Status: No, score=-3.349 tagged_above=-999 required=5 tests=[AWL=0.250, BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WbE6pkQOWvGw for <aqm@ietfa.amsl.com>; Tue, 12 Nov 2013 14:30:25 -0800 (PST)
Received: from hubrelay-by-04.bt.com (hubrelay-by-04.bt.com [62.7.242.140]) by ietfa.amsl.com (Postfix) with ESMTP id 2286921E8095 for <aqm@ietf.org>; Tue, 12 Nov 2013 14:30:24 -0800 (PST)
Received: from EVMHR71-UKRD.domain1.systemhost.net (10.36.3.109) by EVMHR04-UKBR.bt.com (10.216.161.36) with Microsoft SMTP Server (TLS) id 8.3.297.1; Tue, 12 Nov 2013 22:30:18 +0000
Received: from EPHR01-UKIP.domain1.systemhost.net (147.149.196.177) by EVMHR71-UKRD.domain1.systemhost.net (10.36.3.109) with Microsoft SMTP Server (TLS) id 8.3.279.1; Tue, 12 Nov 2013 22:30:22 +0000
Received: from bagheera.jungle.bt.co.uk (132.146.168.158) by EPHR01-UKIP.domain1.systemhost.net (147.149.196.177) with Microsoft SMTP Server id 14.2.347.0; Tue, 12 Nov 2013 22:30:12 +0000
Received: from BTP075694.jungle.bt.co.uk ([10.215.131.145]) by bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id rACMUBmH003536; Tue, 12 Nov 2013 22:30:11 GMT
Message-ID: <201311122230.rACMUBmH003536@bagheera.jungle.bt.co.uk>
X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9
Date: Tue, 12 Nov 2013 22:30:10 +0000
To: <nichols@pollere.com>, <vanj@google.com>
From: Bob Briscoe <bob.briscoe@bt.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
Cc: AQM IETF list <aqm@ietf.org>
Subject: [aqm] CoDel's control law that determines drop frequency
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/aqm>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Nov 2013 22:30:31 -0000

Kathie, Van,

In <http://tools.ietf.org/html/draft-nichols-tsvwg-codel-01> (and in 
the ACM queue paper), you say CoDel's control law was designed around 
TCP Reno's formula relating packet drop probability to throughput. Quoting:

"  The next
    drop time is decreased in inverse proportion to the square root of
    the number of drops since the dropping state was entered, using the
    well-known relationship of drop rate to throughput to get a linear
    change in throughput. [REDL1998, MACTCP1997]
"
1/ It's questionable whether it's desirable to design an AQM around 
any particular transport algorithm. However, let's leave this point 
aside for now...

2/ It's questionable whether it's reasonable to design an AQM around 
NewReno's square-root law that is known not to scale, which means 
NewReno is being superceded by algorithms like Cubic that move away 
from the square root. Specifically if the window W is proportional to 
1/p^b, then for:
         NewReno,        b = 0.5
         Cubic,          b = 0.75
         Compound,       b = 0.8
         HSTCP,          b = 0.835
         DCTCP,          b = 1
         Relentless,     b = 1
         Scalable TCP,   b = 1
However, let's leave this point aside too for now...

3/ Back in Apr'13, for myself, I plotted how the CoDel algorithm 
increases drop probability with time, and it looked pretty linear to 
me, not a square root law. I've just got round to doing the analysis, 
and indeed it tends very quickly to a linear law with time. So, 
Codel's control law will actually cause Reno's throughput to fall 
with the square root of time, not linearly.

4/ It seems to be a waste of cycles to implement CoDel's control law 
using a square-root. Given it results in a linear increase in drop 
frequency with time, it should be possible to design a much simpler algorithm.

5/ My analysis also shows that the rate of increase in drop 
probablity is inversely proportional to the link rate. I don't 
believe this is desirable, as it will require tuning for the expected 
number of flows on the link. However it is probably OK for variable rate links.

6/ The rate of increase of drop probability is also inversely 
proportional to the square of 'interval'. I suggest a new constant is 
defined for the control law, rather than relating the control law to 
interval in this way. Otherwise, when anyone tunes interval, they 
will get undesirable changes to the control law (e.g. reducing 
interval 10x [to 10ms] would make the control law 100x more sluggish).

7/ We'll need to debate how the control law should behave in the 
presence of unresponsive flows, so this analysis should at least 
support that discussion.


Here's my working (pls check it - I may have made mistakes)
_________________
For brevity, I'll define some briefer variable names:
         interval =      I [s]
         next_drop =     D [s]
         packet-rate =   R [pkt/s]
         count =         n [pkt]

 From the CoDel control law code:
         D(n) = I / sqrt(n)
And the instantaneous drop probability is:
         p(n) = 1/( R * D(n) )

Then the slope of the rise in drop probability with time is:
         Delta p / Delta t       = [p(n+1) - p(n)] / D(n)
                                 = [1/D(n+1) - 1/D(n)] / [ R * D(n) ]
                                 = sqrt(n) * [sqrt(n+1) - sqrt(n)] / [R*I*I]
                                 = [ sqrt(n(n+1)) - n ] / R*I^2

At count = 1, the numerator starts at sqrt(2)-1 = 0.414.
Amd as n increases, it rapidly tends to 1/2.

So CoDel's rate of increase of drop probability with time is nearly 
constant (it is always between 0.414 and 0.5) and it rapidly 
approaches 0.5 after a few drops, tending towards:
         dp/dt = 1/(2*R*I^2)

This constant increase clearly has very little to do with the 
square-root law of TCP Reno.

In the above formula, drop probability increases inversely 
proportional to the packet rate. For instance, with I = 100ms and 1500B packets
at 10Mb/s =>    R = 833 pkt/s =>        dp/dt = 6.0% /s
at 100Mb/s =>   R = 8333 pkt/s =>       dp/dt = 0.6% /s

If CoDel was designed for scalable transports like DCTCP, and it was 
designed for the same number of flows on any size link, this scaling 
with link rate would make sense. But there are two reasons for an AQM 
to have to scale with link rate:

a) Variable rate link
You would probably want to design for about the same number of flows 
no matter how the link rate varies, so the control law could work for 
variable rate links (assuming scalable transports, which we are 
gradually tending towards).

b) Deploying CoDel on links in different parts of a network
At any one time, we tend to design a network assuming there will be 
more flows on faster links. So, in order to remain as responsive on 
faster links, CoDel's control law will need to be tuned for the 
expected number of flows before it is deployed on different size links.


Cheers



Bob


________________________________________________________________
Bob Briscoe,                                                  BT