Re: [aqm] [tsvwg] Immediate ECN: Autotuning AQM for RTT

Bob Briscoe <bob.briscoe@bt.com> Tue, 12 November 2013 22:20 UTC

Return-Path: <bob.briscoe@bt.com>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 114EC21E8090; Tue, 12 Nov 2013 14:20:53 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.745
X-Spam-Level:
X-Spam-Status: No, score=-2.745 tagged_above=-999 required=5 tests=[AWL=-0.373, BAYES_00=-2.599, SARE_SUB_OBFU_Q1=0.227]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ElDKZbE7AdQv; Tue, 12 Nov 2013 14:20:48 -0800 (PST)
Received: from hubrelay-rd.bt.com (hubrelay-rd.bt.com [62.239.224.98]) by ietfa.amsl.com (Postfix) with ESMTP id A953921E8095; Tue, 12 Nov 2013 14:20:47 -0800 (PST)
Received: from EVMHR02-UKBR.domain1.systemhost.net (193.113.108.41) by EVMHR66-UKRD.bt.com (10.187.101.21) with Microsoft SMTP Server (TLS) id 14.3.158.1; Tue, 12 Nov 2013 22:20:45 +0000
Received: from EPHR02-UKIP.domain1.systemhost.net (147.149.100.81) by EVMHR02-UKBR.domain1.systemhost.net (193.113.108.41) with Microsoft SMTP Server (TLS) id 8.3.297.1; Tue, 12 Nov 2013 22:20:45 +0000
Received: from bagheera.jungle.bt.co.uk (132.146.168.158) by EPHR02-UKIP.domain1.systemhost.net (147.149.100.81) with Microsoft SMTP Server id 14.2.347.0; Tue, 12 Nov 2013 22:20:43 +0000
Received: from BTP075694.jungle.bt.co.uk ([10.215.131.145]) by bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id rACMKfo9003510; Tue, 12 Nov 2013 22:20:41 GMT
Message-ID: <201311122220.rACMKfo9003510@bagheera.jungle.bt.co.uk>
X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9
Date: Tue, 12 Nov 2013 22:20:41 +0000
To: Greg White <g.white@CableLabs.com>
From: Bob Briscoe <bob.briscoe@bt.com>
In-Reply-To: <CEA65B19.21ED5%g.white@cablelabs.com>
References: <201311111618.rABGILbC031136@bagheera.jungle.bt.co.uk> <CEA65B19.21ED5%g.white@cablelabs.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"; format=flowed
Content-Transfer-Encoding: quoted-printable
X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
Cc: tsv-area IETF list <tsv-area@ietf.org>, AQM IETF list <aqm@ietf.org>, tsvwg IETF list <tsvwg@ietf.org>
Subject: Re: [aqm] [tsvwg] Immediate ECN: Autotuning AQM for RTT
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/aqm>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Nov 2013 22:20:53 -0000

Greg, inline...

At 18:57 11/11/2013, Greg White wrote:


>On 11/11/13, 9:02 AM, "Bob Briscoe" <bob.briscoe@bt.com> wrote:
>
> >Greg,
> >
> >At 06:54 09/11/2013, Greg White wrote:
> >>This is very interesting work.  There are a lot of unanswered questions
> >>about ecn / no-ecn coexistence and differential treatment in an AQM, and
> >>this could provide some answers.
> >>
> >>To those who groaned that ECN was not included in DOCSIS 3.1, read these
> >>slides (and Naeem Khademi's).
> >
> >Indeed. However, I think it would be safe to recommend that ECN
> >support should at least be implemented and separately configurable,
> >then it can be turned on or off by operators later.
> >
> >I'm aware that this doubles the amount of configuration, but we've
> >had some success already (with RED) in relating all the ECN
> >parameters to the drop parameters by a formula, so hopefully the
> >vendor could configure the ECN parameters automatically based on the
> >drop parameters.
>
>I don't have a lot of confidence that I can recommend something to be
>burned into silicon at this point (and doing it in software is a
>non-starter).  The "turned on or off by operators later" would be a given,
>but if implemented based on what we know now, I don't have a lot of
>confidence that the switch would be ever set to the "turned on" state.  I
>can't recommend that vendors add gates for a feature that my intuition
>tells me would never be used.  If separately configurable means something
>that would increase the probability of the feature being used, then that's
>great, but what specifically are we talking about?  It could be a lot of
>gates.
>
>Also, you've suggested *not* differentiating between "classic ECN" and
>"immediate ECN" in the ECT flags, but this is just a suggestion at this
>point.  Again, not something I feel confident about taking for granted.
>We'll continue to track this discussion though.

Understood.

Given the cable industry has decided to move on 
this early schedule, we can now do the necessary 
research so that competing access technologies can properly exploit ECN ;)


more...

> >>Bob, CoDel uses "interval" both as a hold-off for the first packet drop
> >>and as the numerator in the invsqrt drop scheduling.  Setting interval =
> >>0
> >>would result in ECN being signaled on *every* ECN capable packet when the
> >>sojourn time is above threshold.    This jibes with some of your charts
> >>for RED, but others show a ramp up in mark probability rather than a step
> >>function. Could you clarify?
> >
> >We've only looked at WRED in detail, because it was much more
> >interesting (to us) to reconfigure existing implementations than have
> >to wait for new code to be implemented and tested.
> >
> >The suggestions for PIE and CoDel are just conceptual at this stage -
> >we've done no implementation of this idea with either. (I said this
> >verbally when presenting the slides, but I should have put it in
> >writing too). Please read my suggestions for PIE and CoDel in this light.
> >
> >I'm not surprised that CoDel derives other parameters from 'interval'
> >that should have been declared and set separately. Andrew McGregor
> >also pointed out to me that CoDel sets threshold = 0.2*interval, so
> >threshold would have to be declared separately as well. This starts
> >to reveal just how many magic numbers there really are in the CoDel
> >algorithm.
>
>
>In the ns2 CoDel code that we've used (from Kathie), there are two
>independent parameters, threshold and interval, and the defaults are 5ms
>for threshold and 100ms for interval, so I don¹t know where the "threshold
>= 0.2*interval" is coming from (maybe it is different in the current linux
>or openWRT distros).

Sorry, altho it turns out I was wrong anyway (see below), I meant to say:
         0.05 * interval
         (= interval / 20)
I was working from memory and I knew there was a 2 in there somewhere.

Andrew McGregor said at the mic in tsvwg last 
Friday that threshold was defined relative to 
interval, and I remembered it from when I first 
looked at the CoDel code. However, I've just 
checked and even Andrew's ns3 port of CoDel sets 
threshold to 5ms, independent of the setting of 
interval. So consider this a memory thow-back on everyone's part and ignore it.


>To be more clear about my earlier statement, CoDel actually uses
>"interval" to control a single aspect of the drop policy, however, the
>code (and descriptions of it) imply that there are two different
>functions, and my comments above were written in that vein. However,
>here's how I would describe it more succinctly:  the "next drop time" is
>always set according to interval/sqrt(count).  In the specific case of
>sojourn time crossing above threshold from below, count happens to equal
>1, so the first drop is set at one interval in the future.  This is what
>is sometimes described as the "hold off" period.

OK. From an admittedly hasty scan through Andrew 
McGregor's port of CoDel to ns3 (not the Linux 
code), it seems to introduce this hold-off time 
in the ControlLaw() function, but it only calls 
ControlLaw() from the ShouldDrop() function if 
the queue has remained above threshold for duration interval.

That implies that CoDel holds off signalling 
anything for 2 * interval, ie 200ms for the 
default setting of interval = 100ms.

>That said, there are some "magic numbers" in the algorithm, specifically
>the part that regulates very low drop probabilities (e.g. 8*interval,
>count *= 0.9844), but that¹s about it.

And the following magic numbers, of course:
* threshold [5ms]
* the power used in the control law [0.5 in the sqrt function]


>I¹m still thinking about how to achieve differential treatment of marking
>vs dropping in CoDel in a logical way.

That would only be possible if CoDel's control 
law for drop was based on some understandable 
logic in the first place. I've started another 
thread on that, based on analysis I've just done of CoDel's control law.

A naive answer would be to create a second 
variable for the interval used in ControlLaw(), 
let's call it I.  Then solely set interval = 0.

But that would still leave a signalling delay of 
I. As you say, CoDel's control law for increasing 
the dropping frequency is also based on the assumption that I = RTT = 100ms.

By my analysis (in the other email), the 
rationale for CoDel's current control law seems 
incorrect, which is probably why we're finding 
it's tough to base any new thinking on it.

> >
> >
> >>Setting max_burst = 0 in PIE would not result in the step function
> >>behavior.
> >
> >It's not meant to result in a step-function.
> >
> >In the WRED example, it solely avoids the delay of queue averaging,
> >so that once the /instantaneous/ queue exceeds min_thresh it marks
> >with increasing probability (not intended to be a step).
> >
> >Similarly with PIE, the formula:
> >         p = p + alpha*(est_del-target_del) + beta*(est_del-est_del_old);
> >would still gradually increase the probability of drop (not a step
> >function), but it would start to do so as soon as the queue exceeded
> >target_del, rather than waiting for max_burst.
> >
> >Is that what you meant?
>
>
>Yes, that is what I meant.
>
>However, it seemed to me that what you were proposing was to move the
>intelligence to the transport, and keep the queue simple. In that regard,
>would a very simple threshold:  always mark when instantaneous queue
>exceeds the threshold, never mark when it doesn¹t, be an acceptable way to
>do things?
>
>This would be easily implementable, and could be straightforwardly
>combined (I think) with any AQM that is controlling drops (including
>CoDel).

A simple step threshold would be fine if all 
traffic used ECN. DCTCP proves that such a really 
dumb AQM works well if complemented by a smarter 
TCP (which proved to me that our problem is with TCP).

But I don't think it's so easy to combine such a 
simple AQM for ECN traffic, with a complex AQM like CoDel for drop.

The logic I started from was:
* If starvation is going to happen, it will happen over time
* So we should be able to design two AQMs (for 
ECN and drop) that will not cause each other to 
starve, by only considering where they eventually 
converge to - solely in the presence of stable 
long-running flows. We can elide dynamics, like 
smoothing, that disappear over time.
* CoDel looks for the min queuing delay over 
interval, so we can elide that part of its 
behaviour. But the control law continues to 
increase drop with time, irrespective of how the 
queue is growing (as long as it is greater than 
threshold). Then, when drop is high enough, 
assuming responsive flows, the CoDel queue will 
fall below threshold and switch back to 
non-dropping mode. Then the cycle will repeat. 
It's not easy to elide away behaviour that 
stabilises by continually switching between two 
discrete modes, rather than stabilising in one mode.

This is what the short-term memory within CoDel 
does - if it returns to dropping mode soon enough 
after leaving it, it starts dropping from where 
it left off. So it's actually continually 
switching in and out of dropping mode, which 
makes it much harder to think about analytically. 
And much harder to design a complementary ECN behaviour for it.


________
I prefer the principle: "Design for 
Verifiability", which is why I always have an 
allergic reaction to CoDel. See John Doyle's 
paper below, which uses TCP and AQM design as a 
case study for how to use and abuse this principle:

Doyle, J.C., Carlson, J., Low, S.H., Paganini, 
F., Vinnicombe, G., Willinger, W., Hickey, J., 
Parrilo, P. & Vandenberghe, L., "Robustness and 
the Internet: Theoretical Foundations," Caltech Draft Paper (March 2002)
<http://netlab.caltech.edu/pub/papers/RIPartII.pdf>

The IRTF network complexity research group is 
trying to make this stuff understandable by us mere IETFers.



Bob


>-Greg
>
>
> >
> >
> >Bob
> >
> >
> >>-Greg
> >>
> >>
> >>On 11/7/13, 1:03 PM, "Bob Briscoe" <bob.briscoe@bt.com> wrote:
> >>
> >> >Folks,
> >> >
> >> >"Immediate ECN" slides:
> >> ><http://bobbriscoe.net/presents/1311ietf/1311tsvarea-iecn.pptx>
> >> ><http://bobbriscoe.net/presents/1311ietf/1311tsvarea-iecn.pdf>
> >> >
> >> >PS. This talk fell off the end of the TSVAREA agenda. It's mostly
> >> >relevant to AQM, but I didn't originally bring it to AQM, because it
> >> >affects 3 wgs: tsvwg, aqm & tcpm.
> >> >
> >> >In the AQM wg, there was dismay about CableLabs not including
> >> >anything about ECN in DOCSIS3.1. This talk is about AQM dynamics; and
> >> >how ECN can take out the 100ms of delay that CoDel and PIE introduce
> >> >- it's essentially about auto-tuning for RTT.
> >> >
> >> >It gives an interim recommendation for hardware designers that there
> >> >should be a second instance of the AQM algo for ECN packets so that
> >> >it can be configured with different parameters (think of WRED instead
> >>of
> >> >RED).
> >> >
> >> >Specifically, for ECN packets:
> >> >interval = 0 (for CoDel)
> >> >max_burst = 0 (for PIE)
> >> >
> >> >
> >> >Bob
> >> >
> >> >PS. We have a paper under submission, which we can supply on request.
> >> >We plan to document this in the IETF too.
> >> >
> >> >
> >> >
> >> >
> >> >________________________________________________________________
> >> >Bob Briscoe,                                                  BT
> >
> >________________________________________________________________
> >Bob Briscoe,                                                  BT
> >
>
>_______________________________________________
>aqm mailing list
>aqm@ietf.org
>https://www.ietf.org/mailman/listinfo/aqm

________________________________________________________________
Bob Briscoe,                                                  BT