Re: [tcpPrague] [aqm] L4S status update

Bob Briscoe <> Tue, 29 November 2016 00:45 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id EB7A212A1E1; Mon, 28 Nov 2016 16:45:31 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id pvwLcoNquzcB; Mon, 28 Nov 2016 16:45:29 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 0B69D12A1DF; Mon, 28 Nov 2016 16:45:29 -0800 (PST)
Received: from ([]:54067 helo=[]) by with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.87) (envelope-from <>) id 1cBWY3-0004aH-BF; Tue, 29 Nov 2016 00:45:27 +0000
To: "Bless, Roland (TM)" <>
References: <> <> <> <>
From: Bob Briscoe <>
Message-ID: <>
Date: Tue, 29 Nov 2016 00:45:26 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname -
X-AntiAbuse: Original Domain -
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain -
X-Get-Message-Sender-Via: authenticated_id:
Archived-At: <>
Cc: TCP Prague List <>, tcpm IETF list <>, AQM IETF list <>, tsvwg IETF list <>
Subject: Re: [tcpPrague] [aqm] L4S status update
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "To coordinate implementation and standardisation of TCP Prague across platforms. TCP Prague will be an evolution of DCTCP designed to live alongside other TCP variants and derivatives." <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 29 Nov 2016 00:45:32 -0000


On 24/11/16 20:48, Bless, Roland (TM) wrote:
> Hi Bob,
> see comments inline, please.
> Am 22.11.2016 um 20:09 schrieb Bob Briscoe:
>> I share your concern about cc-specific AQMs. But that is not a good
>> characterization of what we're doing.
> Yep, but it's part of what you're proposing.
>> On the current Internet, everything is meant to be somewhat "friendly"
>> to the original TCP cc (now spec'd in RFC5681). All sorts of cc's work
>> alongside that, with slightly different "fairness" properties, and only
>> one AQM is needed to cover them all. Nonetheless, Reno is the "lamest",
>> so everyone has to try to "Do (not much) Harm" to the lamest.
> Yes, and that's actually a deployment problem we agree on. Congestion
> control (CC) innovation is obstructed by the old loss-based CC so some
> separation mechanism would be good to let room for low-delay CCs.
>> *Is any AQM CC-neutral?**
>> *Note rule 5 <> in the
>> AQM Guidelines [RFC7567]
>>        "AQM algorithms SHOULD NOT interpret specific transport protocol
>> behaviors."
>> In general, the advice in that section is sound, but I don't think we
>> realized at that time just how subtle this issue is.
> I think that AQMs could be in fact designed CC neutral to a large
> extent, but every CC reacts probably different to the drop/ECN signals.
> So the dropping strategy implemented by the AQM will always cause a
> CC-specific reaction. Some AQMs therefore may achieve better results if
> they are tuned to specific CC behavior (but my e2e argument would also
> apply here), however, right now it's not that easy to change AQM
> implementations, esp. those implemented in hardware.
I am particularly worried about embedding fq in the Internet. That is 
far worse than embedding a subtly different performance improvement for 
certain congestion controls. With fq, the network determines the precise 
departure time of each packet, completely overriding the host's choice, 
without any understanding of what the applications is trying to achieve.

Even worse, in Jan 2017, I am told that fq_CoDel will become hard coded 
into the Linux WiFi drivers, without even a framework to dynamically 
load any alternative(s). Of course, we can add such a framework, but we 
are seeing Linux become the next major middlebox problem. It might be 
excusable if there were not sound alternatives available,... but there are.
>> Since then, I discovered that the autotuning parameter table in the PIE
>> algorithm is designed very precisely around the 1/sqrt(p) rule of Reno
>> (see Fig 5 in the PI^2 paper <>).
>> Similarly, the sqrt control law in Codel claims to be dependent on Reno
>> {Note 1}.
>> The point is that these AQMs still work fine with Cubic, Compound,
>> Westwood, etc, because all these ccs were designed to interwork with
>> Reno. {Note 2}
>> The idea of L4S (and specifically the DualQ Coupled AQM and the L4S ID
>> spec) is to enable a shift to a completely different "norm", but still
>> coexist with all the 'Classic' cc's that coexisted around the old
>> "norm". The new norm is intended to be just as fuzzy as the old norm
>> {Note 3}. The idea is two fuzzy clouds of congestion controls, around an
>> old and a new norm that are related together.
> Yes, and I support that basic idea to introduce network support for
> separating flows with different CC schemes. However, I'd like to have a
> solution that doesn't build in a specific coupling law, otherwise we
> will end up with having either TCP friendly or DCTCP friendly CC schemes.
And your proposal is...?

Also, what exactly is your rationale for not wanting a coupling law? It 
feels nice not to tie anything down. However, when hosts have fuzziness 
around both norms, additional fuzziness between the norms would just add 
an unnecessary dimension of uncertainty for no benefit.

As I just explained to Mario, the choice of "inversely proportional to 
marking probability" as the principle behind the new L4S space was not 

     Congestion signals per round trip = probability of congestion 
signal * packets per round trip
        = p * W
where W is the window, and p is marking probability.

L4S is defined for
     W ~= k/p
where k is a constant.

So, L4S Congestion signals per round trip = p * W
                                                                  ~= k

That is, L4S is defined so that congestion signals per round trip is 
invariant as flow rate scales. That's an extremely important property..

>> *BBR**
>> *I believe BBR attempts to be 'friendly' to loss-based flows when
>> competing in the same queue. But it's still research, and we don't yet
>> know how good it is at that in all scenarios, although we do have code
> I'm not sure how friendly/fair BBR is actually to other CCs,
See Koen's recent post.
> but L4S is also still research...
Well, L4S is more a space in which research can flourish. And also a 
space where there's an initial CC (DCTCP) already deployed in 3 OSs, 
with a lot of deployment experience in controlled environments, and it's 
showing pretty cool results over the Internet, even though DCTCP wasn't 
designed for the Internet.

I call L4S and incrementally deployable clean-slate.
>> to test now. Given BBR currently sets Not-ECT, it would classify itself
>> into the Classic queue of a DualQ AQM, and if it coexists with Reno it
>> /should/ coexist with L4S traffic in the other queue. See Koen's recent
>> posting
>> <>
>> about this.
> Hmm, from what I understood so far BBR reacts differently to packet loss
> than Reno/Cubic etc. So that coupling by dropping probability may not
> work correctly in this case, because BBR will not react according to
> 1/sqrt(p) (cf. the FAQ from the BBR slide set presented at IETF97,
Well, it's BBR's problem to show that it can interwork with the existing 
Internet first (and I am not including L4S in "the existing Internet" yet).
>> There would be nothing to stop someone designing a variant of BBR that
>> coexisted in the L4S queue with Scalable CCs like DCTCP (the point being
>> that if the bottleneck was not DualQ it would keep delay low and if if
>> the bottleneck was DualQ it would benefit even more from the lower
>> queuing delay there). However, it would have to be a bit more careful
>> about its whole round trip of queue probing, to avoid increasing the
>> delay in the L4S queue. You'll see that I suggested to Neil Cardwell
>> that they consider probing with a few packets rather than a whole
>> window, e.g. the chirping
>> <> technique that Mirja
>> and I looked into back in 2010 was designed to find the same knee
>> between rate increase and delay increase, with far fewer packets. I
>> thought of a better way of using chirping a few weeks ago, so I will be
>> returning to that too.
> That would be interesting...
>> *Specs**
>> *There is no statement that all L4S cc's MUST adhere to a 1/p rule. The
>> L4S ID draft says:
>>    "The inverse proportionality requirement above is worded
>>     as a 'SHOULD' rather than a 'MUST' to allow reasonable flexibility
>>     when defining these specifications."
> I found two MUSTs in these contexts:
> draft-briscoe-tsvwg-aqm-dualq-coupled-00:
>     In order to prevent
>     starvation of Classic traffic by scalable L4S traffic (e.g.  DCTCP)
>     the drop probability of Classic traffic MUST be proportional to the
>     square of the marking probability of L4S traffic, In other words, the
>     power to which p_L is raised in Eqn. (1) MUST be 2.
> 2.5:
>     The likelihood that an AQM drops a Not-ECT Classic packet (p_C) MUST
>     be roughly proportional to the square of the likelihood that it would
>     have marked it if it had been an L4S packet (p_L).  That is
>        p_C ~= (p_L / k)^2
They are deliberate MUSTs, linking together the meaning of the two 
congestion signals that a network queue applies. Neither constrains 
hosts; the SHOULD in the host behaviour is deliberately more liberal to 

The network is given freedom to flex the constant of proportionality (k 
in the above equation), but not the long-term scaling exponent (the 
square). That allows networks and equipment vendors to differentiate 
themselves, without compromising long term scaling.

Given hosts have flex around both "norms", if the network flexed the 
scaling relationship between the "norms" as well, it would add another 
dimension of uncertainty to the system, with no benefit.

>> I hope that 'SHOULD' is fuzzy enough - I suspect adding more words would
>> make it less fuzzy. But I would welcome wording to make it even more
>> fuzzy if you would like to engage in wordsmithing.
>> Nonetheless, we are trying to steer a path between a rock and a hard
>> place. Because, to shift to a much calmer waters beyond the rocks, we
>> have to define some number to relate L4S to Classic. I am wary because
>> when ECN was specified, there were attempts to define ECN as different
>> to drop. However, ECN originally ended up "the same as drop" because
>> no-one could muster enough backing behind any particular number to
>> relate the two, so the number '1' won by default (ie. ECN = 1 * drop^1 ).
> That's a different discussion.
Well, no. Think of it as push-back against your earlier sentence: "I'd 
like to have a solution that doesn't build in a specific coupling law".

We are proposing the square coupling law at the IETF precisely because, 
without a standard, no-one would be able to design a CC and be able to 
say anything about how it shares out capacity.
>> Bob
>> {Note 1} I have never got a good answer to my questions on aqm@ietf as
>> to why a sqrt that controls the shrinkage of the spacing between dropped
>> packets has something to do with the steady state law of Reno,
>> particularly because the law leads to linear growth in p over time.
> Yep. However, according to our experiments Codel's dropping law seems to
> achieve a quite reasonable loss desynchronization.
>> {Note 2} Actually, I don't believe PIE would work that well with Cubic
>> at v high BDP, once it was far from Reno-friendly, but that's only
>> intuition from the stability analysis, not from actual testing.
> We tested PIE at 10GB/s and it worked with Cubic similarly as at
> low speeds.
>> {Note 3} Indeed, at the moment, when DCTCP is on its own in the L4S
>> queue of the DualQ AQM as coded now, it hits up against a step
>> threshold, which makes it behave as 1/p^2, not 1/p. For now, that's just
>> because we didn't want to change too much about DCTCP at one time. But
>> it's also got some nice properties. This will all need to be discussed
>> as the DualQ AQM is specified more deeply.
> So probably it would be good to find out what is the common denominator
> of _new_ L4S-like CCs (not only considering DCTCP) and what is a common
> denominator for CCs in the other class.
The text proposal to answer this is in the ecn-l4s-id draft.

Pls feel free to suggest specific text improvements.


> Regards,
>   Roland
>> On 21/11/16 14:27, Bless, Roland (TM) wrote:
>>> Hi Bob and all,
>>> see below.
>>> Am 01.11.2016 um 01:02 schrieb Bob Briscoe:
>>>> A few people have been working away to specify and document all the
>>>> aspects of the new Low Latency, Low Loss, Scalable throughput (L4S)
>>>> service, which held a successful BoF in Berlin. As the decision was to
>>>> try to work across multiple WGs, I thought it would be useful to give ...
>>> Thanks for putting this together.
>>>>    * Dual Queue Coupled AQM
>>>>        o With Curvy RED for Linux (access available shortly)
>>>>        o With PI2 for Linux <> [*UPDATED*]
>>> I'll repeat my concerns that I already expressed at the L4S BOF in Berlin:
>>> While I agree that we probably need to separate low-delay congestion
>>> control schemes from traditional "queue-filling" congestion schemes,
>>> I strongly suggest to avoid putting a congestion control-specific
>>> coupling scheme into the network (a classic case for applying the
>>> "end-to-end arguments in system design").
>>> The current Dual queue coupled AQM proposal has got a coupling based on
>>> a congestion control specific dropping law p_C=(p_L/2)². So if
>>> congestion control schemes change then this coupling needs to be
>>> adapted. For example, the currently proposed scheme may fail if that
>>> vast majority of TCP traffic is using BBR other some other forthcoming
>>> CC scheme instead of Cubic, Reno, Compound etc. The same applies to
>>> draft-briscoe-tsvwg-ecn-l4s-id, section 2.5, where the dropping
>>> likelihood is defined.
>>> Regards,
>>>   Roland
> _______________________________________________
> aqm mailing list

Bob Briscoe