Re: [tcpPrague] [tsvwg] ecn-l4s-id: Proposed Changed to Normative Classic ECN detection Text

Christian Huitema <> Sun, 01 November 2020 19:48 UTC

On 11/1/2020 9:00 AM, Jonathan Morton wrote:
>> On 1 Nov, 2020, at 3:07 am, Christian Huitema <> wrote:
>> I am reading the L4S ECN-AQM proposal with an eye on responding to it in an implementation of QUIC, and I have a couple of questions regarding use of ECN marking with QUIC. 
>> The document does not mention QUIC, yet QUIC is already used in a large fraction of Internet traffic. QUIC does specify support for ECN, and QUIC acknowledgements may carry counts of each category of ECN marks received from the peer -- three counters for ECT(0), ECT(1) and CE. In theory, QUIC implementations could take advantage of L4S -- in fact, at least one implementation supports DC-TCP like CC already. Is there interest in specifying L4S for QUIC?
>> My next question regards the interaction of the proposed L4S ECN-AQM with CC algorithms like BBR that attempt to discover the bottleneck packet rate for the connection, and use pacing to send packets at that rate. I observed that BBR is never mentioned in the draft, yet BBR is used in a sizeable part of Internet traffic. Do we have data on how a non-L4S aware implementation of BBR interacts with the proposed L4S AQM?
>> My last question regards potential use of ECT(1) marking. Most current implementations set ECT(0), but setting ECT(1) instead is trivial. This should elicit an L4S compatible response in L4S-AQM, and the BBR implementation might be modified to use the signals as part of the bottleneck bandwidth tracking. But there is a small issue there. With BBR, QUIC packets are supposedly paced at just under the bottleneck rate, except during "probe" periods in which they probe for 1 RTT at a slightly higher rate. The L4S AGM might degenerate in a form of ON-OFF control -- no feedback at all most of the time, then a bunch of CE marks if the probe rate exceeds the bottleneck bandwidth. As anyone experimented with that?
> In my view, the biggest question you should be asking is how QUIC will distinguish between CE marks applied by an L4S AQM on the path, and those applied by an RFC-3168 AQM on the path.  The latter will treat ECT(1) marked packets the same as ECT(0), and expects the same multiplicative-decrease congestion response, but L4S expects a qualitatively different linear response.

Yes, that's a significant issue. As it is, nodes can only work with L4S
on specific paths that are somehow guaranteed to not mix the "classic"
and "L4S" behavior. For now, the only way I would implement it is by
adding an "l4s" option, that would only be turned on by default when
experimenting with L4S behavior. If we want any large scale deployment,
we have to get an IETF consensus on the evolution of ECN -- something
prescriptive, not just the "may experiment" of RFC 8311. Currently,
implementations can only treat CE marks as an indication that they are
probably sending faster than the network allows.

> You should probably not do any substantial work on integrating L4S into QUIC until you have a good answer to that question.  Alternative approaches to high-fidelity congestion control exist which resolve that particular conundrum easily.  In particular, the ECN field feedback mechanism in QUIC can also support SCE.

Yes, QUIC implementations could indeed support SCE. But they could not
support both SCE and L4S, and that's a big bummer.

I wonder whether there is a way to unify the L4S and SCE behaviors.
Transport protocol developers are unlikely to widely develop either
option unless the IETF comes to a consensus. This is a shame, because
ECN based congestion control does not have the ambiguity of loss-based
CC: ECN marks are not ambiguous, they are clearly signals from the
network, the application does not need to build logic to differentiate
between congestion induced losses and random losses due to, for example,
radio event. Failing that ECN unification CC algorithms end up relying a
lot on bandwidth and delay measurements, but that is sub-optimal because
some radio links do have variable bandwidth and variable delays.

By the way, I do think that L4S makes too many hypotheses about the CC.
I wish the AQM was based on intrinsic properties, such as "this flow is
using more than its fair share of bandwidth" or "this link is congested
and the transport should back off" rather than reasoning as if
implementations were still using TCP-RENO, which neither Linux or
Windows or Chrome does.

-- Christian Huitema