[tcpPrague] L4S and BBR (was: [iccrg] ecn-l4s-id: Proposed Changed to Normative Classic ECN detection Text)

Bob Briscoe <ietf@bobbriscoe.net> Sun, 01 November 2020 13:31 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tcpprague@ietfa.amsl.com
Delivered-To: tcpprague@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 22FE43A08B2; Sun, 1 Nov 2020 05:31:08 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_FAIL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ae-8gm0Ra34g; Sun, 1 Nov 2020 05:31:04 -0800 (PST)
Received: from cl3.bcs-hosting.net (cl3.bcs-hosting.net [3.11.37.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AD7A03A08C0; Sun, 1 Nov 2020 05:31:03 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:From:References:Cc:To:Subject:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=bKcHTUNNDh6B6UxrQeDIeWsYKI3EeRKcQm4UI42amKE=; b=2Yb1rQswQlpVXVcsfGmbM1hwz kopA+6boWs1Rruxw1u9GAHH2O5/jpcsoJoMz2o+am/KeM2t/tOvvHqIFJGDlYxwqa8rkgERedk9ku CLolrgXnHqBrIOGwna7QSDyjSosHnkmIXdO82E4E6mBY6ydT8VbVzqa86XSHqLLrzelWZOlIGITU1 wuacxYXUg1SqL5XL9chZqwW+J4zoMhI9v9wDoGxkl1KK/4hdUTSzTHeUyun6V9iHkbH9VUOtvJZKm ypoPPzvp1dZuY86ejOGkmwFiwFimBBRHPzmpbSWVGdEngMbGFKqCINU1buGypBTNpEeqWvdHJSO9m dvQSIJ13Q==;
Received: from 67.153.238.178.in-addr.arpa ([178.238.153.67]:34964 helo=[192.168.1.3]) by cl3.bcs-hosting.net with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from <ietf@bobbriscoe.net>) id 1kZDRl-007gu3-EQ; Sun, 01 Nov 2020 13:31:01 +0000
To: Christian Huitema <huitema@huitema.net>, tsvwg IETF list <tsvwg@ietf.org>
Cc: iccrg IRTF list <iccrg@irtf.org>, TCP Prague List <tcpPrague@ietf.org>, "De Schepper, Koen (Koen)" <koen.de_schepper@nokia.com>
References: <1b71a610-75ea-e1d4-e3ce-f0ae6a4c12f7@bobbriscoe.net> <28247e5f-5df3-1f75-50e6-b4a7e80d5ab0@huitema.net>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <69a73308-6718-7304-be24-0eb84f77e50d@bobbriscoe.net>
Date: Sun, 01 Nov 2020 13:30:59 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0
MIME-Version: 1.0
In-Reply-To: <28247e5f-5df3-1f75-50e6-b4a7e80d5ab0@huitema.net>
Content-Type: multipart/alternative; boundary="------------1513BD8FB92639E16973A561"
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - cl3.bcs-hosting.net
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: cl3.bcs-hosting.net: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: cl3.bcs-hosting.net: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpprague/3NbyrtbuRVq5c2KRIqekXaaoV-Q>
Subject: [tcpPrague] L4S and BBR (was: [iccrg] ecn-l4s-id: Proposed Changed to Normative Classic ECN detection Text)
X-BeenThere: tcpprague@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "To coordinate implementation and standardisation of TCP Prague across platforms. TCP Prague will be an evolution of DCTCP designed to live alongside other TCP variants and derivatives." <tcpprague.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpprague>, <mailto:tcpprague-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpprague/>
List-Post: <mailto:tcpprague@ietf.org>
List-Help: <mailto:tcpprague-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpprague>, <mailto:tcpprague-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 01 Nov 2020 13:31:08 -0000

Christian,

I've changed the subject line given it's no longer appropriate.
See inline tagged [BB]...

On 01/11/2020 01:07, Christian Huitema wrote:
>
> I am reading the L4S ECN-AQM proposal with an eye on responding to it 
> in an implementation of QUIC,
>

[BB] That's good news.

> and I have a couple of questions regarding use of ECN marking with QUIC.
>
> The document does not mention QUIC, yet QUIC is already used in a 
> large fraction of Internet traffic. QUIC does specify support for ECN, 
> and QUIC acknowledgements may carry counts of each category of ECN 
> marks received from the peer -- three counters for ECT(0), ECT(1) and 
> CE. In theory, QUIC implementations could take advantage of L4S -- in 
> fact, at least one implementation supports DC-TCP like CC already. Is 
> there interest in specifying L4S for QUIC?
>

[BB] Certainly.
Nonetheless, when I search for the string "QUIC" in L4S ECN-AQM 
proposal, I find it (assuming you mean draft-ietf-tsvwg-aqm-dualq-coupled ).
But you won't find much else about QUIC there, 'cos that doc is about 
the AQM.

There are three main L4S docs.
You need to be reading the doc with the requirements for L4S transports: 
draft-ietf-tsvwg-ecn-l4s-id 
<https://tools.ietf.org/html/draft-ietf-tsvwg-ecn-l4s-id>
And the L4S architecture: draft-ietf-tsvwg-l4s-arch 
<https://tools.ietf.org/html/draft-ietf-tsvwg-l4s-arch>
Both of which explain that QUIC already provides the necessary feedback.

Quentin De Coninck wrote an initial implementation of a "QUIC Prague" 
congestion control based on your picoQUIC code. It can be accessed via 
the L4S landing page here:
     https://riteproject.eu/dctth/#code
He started it at an IETF hackathon. I don't think it's been maintained 
since it was first written, but it might be a start.


> My next question regards the interaction of the proposed L4S ECN-AQM 
> with CC algorithms like BBR that attempt to discover the bottleneck 
> packet rate for the connection, and use pacing to send packets at that 
> rate. I observed that BBR is never mentioned in the draft, yet BBR is 
> used in a sizeable part of Internet traffic. Do we have data on how a 
> non-L4S aware implementation of BBR interacts with the proposed L4S AQM?
>

[BB] Again, if you're looking at the L4S AQM doc, you won't find much 
about CC.
For BBR, take a look at section 5.2 of l4s-arch 
<https://tools.ietf.org/html/draft-ietf-tsvwg-l4s-arch-07#section-5.2>.

When BBRv2 was released it included initial support for L4S ECN. Neal 
and Yuchung presented BBRv2's L4S ECN queue delay results when they 
first talked about BBRv2 in iccrg. I'm sure more work is needed on the 
L4S ECN part, but I guess that will rise in priority once ISPs start 
deploying L4S AQMs (e.g. Greg White reported that cable implementations 
were being interop tested some time ago, and Nokia released an L4S WiFi 
AP a while back).

So, if there is an L4S AQM at the bottleneck, the resulting ECN signals 
would make BBRv2 morph into an L4S CC.

However, your question is about how a non-L4S aware BBR interacts with 
an L4S AQM.
* For that, I'll have to take 'BBR' to mean something like BBRv2, but 
without any support for ECN - i.e. without the L4S ECN part.
* And just to be super-precise, I'll take the 'L4S AQM' to mean the 
DualQ Coupled AQM (not an FQ one).
Then the answer is, I think, only a few tests of that combination have 
been done. Am I right, Olga, Asad, Olivier, Neal, Yuchung, anyone? I 
suspect this mainly because there doesn't seem to be a reason why the 
L4S ECN part of BBRv2 would be disabled.

I can say what I suspect would happen in what I think is your 
hypothetical case. The DualQ Coupled AQM design relies on a flow that is 
classified into the Classic queue (e.g.. the non-L4S aware BBRv2 you ask 
about) responding to loss and/or delay in a way that would be friendly 
to Reno. The v2 changes to BBR were intended to be more friendly to 
other flows, in particular the response to >1% loss. Then, these non-L4S 
aware BBRv2 flows that you posit should co-exist with L4S flows in some 
way. But whether they would coexist well isn't something I've looked 
into, 'cos I think it's rather a hypothetical question (if I've 
understood you correctly).

Before BBRv2 when BBRv1 had no response to loss (and no ECN support), I 
can say that it usually starved L4S flows over the same DualQ AQM for 
the same reason it usually starved Cubic and other flows in a FIFO. But 
hopefully that is in the past now.


> My last question regards potential use of ECT(1) marking. Most current 
> implementations set ECT(0), but setting ECT(1) instead is trivial. 
> This should elicit an L4S compatible response in L4S-AQM, and the BBR 
> implementation might be modified to use the signals as part of the 
> bottleneck bandwidth tracking. But there is a small issue there. With 
> BBR, QUIC packets are supposedly paced at just under the bottleneck 
> rate, except during "probe" periods in which they probe for 1 RTT at a 
> slightly higher rate. The L4S AGM might degenerate in a form of ON-OFF 
> control -- no feedback at all most of the time, then a bunch of CE 
> marks if the probe rate exceeds the bottleneck bandwidth. As anyone 
> experimented with that?
>

[BB] Yes, once the IETF assigns the codepoint, BBRv2 will be 'allowed' 
to send packets over the Internet as ECT(1). Then, yes, an L4S DualQ 
Coupled AQM will classify BBRv2 packets into the L4S queue. This should 
have a very shallow ECN marking threshold (500us - 1ms), so even if the 
flow (whether QUIC or TCP) is flying just under the available capacity, 
bunching of packets means it is unlikely to completely avoid ECN marking 
between probes. If it could avoid ECN marking, you are right that it 
would get a bump of ECN marks during the probe. I haven't studied the 
code, but when it experiences ECN marking I believe it switches into an 
L4S ECN mode for a while, and uses ECN rather than delay probing to 
track available capacity. I assume it switches back to BBR's delay 
probing mode if it gets no ECN for a while (e.g. the bottleneck might 
have moved). But I haven't looked at BBRv2's ECN behaviour in detail.

I'm sure someone on one of the lists that this is posted to will be able 
to answer tho.

HTH


Bob


> -- Christian Huitema
>
>
> On 10/31/2020 2:54 AM, Bob Briscoe wrote:
>> Folks,
>>
>> The co-authors of ECN L4S ID have been reviewing the correctness of 
>> the normative 'Prague' requirements.
>>     See 
>> https://tools.ietf.org/html/draft-ietf-tsvwg-ecn-l4s-id-10#section-4.3
>> This is the second of 2 emails, about 2 of the requirements that we 
>> think ought to be reworded a little.
>>
>> If you agree with the rationale, but think the new wording still 
>> doesn't fully capture the requirement, pls suggest sthg better.
>> If you disagree with the rationale, pls discuss.
>>
>> 4.3.  Prerequisite Congestion Response
>> ...
>> CURRENT:
>>
>>     o  A scalable congestion control MUST react to ECN marking from a
>>        non-L4S but ECN-capable bottleneck in a way that will coexist with
>>        a TCP Reno congestion control [RFC5681  <https://tools.ietf.org/html/rfc5681>] (seeAppendix A.1.4  <https://tools.ietf.org/html/draft-ietf-tsvwg-ecn-l4s-id-10#appendix-A.1.4>  for
>>        rationale).
>>
>>        Note that a scalable congestion control is not expected to change
>>        to setting ECT(0) while it falls back to coexist with Reno.
>>   
>> PROPOSED:
>>    o  A scalable congestion control MUST implement monitoring in order
>>       to detect a likely non-L4S but ECN-capable AQM at the bottleneck.
>>       On detection of a likely ECN-capable bottleneck it SHOULD be
>>       capable (dependent on configuration) of automatically adapting its
>>       congestion response to coexist with TCP Reno congestion controls
>>       [RFC5681] (see Appendix A.1.4 for rationale and a referenced
>>       algorithm).
>>
>>       Note that a scalable congestion control is not expected to change
>>       to setting ECT(0) while it falls back to coexist with Reno.
>>
>> RATIONALE:
>> 1/ The requirement as currently written says what an omniscient 
>> sender MUST do. So there's an implied requirement that a sender MUST 
>> be omniscient, which is of course impossible.
>> 2/ The requirement needs to be recast to require a sender to aim to 
>> be as knowledgeable as possible. Then, what it does as a result needs 
>> to take into account the a priori likelihood of there being a non-L4S 
>> bottleneck present.
>> 3/ This includes the possibility that the operator of the host knows 
>> that the network it serves has not deployed any single queue classic 
>> ECN AQM (e.g. in a CDN case they're doing out of band testing, or 
>> they've asked the ISP). So we've included the possibility of 
>> fall-back being disabled by configuration.
>> 4/ Nonetheless, as has been pointed out on the list, there is still a 
>> possibility that there is a Classic ECN AQM somewhere else on the 
>> path (to continue the CDN example, perhaps beyond the ISP in a home 
>> network). The 'MUST monitor' requirement still stands to ensure the 
>> operator doesn't miss these cases.
>> 5/ Then, if the server operators have disabled fall-back for their 
>> deployment, they can reconsider their policy or at least do more 
>> focused testing if they are frequently detecting a single-queue 
>> Classic ECN AQM.
>>
>> Items 3-5 are the "react via management" model that I've talked about 
>> on the list, given the unfairness doesn't amount to starvation, and 
>> it is possible that the prevalence of the problem is very low.
>>
>>
>> Finally, after the bullet list of requirements in section 4.3, (which 
>> are prerequisites for setting the ECT1 codepoint), we propose to add 
>> the following requirement, as suggested on the tsvwg list:
>>
>>       To participate in the L4S experiment, a scalable congestion 
>> control MUST
>>       be capable of being replaced by a Classic congestion control (by
>>       application and by administrative control). A Classic 
>> congestion control
>>       will not tag its packets with the ECT(1) codepoint.
>>
>> Cheers
>>
>>
>> Bob
>>
>>
>> -- 
>> ________________________________________________________________
>> Bob Briscoehttp://bobbriscoe.net/
>>
>> _______________________________________________
>> tcpPrague mailing list
>> tcpPrague@ietf.org
>> https://www.ietf.org/mailman/listinfo/tcpprague
>
> _______________________________________________
> iccrg mailing list
> iccrg@irtf.org
> https://www.irtf.org/mailman/listinfo/iccrg

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/