Re: [tsvwg] start of WGLC on L4S drafts

Bob Briscoe <ietf@bobbriscoe.net> Tue, 09 November 2021 16:40 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 556943A0CB1 for <tsvwg@ietfa.amsl.com>; Tue, 9 Nov 2021 08:40:21 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.429
X-Spam-Level:
X-Spam-Status: No, score=-5.429 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, NICE_REPLY_A=-3.33, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tvqkNiJehEJs for <tsvwg@ietfa.amsl.com>; Tue, 9 Nov 2021 08:40:16 -0800 (PST)
Received: from mail-ssdrsserver2.hostinginterface.eu (mail-ssdrsserver2.hostinginterface.eu [185.185.85.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A5EC63A0C9E for <tsvwg@ietf.org>; Tue, 9 Nov 2021 08:40:15 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:References:To:From:Subject:Sender:Reply-To:Cc: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=20OEud5tlfWXfkz7pXIDyEp6Qsu51Kj2EAN2YWwo7i8=; b=h2Z5IVuCn0siznrpeLJHcKPkYs Twtlp7yjrQaUpL4TVJsEzt223PJrPMqRUlhOqULz3F5aL2wA3d7pWJ+HNoR/zU41P2zr9a2ItLAkr GgMr9iWEsHkwtqlvATIHNBJVP3COh75XC9+LENm4XC1qpB2gj47K0Pxr3/uALPBBI0+JYYTgsDmWC 6E3ajZooom1ynpwmsNxhHuBsp/0Ma4nbp3RtsLkE2gyYJ8eAkHMUzBoXKJ6A+slCXcABRI2NoqqsD U2M6+InGUcdlMAH9Qgr21+zHXgWNO/dBjPi3IxQDz1woza4nBDL8fV5lsGy486zsYgEoc8cjH9HKF VwRQMa7g==;
Received: from 67.153.238.178.in-addr.arpa ([178.238.153.67]:60960 helo=[192.168.1.11]) by ssdrsserver2.hostinginterface.eu with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from <ietf@bobbriscoe.net>) id 1mkUAQ-0000QE-7h; Tue, 09 Nov 2021 16:40:14 +0000
From: Bob Briscoe <ietf@bobbriscoe.net>
To: Stuart Cheshire <cheshire=40apple.com@dmarc.ietf.org>, "tsvwg@ietf.org" <tsvwg@ietf.org>
References: <7dd8896c-4cd8-9819-1f2a-e427b453d5f8@mti-systems.com> <844986F2-D085-4743-8C6A-8689369CA7F6@apple.com> <77ec5b2b-0dee-347d-63ac-c5c5d0ab2e7d@bobbriscoe.net>
Message-ID: <0348c7f4-da92-4777-7a42-515499f864c8@bobbriscoe.net>
Date: Tue, 09 Nov 2021 16:40:12 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0
MIME-Version: 1.0
In-Reply-To: <77ec5b2b-0dee-347d-63ac-c5c5d0ab2e7d@bobbriscoe.net>
Content-Type: multipart/alternative; boundary="------------DFB4E9E6F447E8C9C70E19A6"
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - ssdrsserver2.hostinginterface.eu
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: ssdrsserver2.hostinginterface.eu: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: ssdrsserver2.hostinginterface.eu: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/FiWPi3WrIf9mYSth2AFSD3OrUXM>
Subject: Re: [tsvwg] start of WGLC on L4S drafts
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Nov 2021 16:40:22 -0000

Stuart,  see [BB2]

On 09/11/2021 00:14, Bob Briscoe wrote:
> Stuart,
>
> On 27/08/2021 11:48, Stuart Cheshire wrote:
>> On 29 Jul 2021, at 09:18, Wesley Eddy <wes@mti-systems.com> wrote:
>>
>>> This message is starting a combined working group last call on 3 of 
>>> the L4S drafts:
>>>
>>> - Architecture: 
>>> https://datatracker.ietf.org/doc/draft-ietf-tsvwg-l4s-arch/
>>>
>>> - DualQ: 
>>> https://datatracker.ietf.org/doc/draft-ietf-tsvwg-aqm-dualq-coupled/
>>>
>>> - ECN ID: https://datatracker.ietf.org/doc/draft-ietf-tsvwg-ecn-l4s-id/
>> These are exceptionally high quality documents. The insights are 
>> profound, and the writing is excellent. I have a few minor 
>> observations, and one more major one.
>>
>> =====
>>
>> <https://datatracker.ietf.org/doc/draft-ietf-tsvwg-l4s-arch/>
>>
>> I find use of reference-as-noun makes some sentences hard to parse 
>> and understand. It assumes every reader has every RFC memorized and 
>> unconsciously treats every RFC number as a noun. To see the text as a 
>> normal reader would, assume that instead of [RFC1234] all references 
>> are written numerically, like [1]. Or, even better, as a little 
>> superscript number, as they were back in the days when we still had 
>> typesetting.
>>
>> For example, what does this mean, on page 8: “To enable L4S, the 
>> standards track¹ has had to be updated...”
>
> [BB] I agree with you. But I ran out of time to find and convert all 
> these. It's on my ToDo list for the next rev.

[BB2] I've now scanned through and found some occurrences.

To check that my sensitivity gauge for this is about right, below I've 
given some examples of where I did (in bold) and didn't change the text. 
I could only find 2 cases that I thought needed changing. You'll see 
there are cases where I use citations as nouns - I don't think that is 
precisely the problem - it's more about making sure to use a natural 
language moniker when the citation is integral to the meaning of the 
sentence:

"[I-D.ietf-tsvwg-aqm-dualq-coupled] gives a full explanation of the 
DualQ Coupled AQM framework.  A specific marking algorithm is not 
mandated for L4S AQMs.  Appendices of I-D.ietf-tsvwg-aqm-dualq-coupled] 
give non-normative examples"

"*The L4S identifier spec. *[I-D.ietf-tsvwg-ecn-l4s-id] concludes that 
all alternatives involve compromises,..."

"To enable L4S, *the standards track Classic ECN spec.* [RFC3168] has 
had to be updated to allow L4S packets to depart from the 'equivalent to 
drop' constraint. [RFC8311] is a standards track update to relax 
specific requirements in RFC 3168 (and certain other standards track 
RFCs), which clears the way for the experimental changes proposed for 
L4S. [RFC8311] also reclassifies the original experimental assignment of 
the ECT(1) codepoint as an ECN nonce [RFC3540] as historic."

"[I-D.ietf-tsvwg-ecn-l4s-id] specifies that ECT(1) is used as the 
identifier to classify L4S packets into a separate treatment from 
Classic packets."

"Appendix B of [I-D.ietf-tsvwg-ecn-l4s-id] explains why five unlikely 
eventualities"

"The policy goal of the marking could be to differentiate flow rates 
(e.g. [Nadas20], which requires additional signalling of a per-flow 
'value'), or to equalize flow-rates (perhaps in a similar way to Approx 
Fair CoDel [AFCD], [I-D.morton-tsvwg-codel-approx-fair],..."

I adopt this stance myself, so I have no problem being pulled up on it. 
When I review papers, I point out cases where citations are discussed 
without identifying the citation with some natural language (though not 
necessary where the citation is incidental to comprehension). Indeed, 
when I was arguing for the use of 'Classic ECN', I argued against 
calling it RFC 3168 ECN for similar reasons - the RFC numbers that one 
person deals with day to day aren't the same for others.

Pls confirm these are OK.
Cheers


Bob

>
>>
>> -- 
>>
>> Page 7 defines Reno-friendly:
>>
>>     Reno-friendly:  The subset of Classic traffic that is friendly to 
>> the
>>        standard Reno congestion control defined for TCP in [RFC5681].
>>
>> The term “friendly” is defined in terms of “friendly”, which seems a 
>> bit circular. What does “friendly” mean precisely?
> [BB] I've added:
>
>     The TFRC spec. [RFC5348] indirectly implies that 'friendly' is 
> defined as "generally within a factor of two of the sending rate of a 
> TCP flow under the same conditions".
>
>
>>
>> And what is “Classic traffic”? Traffic that uses Classic Congestion 
>> Control?
>>
>>     Classic Congestion Control:  A congestion control behaviour that can
>>        co-exist with standard Reno [RFC5681] without causing
>>        significantly negative impact on its flow rate [RFC5033].
>>
>> If Classic traffic does not negatively impact Reno, and Reno-friendly 
>> traffic is friendly to Reno, what’s the difference? Why is 
>> Reno-friendly a subset of Classic traffic? What is an example of 
>> traffic that is Classic (does not negatively impact Reno) but is not 
>> Reno-friendly?
>
> [BB] Coexist includes non-congestion controlled stuff like DNS, and 
> CCs that coexist "without causing  significantly impact" [RFC5033]. 
> I'm not going to attempt to define significant here. When RFCs like 
> RFC 5033 were written, I believe the chosen wording was to 
> deliberately dodge anything precise. Although the authors of the UDP 
> Usage Guidelines [RFC8085] took it upon themselves to state another 
> numeric ratio, albeit vaguely: "an order of magnitude".
>
> I think the above additional definition of friendly should answer all 
> three of your questions. I hope it's is now clearer that friendly is 
> tighter than coexist.
> Indeed, in terms of tightness of constraint: Reno-compatible > 
> Reno-friendly > Classic
> (I'm silently substituting 'Reno' for 'TCP' here)
>
> Bear in mind that I have to grit my teeth when explaining all this, 
> given I've written a reasonably highly cited paper saying the metric 
> of flow-rate fairness is a load of tosh.
>
>>
>> -- 
>>
>> Page 17 mentions self-inflicted queuing delay:
>>
>>        1.  It might seem that self-inflicted queuing delay within a per-
>>            flow queue should not be counted, because if the delay wasn't
>>            in the network it would just shift to the sender. However,
>>            modern adaptive applications, e.g. HTTP/2 [RFC7540] or some
>>            interactive media applications (see Section 6.1), can keep 
>> low
>>            latency objects at the front of their local send queue by
>>            shuffling priorities of other objects dependent on the
>>            progress of other transfers.  They cannot shuffle objects 
>> once
>>            they have released them into the network.
>>
>> It might be helpful here to mention the TCP_NOTSENT_LOWAT socket option.
>>
>> If you want a reference, something like this might do:
>>
>> <https://blog.cloudflare.com/http-2-prioritization-with-nginx/>
>
> [BB] I would like to, but the RFC Editor kicks out non-archival 
> references. I've put it in for now, and see if we can find something 
> more archival later.
>
>>
>> -- 
>>
>> Page 22:
>>
>>     For any one L4S flow to work, it requires 3 parts to have been
>>     deployed.
>>
>> I *think* the three parts you mean are: client, bottleneck queue, and 
>> server.
>>
>> But I’m not sure. It would be good for this text to specify what the 
>> 3 parts are.
>
> [BB] Good point. Added:
>
> "...deployed: i) the congestion control at the sender; ii) the AQM at 
> the bottleneck; and iii) older transports (namely TCP) need upgraded 
> receiver feedback too."
>
>>
>> -- 
>>
>> Page 27:
>>
>>     the L4S service relies on self-constraint
>>
>> I think you meant:
>>
>>     the L4S service relies on self-restraint
>
> [BB] Sry, brain-fart (2 occurrences of self-restraint already elsewhere)
>
> Thank you, continuing...
>
>>
>> =====
>>
>> <https://datatracker.ietf.org/doc/draft-ietf-tsvwg-ecn-l4s-id/>
>>
>> Page 6:
>>
>>     Even with a perfectly tuned AQM, the additional
>>     queuing delay will be of the same order as the underlying speed-of-
>>     light delay across the network.
>>
>> I know what you’re saying here, but some readers might miss the 
>> significance. How about adding some words at the end:
>>
>>     Even with a perfectly tuned AQM, the additional
>>     queuing delay will be of the same order as the underlying speed-of-
>>     light delay across the network, thereby doubling the total 
>> round-trip time.
>
> [BB] Thx. I've also added 'roughly' to doubling.
>
>>
>> -- 
>>
>> Page 16:
>>
>>     {ToDo: Not-ECT / ECT(0) ?}.
>>
>> Seems like this bit is not quite ready for IETF Last Call.
>
> [BB] See list discussion since - Not-ECT now selected.
> I noticed that one myself soon after I'd posted it, but before I'd 
> read the numerous reviews that pointed that omission out.
>
>>
>> -- 
>>
>> Page 20, 'Safe' Unresponsive Traffic
>>
>>     The above section requires unresponsive traffic to be 'safe' to mix
>>     with L4S traffic.  Ideally this means that the sender never sends 
>> any
>>     sequence of packets at a rate that exceeds the available capacity of
>>     the bottleneck link.  However, typically an unresponsive transport
>>     does not even know the bottleneck capacity of the path, let alone 
>> its
>>     available capacity.  Nonetheless, an application can be considered
>>     safe enough if it paces packets out (not necessarily completely
>>     regularly) such that its maximum instantaneous rate from packet to
>>     packet stays well below a typical broadband access rate.
>>
>>     This is a vague but useful definition, because many low latency
>>     applications of interest, such as DNS, voice, game sync packets, 
>> RPC,
>>     ACKs, keep-alives, could match this description.
>>
>> This makes me very nervous.
>>
>> I agree that it is vague, but I don’t think it’s useful.
>>
>> A bunch of people doing voice calls over a shared slow Internet 
>> connection (think beach café on some remote island) could easily 
>> overwhelm the available capacity, and their clients should react by 
>> selecting a lower-rate voice codec, not by driving the network into 
>> massive packet loss so that nothing works. Conversely, if spare 
>> capacity is available, the clients should react by selecting a 
>> higher-rate voice codec to make use of that extra capacity and give 
>> better voice quality. I don’t see any good reason why voice clients, 
>> or video games, should not adapt to give the best user experience 
>> possible on a given network capacity. And that adaptation includes 
>> adapting down as well as adapting up. A voice call that adapts down 
>> (and works) gives a better user experience than a voice call that 
>> drives the network into congestion collapse (and completely fails).
>
> [BB] Totally agree. This was meant to be explained by the mention of 
> available capacity in "...does not even know the bottleneck capacity 
> of the path, let alone its available capacity."
> I've attempted to remedy this by adding the following at the end of 
> the subsection about "'Safe' Unresponsive Traffic":
>
> "Low rate streams such as voice and game sync packets, might not use 
> continuously adapting ECN-based congestion control, but they ought to 
> at least use a 'circuit-breaker' style of congestion response 
> [RFC8083]. Then, if the volume of traffic from unresponsive 
> applications is high enough to overload the link, this will at least 
> protect the capacity available to responsive applications. However, 
> queuing delay in the L queue will probably rise to that controlled by 
> the Classic (drop-based) AQM. If a network operator considers that 
> such self-restraint is not enough, it might want to police the L queue 
> (see Section 8.2 of [I-D.ietf-tsvwg-l4s-arch]."
>
> You didn't give an indication of what you wanted me to do. Does this 
> help?
>
> BTW, I resisted the temptation to also cite RFC7893, which favourably 
> compares the ITU-recommended termination of voice calls due to high 
> loss with TCP's congestion response, and goes on to calculate how 
> large a pseudowire of multiple voice circuits can be before it is 
> worse than TCP's response.
>
>
>>
>> -- 
>>
>> Page 20:
>>
>>     An operator that excludes traffic carrying the L4S identifier from
>>     L4S treatment MUST NOT treat such traffic as if it carries the 
>> ECT(0)
>>     codepoint, which could confuse the sender.
>>
>> This seems like a disincentive for client applications to use the L4S 
>> identifier, because they might actually get lower latency and lower 
>> loss by simply sticking with ECT(0). (More on this below.)
>
> [BB] I don't know whether you read the same ambiguity in this that 
> Pete Heist pointed out in his review: 'operator' implied all the 
> operator's nodes, not just this one. It now says :
>
> "A network node that supports L4S but excludes certain traffic 
> carrying the L4S identifier from L4S treatment MUST NOT treat ..."
>
> It was a perhaps lame attempt not to add more nodes that treat ECT(1) 
> the same as ECT(0) during deployment of L4S.
> Admittedly, it was written on the assumption that the node had 
> previously not supported any form of ECN at all, in which case this 
> wouldn't be a regression. I guess that assumption isn't 100% likely.
>
> I'll leave it as it is for now, but open to changing it after more 
> discussion on the list.
>
>>
>> I didn’t find any mention of what these documents assume a Classic 
>> (e.g., FQ-CoDel) bottleneck will do with an ECT(1) packet. I think it 
>> is assumed that an FQ-CoDel bottleneck will treat ECT(1) exactly the 
>> same as ECT(0), but the documents don’t say so.
>
> [BB] Since the recent patch to Linux FQ_CoDel, there's now text on an 
> ETC(0/1) distinction in a couple of placed in l4s-arch (search for 
> 'FQ-CoDel').
>
> more...
>
>>
>> =====
>>
>> Final Observation:
>>
>> With our work slaying the Bufferbloat beast almost complete now, I 
>> started thinking about what advice we should be giving to developers 
>> at next year’s Apple Worldwide Developer Conference (WWDC). Should 
>> their apps use ECT(0) for lower latency and lower loss? FQ-CoDel with 
>> ECN is deployed in some places, and increasing. With many home 
>> gateways today you just have to turn it on. Or should their apps use 
>> ECT(1) for even lower latency? L4S is not widely deployed, so ECT(1) 
>> gives zero immediate benefit to the developer using it, but possibly 
>> it might be the better long-term direction. As a developer I know 
>> what I’d pick: “I’ll use ECT(0) now for the immediate benefit. Once 
>> L4S overtakes FQ-CoDel let me know and I may switch.” But if all the 
>> app developers are using ECT(0), L4S will never overtake FQ-CoDel, 
>> because there will be no reason for anyone to ever deploy L4S with 
>> ECT(1) if all the apps are using ECT(0). If all the apps are using 
>> ECT(0) to get FQ-CoDel behaviour, then home gateway vendors will 
>> implement FQ-CoDel with ECT(0) because that’s what shows a measurable 
>> customer benefit immediately.
>>
>> So, let’s think about the pros and cons of an app developer choosing 
>> ECT(0) with Classic Congestion Control vs. ECT(1) with Scalable 
>> Congestion Control.
>>
>> Marking packets ECT(0) instead of “Not ECT” will get the benefit of 
>> lower loss and therefore lower retransmission delays, on both Classic 
>> FQ-CoDel bottlenecks and newer L4S bottlenecks. These congestion 
>> signals however are coarse, so the sender will need to make large 
>> sawtooth rate swings in response to a CE mark, so it will never get 
>> truly low delay and high throughput consistently. It will either 
>> over-fill the queue and get higher delay, or under-fill the queue and 
>> get lower throughput.
>>
>> Marking packets ECT(1) will get the same benefit of lower loss and 
>> therefore lower retransmission delays. In addition, if the bottleneck 
>> is actually L4S, the sender also gets the benefit of fine congestion 
>> signals, allowing more moderate rate adjustments, keeping the queue 
>> consistently short while keeping throughput high.
>>
>> However, if the bottleneck is actually FQ-CoDel, and a single CE mark 
>> is supposed to produce a large rate reduction, but the sender makes 
>> only a modest rate reduction, then the sender will continue to grow 
>> the bottleneck queue. Over time the FQ-CoDel algorithm will slowly 
>> get more persistent with its CE marks until eventually the sender 
>> makes a sufficient rate correction. In the meantime the bottleneck 
>> queue will have grown much worse than it would have done with a 
>> classic sender that reduced its congestion window more aggressively 
>> at the first CE mark.
>>
>> So, an L4S ECT(1) sender hitting a Classic FQ-CoDel bottleneck 
>> actually gets worse queueing delay than a Classic ECT(0) sender. This 
>> is a serious disincentive to client adoption of ECT(1) with Scalable 
>> Congestion Control.
>>
>> I don’t think heuristics to detect the bottleneck queue type can 
>> react fast enough to help.
>
> [BB] See below....
>
>>
>> The bottleneck queue for a path can change rapidly.
>>
>> Consider a customer with 100Mb/s ISP service and a Wi-Fi Access Point 
>> in their home.
>>
>> When the customer is standing close to their Wi-Fi Access Point their 
>> Wi-Fi rate may be 300Mb/s, so their downstream bottleneck link is the 
>> ISP service coming in to their house. Suppose they have an 
>> enlightened ISP that already implements L4S Dual Queue at the 
>> downstream link. The customer’s L4S-dependent client software works 
>> great.
>>
>> When the customer moves a little further from their Wi-Fi Access 
>> Point their Wi-Fi rate may drop to 50Mb/s, so their downstream 
>> bottleneck link is now their Wi-Fi Access Point. Suppose their Wi-Fi 
>> Access Point implements Classic FQ-CoDel but not L4S. The customer’s 
>> L4S-dependent client software now works even worse than simple 
>> loss-based congestion control. Instead of making a drastic reduction 
>> in response to a CE mark, it makes a small reduction and the queue 
>> (and delay) keep growing.
>>
>> As this user walks around their house (or as other people walk around 
>> them) their downstream bottleneck link is going to flip flop between 
>> their ISP’s L4S downstream queue, and their Wi-Fi Access Point’s 
>> Classic FQ-CoDel queue.
>>
>> This suggests that we need a way for client software to use L4S in a 
>> way that takes advantage of an L4S bottleneck when available, but 
>> does no worse than Classic ECN when the bottleneck happens to be 
>> Classic ECN instead of L4S.
>
> [BB] The Classic ECN AQM detection and fallback algo in TCP Prague 
> (currently default off) responds to a switch of AQM within several 
> tens of round trips. There are too many false positives in the current 
> algo, but I'm just pushing back on the question of speed. If we worked 
> on this more (which I'd like to) the way I'm thinking that the false 
> positives could be removed shouldn't slow down the response.
>
> To see how fast it reacts, visit:
>     https://l4s.net/ecn-fbk/results_v2.2/full_heatmap/
> Scroll down to the 'Switch AQMs' heatmaps,
> Select and scenario and click on any 'Classic ECN Variable' link, 
> which shows plots of this variable, which controls adaptation of the 
> CC between scalable and Classic.
> You can see how fast it switches between it's extremes - when it does.
>
> There's explanation of the experiments on the pages, but I can walk 
> you (or anyone) through it if it's all too demanding.
>
> I'll leave the rest of your email as part of the ongoing discussion on 
> this. Suffice to say your point about not creating a deployment 
> dilemma is important and I had already taken it on board when you 
> first hinted at this dilemma a couple of IETF's ago. As you know, 
> we're working on making sure this is not a dilemma that app developers 
> will have to face.
>
> Thank you again for picking up some important points in your review.
> I will now post another rev to l4s-arch (-14), and finally post a rev 
> containing all the backed up changes to ecn-l4s-id (-22).
>
>
> Bob
>
>>
>> Conclusion:
>>
>> Because of the reasons explained eloquently in the documents, queues 
>> are going to be shared. If traffic wants to be in an ultra-low 
>> latency shared queue, it needs to signal its agreement to play nicely 
>> with others and not be bursty (with associated penalty if it is 
>> caught misbehaving). This means we need three input code points:
>>
>> (a) I am not ECT (please drop my packets after the queue gets fairly 
>> big)
>> (b) I understand Classic ECN (please mark after the queue gets fairly 
>> big)
>> (c) I understand Low-Latency ECN (please mark my packets when the 
>> queue starts to grow even slightly, and I promise to reduce my rate 
>> promptly)
>>
>> On the output side what signals do we need?
>>
>> Input (a) does not have a congestion signal output (a dropped packet 
>> has no header fields)
>> Input (b) can have output (d) below
>> Input (c) can have output (d) or (e) below
>>
>> (d) Do drastic congestion window reduction
>> (e) Do moderate congestion window reduction
>>
>> This means we need to encode five values. That doesn’t fit in a 
>> two-bit field. One of the values has to go somewhere else. Maybe a 
>> DiffServ codepoint for input code point (c)?
>>
>> I know this conclusion won’t make people happy. It doesn’t make me 
>> happy, but I don’t see any way around it.
>>
>> We need a way for Scalable Congestion Control senders to tag their 
>> packets in a way that tells the network: “I understand L4S. If my 
>> bottleneck queue at this instant in time happens to be a dumb FIFO 
>> whose queue is full, then this packet will get dropped and I will do 
>> a drastic congestion window reduction in response. If my bottleneck 
>> queue at this instant in time happens to support Classic FQ-CoDel and 
>> it has determined that it has a standing queue then this packet will 
>> be marked (d) and I will do a drastic congestion window reduction in 
>> response. If my bottleneck queue at this instant in time happens to 
>> support L4S and it has a moderate queue then this packet will be 
>> marked (e) and I will do a moderate congestion window reduction in 
>> response.”
>>
>> Stuart Cheshire
>>
>
> [BB] I think this is the critical point. Certainly, our Clss
>
>
> [BB]
>
>

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/