Re: [tsvwg] Prague requirements survey

Sebastian Moeller <moeller0@gmx.de> Sun, 18 April 2021 11:20 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 638003A12E4 for <tsvwg@ietfa.amsl.com>; Sun, 18 Apr 2021 04:20:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.667
X-Spam-Level:
X-Spam-Status: No, score=-1.667 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 28ducwhrMceE for <tsvwg@ietfa.amsl.com>; Sun, 18 Apr 2021 04:20:00 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DA16B3A12E2 for <tsvwg@ietf.org>; Sun, 18 Apr 2021 04:19:59 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1618744791; bh=NuqofL7Vs4uGyrQY4kCF56qvsA/5DsIkIlJlPLvNfM4=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=lHW0ssly+y+fydhjCL1kTbRWrvjZF/J8+4MoxeIgYf8cKLB6+aqvfS/lIVjj0C2PL fdec1IICgzvpbOD9HyIliPHGD8fMID+eXHH/G7m4TKXyO4FPwyznuFhB2TwH/CJx2/ k9wtwaQEZX/QwVRc2QoK0nbsgOn8mnrAMTG8hkW4=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [192.168.42.229] ([77.6.14.20]) by mail.gmx.net (mrgmx105 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MDQic-1lOlo918Cc-00ATvb; Sun, 18 Apr 2021 13:19:51 +0200
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <AM8PR07MB7476F513E7A6551F27DC7295B94A9@AM8PR07MB7476.eurprd07.prod.outlook.com>
Date: Sun, 18 Apr 2021 13:19:48 +0200
Cc: Vidhi Goel <vidhi_goel@apple.com>, tsvwg IETF list <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <1CE56227-7872-4DE3-95B6-EB08F9619A39@gmx.de>
References: <AM8PR07MB7476A907FDD0A49ADBD7CA7EB9BD0@AM8PR07MB7476.eurprd07.prod.outlook.com> <SN2PR00MB017475FC0E8C13754E531E17B6B69@SN2PR00MB0174.namprd00.prod.outlook.com> <AM8PR07MB7476FAE559719D241375A816B9B19@AM8PR07MB7476.eurprd07.prod.outlook.com> <HE1PR0701MB22999C8C05ECA3D995FA7FFEC28F9@HE1PR0701MB2299.eurprd07.prod.outlook.com> <AM8PR07MB7476E0EB3FC368D3C69A5466B98F9@AM8PR07MB7476.eurprd07.prod.outlook.com> <DBBPR07MB7481E1026CDE30D494856F15B9989@DBBPR07MB7481.eurprd07.prod.outlook.com> <AM8PR07MB7476FAEF53518DBFE457AC62B9949@AM8PR07MB7476.eurprd07.prod.outlook.com> <AM8PR07MB747629F14C5AEC5B47F40F56B94C9@AM8PR07MB7476.eurprd07.prod.outlook.com> <92C476A6-3E60-498B-A088-EF24E4B077AC@gmx.de> <83EC2DB8-C42F-4B1D-80C0-F01C2D393A9F@apple.com> <BB1A6362-FB51-471A-BF50-18C882C303E5@gmx.de> <DB7101BA-839C-44E2-B76E-C04F7963B5E5@apple.com> <AM8PR07MB7476F513E7A6551F27DC7295B94A9@AM8PR07MB7476.eurprd07.prod.outlook.com>
To: "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com>
X-Mailer: Apple Mail (2.3445.104.17)
X-Provags-ID: V03:K1:FApkt0CmZnwQiGct78WtQNetX8yx3zHGux1vamKq1X/mBEwOxwO cZlpy+G/zvpXjpe3PaZjZnjNIm4woKZJ8x9Ik+DGhdDA5rQqfYSYPsWU2EsFIFSmDejnHZL JY2xVgFRHiAsswFpMkIuUltYJeypjWkiH/DTzvITT2jeKFFiuq3ewy2MF1E6S2mTA+fRYMG xar15AXoqLw1SS/F1SNrw==
X-UI-Out-Filterresults: notjunk:1;V03:K0:4lFQoVLsqmw=:TXn2Rx6wS17tLIKH509ihH nMdBQ03OGr7SW/MXV8hcDVklGbXuk/4qFjKhRL9mWaB4Z4nltgDbwRSyuoS8kV4u69d2+ajac UgHKTNK4deVRzWO6lcmL+BqZEtYe3+ybwgha8z7Pu9GZmJnmfwHlIs8/q8B8iGZIly/UEfTol 0MlRiVLWYsIPBRj3t6Cjxgs6B8zdQnyBpYKaNpy1fX+EMMzUGUkEir6QzTZZnJzwSci2mSWsb R8XcwMqa/7UGH16g7NdzMvcqpfKUNdmqwgNkN95ilonE+ylmySpLNcmtgKN+mpMdS/opbQTCr uquiqVehj4NO3nj6kBJCmJbflwjrKbNKiaVtAh5zCfuERM2huG9DR43phOXO5cU4HopNqobtr //QRmiA3D5Ylr1mOU+OEH5kDWTBtDeazCE32rY0lXPwn8dR6dpiJ4/cbx4xNBoN7dbi2j2FWN oHRCpJPHK2hsuNCW+BuM4VwqeDFtYz8n5xEqN5ozj6Wux6n2IpwQPkMVSabk49ZxwaRTkKX8i GEOYezCftCqENWHgDkvKDeyusD2NeMTFimZvUHtptldd+0qVe+la9Sm99hr8PcchTQsQhD5me FvGfNrpBIIrjtTAh+tW4BhFMtDefgedNM67EoyeMQH5TbOKV/jFbBPViExGjCzj+ccULzHsWO znDLJIV71N3MZu0275oygJSXvvqdESWI+SegJqPUkQ3I24PvEkh74t8KaUw8pgMJxJIH8qVv5 ZMd6k1Y++unxPJJ3CpFU0/Tx6gyTbXkUu+qFQn32gTHEKSNDhGWEpKdlJtWnPQteuH5o3R5Ey xUodnxdLmoka3NJvvJNZItKllZUNw1hqtyXkKUMtiGjd26Oen7cWBe1QjvZLmKhEbo9xrIbIw zuzPEIRiiitP487o1elPnQSbT+Fl406vsJ3Cof/Q3OZP+p+9YGcZGIBQj8fzTLtWyMTEbSGvK ez4PuLnV7/IGZbNtSCFL/lp4u5Hb3f5ArBmh9LEEAtyKFOhoU9xVU1BUFmbyZi370Gt1cu/mE gybIUOnOB5nS1qkfZQv7e8iF1UWszUAC4E8GbObr3UegSQzPiC9kL7Lfk5E/qgVVmQ7NM7/37 ugsK7gmVdg2WOiRTxzyMaLbkMj7o4/bSfl6laD7nz5dgthVX8nqlvebmVh9hTOSjs4pq5qjXp NOnYuipLL5ZPcbHTrzSpda3xMzzICxw+3kGgnyAeTL+FxjSl06wnJ2foPwFK4M/ZsOYQE=
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/B9q7s7qqG-wwI8k2KgfxYmPrODM>
Subject: Re: [tsvwg] Prague requirements survey
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 18 Apr 2021 11:20:05 -0000

Hi Koen,

more below prefixed [SM].

Tl;dr: Both TCP Prague and the dual queue coupled AQM both independently increase the observable RTT-bias compared to both the status quo ("dumb" FIFOs) or the state of the art (mostly FQ or approximate FQ AQMs), neither supports the claim, that RTT-bias can and should be addressed in end-points only.


> On Apr 18, 2021, at 12:11, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
> 
> Hi Sebastian, Vidhi,
> 
> Some background:
> 
> RTT (in)dependence is an end-point property, that can best be corrected in the endpoints.

	[SM] History disagrees with you, there exis zero general RTT-fairness solutions that work purely from endpoints, but there are several solutions that work at the AQM level. I have posted references to plots showing that explicitly. So if you maintain this is fixable best from end-points, please post references demonstrating that fact.
	As I explained before, RTT-unfairness happens as an effect of interactions of the control-loops of different RTT flows INSIDE the bottleneck buffer, so unless you can fix the RTT differences, fixing it in the management of the bottleneck buffer is the second best (but more realistic) approach. If you disagree, aagin, please cite references of end-point only solutions that are actually palatable for internet scale RTTs (from 1ms (FTTH) to 500ms (geostationary satellites)). And no, TCP Pragues approach of playbing dumb seems not acceptable if the corrected RTT is set to a more realistic 100ms or realistic worst case of 500ms. If you disagree, I am waiting for TCP Pragues default change to >> 25ms.


> It could be solved in the network if:
> - the network can identify flows and schedule then accordingly: FQ_x

	[SM] I agree, a control system that needs to arbitrate between different entities, needs to identify those entities.

> - the endpoints add more info (RTT) in the packet headers: RCP, XCP, ...

	[SM] Well, even without that additional information, sate of the art AQMs already do a much better job at RTT-fairness than L4S's reference AQM, heck, even a dumb FIFO seems superior here. So I agree, having per flow RTT information would theoretically helpful and good to have, it is by no means necessary to do better than L4S (given that even a FIFO does better).


> - the network can identify flows and adapts the marking/drop to let flows converge to the same throughput: CSAQM, ...
> - the network can identify flows and "estimate/guess" their RTT and adapt the marking/drop probability: I'm sure you will find research...
> - ...

	[SM] Again, for a perfect solution all of that seems to apply, but as the comparison with a FIFO, which does nothing of that sort demonstrates, it is well possible to do better than L4S even without all that trickery and sophistication. Let's aim for good enough, instead of perfect, and we see that the dual queue coupled AQM is simply defective in te RTT-bias department, and ameliorating that is possible without having to change end-point behavior.\


> 
> So unless we further expand headers or make it the responsibility to identify and schedule flows, measure rates or guess RTTs in the network, we better solve the problem in the entity that causes it. Assuming from here we restrict us to the latter:

	[SM] Wrong root-cause and hence wrong solution. This looks also like an attempt to ignore that L4S noticeably increases RTT-bias at the AQM level, and trying to push the fix for that AQM deficiency into the end-points seems unduly onerous.

> 
> Sebastian, as you mentioned, there has been a lot of research in the past, which resulted in a lot of solutions.

	[SM] Except, we have no generally applicable solution that actually got deployed over the internet (okay, for some RTT/rate regimes CUBiC does better than say reno, but all of that is rather minor). So I agree there have been attempts at solutions, sure, but that is not the same as actual solutions, sorry.


> The problem is not that these don't work, but that converging at a lower rate for the lower end RTTs requires a clean slate reset of all endpoints, as nobody would start doing that unilaterally.

	[SM] Which means, these are not general solutions, sorry. But that is expected for end-point only solutions, these try to paper over the issue from the wrong place in the network, and hence can not solve the issue. And e.g. having all end-points agree to use an identical RTT independent cwnd-gowth function is just that, trying to paper over what happens in the bottleneck buffers. 

> L4S is a clean slate starting point where we set new rules. During the first Prague meeting there was a lot of enthusiasm and proposals just because of this new opportunity. So if we want to introduce RTT independence, this is the moment, and all this previous research can now be used and be deployed.

	[SM] Again, you are not doing that at all. all you defaulted TCP Prague to is to paper over the dual queue AQMs sub-optimal sharing behavior at short RTTs. If one repeated Pete's test with RTTs of 26 ms and 176 ms (Pete's 10/160 shifted to just fall outside of TCP Pragues "fixed" RTT range) one would still see considerably increased RTT bias as compared to a FIFO "solution". I fail to see, how you made any significant dent into RTT-bias at all, sorry.


> 
> As L4S removed the queue completely when competing with Classic over a DualQ and limits it to 1ms when not, the previous role of the (large) queue to middle out RTT unfairness completely disappears.

	[SM] Yes, you keep repeating that argument, but I keep telling you, since you are the party that inflicts ~20ms queueing delay on the classic queue, you can not blame the resulting unfairness as RTT-bias that neds to be fixed in the end-points. You broke it, you need to fix it.

> So even in an L4S-only world, the unfairness would be unsustainable and needs a solution.

	[SM] Yes, this needs a solution, but I do not think that you actually found one, given that TCP Prague over an L4S AQM shows stronger self RTT bias than CUBIC over an L4S AQM (see pairs 1 and 4, as well as 5 and 8 in https://camo.githubusercontent.com/0ca81a2fabe48e8fce0f98f8b8347c79d27340684fe0791a3ee6685cf4cdb02e/687474703a2f2f7363652e646e736d67722e6e65742f726573756c74732f6c34732d323032302d31312d3131543132303030302d66696e616c2f73312d6368617274732f727474666169725f63635f71646973635f31306d735f3136306d732e737667). This is one more data point for my hypothesis that trying to fix that in the end-points is a fool's errand. You are making things worse, and yet you market that under increasing RTT-independence, oh the irony.


> As this also solves the DualQ created imbalance is part of the total concept (which otherwise could only be solved by setting both queues to the same RTT target, defying the purpose of the 2 queues and DualQ at all).

	[SM] Please show data from a 26ms CUBIC flow competing with a 176 ms TCP Prague flow in an L4S AQM (as well as the opposite pair) as well as pairs of CUBIC, and TCP Prague flows at 26 and 176 competing with itself (all with the default 25ms value foe TCP Prague and a path RTT << 1 ms). I predict that L4S' RTT-bias is still going to be noticeably worse than the reference FIFO. If you manage to improve upon the default FIFO we can keep talking.


> 
> Then RTT independence, means that we need to converge to a (more) equal rate when RTTs are different. This means that based on the marking signal we need to agree on a common marking rate (which automatically emerges when a marker marks packets with equal probability and all flows have an equal rate).

	[SM] AS said before, that is a possible solution feasible in the end-points, but it is equivalent in trading in temporal fidelity in the response dynamics, which seems opposite to the goal behind high-frequency congestion signaling.


> If such a marking rate in marks per second is defined, it can automatically be translated in a reference RTT when taking the marks per RTT of an existing congestion control into account.

	[SM] Can you share data how well this would works if the reference RTT is shorter than the true path RTT of a flow. My intuition tells me this is going to fail unless refRTT >= pRTT, but I am happy to convinced otherwise by data.


> This is not a trick, or hack, it is just a result of the concept (RTT independence). We can discuss a reference in marks per second (marking rate) or in terms of a Reference RTT (DCTCP AIMD converging to a fixed 2 marks per RTT being a good reference base CC for that purpose). So "pretending" to be a 25ms is exactly what we do if we set the Reference RTT to 25ms, or set the marking rate to 80 marks per second.

	[SM] But you are not doing that by default in TCP Prague, you are simply set the cwnd growth dynamics to assume RTT = max(25ms, pathRTT), that is not what you claim above. Not a vote of confidence in your own solution, sorry.


> For evenly distributed marks this would be around 12.5ms per mark, which would be a very frequent signal for converging to a fair rate. 25ms also is a useful number, as it is a practical lower limit for Classic RTTs on congested links, and require no changes for the Classic flows.

	[SM] Except for the fact that the delay target of the classic AQM is a configurable number (e.g. in dual queue coupled AQM, see https://github.com/L4STeam/iproute2/blob/master/tc/q_dualpi2.c), now for this to make sense, the potentially user-configurable number in the end-point protocol needs to be kept in lock step with the operator configurable number of all L4S AQMs along the path... this does not appear to be a robust and reliable engineering solution to me, sorry, especially since there is no easy way for end-points to measure the value fro the network. But I have made this argument before. IFF you believe this to be the way forward, you would at least need to make these numbers only configurable at compile time to make it reasonably likely that these actually match.


> It is of course an additional opportunity for Classic CCs to also increase the rate for higher RTTs (but not necessarily for lower RTTs, although this wouldn't hurt them much, as under congestion they wouldn't see lower RTTs on the Internet anyway).

	[SM] Assuming all bottlenecks are L4S-compliant, which is not going to happen any time soon (if at all). So we are now talking about two separate code paths depending on the estimated behaviour of the bottleneck along the path. Not sure how realistic any such change is going to be for the duration of the L4S experiment.


> CuBic was designed to be a fairness/performance compromise to Reno on the longer RTTs.

	[SM] Yes, and it does so, mildly. Note how CUBIC did not fix RTT-bias for good.

> If it would have been acceptable at that time, they would have set the compromise more towards the performance.
> Today most traffic comes from nearby datacenters,

	[SM] Often, but not always, we need to make sure that longer RTT flows will not be starved by those common short RTT flows, otherwise we will considerable decrease the range of reasonably reachable internet. I am not convinced that e.g. European gamers playing by choice on an east coast server, will be happy if their "siblings" video streaming will kill their gaming sessions fidelity, by virtue of coming from close by CDNs. Short RTT flows will always have an edge, by virtue of their tighter control loops, no need to make that worse.


> making most traffic experiencing less than 50ms latency, so I believe setting a reference RTT of 25ms also for higher RTTs would be completely acceptable today.

	[SM] NO, as the consequence of that choice is strangling/starving longer RTT flows. This confirms my hypothesis that L4S really is just an exercise in building a short-RTT priority/fast-lane, but I do not believe that the internet as a whole was waiting for that, sorry.


Best Regards
	Sebastian


> As a final remark, converging to a steady state rate in the past was always seen as a property of a single mechanism (AIMD of +1 and /2 for Reno, and +cubic(t) and *0.7 for Cubic). I believe we are past simple single ACK response mechanisms (see BBR, ...)
> where models based on measurements and different states adapt the response and selects appropriate mechanism. When we detect we are out of steady state (0% or 100% marking for a while), the selected mechanism can be RTT dependent (getting up to speed, avoiding latency, ...), once back in sync with the steady marking rate, the RTT independent response can be selected (whatever the mechanism is).
> 
> Hope this clarifies.
> 
> Koen.
> 
> -----Original Message-----
> From: Vidhi Goel <vidhi_goel@apple.com> 
> Sent: Sunday, April 18, 2021 4:48 AM
> To: Sebastian Moeller <moeller0@gmx.de>
> Cc: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>om>; tsvwg IETF list <tsvwg@ietf.org>
> Subject: Re: [tsvwg] Prague requirements survey
> 
> Hi Sebastian,
> 
>>> I think the simple proposal in Linux (which you already know) is a good starting point. A
>> 
>> 	Are we talking about the same Linux proposal here ((https://github.com/L4STeam/linux/commit/a2ef76f8da1c9d1b13fa941f55607f3e60d4112e)? Where TCP Prague is instructed to basically behave like ("pretend") there was a fixed lower bound RTT when growing the congestion window? IMHO that is not a good starting point for fixing RTT-bias, given that that over the internet RTTs easily range from low single to low triple digit ms. Unless we are prepared to set, say 100 ms as our lower bound RTT this approach will not work for the internet, but once we do that we are giving up one of the advantages that high frequency congestion signaling is promising, faster reactions to congestion signals. 
>> 
>> 	I want to note that TCP Prague by default only tries to equalize RTTs up to 25 ms, which indicates that not even its developers consider it as a generic solution (or they believe that 25ms is a "magic" RTT on the internet). I also note that the root-cause for adding that feature to TCP Prague, was/is the fact that the dual queue coupled AQM failed to properly share capacity between its two queues at low RTTs. This failure is rationalized as being an effect of RTT-bias caused by the difference in queueing delay between the two queues (~1ms fir the LL queue, ~20ms for the classic queue) and the proposed solution is to make TCP Prague not grow its congestion window faster than a 25 ms RTT flow. In other words this is not meaningfully addressing RTT-bias, but is fixing a deficiency in L4S's reference AQM*.
> 
> Yes, we are talking about the same proposal.
> At the time I read the Linux Prague proposal, I didn’t realize the rationale behind it and now I understand it better with your reasoning. I agree that we should not fix RTT bias which is purely created by the L4S dual queue.
> 
>>> s a community, we might come up with more heuristics / tunable parameters to handle edge cases.
>> 
>> 	Sorry, for the last decades people have worked on RTT-bias and no generic solution based solely on end-point actions has been found. I am not saying that this is impossible, but it it is quite unlikely that this is easy enough for the community to come up with a solution. And TCP Pragure is IMHO not a promising contender for a generic solution.
>> 	IMHO, the problem is that the issue is not caused by the endpoints in the first place, but by the interaction of control loops of different "fidelity"/reaction times in bottleneck buffers. This can easily be seen in that a properly configured TCP flow can approach bottleneck capacity when run as the sole flow over a bottleneck, but will be suppressed if competing with TCP flows of shorter RTT in the same bottleneck. It hence seems clear that management of the bottleneck is at least as important to counter RTT-bias as the endpoints's control loops. The L4S approach of relegating the issue solely to the endpoints/protocols to fix, instead of also making the AQM part of the solution strikes me as short-sighted especially in the light of deployment of an AQM being one of the core pillars of the L4S design.
> The problem of RTT-unfairness arises from different ACK clocking speeds based on RTT. If the propagation delay is different for two flows, then there is nothing that AQM can do. OTOH, if the propagation delay is same for two flows, and it is really the buffering (queuing) delay that is causing RTT unfairness, then I agree with you that we should solve this problem at the bottleneck.
> 
> I believe you are concerned about the latter scenario and yes in this case, we should not try to solve the RTT bias at the endpoint as that could be counter productive to what we are trying to achieve with scalable congestion controllers.
> 
> Thanks,
> Vidhi
> 
>> On Apr 16, 2021, at 3:45 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
>> 
>> Hi Vidhi,
>> 
>> 
>>> On Apr 16, 2021, at 23:21, Vidhi Goel <vidhi_goel@apple.com> wrote:
>>> 
>>> Hi Sebastian,
>>> 
>>>> If this is easy to implement, could you please propose a description of such a solution to the mailing list please? As far as I can tell RT- bias has been a topic of research for decades and still no general solution has beed presented, so I am quite interested to learn more about this comment. Even if the response is something like "for the expected range of RTTs from 1ms to 20 ms" a solution like TCP Pragues, pretend all RTTs are 20ms" I am quite interested in apple's thoughts.
>>> 
>>> I think the simple proposal in Linux (which you already know) is a good starting point. A
>> 
>> 	Are we talking about the same Linux proposal here ((https://github.com/L4STeam/linux/commit/a2ef76f8da1c9d1b13fa941f55607f3e60d4112e)? Where TCP Prague is instructed to basically behave like ("pretend") there was a fixed lower bound RTT when growing the congestion window? IMHO that is not a good starting point for fixing RTT-bias, given that that over the internet RTTs easily range from low single to low triple digit ms. Unless we are prepared to set, say 100 ms as our lower bound RTT this approach will not work for the internet, but once we do that we are giving up one of the advantages that high frequency congestion signaling is promising, faster reactions to congestion signals. 
>> 
>> 	I want to note that TCP Prague by default only tries to equalize RTTs up to 25 ms, which indicates that not even its developers consider it as a generic solution (or they believe that 25ms is a "magic" RTT on the internet). I also note that the root-cause for adding that feature to TCP Prague, was/is the fact that the dual queue coupled AQM failed to properly share capacity between its two queues at low RTTs. This failure is rationalized as being an effect of RTT-bias caused by the difference in queueing delay between the two queues (~1ms fir the LL queue, ~20ms for the classic queue) and the proposed solution is to make TCP Prague not grow its congestion window faster than a 25 ms RTT flow. In other words this is not meaningfully addressing RTT-bias, but is fixing a deficiency in L4S's reference AQM*.
>> 
>>> s a community, we might come up with more heuristics / tunable parameters to handle edge cases.
>> 
>> 	Sorry, for the last decades people have worked on RTT-bias and no generic solution based solely on end-point actions has been found. I am not saying that this is impossible, but it it is quite unlikely that this is easy enough for the community to come up with a solution. And TCP Pragure is IMHO not a promising contender for a generic solution.
>> 	IMHO, the problem is that the issue is not caused by the endpoints in the first place, but by the interaction of control loops of different "fidelity"/reaction times in bottleneck buffers. This can easily be seen in that a properly configured TCP flow can approach bottleneck capacity when run as the sole flow over a bottleneck, but will be suppressed if competing with TCP flows of shorter RTT in the same bottleneck. It hence seems clear that management of the bottleneck is at least as important to counter RTT-bias as the endpoints's control loops. The L4S approach of relegating the issue solely to the endpoints/protocols to fix, instead of also making the AQM part of the solution strikes me as short-sighted especially in the light of deployment of an AQM being one of the core pillars of the L4S design.
>> 
>>> https://l4steam.github.io/PragueReqs/Linux_TCP_Prague_L4S_requirements_Compliance_and_Objections.pdf
>> 
>> Best Regards
>> 	Sebastian
>> 
>> *) And doing so before actual deployment, at a point in time when that AQM could actually still be fixed for good.
>> 
>> 
>>> 
>>> Thanks,
>>> Vidhi
>>> 
>>>> On Apr 16, 2021, at 7:16 AM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>>> 
>>>> Hi Koen,
>>>> 
>>>> Thanks,.
>>>> 
>>>> Here is a question for Apple though:
>>>> 
>>>> "5. Reduce RTT dependence (A1.5)
>>>> Section 4.3: A scalable congestion control MUST eliminate RTT bias as much as possible in the range between the minimum likely RTT and typical RTTs expected in the intended deployment scenario.
>>>> Apple's comment:<page1image4260772480.png>		
>>>> Again, agreed with the rationale behind this and the MUST compliance. This might be easy to implement as well based on heuristics but will require thorough testing."
>>>> 
>>>> 
>>>> If this is easy to implement, could you please propose a description of such a solution to the mailing list please? As far as I can tell RT- bias has been a topic of research for decades and still no general solution has beed presented, so I am quite interested to learn more about this comment. Even if the response is something like "for the expected range of RTTs from 1ms to 20 ms" a solution like TCP Pragues, pretend all RTTs are 20ms" I am quite interested in apple's thoughts.
>>>> 
>>>> Best Regards
>>>> 	Sebastian
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On Apr 16, 2021, at 14:52, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> An update on the survey is available. We received an additional input from Apple which we could publicly share (thanks Vidhi for providing this input). I also updated the consolidated view v2 (available onhttps://github.com/L4STeam/l4steam.github.io#prague-requirements-compliance).
>>>>> 
>>>>> I believe it is strongly in line with the previous survey conclusions as presented in last tsvwg. One main additional feedback was on “7. Measuring Reordering Tolerance in Time Units”. There was disagreement that using time only and not packet count is a foolproof solution. As far as I understand the objection is to the current wording that a time based mechanism is the only/sufficient way to assure this.
>>>>> 
>>>>> The objective of this requirement is to allow a certain level of reordering for L4S traffic (actually avoid delaying packets in the network to guarantee correct order of packet delivery). I personally could support wording that expresses the core of the requirement, and not limit the text to one mechanism, which would allow alternative/more robust implementations. The requirement could be expressed as something like: “a scalable congestion control SHOULD  be resilient to reordering over an (adaptive) (time?) interval, which scales with / adapts to throughput, as opposed to counting only in (fixed) units of packets (as in the 3 DupACK rule of RFC 5681 TCP), which is not scalable”. Let’s further discuss here on the list what could be for all parties an acceptable wording.
>>>>> 
>>>>> Thanks,
>>>>> Koen.
>>>>> 
>>>>> 
>>>>> From: De Schepper, Koen (Nokia - BE/Antwerp) 
>>>>> Sent: Sunday, March 7, 2021 1:57 AM
>>>>> To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>om>; tsvwg IETF list <tsvwg@ietf.org>
>>>>> Cc: Bob Briscoe <ietf@bobbriscoe.net>
>>>>> Subject: RE: Prague requirements survey
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> The details of the consolidated view of all feedback received is available and can be found via following link: https://l4steam.github.io/PragueReqs/Prague_requirements_consolidated.pdf
>>>>> 
>>>>> The only strong objections were against the “MUST document” requirements, which will be removed from the next version of the draft. Some clarifications were asked and (will be) added.
>>>>> For 2 requirements a big consensus was that they should be developed and evolved as needed during the experiment.
>>>>> All other requirements had already implementations and if not, were seen feasible/realizable and were planned to be implemented.
>>>>> 
>>>>> We will present an overview during the meeting.
>>>>> 
>>>>> Regards,
>>>>> Koen.
>>>>> 
>>>>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of De Schepper, Koen (Nokia - BE/Antwerp)
>>>>> Sent: Wednesday, March 3, 2021 2:20 PM
>>>>> To: tsvwg IETF list <tsvwg@ietf.org>
>>>>> Subject: Re: [tsvwg] Prague requirements survey
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> We have received several surveys privately, for which I tried to get the approval for sharing those on the overview page: l4steam.github.io | L4S-related experiments and companion website
>>>>> 
>>>>> Thanks to NVIDIA for sharing their view and feedback for their GeforceNow congestion control. Their feedback was added to the above overview about a week ago. As we didn’t get the explicit approval for the others, we will share and present a consolidated view of all feedback received later and during the meeting.
>>>>> 
>>>>> Note: pdf versions are now also available on the above page for easier reading.
>>>>> 
>>>>> Koen.
>>>>> 
>>>>> 
>>>>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of De Schepper, Koen (Nokia - BE/Antwerp)
>>>>> Sent: Monday, February 8, 2021 2:37 PM
>>>>> To: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>om>; tsvwg IETF list <tsvwg@ietf.org>
>>>>> Subject: Re: [tsvwg] Prague requirements survey
>>>>> 
>>>>> Hi Ingemar,
>>>>> 
>>>>> Thanks for your contributions. I linked your doc to the https://l4steam.github.io/#prague-requirements-compliance web page (and will do so for others).
>>>>> 
>>>>> I didn’t see any issues or objections mentioned to the current requirements as specified in the draft. Does this mean you think they are all reasonable, valid and feasible?
>>>>> 
>>>>> Interesting observation (related to the performance optimization topic 1) that for the control packets “RTCP is likely not using ECT(1)”. Why is this not likely? I assume this will impact the performance? Do we need to recommend the use of ECT(1) on RTCP packets in the draft?
>>>>> 
>>>>> Thanks,
>>>>> Koen.
>>>>> 
>>>>> From: Ingemar Johansson S <ingemar.s.johansson@ericsson.com> 
>>>>> Sent: Monday, February 8, 2021 10:59 AM
>>>>> To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>om>; tsvwg IETF list <tsvwg@ietf.org>
>>>>> Cc: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
>>>>> Subject: RE: Prague requirements survey
>>>>> 
>>>>> Hi
>>>>> Please find attached (hopefully) a Prague requirements survey applied to SCReAM (RFC8298 std + running code)
>>>>> 
>>>>> Regards
>>>>> Ingemar
>>>>> 
>>>>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of De Schepper, Koen (Nokia - BE/Antwerp)
>>>>> Sent: den 6 februari 2021 23:20
>>>>> To: tsvwg IETF list <tsvwg@ietf.org>
>>>>> Subject: [tsvwg] Prague requirements survey
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> To get a better understanding on the level of consensus on the Prague requirements, we prepared an overview document listing the L4S-ID draft requirements specific to the CC (wider Prague requirements), as a questionnaire towards potential CC developers. If you are developing or have developed an L4S congestion control, you can describe the status of your ongoing development in the second last column. If you cannot share status, or plan-to/would implement an L4S CC, you can list what you would want to support (see feasible). In the last column you can put any description/limitations/remarks/explanations related to evaluations, implementations and/or plans (will implement or will not implement). Any expected or experienced issues and any objections/disagreements to the requirement can be explained and colored appropriately.
>>>>> 
>>>>> The document can be found on following link: https://raw.githubusercontent.com/L4STeam/l4steam.github.io/master/PragueReqs/Prague_requirements_Compliance_and_Objections_template.docx
>>>>> 
>>>>> As an example I filled it for the Linux TCP-Prague implementation on following link: https://l4steam.github.io/PragueReqs/Prague_requirements_Compliance_and_Objections_Linux_TCP-Prague.docx
>>>>> 
>>>>> Please send your filled document to the list (Not sure if an attachment will work, so I assume you also need to store it somewhere and send a link to it, or send to me directly).
>>>>> 
>>>>> We hope to collect many answers, understanding the position of the different (potential) implementers and come faster to consensus.
>>>>> 
>>>>> Thanks,
>>>>> Koen.
>>>> 
>>> 
>> 
>