Re: [tsvwg] Prague requirements survey

Vidhi Goel <vidhi_goel@apple.com> Sun, 18 April 2021 02:48 UTC

Return-Path: <vidhi_goel@apple.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 00E913A3D7B for <tsvwg@ietfa.amsl.com>; Sat, 17 Apr 2021 19:48:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.12
X-Spam-Level:
X-Spam-Status: No, score=-2.12 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=apple.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PojG9Kp0kiw0 for <tsvwg@ietfa.amsl.com>; Sat, 17 Apr 2021 19:48:32 -0700 (PDT)
Received: from rn-mailsvcp-ppex-lapp34.apple.com (rn-mailsvcp-ppex-lapp34.rno.apple.com [17.179.253.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 264AB3A3D79 for <tsvwg@ietf.org>; Sat, 17 Apr 2021 19:48:31 -0700 (PDT)
Received: from pps.filterd (rn-mailsvcp-ppex-lapp34.rno.apple.com [127.0.0.1]) by rn-mailsvcp-ppex-lapp34.rno.apple.com (8.16.1.2/8.16.1.2) with SMTP id 13I2fx1X020289; Sat, 17 Apr 2021 19:48:28 -0700
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com; h=content-type : mime-version : subject : from : in-reply-to : date : cc : content-transfer-encoding : message-id : references : to; s=20180706; bh=4zrRZQn4j3lv6Vi6SiYnY2hHk2IG/Yl7vmO0qRnzfzc=; b=BUP2XhmOb9hgUCLTcrMpNjJyQ09NRq0qg3Jr/yIR/fPLUNm2gNqdMoLySH26iUiJM3/L 3fNtfqYEqt3wvXjUxUApjf+8MnEhZpOiWdoVUpKW1G3p+HYhbmXL7SLChMKIxrQPpQnV XBDMwVDwHeKanKStfeIMTLdRTpaSMHsmLbVGgVZTsIBzwNFS9d7hkKJ547SasUfV3HFR 8cGdiTmHUntN4ens6P5dwvarpMBEKRxaIgQW1tm7Ko/h9qRuTYu51OZ4lYJN/GJ9VYcS lyo9OTgIhEbWdj7xAtT0CmnvP+HqB18YHsbmV1A/sZ560UD9VHCmGUNZhL0pl9eJXF/t gQ==
Received: from rn-mailsvcp-mta-lapp02.rno.apple.com (rn-mailsvcp-mta-lapp02.rno.apple.com [10.225.203.150]) by rn-mailsvcp-ppex-lapp34.rno.apple.com with ESMTP id 37yx89p8x3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Sat, 17 Apr 2021 19:48:28 -0700
Received: from rn-mailsvcp-mmp-lapp03.rno.apple.com (rn-mailsvcp-mmp-lapp03.rno.apple.com [17.179.253.16]) by rn-mailsvcp-mta-lapp02.rno.apple.com (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec 3 2020)) with ESMTPS id <0QRQ00Q24NSSPL20@rn-mailsvcp-mta-lapp02.rno.apple.com>; Sat, 17 Apr 2021 19:48:28 -0700 (PDT)
Received: from process_milters-daemon.rn-mailsvcp-mmp-lapp03.rno.apple.com by rn-mailsvcp-mmp-lapp03.rno.apple.com (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec 3 2020)) id <0QRQ00300NEPIT00@rn-mailsvcp-mmp-lapp03.rno.apple.com>; Sat, 17 Apr 2021 19:48:28 -0700 (PDT)
X-Va-A:
X-Va-T-CD: 0af778c0afa90afa8c4c05937d25c782
X-Va-E-CD: ba6a5fdf3c618af542e410a6471b11c2
X-Va-R-CD: 6be133926c92fc4ea0ae4e0950790501
X-Va-CD: 0
X-Va-ID: 1b1c6160-c355-4f96-997b-6b3f28b36147
X-V-A:
X-V-T-CD: 0af778c0afa90afa8c4c05937d25c782
X-V-E-CD: ba6a5fdf3c618af542e410a6471b11c2
X-V-R-CD: 6be133926c92fc4ea0ae4e0950790501
X-V-CD: 0
X-V-ID: 1fea0753-4ede-4405-8815-30d78a15b3ff
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.761 definitions=2021-04-17_16:2021-04-16, 2021-04-17 signatures=0
Received: from [17.11.88.83] (unknown [17.11.88.83]) by rn-mailsvcp-mmp-lapp03.rno.apple.com (Oracle Communications Messaging Server 8.1.0.7.20201203 64bit (built Dec 3 2020)) with ESMTPSA id <0QRQ00D48NSRKO00@rn-mailsvcp-mmp-lapp03.rno.apple.com>; Sat, 17 Apr 2021 19:48:28 -0700 (PDT)
Content-type: text/plain; charset=utf-8
MIME-version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\))
From: Vidhi Goel <vidhi_goel@apple.com>
In-reply-to: <BB1A6362-FB51-471A-BF50-18C882C303E5@gmx.de>
Date: Sat, 17 Apr 2021 19:48:27 -0700
Cc: "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com>, tsvwg IETF list <tsvwg@ietf.org>
Content-transfer-encoding: quoted-printable
Message-id: <DB7101BA-839C-44E2-B76E-C04F7963B5E5@apple.com>
References: <AM8PR07MB7476A907FDD0A49ADBD7CA7EB9BD0@AM8PR07MB7476.eurprd07.prod.outlook.com> <SN2PR00MB017475FC0E8C13754E531E17B6B69@SN2PR00MB0174.namprd00.prod.outlook.com> <AM8PR07MB7476FAE559719D241375A816B9B19@AM8PR07MB7476.eurprd07.prod.outlook.com> <HE1PR0701MB22999C8C05ECA3D995FA7FFEC28F9@HE1PR0701MB2299.eurprd07.prod.outlook.com> <AM8PR07MB7476E0EB3FC368D3C69A5466B98F9@AM8PR07MB7476.eurprd07.prod.outlook.com> <DBBPR07MB7481E1026CDE30D494856F15B9989@DBBPR07MB7481.eurprd07.prod.outlook.com> <AM8PR07MB7476FAEF53518DBFE457AC62B9949@AM8PR07MB7476.eurprd07.prod.outlook.com> <AM8PR07MB747629F14C5AEC5B47F40F56B94C9@AM8PR07MB7476.eurprd07.prod.outlook.com> <92C476A6-3E60-498B-A088-EF24E4B077AC@gmx.de> <83EC2DB8-C42F-4B1D-80C0-F01C2D393A9F@apple.com> <BB1A6362-FB51-471A-BF50-18C882C303E5@gmx.de>
To: Sebastian Moeller <moeller0@gmx.de>
X-Mailer: Apple Mail (2.3608.80.23.2.2)
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.761 definitions=2021-04-17_16:2021-04-16, 2021-04-17 signatures=0
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/mVKFf3pSLsrzUqK8kdrCx9I8R7U>
Subject: Re: [tsvwg] Prague requirements survey
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 18 Apr 2021 02:48:37 -0000

Hi Sebastian,

>> I think the simple proposal in Linux (which you already know) is a good starting point. A
> 
> 	Are we talking about the same Linux proposal here ((https://github.com/L4STeam/linux/commit/a2ef76f8da1c9d1b13fa941f55607f3e60d4112e)? Where TCP Prague is instructed to basically behave like ("pretend") there was a fixed lower bound RTT when growing the congestion window? IMHO that is not a good starting point for fixing RTT-bias, given that that over the internet RTTs easily range from low single to low triple digit ms. Unless we are prepared to set, say 100 ms as our lower bound RTT this approach will not work for the internet, but once we do that we are giving up one of the advantages that high frequency congestion signaling is promising, faster reactions to congestion signals. 
> 
> 	I want to note that TCP Prague by default only tries to equalize RTTs up to 25 ms, which indicates that not even its developers consider it as a generic solution (or they believe that 25ms is a "magic" RTT on the internet). I also note that the root-cause for adding that feature to TCP Prague, was/is the fact that the dual queue coupled AQM failed to properly share capacity between its two queues at low RTTs. This failure is rationalized as being an effect of RTT-bias caused by the difference in queueing delay between the two queues (~1ms fir the LL queue, ~20ms for the classic queue) and the proposed solution is to make TCP Prague not grow its congestion window faster than a 25 ms RTT flow. In other words this is not meaningfully addressing RTT-bias, but is fixing a deficiency in L4S's reference AQM*.

Yes, we are talking about the same proposal.
At the time I read the Linux Prague proposal, I didn’t realize the rationale behind it and now I understand it better with your reasoning. I agree that we should not fix RTT bias which is purely created by the L4S dual queue.

>> s a community, we might come up with more heuristics / tunable parameters to handle edge cases.
> 
> 	Sorry, for the last decades people have worked on RTT-bias and no generic solution based solely on end-point actions has been found. I am not saying that this is impossible, but it it is quite unlikely that this is easy enough for the community to come up with a solution. And TCP Pragure is IMHO not a promising contender for a generic solution.
> 	IMHO, the problem is that the issue is not caused by the endpoints in the first place, but by the interaction of control loops of different "fidelity"/reaction times in bottleneck buffers. This can easily be seen in that a properly configured TCP flow can approach bottleneck capacity when run as the sole flow over a bottleneck, but will be suppressed if competing with TCP flows of shorter RTT in the same bottleneck. It hence seems clear that management of the bottleneck is at least as important to counter RTT-bias as the endpoints's control loops. The L4S approach of relegating the issue solely to the endpoints/protocols to fix, instead of also making the AQM part of the solution strikes me as short-sighted especially in the light of deployment of an AQM being one of the core pillars of the L4S design.
The problem of RTT-unfairness arises from different ACK clocking speeds based on RTT. If the propagation delay is different for two flows, then there is nothing that AQM can do. OTOH, if the propagation delay is same for two flows, and it is really the buffering (queuing) delay that is causing RTT unfairness, then I agree with you that we should solve this problem at the bottleneck.

I believe you are concerned about the latter scenario and yes in this case, we should not try to solve the RTT bias at the endpoint as that could be counter productive to what we are trying to achieve with scalable congestion controllers.

Thanks,
Vidhi

> On Apr 16, 2021, at 3:45 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
> Hi Vidhi,
> 
> 
>> On Apr 16, 2021, at 23:21, Vidhi Goel <vidhi_goel@apple.com> wrote:
>> 
>> Hi Sebastian,
>> 
>>> If this is easy to implement, could you please propose a description of such a solution to the mailing list please? As far as I can tell RT- bias has been a topic of research for decades and still no general solution has beed presented, so I am quite interested to learn more about this comment. Even if the response is something like "for the expected range of RTTs from 1ms to 20 ms" a solution like TCP Pragues, pretend all RTTs are 20ms" I am quite interested in apple's thoughts.
>> 
>> I think the simple proposal in Linux (which you already know) is a good starting point. A
> 
> 	Are we talking about the same Linux proposal here ((https://github.com/L4STeam/linux/commit/a2ef76f8da1c9d1b13fa941f55607f3e60d4112e)? Where TCP Prague is instructed to basically behave like ("pretend") there was a fixed lower bound RTT when growing the congestion window? IMHO that is not a good starting point for fixing RTT-bias, given that that over the internet RTTs easily range from low single to low triple digit ms. Unless we are prepared to set, say 100 ms as our lower bound RTT this approach will not work for the internet, but once we do that we are giving up one of the advantages that high frequency congestion signaling is promising, faster reactions to congestion signals. 
> 
> 	I want to note that TCP Prague by default only tries to equalize RTTs up to 25 ms, which indicates that not even its developers consider it as a generic solution (or they believe that 25ms is a "magic" RTT on the internet). I also note that the root-cause for adding that feature to TCP Prague, was/is the fact that the dual queue coupled AQM failed to properly share capacity between its two queues at low RTTs. This failure is rationalized as being an effect of RTT-bias caused by the difference in queueing delay between the two queues (~1ms fir the LL queue, ~20ms for the classic queue) and the proposed solution is to make TCP Prague not grow its congestion window faster than a 25 ms RTT flow. In other words this is not meaningfully addressing RTT-bias, but is fixing a deficiency in L4S's reference AQM*.
> 
>> s a community, we might come up with more heuristics / tunable parameters to handle edge cases.
> 
> 	Sorry, for the last decades people have worked on RTT-bias and no generic solution based solely on end-point actions has been found. I am not saying that this is impossible, but it it is quite unlikely that this is easy enough for the community to come up with a solution. And TCP Pragure is IMHO not a promising contender for a generic solution.
> 	IMHO, the problem is that the issue is not caused by the endpoints in the first place, but by the interaction of control loops of different "fidelity"/reaction times in bottleneck buffers. This can easily be seen in that a properly configured TCP flow can approach bottleneck capacity when run as the sole flow over a bottleneck, but will be suppressed if competing with TCP flows of shorter RTT in the same bottleneck. It hence seems clear that management of the bottleneck is at least as important to counter RTT-bias as the endpoints's control loops. The L4S approach of relegating the issue solely to the endpoints/protocols to fix, instead of also making the AQM part of the solution strikes me as short-sighted especially in the light of deployment of an AQM being one of the core pillars of the L4S design.
> 
>> https://l4steam.github.io/PragueReqs/Linux_TCP_Prague_L4S_requirements_Compliance_and_Objections.pdf
> 
> Best Regards
> 	Sebastian
> 
> *) And doing so before actual deployment, at a point in time when that AQM could actually still be fixed for good.
> 
> 
>> 
>> Thanks,
>> Vidhi
>> 
>>> On Apr 16, 2021, at 7:16 AM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>> 
>>> Hi Koen,
>>> 
>>> Thanks,.
>>> 
>>> Here is a question for Apple though:
>>> 
>>> "5. Reduce RTT dependence (A1.5)
>>> Section 4.3: A scalable congestion control MUST eliminate RTT bias as much as possible in the range between the minimum likely RTT and typical RTTs expected in the intended deployment scenario.
>>> Apple's comment:<page1image4260772480.png>		
>>> Again, agreed with the rationale behind this and the MUST compliance. This might be easy to implement as well based on heuristics but will require thorough testing."
>>> 
>>> 
>>> If this is easy to implement, could you please propose a description of such a solution to the mailing list please? As far as I can tell RT- bias has been a topic of research for decades and still no general solution has beed presented, so I am quite interested to learn more about this comment. Even if the response is something like "for the expected range of RTTs from 1ms to 20 ms" a solution like TCP Pragues, pretend all RTTs are 20ms" I am quite interested in apple's thoughts.
>>> 
>>> Best Regards
>>> 	Sebastian
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> On Apr 16, 2021, at 14:52, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
>>>> 
>>>> Hi all,
>>>> 
>>>> An update on the survey is available. We received an additional input from Apple which we could publicly share (thanks Vidhi for providing this input). I also updated the consolidated view v2 (available onhttps://github.com/L4STeam/l4steam.github.io#prague-requirements-compliance).
>>>> 
>>>> I believe it is strongly in line with the previous survey conclusions as presented in last tsvwg. One main additional feedback was on “7. Measuring Reordering Tolerance in Time Units”. There was disagreement that using time only and not packet count is a foolproof solution. As far as I understand the objection is to the current wording that a time based mechanism is the only/sufficient way to assure this.
>>>> 
>>>> The objective of this requirement is to allow a certain level of reordering for L4S traffic (actually avoid delaying packets in the network to guarantee correct order of packet delivery). I personally could support wording that expresses the core of the requirement, and not limit the text to one mechanism, which would allow alternative/more robust implementations. The requirement could be expressed as something like: “a scalable congestion control SHOULD  be resilient to reordering over an (adaptive) (time?) interval, which scales with / adapts to throughput, as opposed to counting only in (fixed) units of packets (as in the 3 DupACK rule of RFC 5681 TCP), which is not scalable”. Let’s further discuss here on the list what could be for all parties an acceptable wording.
>>>> 
>>>> Thanks,
>>>> Koen.
>>>> 
>>>> 
>>>> From: De Schepper, Koen (Nokia - BE/Antwerp) 
>>>> Sent: Sunday, March 7, 2021 1:57 AM
>>>> To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>om>; tsvwg IETF list <tsvwg@ietf.org>
>>>> Cc: Bob Briscoe <ietf@bobbriscoe.net>
>>>> Subject: RE: Prague requirements survey
>>>> 
>>>> Hi all,
>>>> 
>>>> The details of the consolidated view of all feedback received is available and can be found via following link: https://l4steam.github.io/PragueReqs/Prague_requirements_consolidated.pdf
>>>> 
>>>> The only strong objections were against the “MUST document” requirements, which will be removed from the next version of the draft. Some clarifications were asked and (will be) added.
>>>> For 2 requirements a big consensus was that they should be developed and evolved as needed during the experiment.
>>>> All other requirements had already implementations and if not, were seen feasible/realizable and were planned to be implemented.
>>>> 
>>>> We will present an overview during the meeting.
>>>> 
>>>> Regards,
>>>> Koen.
>>>> 
>>>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of De Schepper, Koen (Nokia - BE/Antwerp)
>>>> Sent: Wednesday, March 3, 2021 2:20 PM
>>>> To: tsvwg IETF list <tsvwg@ietf.org>
>>>> Subject: Re: [tsvwg] Prague requirements survey
>>>> 
>>>> Hi all,
>>>> 
>>>> We have received several surveys privately, for which I tried to get the approval for sharing those on the overview page: l4steam.github.io | L4S-related experiments and companion website
>>>> 
>>>> Thanks to NVIDIA for sharing their view and feedback for their GeforceNow congestion control. Their feedback was added to the above overview about a week ago. As we didn’t get the explicit approval for the others, we will share and present a consolidated view of all feedback received later and during the meeting.
>>>> 
>>>> Note: pdf versions are now also available on the above page for easier reading.
>>>> 
>>>> Koen.
>>>> 
>>>> 
>>>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of De Schepper, Koen (Nokia - BE/Antwerp)
>>>> Sent: Monday, February 8, 2021 2:37 PM
>>>> To: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>om>; tsvwg IETF list <tsvwg@ietf.org>
>>>> Subject: Re: [tsvwg] Prague requirements survey
>>>> 
>>>> Hi Ingemar,
>>>> 
>>>> Thanks for your contributions. I linked your doc to the https://l4steam.github.io/#prague-requirements-compliance web page (and will do so for others).
>>>> 
>>>> I didn’t see any issues or objections mentioned to the current requirements as specified in the draft. Does this mean you think they are all reasonable, valid and feasible?
>>>> 
>>>> Interesting observation (related to the performance optimization topic 1) that for the control packets “RTCP is likely not using ECT(1)”. Why is this not likely? I assume this will impact the performance? Do we need to recommend the use of ECT(1) on RTCP packets in the draft?
>>>> 
>>>> Thanks,
>>>> Koen.
>>>> 
>>>> From: Ingemar Johansson S <ingemar.s.johansson@ericsson.com> 
>>>> Sent: Monday, February 8, 2021 10:59 AM
>>>> To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>om>; tsvwg IETF list <tsvwg@ietf.org>
>>>> Cc: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
>>>> Subject: RE: Prague requirements survey
>>>> 
>>>> Hi
>>>> Please find attached (hopefully) a Prague requirements survey applied to SCReAM (RFC8298 std + running code)
>>>> 
>>>> Regards
>>>> Ingemar
>>>> 
>>>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of De Schepper, Koen (Nokia - BE/Antwerp)
>>>> Sent: den 6 februari 2021 23:20
>>>> To: tsvwg IETF list <tsvwg@ietf.org>
>>>> Subject: [tsvwg] Prague requirements survey
>>>> 
>>>> Hi all,
>>>> 
>>>> To get a better understanding on the level of consensus on the Prague requirements, we prepared an overview document listing the L4S-ID draft requirements specific to the CC (wider Prague requirements), as a questionnaire towards potential CC developers. If you are developing or have developed an L4S congestion control, you can describe the status of your ongoing development in the second last column. If you cannot share status, or plan-to/would implement an L4S CC, you can list what you would want to support (see feasible). In the last column you can put any description/limitations/remarks/explanations related to evaluations, implementations and/or plans (will implement or will not implement). Any expected or experienced issues and any objections/disagreements to the requirement can be explained and colored appropriately.
>>>> 
>>>> The document can be found on following link: https://raw.githubusercontent.com/L4STeam/l4steam.github.io/master/PragueReqs/Prague_requirements_Compliance_and_Objections_template.docx
>>>> 
>>>> As an example I filled it for the Linux TCP-Prague implementation on following link: https://l4steam.github.io/PragueReqs/Prague_requirements_Compliance_and_Objections_Linux_TCP-Prague.docx
>>>> 
>>>> Please send your filled document to the list (Not sure if an attachment will work, so I assume you also need to store it somewhere and send a link to it, or send to me directly).
>>>> 
>>>> We hope to collect many answers, understanding the position of the different (potential) implementers and come faster to consensus.
>>>> 
>>>> Thanks,
>>>> Koen.
>>> 
>> 
>