Re: [tsvwg] new tests of L4S RTT fairness and intra-flow latency: defaults ready for testing

Sebastian Moeller <moeller0@gmx.de> Wed, 18 November 2020 13:57 UTC

From: Sebastian Moeller <moeller0@gmx.de>
Message-Id: <5CEA0C9D-6ED7-495D-8697-0D761EA33F55@gmx.de>
Content-Type: multipart/alternative; boundary="Apple-Mail=_869A2AC7-C02D-48A8-8BDC-5099116DE0FB"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\))
Date: Wed, 18 Nov 2020 14:57:19 +0100
In-Reply-To: <AM8PR07MB74761B7A75E727A7830E72A6B9E10@AM8PR07MB7476.eurprd07.prod.outlook.com>
Cc: Jonathan Morton <chromatix99@gmail.com>, tsvwg IETF list <tsvwg@ietf.org>
To: "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com>
References: <AM8PR07MB7476081896E0A1C4897FFBA3B9E20@AM8PR07MB7476.eurprd07.prod.outlook.com> <811A76DD-3D48-43D3-A962-3F15AE9E858B@gmail.com> <AM8PR07MB74761B7A75E727A7830E72A6B9E10@AM8PR07MB7476.eurprd07.prod.outlook.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/TptTbpMMoH7CrGnPX-4toyR73zM>
Subject: Re: [tsvwg] new tests of L4S RTT fairness and intra-flow latency: defaults ready for testing
Precedence: list

Hi Koen,



> On Nov 18, 2020, at 12:56, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
> 
> Hi Jonathan, Pete, Sebastian,
>  
> Thanks for also showing this case sharp again. The reason for this worse longer RTT behavior of Prague and DCTCP is that the smallest flow induces the 1-RTT-on/X-RTT-off burst pattern on the step threshold of the L4S queue, which is known to trigger the r=2/(p^2*RTT) DCTCP response (as defined in the original DCTCP paper). L4S started when we found that DCTCP behaves as r=2/(p*RTT), so the square on p disappears, when it gets the continuous per RTT-stable marking rate from a Classic AQM (when it gets coupled). Any in between burstiness will result in a value somewhere in between those equations.
>  
> So when the 1-RTT-on/X-RTTs-off bursts of the smallest flow become a steady per RTT marking rate for the biggest RTT flow, it will have in worse case an extra 1/p disadvantage. The second paper described this behavior as a 1/RTT^2 dependency, but I think it is more related due to this effect. Neal Cardwell recently suggested looking into the more fluid per packet implementation to eliminate the RTT-dependence behavior, which I thought was not relevant as it “only” solves the getting the 1/RTT^2 to 1/RTT, and so not a solution for 1/RTT^0 objective for RTT-independence.

	[SM] I do not buy  this rationale at all. Going from 1/RTT^2 to 1/RTT would be a tremedous reduction in RTT dependence, to big to argue you left this on the floor as you aimed for 1/RTT^0 (and failed). On the other hand that approach would explain a lot of what I object in L4S.


> But it would actually solve this “Too big rate disadvantage for long >80ms RTTs” problem. So currently to be tried and further investigated. Further contributions are appreciated there as well (from anyone). 
>  
> As I understood there is an expectation to have a CC that can be just enabled and autotunes to all conditions. As a fast work-around to make selecting Prague as a good alternative in the full range, we are considering switching to non-ETC or ECT(0) and Reno when the RTT is >80ms.

	[SM] That is an open admission that L4S really is just a fast-priority-lane design. Thanks for confirming that though. It does severely reduce the "low latency for all" claim on which L4S is still marketed.

> Even better, but a bigger code impact, would be to switch to Cubic in that case. Olivier can push the fallback to Reno if RTT>80ms, which for now would solve this issue until there is any of the better solutions available.

	[SM] It is exactly this pilling up of hacks to solve individual undesirable outcomes one by one, that leave me unconvinced whether the core "coupled AQMs" idea actually is suitable for the real internet.

Regards
	Sebastian

>  
> Regards,
> Koen.
>  
>  
> From: Jonathan Morton <chromatix99@gmail.com> 
> Sent: Wednesday, November 18, 2020 1:28 AM
> To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>
> Cc: Lars Eggert <lars@eggert.org>; Ingemar Johansson S <ingemar.s.johansson@ericsson.com>; tsvwg IETF list <tsvwg@ietf.org>
> Subject: Re: [tsvwg] new tests of L4S RTT fairness and intra-flow latency: defaults ready for testing
>  
> On 17 Nov, 2020, at 3:32 pm, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
> 
> The RTT-independence was implemented, available and demonstrated several meetings ago already and as presented working very well according to our tests. The following parameters are now set as default, so can be tested out of the box:
> 
> All Prague flows with an RTT below 25ms will now converge to the same rate, independent of their real base RTT. This means that flows with a bigger RTT than 25 ms will never have to compete against smaller than 25ms RTT flows. 
> 
> Now the defaults are set, I'm looking forward to independent evaluations.
>  
> Since our tests are quite well automated, we were able to run a subset of them (all at 50Mbps) against the new defaults this evening.
>  
> I'll give you credit: there is some improvement in some of the tests.  However, we could still draw most of the same conclusions from the new data as we did from last week's data; the big-picture problems are still present and in some cases have actually deteriorated.
>  
> I'll focus on two major concerns in particular:
>  
> 1: Prague outcompetes CUBIC in DualPI2, at a common baseline RTT.  This only stops being true when the BDP is large enough for Prague to have difficulty growing to steady state in a reasonable amount of time.
>  
> With the new code, the Jain's index improves from .823 to .987 at 10ms (the advantage in both cases being to Prague), but actually worsens from .880 to .838 at 20ms, and from .936 to .890 at 80ms.  All of these are sampled after allowing two minutes for the flows to converge to steady-state.
>  
> 2: Prague vs Prague competition on differing RTTs.
>  
> Here is Figure 3 from the test report we recently posted, followed by an equivalent chart generated from the new data this evening.  Let's play spot the difference:
>  
> 
> 
>  
> I can say that the throughput ratio for Prague vs Prague via DualPI2 is, in fact, slightly improved in the new data, but it is still significantly worse even than the 16:1 ratio expected from the baseline RTTs at identical average cwnd.  In a similar test with 80ms versus 20ms RTTs, the two Prague flows also have more than the expected 4:1 throughput ratio.  I don't have an immediate explanation for that.
>  
> Notice that with both the old and new code, CodelAF gets very close to parity in throughput with the same traffic load, and that even through DualPI2, a pair of CUBIC flows is closer to parity than a pair of Prague flows.  That is not, overall, an improvement in RTT independence from switching to TCP Prague and/or DualPI2.
>  
> However, we did find an improvement in fairness, compared to the older code, when comparing 20ms vs 10ms Prague flows.  That's what you were going for, wasn't it?  A shame that, in achieving that singular success, so many other things are left unresolved.
>  
> I'm sure we will have the opportunity to run more tests on your future efforts.  For the moment, with limited time on our hands, this will have to do.
>  
>  - Jonathan Morton

Attachment: image001.png
Attachment: image002.png

Re: [tsvwg] new tests of L4S RTT fairness and int… De Schepper, Koen (Nokia - BE/Antwerp)
Re: [tsvwg] new tests of L4S RTT fairness and int… Sebastian Moeller
Re: [tsvwg] new tests of L4S RTT fairness and int… Jonathan Morton
Re: [tsvwg] new tests of L4S RTT fairness and int… Sebastian Moeller
Re: [tsvwg] new tests of L4S RTT fairness and int… Jonathan Morton
Re: [tsvwg] new tests of L4S RTT fairness and int… Ingemar Johansson S
Re: [tsvwg] new tests of L4S RTT fairness and int… Ingemar Johansson S
Re: [tsvwg] new tests of L4S RTT fairness and int… Jonathan Morton
Re: [tsvwg] new tests of L4S RTT fairness and int… Sebastian Moeller
Re: [tsvwg] new tests of L4S RTT fairness and int… De Schepper, Koen (Nokia - BE/Antwerp)
Re: [tsvwg] new tests of L4S RTT fairness and int… De Schepper, Koen (Nokia - BE/Antwerp)
Re: [tsvwg] new tests of L4S RTT fairness and int… Sebastian Moeller
Re: [tsvwg] new tests of L4S RTT fairness and int… Sebastian Moeller
Re: [tsvwg] new tests of L4S RTT fairness and int… Sebastian Moeller

Re: [tsvwg] new tests of L4S RTT fairness and intra-flow latency: defaults ready for testing

Attachment: image001.png

Attachment: image002.png