Re: [tsvwg] These L4S issues reported are not show stoppers

Sebastian Moeller <moeller0@gmx.de> Mon, 18 November 2019 14:13 UTC

Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <AM4PR07MB3459A1508B3D289BC8AE345DB94D0@AM4PR07MB3459.eurprd07.prod.outlook.com>
Date: Mon, 18 Nov 2019 15:13:09 +0100
Cc: "Holland, Jake" <jholland@akamai.com>, Ingemar Johansson S <ingemar.s.johansson=40ericsson.com@dmarc.ietf.org>, "tsvwg@ietf.org" <tsvwg@ietf.org>, "gorry@erg.abdn.ac.uk" <gorry@erg.abdn.ac.uk>, Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <1FD4EBCE-C566-49E3-97CC-925B6F2C5F36@gmx.de>
References: <AM4PR07MB3459A1508B3D289BC8AE345DB94D0@AM4PR07MB3459.eurprd07.prod.outlook.com>
To: "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/wr3Mo9BUKa-bpbUAIkMGy25yFpA>
Subject: Re: [tsvwg] These L4S issues reported are not show stoppers
Precedence: list

Dear Koen,

> On Nov 18, 2019, at 12:29, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
> 
> Hi SCE'ers,
> 
>>> unproven L4S technology
> The Network part of the L4S technology didn't change since the L4S BoF. It is based on the theoretical interactions between Scalable CCs (say DCTCP) and Classic CCs (say Reno), and was already extensively (not claiming exhaustive as everybody makes mistakes) verified with experiments to both detect under which conditions it works and which conditions it doesn't or could show unwanted issues. Following the good design rule of keeping the network implementation as simple as possible and higher protocol layer header agnostic, these issues were decided to be solved by the endpoints, and this is why both safety and performance improvement requirements were defined in the drafts from the beginning (BoF), also known as the TCP-Prague requirements. 

	[SM] Requirements that currently TCP Prague does not seem to meet, no?


> 
>>> I think we've seen strong evidence that L4S may still contain show-stopping problems.
> Correct me if I'm wrong, but most of the problems that were recently labeled as "show-stoppers" are not new, and only motivating the existence of the related TCP-Prague requirement or known limitations valid for any low latency architecture. I think it is good that L4S gets evaluated and challenged, but I think it is incorrect to label issues immediately as "show stoppers".
> 
> A summary of the so called "show-stopping problems" I picked up:
> 
> - 4 seconds lasting burst in a cascade of a bufferbloated FIFO and a slightly lower FQ-CoDel bottleneck: originated in a Bug in an alpha version of our TCP-Prague implementation,

	[SM] Which is a strong indicator the TCP Prague has not seen sufficient testing to merit wider roll-out into the internet, no? So why the rush to get L4S into experimental RFC status since it obviously is not fully baked yet?


> and that was amplified by a wrong assumption of an FQ_CoDel implementation, that overload protection (reverting to drop) is not needed,

	[SM] ??? As far as I can tell fq_codel does resort to drop on overload, could you please specify exactly what you are referring to?

> as in an FQ there is isolation between flows and a non-responsive flow will only hurt itself. Usually it does, but in this particular setup where the FQ_CoDel implementation tried to protect other flows from the missing AQM in the preceding FIFO, it was clearly a missing FQ_CoDel feature.

	[SM] That is a quite extreme interpretation, for TCP Pragues failure to meet the Prague requirements to properly respect rfc3168 AQMs. My subjective take on this is that the L4S components needs to coexist with the existing internet, unless fixing something is realistically possible.

> Dropping packets would immediately trigger the Classical congestion response and avoid the reported "show-stopping" effect.

	[SM] As would disallowing ECN for ECT(1) flows, 

> 
> - high unfairness between flows when the base RTT is 0ms: Due to the large difference between the experienced RTTs of both flows, the RTT dependence gives the L4S flow a 10 times higher throughput. This RTT dependence, which we love and hate, is the normal mode of operation on the Internet since the last 40 years. I am personally a big  promotor of the "Less RTT dependent" TCP-Prague requirement, while others argue it is even not necessary, as it is "Normal" and accepted Internet behavior and part of the advantages of using L4S. I think extreme cases as the 0ms show the importance of this requirement to cover at least these extreme cases.

	[SM] You seem to misunderstand the issue I raised: Let me try again: L4S introduces a new supposedly equitable sharing system between L4S and "normal TCP" flows that fails to do exactly that: share fairly between the two categories it sorts all packets into (even on one and the same path). IMHO this is a failure independent of the root cause of the behavior.
	In addition the AQM L4S selected as reference artificially increases the RTT of the normal queue flows (by selecting a high RTT target of 15* ms without properly considering the consequences that choice has on queue sharing behavior in your coupled design) and then it is argued that due the inflated RTT unfair bandwidth distribution is acceptable due to TCPs known RTT dependence. I would cautiously argue that it might be better to employ an AQM that comes with less obvious failure modes... instead of employing such forced logic.


*) At the tested path that demonstrated this dualq short-coming with an RTT < 1ms PIE actually would only need a sub millisecond target, so no matter how you slice and dice it, it seems unconvincing to first burden the normal queue with an massively over-sized latency target and then take this aas an allowance for giving the L4S queue an unfair bandwidth advantage. Now, I am not an engineer, so this behavior might be acceptable here, but that should be made explicit in both the arch and the dualq drafts.... and I would like to see people here actually ACKing or NACKing that behavior as acceptable for the wider-internet.


Best Regards
	Sebastian


> 
> - lower throughput when traffic is passing bursty links: From a congestion control point of view, It is possible to lower the queuing latency below 1ms. Maybe we did not clearly enough state that this is not the "real world" end-to-end latency that can always be achieved. There are many other sources of latency (serialization time, speed of light...) that add to the end-to-end latency, but which don't prevent to achieve the additional 1ms "queuing" delay. But there are many real world sources that do limit what can be achieved. If the serialization time of a packet is longer than 1ms (when the rate drops below 12Mbps), it is a mistake to mark packets at 1ms delay (the Linux DualPI2 does not mark below 2 packets in queue). If packets are waiting to be aggregated and send in a burst it is a mistake to consider this waiting time as "queuing" delay and mark them based on this time. If network technology on your path or at the sender aggregates packets over longer times than 1ms and burst them out at a larger rate that creates a larger queuing delay than 1ms in the smallest path throughput bottleneck, it is a mistake to put the marking threshold in those low throughput paths below the expected burst size. In any of these mistake cases, low latency CC traffic will lose throughput. If you see lower throughput for L4S it is due to additional L4S marking on top of the coupled marking, typically caused by a bursty source. By the way, these are not L4S or L4S codepoint specific, they are also valid for SCE and even DCTCP in datacenters and for delay based congestion controls that would want to avoid 1ms extra delay. Another solution for this is to improve the aggregation and MAC mechanisms in the related link technologies or pace packets out at a lower burst rate (eg in your WiFi access point). Low latency CC is raising the bar for link layer technologies. It will take time for the lower layers to adapt to the new TCP behavior.
> 
> Next to the TCP-Prague requirements, I think a way forward is to also explicitly document in the L4S drafts, the limitations or "pitfalls" (if not already). Even if they are not L4S specific, I agree, it is important to set the correct expectations and to clearly inform people that want to deploy or reproduce experiments, that these are known and unavoidable limitations. This way we can move on and focus on finding "real" show-stoppers and specifically in this context on finding "real" differentiation between L4S and SCE.
> 
> Before I forget 😉, one more issue:
> - unfairness to classic TCP when sharing a Classic ECN AQM: I think this is the real differentiator between L4S and SCE. L4S has "covered" this as a TCP-Prague requirement. Agreed it is a bit putting the hot potato into the congestion control developers' basket, but that is where we need to solve it. I think the debate should be around this issue only at this stage. Question is how much to we need to compromise if this TCP-Prague requirement is not sufficiently resolved (and which level of sufficiency is expected), and what do we need to compromise if we select SCE on the other hand... I think it is important to have a future facing vision here.
> 
> Regards,
> Koen.
> 
> 
> 
> -----Original Message-----
> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Holland, Jake
> Sent: Monday, November 18, 2019 1:46 AM
> To: Ingemar Johansson S <ingemar.s.johansson=40ericsson.com@dmarc.ietf.org>; tsvwg@ietf.org
> Cc: gorry@erg.abdn.ac.uk; Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
> Subject: Re: [tsvwg] Requesting TSVWG adoption of SCE draft-morton-tsvwg-sce
> 
> Hi Ingemar,
> 
> If fragmenting the space will prevent other SDOs from prematurely adopting the unproven L4S technology, that seems like exactly the right thing to do at this stage.
> 
> I think we've seen strong evidence that L4S may still contain show-stopping problems.  Also that we have not yet seen strong evidence that the problems stemming from the ambiguity in the L4S signaling design can be fixed.
> 
> This carries a demonstrated potential for breaking existing ECN deployments by under-responding to the already widely-deployed congestion feedback systems.
> 
> Certainly L4S's implementation was demonstrated to contain an issue that would have wrecked the latency of existing ECN deployments, and it had not previously been detected, despite the years of lab evaluation and repeated requests from reviewers to test such scenarios earlier.
> 
> Although a fix was found for the specific initially-demonstrated case, no fix has yet been demonstrated for what looks to be a very similar issue occurring with staggered flow startup, which can't be attributed to a wrong alpha starting value:
> https://trac.ietf.org/trac/tsvwg/ticket/17#comment:8
> 
> The proposed pseudocode fix (with up to 5 tuning parameters, IIRC) may or may not be able to address this for specific cases, and it may or may not be possible to discover a set of tuning values that can address a wide range of conditions, but it seems appropriate to have some skepticism, at least until demonstration of successful operation under a wide range of conditions, given the history of such proposals.  This suggests that we do the opposite of encouraging other SDOs to move broadly forward with L4S at this time.
> 
> I share your concern that we might lose the codepoint (and the low latency functionality), and I acknowledge that a persistently fragmented space introduces a risk that it never happens, or takes an extra decade.
> 
> But the risk that concerns me even more is if L4S gets rolled out and then these kinds of issues are discovered in production, after other SDOs have prematurely standardized on this experiment, and it therefore gets shut off with prejudice against future solutions.
> That outcome also would lose the use of the codepoint, probably even more permanently.
> 
> (Or even worse: if it does not get shut off in spite of the problems it causes, which loses even the low-ish latency solutions we already have, and adds to the congestion control aggression arms race.)
> 
> IMO, those would be even worse outcomes than a somewhat delayed adoption of a fully vetted system (or at least one that can't break existing deployed networks).
> 
> Best regards,
> Jake
> 
> PS: I still don't understand why the gains available through the use of regular AQM (especially with ECN) have not been more widely adopted by the other SDOs that would want to make use of L4S.
> 
> It seems possible already to reduce the application-visible delay spikes from ~200ms to ~20ms (provided that no overly aggressive competing traffic improperly ignores the feedback, or that flow-queuing or other queue protection mechanisms are more widely deployed to prevent excessive damage from aggressive flows to less aggressive competing flows).
> 
> I wonder if whatever would drive SDOs to start using L4S maybe could instead be leveraged to drive adoption of the much more well- proven existing ECN solutions, which at least already have a lot of endpoint support deployed.
> 
> The endpoint support is a critical component to making this useful, and I see no reason to believe it'll be any quicker than the existing regular ECN was.  I'd even expect less so, since the behavior is much more complicated and hard to test.
> 
> PPS: I agree it would be interesting to see paced chirping solutions to help do better than slow start, and to quickly grow when new capacity opens on-path.  But I'll point out that's not specific to L4S, but rather should have application for any CC that can avoid pushing the network until queue overflow, which to me likely includes regular ECN-enabled Reno or Cubic, as well as BBR.
> 
> However, as yet another unproven TBD, I'll suggest it's not very useful as a strong influence on this debate, in spite of the early demos using L4S.  Regardless of the ultimate low latency solution, that part will need further development and might not work.
> 
>

Re: [tsvwg] These L4S issues reported are not sho… De Schepper, Koen (Nokia - BE/Antwerp)
Re: [tsvwg] These L4S issues reported are not sho… Sebastian Moeller
Re: [tsvwg] These L4S issues reported are not sho… Jonathan Morton
Re: [tsvwg] These L4S issues reported are not sho… De Schepper, Koen (Nokia - BE/Antwerp)
Re: [tsvwg] These L4S issues reported are not sho… Sebastian Moeller