Re: [tsvwg] Adoption call for draft-white-tsvwg-l4sops

Re: [tsvwg] Adoption call for draft-white-tsvwg-l4sops - to conclude 24th March 2021

Jonathan Morton <chromatix99@gmail.com> Mon, 15 March 2021 10:35 UTC

From: Jonathan Morton <chromatix99@gmail.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.7\))
Date: Mon, 15 Mar 2021 12:35:17 +0200
References: <e9da704b-7705-baf9-a82c-39d4fe4e7ef1@erg.abdn.ac.uk> <d21192b20f1f40da3ffc8203083ab8a690b0cc9d.camel@heistp.net>
To: "tsvwg@ietf.org" <tsvwg@ietf.org>
In-Reply-To: <d21192b20f1f40da3ffc8203083ab8a690b0cc9d.camel@heistp.net>
Message-Id: <4B131032-C527-4E7E-8ACE-657814C4F18F@gmail.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/81XOMpDBr1iXyRYZ9qvrAupst88>
Subject: Re: [tsvwg] Adoption call for draft-white-tsvwg-l4sops - to conclude 24th March 2021
Precedence: list

I do not think this document is ready for adoption in its current form. Let me explain why, and suggest some ways it could be improved.

L4S has a fundamental incompatibility with conventional AIMD traffic in the presence of RFC-3168 ECN AQMs, just like DCTCP upon which it was based. L4S therefore requires mitigations to ensure that the harm caused by this incompatibility is minimised to an acceptable level. Since the harm is primarily caused to "innocent bystanders" rather than "involved participants" or "interested observers", the acceptable level of harm and risk is especially low, and the mitigations need to be correspondingly robust.

However, robust mitigations are not what l4s-ops currently describes. Most of the measures described fall into three categories:

1: Reliance on detecting an RFC-3168 AQM and disabling the L4S behaviour, using heuristics that have not yet been shown in a reliably working state, even under lab conditions. It is impossible to state that such a heuristic can be relied upon until such a showing has been made. A previous attempt at implementing such a heuristic was unsuccessful and is now disabled by default in the reference implementation. Hence, the reliability of such a heuristic would necessarily be a subject of the experiment, not the primary safeguard.

2: Requirements placed upon "innocent bystanders" to avoid the harm, mostly by reconfiguring, replacing, or disabling their RFC-3168 AQMs (sometimes in an RFC-ignorant manner). This is obviously unworkable, since by definition "innocent bystanders" are unaware of the experiment, and even if made aware, are disinterested in doing work to accommodate it.

3: Recommendation to deploy L4S hosts on networks that have been prepared to receive it. Which is a step in the right direction. But this is not accompanied by a corresponding requirement to *contain* L4S traffic to each prepared network. Without such a requirement, it would be very easy for L4S hosts on different networks, which may individually have been prepared, to communicate over the path between those networks that has *not* been prepared, and upon which the risk of disrupting bystander traffic therefore exists.

It is perhaps noteworthy that gaps in the second and third classes of mitigation are proposed to be covered by the first class of mitigation. I also note that there is still an assertion in the text that RFC-3168 AQMs are "rare", which is refuted by recent data. Finally, in the context of a CDN-ISP pairing for an experimental deployment, the ISP subscribers' LANs and WLANs are technically separate networks that would be difficult to "prepare" for L4S in advance; it would be wise to consider the ramifications of that.

I also note in passing that a modification of tunnel encapsulation semantics is also proposed. Given that tunnel implementations are more diverse than RFC-3168 AQM implementations, I also consider this unlikely to be practical, though I haven't studied in detail whether it would be effective if achieved.

I am currently aware of four theoretical methods of robustly mitigating the risk posed by L4S. I think that l4s-ops would be considerably improved by proposing that at least one of them be employed as a prerequisite to the L4S experiment actually taking place:

1: Develop, implement, demonstrate, and open for scrutiny an RFC-3168 detection heuristic that is reliable and prompt enough to serve as a primary safeguard for the experiment. In my opinion this will be difficult and will take significant time, but is not impossible to achieve.

2: Deprecate RFC-3168, or amend it to remove drop-equivalent marking of ECT(1) packets, and require the removal of all unmodified ECN AQMs from the Internet. This is unlikely to get much support given the increasing deployment rates of RFC-3168 AQMs at the present time. In any case it would take a very long time to eliminate existing RFC-3168 AQM deployments at Internet scale, so I consider this impractical.

3: Explicitly contain L4S traffic to networks that have been prepared or designated for the experiment. That could be done by marking all L4S traffic with a designated DSCP at origin, and blocking traffic carrying that DSCP from traversing border gateways into unprepared networks. This has the effect of making users and administrators of these networks at least "interested observers" and isolating L4S traffic from "innocent bystanders". Within the designated networks, observing the practical interactions between L4S and conventional traffic would be part of the experiment.

4: Redesign L4S to shift the risk burden away from "innocent bystanders". The most obvious way to do so is to implement unambiguous signalling by the network, so that the receiver knows for certain whether it is receiving congestion signals from an RFC-3168 AQM requesting an immediate MD response, or from an AQM of the new type requesting a new type of response. The risk of performance trouble is then restricted to network nodes that produce the new signals and transport endpoints that understand them - in other words, to the relatively small number of "involved participants" who have the knowledge and incentive to study the problem and find solutions. The incentives are thus aligned correctly and risks are not "externalised".

The SCE proposal does exactly that, in a manner that is totally transparent to existing RFC-3168 endpoints and middleboxes. It becomes practical, for example, to use a Differentiated Services Code Point to differentiate a low-latency service onto a second bearer and provide a single-queue SCE AQM there, while providing a single-queue RFC-3168 AQM (without SCE) on the primary bearer. Because of the unambiguous signalling, SCE traffic missing the DSCP would still compete on equal terms with conventional traffic, instead of dominating it or being dominated.

I realise that this last method is not strictly in scope for the l4s-ops draft (and that mentions of SCE tend to raise hackles among L4S proponents), but I include it because it appears to be the most robust mitigation method available. It also has the advantage of running code being available to try it out immediately.

I am not hugely optimistic that the l4s-ops draft will incorporate the above advice before the adoption call ends. But unless and until it does, my position is that it SHOULD NOT be adopted.

- Jonathan Morton

[tsvwg] Adoption call for draft-white-tsvwg-l4sop… Gorry Fairhurst
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Bob Briscoe
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Neal Cardwell
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Holland, Jake
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Smith, Kevin, Vodafone Group
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Livingood, Jason
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Pete Heist
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Jonathan Morton
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Sebastian Moeller
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Greg White
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Martin Duke
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Ingemar Johansson S
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Dave Taht
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Gorry Fairhurst
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Sebastian Moeller
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Gorry Fairhurst
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Sebastian Moeller
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Ruediger.Geib
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Gorry Fairhurst
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Tilmans, Olivier (Nokia - BE/Antwerp)
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Sebastian Moeller
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Black, David
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Holland, Jake
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Jonathan Morton
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Steven Blake
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Bob Briscoe
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Steven Blake
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Bob Briscoe
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Sebastian Moeller
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Ingemar Johansson S
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Sebastian Moeller
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Bob Briscoe
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Jonathan Morton
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Ingemar Johansson S
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Sebastian Moeller
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Jonathan Morton
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Gorry Fairhurst
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Livingood, Jason
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Sebastian Moeller
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Livingood, Jason
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Sebastian Moeller
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Steven Blake
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… Sebastian Moeller
Re: [tsvwg] Adoption call for draft-white-tsvwg-l… De Schepper, Koen (Nokia - BE/Antwerp)