Re: [tsvwg] Adoption call for draft-white-tsvwg-l4sops - to conclude 24th March 2021

Jonathan Morton <chromatix99@gmail.com> Mon, 15 March 2021 10:35 UTC

Return-Path: <chromatix99@gmail.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E49D73A0B0E for <tsvwg@ietfa.amsl.com>; Mon, 15 Mar 2021 03:35:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.849
X-Spam-Level:
X-Spam-Status: No, score=-1.849 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JBEhxb4zp1Fl for <tsvwg@ietfa.amsl.com>; Mon, 15 Mar 2021 03:35:22 -0700 (PDT)
Received: from mail-lf1-x135.google.com (mail-lf1-x135.google.com [IPv6:2a00:1450:4864:20::135]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CD0EA3A0B02 for <tsvwg@ietf.org>; Mon, 15 Mar 2021 03:35:21 -0700 (PDT)
Received: by mail-lf1-x135.google.com with SMTP id e7so56064663lft.2 for <tsvwg@ietf.org>; Mon, 15 Mar 2021 03:35:21 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:content-transfer-encoding:mime-version:subject:date:references :to:in-reply-to:message-id; bh=t5jCA4UY6kRjMoTSj4cnHlqJ2D60OWE/GnCS6njsLMY=; b=LqfSsLhODWmnHuTwxoJtMyhPgKhnTimVpAv4N/+//QJ4pivnXjHDwz4cPPW5D05BdX ZwXL+mIR4HpbM8ZJA1QjJAY89UDqhZRTj/LO41ycXox2EwRex6pkK4G9F3RZU2IJH9vu Z3PHT03wiR6XxHeoELoLDQipoEXyBAkVC3AOdNQcmvMVrY7clhVn2d3R0l/cdTDq2Yz8 H4wZVUrqTzt/y+pd29UhgsF6gPV7OvBm023BayrqiQaxNB3zfmd6C+pheZSPRv7m0kYV ko8hAUi1VLz8F5sh5lPmF4AEREWTBaj5WNY7TSE2sn0iu3I3OalGdMDZ0eb5r0FhfAKV ImhQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:date:references:to:in-reply-to:message-id; bh=t5jCA4UY6kRjMoTSj4cnHlqJ2D60OWE/GnCS6njsLMY=; b=Af3XiK9+lnW749xShBBkSpwnwEdM4ipqzOxB6QgrgBFfFCVbGkmQhukhFg8TSJzl2f dQBqkxzYphMLVruT3BNT/GqLmqhai9dtSimC2Bj5zvbtXN0S6GlPFadvbvEQTZyjZ8QR 7XjarDFHJco6w7tdEoYkem6dfAw/M28lc1mft+TE2LNyPtYBNRXkzN8Dot0prfkie249 cBvDvyu5tziTKysp3cN7Vr/tKxmd3+e6rGLpDpFXaUY+L+8ENTU4KPtCFTPAND7Y7LSS 8b6i7kWlDO34TUe4qPT4dVMcG9Fh48+t2CLXhj0MCfjlRmq9hpF0rpob5wRFAzI+0hdu dJmQ==
X-Gm-Message-State: AOAM530dVK6Jcz3JB0Fguva6DWMxKoILdL7VjlRnziMcQ8iS/UPn/gm/ oaSwHlu8n8T5mDpjLCgnk+kdqbOrUc4=
X-Google-Smtp-Source: ABdhPJxUHPPfKzoZQfrm9HYXRG1dKcnLaeuyDddIHwA+Wp24Kng9P1LQPBuTJyy75SCLTOxuDHLE1w==
X-Received: by 2002:ac2:41c5:: with SMTP id d5mr7624130lfi.459.1615804519014; Mon, 15 Mar 2021 03:35:19 -0700 (PDT)
Received: from jonathartonsmbp.lan (87-93-215-52.bb.dnainternet.fi. [87.93.215.52]) by smtp.gmail.com with ESMTPSA id b1sm2622357lfe.282.2021.03.15.03.35.17 for <tsvwg@ietf.org> (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 15 Mar 2021 03:35:18 -0700 (PDT)
From: Jonathan Morton <chromatix99@gmail.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.7\))
Date: Mon, 15 Mar 2021 12:35:17 +0200
References: <e9da704b-7705-baf9-a82c-39d4fe4e7ef1@erg.abdn.ac.uk> <d21192b20f1f40da3ffc8203083ab8a690b0cc9d.camel@heistp.net>
To: "tsvwg@ietf.org" <tsvwg@ietf.org>
In-Reply-To: <d21192b20f1f40da3ffc8203083ab8a690b0cc9d.camel@heistp.net>
Message-Id: <4B131032-C527-4E7E-8ACE-657814C4F18F@gmail.com>
X-Mailer: Apple Mail (2.3445.9.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/81XOMpDBr1iXyRYZ9qvrAupst88>
Subject: Re: [tsvwg] Adoption call for draft-white-tsvwg-l4sops - to conclude 24th March 2021
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 15 Mar 2021 10:35:24 -0000

I do not think this document is ready for adoption in its current form.  Let me explain why, and suggest some ways it could be improved.

L4S has a fundamental incompatibility with conventional AIMD traffic in the presence of RFC-3168 ECN AQMs, just like DCTCP upon which it was based.  L4S therefore requires mitigations to ensure that the harm caused by this incompatibility is minimised to an acceptable level.  Since the harm is primarily caused to "innocent bystanders" rather than "involved participants" or "interested observers", the acceptable level of harm and risk is especially low, and the mitigations need to be correspondingly robust.

However, robust mitigations are not what l4s-ops currently describes.  Most of the measures described fall into three categories:

1: Reliance on detecting an RFC-3168 AQM and disabling the L4S behaviour, using heuristics that have not yet been shown in a reliably working state, even under lab conditions.  It is impossible to state that such a heuristic can be relied upon until such a showing has been made.  A previous attempt at implementing such a heuristic was unsuccessful and is now disabled by default in the reference implementation.  Hence, the reliability of such a heuristic would necessarily be a subject of the experiment, not the primary safeguard.

2: Requirements placed upon "innocent bystanders" to avoid the harm, mostly by reconfiguring, replacing, or disabling their RFC-3168 AQMs (sometimes in an RFC-ignorant manner).  This is obviously unworkable, since by definition "innocent bystanders" are unaware of the experiment, and even if made aware, are disinterested in doing work to accommodate it.

3: Recommendation to deploy L4S hosts on networks that have been prepared to receive it.  Which is a step in the right direction.  But this is not accompanied by a corresponding requirement to *contain* L4S traffic to each prepared network.  Without such a requirement, it would be very easy for L4S hosts on different networks, which may individually have been prepared, to communicate over the path between those networks that has *not* been prepared, and upon which the risk of disrupting bystander traffic therefore exists.

It is perhaps noteworthy that gaps in the second and third classes of mitigation are proposed to be covered by the first class of mitigation.  I also note that there is still an assertion in the text that RFC-3168 AQMs are "rare", which is refuted by recent data.  Finally, in the context of a CDN-ISP pairing for an experimental deployment, the ISP subscribers' LANs and WLANs are technically separate networks that would be difficult to "prepare" for L4S in advance; it would be wise to consider the ramifications of that.

I also note in passing that a modification of tunnel encapsulation semantics is also proposed.  Given that tunnel implementations are more diverse than RFC-3168 AQM implementations, I also consider this unlikely to be practical, though I haven't studied in detail whether it would be effective if achieved.


I am currently aware of four theoretical methods of robustly mitigating the risk posed by L4S.  I think that l4s-ops would be considerably improved by proposing that at least one of them be employed as a prerequisite to the L4S experiment actually taking place:

1: Develop, implement, demonstrate, and open for scrutiny an RFC-3168 detection heuristic that is reliable and prompt enough to serve as a primary safeguard for the experiment.  In my opinion this will be difficult and will take significant time, but is not impossible to achieve.

2: Deprecate RFC-3168, or amend it to remove drop-equivalent marking of ECT(1) packets, and require the removal of all unmodified ECN AQMs from the Internet.  This is unlikely to get much support given the increasing deployment rates of RFC-3168 AQMs at the present time.  In any case it would take a very long time to eliminate existing RFC-3168 AQM deployments at Internet scale, so I consider this impractical.

3: Explicitly contain L4S traffic to networks that have been prepared or designated for the experiment.  That could be done by marking all L4S traffic with a designated DSCP at origin, and blocking traffic carrying that DSCP from traversing border gateways into unprepared networks.  This has the effect of making users and administrators of these networks at least "interested observers" and isolating L4S traffic from "innocent bystanders".  Within the designated networks, observing the practical interactions between L4S and conventional traffic would be part of the experiment.

4: Redesign L4S to shift the risk burden away from "innocent bystanders".  The most obvious way to do so is to implement unambiguous signalling by the network, so that the receiver knows for certain whether it is receiving congestion signals from an RFC-3168 AQM requesting an immediate MD response, or from an AQM of the new type requesting a new type of response.  The risk of performance trouble is then restricted to network nodes that produce the new signals and transport endpoints that understand them - in other words, to the relatively small number of "involved participants" who have the knowledge and incentive to study the problem and find solutions.  The incentives are thus aligned correctly and risks are not "externalised".

The SCE proposal does exactly that, in a manner that is totally transparent to existing RFC-3168 endpoints and middleboxes.  It becomes practical, for example, to use a Differentiated Services Code Point to differentiate a low-latency service onto a second bearer and provide a single-queue SCE AQM there, while providing a single-queue RFC-3168 AQM (without SCE) on the primary bearer.  Because of the unambiguous signalling, SCE traffic missing the DSCP would still compete on equal terms with conventional traffic, instead of dominating it or being dominated.

I realise that this last method is not strictly in scope for the l4s-ops draft (and that mentions of SCE tend to raise hackles among L4S proponents), but I include it because it appears to be the most robust mitigation method available.  It also has the advantage of running code being available to try it out immediately.


I am not hugely optimistic that the l4s-ops draft will incorporate the above advice before the adoption call ends.  But unless and until it does, my position is that it SHOULD NOT be adopted.

 - Jonathan Morton