Re: [tsvwg] new tests of L4S RTT fairness and intra-flow latency

Jonathan Morton <chromatix99@gmail.com> Sun, 15 November 2020 17:13 UTC

Return-Path: <chromatix99@gmail.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 13B3C3A09E9 for <tsvwg@ietfa.amsl.com>; Sun, 15 Nov 2020 09:13:47 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.849
X-Spam-Level:
X-Spam-Status: No, score=-1.849 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CpZ3DrRMm4tH for <tsvwg@ietfa.amsl.com>; Sun, 15 Nov 2020 09:13:45 -0800 (PST)
Received: from mail-lj1-x229.google.com (mail-lj1-x229.google.com [IPv6:2a00:1450:4864:20::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 737A23A0420 for <tsvwg@ietf.org>; Sun, 15 Nov 2020 09:13:45 -0800 (PST)
Received: by mail-lj1-x229.google.com with SMTP id p12so17091787ljc.9 for <tsvwg@ietf.org>; Sun, 15 Nov 2020 09:13:45 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=60owLHadf0Agu9+1Q3t/ypDX9LhGw234jC2GRpd1SR4=; b=AlC4py/5mgWudYS/WMg0BJ4e0TQssq3mRZbK7F9jINQDR1OkxBYUJh6JURASivctmC 3vsnJ/YQz508IFIT3B+SFCKUl751w2mYf7xEmgnmANZ3PNb9n99a+DFLKR7W8rEYAMmg xG9CYcNQw7hNQAvMG4yaxa7P9BqQ3xX3lYdaOkcMDARKyefeWqz55wKoWwv3TNVAYT2Q kvy/59TLF6rJlAM4f9x8ohXsDBDx5gRKCM3ZtaWP/OiNv07ezNRq48vaFDXlvPamzUfu 6IBibwL6anNFabibUdZoJ9B3R6q3w88BrW4HtarnAwBwf9SUhbYoQ8TAiyEvRoy09cwN /+Og==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=60owLHadf0Agu9+1Q3t/ypDX9LhGw234jC2GRpd1SR4=; b=SRDgx+suwCTcpdsji5PfqurD/HdqMiwK/oJOQrli4W70QWt18s0igWARaF4Be87OWF xKkF1VAgsCJbDVNmKgNfd8Kq2xW0rFvg+0D7wmYzEXjl5nHLjOY2CKiPmL1klZ6ETpBC atuy8kgD8NHDehWiZ4xZjXjZLxg3K0/kRAf7pqRlW0FkTEDaGShfYRcH18TRF/JZx6cn rI7RZzhXBSKnMhSdtYS+L+Bo7sKnj4oHoC0U+56IHjI1hIyNpJAY8Xb7XzJt2jYLcwF4 coyAiBg829N+HQxnFAqoHmebUy7G3//PjWeYFoJ36LIgCaAxBb4at2yvSDhlt3XTPTZX n0Lw==
X-Gm-Message-State: AOAM531K0n/9QZMw+Itnjlgp2noEIF8knwG2tUMxi39rSmsCY96BEhR1 GQphQm9INO3m9VX3PadVqou3GFk54Dw=
X-Google-Smtp-Source: ABdhPJz4c9gv/s8gnzFi/M2HxGVWnsOjBObZ+Wp1vVvsTW/GtuV2y3WhKAsIhcs6J+KRZjB4lCezyw==
X-Received: by 2002:a2e:a590:: with SMTP id m16mr4205645ljp.462.1605460423394; Sun, 15 Nov 2020 09:13:43 -0800 (PST)
Received: from jonathartonsmbp.lan (178-55-159-67.bb.dnainternet.fi. [178.55.159.67]) by smtp.gmail.com with ESMTPSA id k19sm2341531lfm.103.2020.11.15.09.13.42 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 15 Nov 2020 09:13:42 -0800 (PST)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.7\))
From: Jonathan Morton <chromatix99@gmail.com>
In-Reply-To: <MN2PR19MB4045BC0869B633F8EB11155583E40@MN2PR19MB4045.namprd19.prod.outlook.com>
Date: Sun, 15 Nov 2020 19:13:41 +0200
Cc: Sebastian Moeller <moeller0@gmx.de>, Pete Heist <pete@heistp.net>, tsvwg IETF list <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <917073C5-B0C1-42BB-ACB0-120E9FA58A2A@gmail.com>
References: <d2edb18dd3cbfecce0f70b3345e4ea70a0be57b9.camel@heistp.net> <AF7A15D8-28DA-4DE5-96AB-BE9B6A468C3D@gmx.de> <MN2PR19MB4045BC0869B633F8EB11155583E40@MN2PR19MB4045.namprd19.prod.outlook.com>
To: "Black, David" <David.Black@dell.com>
X-Mailer: Apple Mail (2.3445.9.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/R9DIeOQ52cMrcMQT1IMjP80Fn7A>
Subject: Re: [tsvwg] new tests of L4S RTT fairness and intra-flow latency
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 15 Nov 2020 17:13:47 -0000

> On 15 Nov, 2020, at 3:26 am, Black, David <David.Black@dell.com> wrote:
> 
> To date, the 'Prague L4S Requirements' (Appendix A of draft-ietf-tsvwg-ecn-l4s-id) have been strongly associated with TCP Prague.  That association ought to be teased apart so that the resulting L4S scalable congestion control requirements provide a reasonable design space that can include a number of other congestion control designs - in addition to what's been discussed, e.g., SCReAM, it would be useful to better understand what it would take for implementations of protocols such as DCTCP and BBR (e.g., for QUIC) to meet those requirements.

Along these lines, let's look at the components of L4S:

1: The "architecture" involves a qualitative, revolutionary alteration to the meaning of a CE mark (relative to that established in RFC-3168 and revised quantitatively in RFC-8511).  In essence, it introduces AIAD semantics where AIMD was previously expected.

2: A classifier codepoint - ECT(1) - asserts to the network that this alternative CE semantic is in force on a given flow.  Important to note that existing networks do not expect this signal, and are thus incapable of correctly interpreting it; they assume that an ECT(1) packet should be treated the same as an ECT(0) packet, because that is what RFC-3168 specifies.

3: AccECN is a TCP extension that provides precise, reliable feedback of the density of CE marks on TCP flows.  This is deemed necessary because the AIAD steady-state has multiple CE marks per RTT, rather than multiple RTTs per CE mark in the AIMD steady-state.

4: TCP Prague, as the reference congestion-control algorithm, implements the classifier codepoint at origin when AccECN is negotiated, and exhibits a DCTCP-like (AIAD) response to CE feedback, and a NewReno-like (AIMD) response to packet loss.

5: DualPI2, as the reference L4S-aware AQM and qdisc, uses the classifier codepoint to implement what amounts to a PHB, differentiating L4S traffic from "classic" conventional traffic.  This PHB both implements the required change in AQM behaviour and provides an effective prioritisation of L4S traffic.  According to the designers, it explicitly avoids maintaining per-flow state.

6: Operational guidelines, which - as far as we can tell - boil down to "run L4S only across specially prepared networks and functionally isolate it from the conventional Internet".

The collection of components above, in the forms thus far demonstrated, fail to achieve both #4 and #5 of the "Prague requirements", namely "Fall back to Reno-friendly congestion control on classic ECN bottlenecks" and "Reduce RTT dependence".  We demonstrated that through test results some months ago, and again in the test results released just now.  We have not explicitly tested for achievement of Requirement #6, since we believe that to be less vital to interoperability with the existing Internet.

If we're going to get specific, I believe the failure to achieve Requirement #4 is the fault of Component #1 in the above list, while the failure to achieve Requirement #5 is the fault of Component #5.  Meanwhile, Component #2 is blocking progress on an alternative design which has a better chance of success.  I think these conclusions are supported by the failure of attempts to address these shortcomings by modifying Component #4, and by the introduction of Component #6 after the discussion in March rather than proceeding with technical refinements to the other components.

> My overall take on the requirements is that in 20/20 hindsight, some of them were overly optimistic, and hence need to be backed off/toned down/broadened to encompass what is reasonable in "running code" well beyond TCP Prague. That sort of collision between interesting ideas and network realities is not an unheard-of scenario in IETF, so I hesitate to view the need for changes to these requirements as evidence that the original ideas were inherently defective, as I've seen far more dramatic changes, e.g., some number of years ago, the first design of iSCSI login was elegant ... and resulted in implementations that did not interoperate, resulting in a complete redesign.


From a systems engineering perspective, it is possible to both see how the design of L4S was developed iteratively starting from DCTCP, and to recognise that this was an inherently flawed approach.  It was well known from the start that DCTCP did not interoperate with AIMD transports at common bottlenecks, but after the dust has cleared, the basic mechanism of DCTCP is the one thing that has been held constant in L4S' design - almost dogmatically, in fact.

The result is a system that, like DCTCP, can only be deployed into networks that have been specially prepared for it, and fails to interoperate with AIMD traffic without specific assistance from L4S-aware in-network components.  That is not really the sort of progress that justifies so much of the WG's time and attention, regardless of how desirable a move towards high-fidelity congestion control is.

So I fear that a substantial redesign of L4S is indeed called for, if it is to actually succeed.  This has to start with component #1 in the above list - the redefinition of the semantics of a CE mark.  Restore CE to RFC-8511 semantics, and most of the severe interoperability problems we have ben talking about for the past two years simply disappear; only networks trying to take advantage of the new behaviour need even be aware that it exists.

I need hardly remind the WG that I and my colleagues have been presenting an alternative design which does exactly that.  We'll happy to discuss it further as and when appropriate.

 - Jonathan Morton