Re: [tsvwg] ECN encapsulation draft - proposed resolution

Jonathan Morton <chromatix99@gmail.com> Tue, 22 June 2021 19:38 UTC

Return-Path: <chromatix99@gmail.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7CED03A1497 for <tsvwg@ietfa.amsl.com>; Tue, 22 Jun 2021 12:38:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.848
X-Spam-Level:
X-Spam-Status: No, score=-1.848 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dA-fVR0vjemi for <tsvwg@ietfa.amsl.com>; Tue, 22 Jun 2021 12:38:36 -0700 (PDT)
Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com [IPv6:2a00:1450:4864:20::134]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A80853A1493 for <tsvwg@ietf.org>; Tue, 22 Jun 2021 12:38:36 -0700 (PDT)
Received: by mail-lf1-x134.google.com with SMTP id x24so37778578lfr.10 for <tsvwg@ietf.org>; Tue, 22 Jun 2021 12:38:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=dIrZcyNQeKr3KYAzrlpGWzM/K3/SEzgCTfa5bmfKGAU=; b=u5l3B9DI5CduiHCsZQQFgl3UFwau1CO0L22BcGHv0AJBkOM+CqkTI8Hh6Rh3P42VLE mXXPgOM3mS6jmyA4QEFvutG2Wmbb6F1qPwUFeYm1AWIs5G6FgHPP3OUVZvUt2PsDisKA 77lccr99M680i3muFK84h/CaEYPXhUxua6JTJROUgtBqLBaT1wR0j5rjeTit7mm3XXvU 04zJxU7Fv8VhfB08XRJBegPRcRVv6xdx+RdzMyT+krH3TqIPGQzK+pwLvn2QjkaDL5YL N6MAATWv9H7IaCn8GARSssPPj/um1pfK+nrn0a2GE9S4TvQraKTwYS3nRpAZU/Wmz7N7 vDQA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=dIrZcyNQeKr3KYAzrlpGWzM/K3/SEzgCTfa5bmfKGAU=; b=CPtSrocT3km4gl8xUUeW1ZqbGh++SkOHvrQq/RrrEKmPqd5TQd+PcrRENszDBywRih mWDbbUuvPBC9825n7nHPdEsKdsVC13xcfjds2drDYFbRrJ0LMlMEbSbb5mX7T+R1n796 I2xfD/8+c9FX71Nys+/tKnggCo2LK3GZlaDX+jIEGi6Y9O3KeIcgFbhVxPHHx+z1BWMR Q2hAwyTdkwjhMlXEMEEWsJwTCYS5VP7MM8i1GYqYZToFZ7i3gMfVe+9khw7R1UaGIagz w2M6g1Y1OpLSHabZrxCSAqhcurLsT2j6eZEq8Bx8zTo07K4uUHIgdc9HRNirEhufHExa LuiA==
X-Gm-Message-State: AOAM531cTJUqDCj4K4HVJvzJ0xQ7cJIPmnACp4NQesPA5umzzHfeQ3Zv k4zjnbCxVUp+e184gVHnJ/M=
X-Google-Smtp-Source: ABdhPJxFc8U92oHTjwdFoPCwCw4OYp+Tfr1iL0FR7tv9ij/hdeTOinuMycnozo+bRa1QW5FUk6zplA==
X-Received: by 2002:a05:6512:370c:: with SMTP id z12mr2417041lfr.272.1624390714521; Tue, 22 Jun 2021 12:38:34 -0700 (PDT)
Received: from jonathartonsmbp.lan (37-136-219-147.rev.dnainternet.fi. [37.136.219.147]) by smtp.gmail.com with ESMTPSA id 28sm2313741lfr.232.2021.06.22.12.38.33 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 22 Jun 2021 12:38:33 -0700 (PDT)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.7\))
From: Jonathan Morton <chromatix99@gmail.com>
In-Reply-To: <alpine.DEB.2.21.2106221542420.4160@hp8x-60.cs.helsinki.fi>
Date: Tue, 22 Jun 2021 22:38:31 +0300
Cc: Bob Briscoe <ietf@bobbriscoe.net>, Donald Eastlake <d3e3e3@gmail.com>, John Kaippallimalil <kjohn@futurewei.com>, Markku Kojo <kojo=40cs.helsinki.fi@dmarc.ietf.org>, "tsvwg@ietf.org" <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <D7734F0E-3A3F-4E82-98E3-035B45AC5876@gmail.com>
References: <MN2PR19MB40454BC50161943BC33AAAD783289@MN2PR19MB4045.namprd19.prod.outlook.com> <43e89761-d168-1eca-20ce-86aa574bd17a@bobbriscoe.net> <de8d355d-08b6-34fb-a6cc-56755c9a11ee@bobbriscoe.net> <MN2PR19MB4045DB9D2C45066AEB0762DB83259@MN2PR19MB4045.namprd19.prod.outlook.com> <alpine.DEB.2.21.2106021717300.4214@hp8x-60.cs.helsinki.fi> <BE497F82-5452-41A1-943F-7ABD0048C7F9@gmail.com> <56c2887b-5e9e-c2b6-c760-81e2627400a2@bobbriscoe.net> <3a66effa-9269-a9b0-48e8-d48bd46d70d1@bobbriscoe.net> <alpine.DEB.2.21.2106221542420.4160@hp8x-60.cs.helsinki.fi>
To: Markku Kojo <kojo@cs.helsinki.fi>
X-Mailer: Apple Mail (2.3445.9.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/eKOegyqFWDQCYmtWWAqXbwcJCuU>
Subject: Re: [tsvwg] ECN encapsulation draft - proposed resolution
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Jun 2021 19:38:42 -0000

> On 22 Jun, 2021, at 5:02 pm, Markku Kojo <kojo@cs.helsinki.fi> wrote:

Up front, let me say that I mainly agree with your approach to this.

> And, if the number of flows is increased, then the time between the marks decreases. So, we need to be careful in understanding what we are modelling. Otherwise, if we just play with the formulae, the outcome is just crap.

So, rather than playing with formulae, I organised an empirical test over the weekend to see what the effects really are.  That accounts for the long pause before replying.

	https://sce.dnsmgr.net/results/mss-tests/

The test involves three flows with the same CC algorithm running through a common single-queue, single-instance AQM.  The AQM is Codel as an exemplar of time-domain marking, or PIE as an exemplar of packet-mode marking, both with ECN enabled.  The test was repeated for Reno and CUBIC, as exemplars of standards-track CC, and DCTCP as an exemplar of 1/p response.  The path parameters are 50Mbps throughput, 80ms RTT, sufficient to keep CUBIC out of "Reno compatibility" mode.

The three flows differ in terms of packet and segment size.  Flow 1 is a conventional bulk flow using a full 1460-byte MSS.  Flow 2 reduces this to 730-byte MSS.  Flow 3 uses a 1460-byte MSS, but goes through a path with reduced MTU (with PMTUD disabled) causing fragmentation into two packets per segment, so has the same packet rate at the AQM as Flow 2.  Fragmentation reassembly is as per RFC-3168.

A well-conditioned congestion system should give each flow roughly equal application goodput on average.  Some allowance can be made for bandwidth lost to headers in the smaller packets, but this is a small effect.  But what do we actually get?

The theoretically ideal case is DCTCP with time-domain marking, where the expected goodput is exactly equal for all three flows.  The actual observed goodput is not quite equal, but reasonably close and very consistent over a 10-minute period:

	https://sce.dnsmgr.net/results/mss-tests/dctcp-fq_codel-plot.html

But if we apply packet-mode marking, the smaller packets are markedly disadvantaged compared to larger ones, approximately in proportion to the average packet size, and again this is consistent for 10 minutes straight:

	https://sce.dnsmgr.net/results/mss-tests/dctcp-pie-plot.html

I do not have running code with which to test a "marked bytes preserving" reassembly rule.  However, such a rule could only possibly influence Flow 3, as that is the only one that's fragmented.  In theory it would be elevated to the level of Flow 1, leaving Flow 2 as the only disadvantaged flow.

You might say that using small MSS is not reasonable for capacity-seeking traffic.  But recall what Paul Vixie pointed out in the other thread about UDP Options, that the MTU over Internet paths could reasonably increase in future.  This implies that "jumbo" TCP segments would, at least for some time, share space with traffic still using today's Ethernet-sized packets, and the latter would be in the analogous position of Flow 2 in this test.  So do not dismiss Flow 2 as irrelevant to your interests.

Moving on to Reno, we see roughly the same effect, though the large Reno sawtooth means we have to peer through a lot of noise to see the trends.  Nevertheless they are similar to DCTCP over a long average:

	https://sce.dnsmgr.net/results/mss-tests/reno-fq_codel-plot.html
	https://sce.dnsmgr.net/results/mss-tests/reno-pie-plot.html

And further likewise for CUBIC:

	https://sce.dnsmgr.net/results/mss-tests/cubic-fq_codel-plot.html
	https://sce.dnsmgr.net/results/mss-tests/cubic-pie-plot.html

These results with standards-track CC confirm those from DCTCP, and the same logic applies.

Current practice on the Internet is to use time-domain ECN marking (because Codel is the most widely deployed AQM with ECN enabled) and RFC-3168 fragment reassembly.  The above results show that this is clearly superior to packet-mode marking with either RFC-3168 or "byte-preserving" reassembly rules.  Hence we should probably try to avoid encouraging the latter behaviour in new Internet specifications.

 - Jonathan Morton