Re: [tsvwg] Question regarding slide 3 of https://datatracker.ietf.org/meeting/106/materials/slides-106-tsvwg-sessb-31-tcp-prague-status-of-implementation-and-evaluation-00#page=3

"De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com> Wed, 09 June 2021 14:35 UTC

Return-Path: <koen.de_schepper@nokia-bell-labs.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A96F03A1A1E for <tsvwg@ietfa.amsl.com>; Wed, 9 Jun 2021 07:35:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.698, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=nokia.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jrSZ4YsWrkKU for <tsvwg@ietfa.amsl.com>; Wed, 9 Jun 2021 07:35:02 -0700 (PDT)
Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-eopbgr60092.outbound.protection.outlook.com [40.107.6.92]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BF0543A1A1F for <tsvwg@ietf.org>; Wed, 9 Jun 2021 07:35:01 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kJ8WRJaugnK2vwEtMvKhiWFivGblHFVqlR7LppAi3CTY220M1SVbbzEkfnYIdcY4ZYPnDNpIo1JuVYUb/n0e0Dzel0cMd+8OddRsXYXU/a0GHZihnD57kieZxXaWUbXg26lyZKATHJR5gpu28dZa7U/P28HpzFY/mG4HWhC9voKouoP98Oae9NTOuefYdBoh8pyalb/vJiR0iITsOwPs8SdGXp9RttttZGA3dOIBh8zNMeXtlNagpbpVPrd60rpHYyrSn89Hrx3PiYSotmkMtj3ZOXsX7N9td6pFmf7EiswbzWvOTHHcmtblFlomNP4+LZa2f/ntFsBsdEFWpOp0Xg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=p+qMYvscgQ9zPbXd1yb1JRmBzXmfNfZBuFaZR9NzAlk=; b=X0HYMs3Br3ewSIwjyYaGlsbdsMVt8MxVHJCPxaWh3e8EZ2hGiDX6KFPJqgJ99LVli0p7TweVPF1IZ/4eCnmTrD6rFBvzIzWIVFcMPUtWQc9AodKPeDIePxh1FTbcdnnvDKDQJvA8CKQmvv9NsDFzWgPLgty5XyWB74cGbGxX3rnFXa4JvbGXU2zjqzdvR6NFk1JtCUt3/+ylKdtjFV4x2PGrl4GyLZhjkpr01euJRC68mIv5nP0pPbsGYT//CiUC5C93tcRfDKREuI0oK/5VQ5eD0J0nFIu3zgeBx2kENA2qhL8Ue2Q74kl6gYNLPbwV9h1Ri9xEzIN/4hP+TsMPvQ==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nokia-bell-labs.com; dmarc=pass action=none header.from=nokia-bell-labs.com; dkim=pass header.d=nokia-bell-labs.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia.onmicrosoft.com; s=selector1-nokia-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=p+qMYvscgQ9zPbXd1yb1JRmBzXmfNfZBuFaZR9NzAlk=; b=UGSUWlzuM8wtEHVSZK0gjv2Sm859MKun6LBP+WaPBzZ1vUTQ25iQH7QkqvVuDSNpm1dKpGszepzKeGbSHS6PBsaiDTFelgm2KIAVpCBi5dxsjGb4nXpx54XbCtXZCShjj0UY1Zt4GUVRJaIFPMMJ3BMq/cyD4Tzz3rZwVEzGDf8=
Received: from AM9PR07MB7313.eurprd07.prod.outlook.com (2603:10a6:20b:2c6::19) by AM0PR07MB4513.eurprd07.prod.outlook.com (2603:10a6:208:79::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4242.9; Wed, 9 Jun 2021 14:34:59 +0000
Received: from AM9PR07MB7313.eurprd07.prod.outlook.com ([fe80::6103:ea49:e70b:2823]) by AM9PR07MB7313.eurprd07.prod.outlook.com ([fe80::6103:ea49:e70b:2823%7]) with mapi id 15.20.4219.021; Wed, 9 Jun 2021 14:34:58 +0000
From: "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com>
To: Sebastian Moeller <moeller0@gmx.de>, TSVWG <tsvwg@ietf.org>
Thread-Topic: [tsvwg] Question regarding slide 3 of https://datatracker.ietf.org/meeting/106/materials/slides-106-tsvwg-sessb-31-tcp-prague-status-of-implementation-and-evaluation-00#page=3
Thread-Index: AQHXV5gC1Auz5v+FSEGRZXPINOSXbasLpGXA
Date: Wed, 09 Jun 2021 14:34:58 +0000
Message-ID: <AM9PR07MB73131643CC45F6C63EA0B824B9369@AM9PR07MB7313.eurprd07.prod.outlook.com>
References: <4D72E5C0-6EA6-4C90-9CD2-A94201806B22@gmx.de>
In-Reply-To: <4D72E5C0-6EA6-4C90-9CD2-A94201806B22@gmx.de>
Accept-Language: nl-BE, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: gmx.de; dkim=none (message not signed) header.d=none; gmx.de; dmarc=none action=none header.from=nokia-bell-labs.com;
x-originating-ip: [2a02:1810:1e00:cb00:858d:57b8:2f7c:ff67]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 707c3e7c-5b2b-492f-2f5b-08d92b53c22e
x-ms-traffictypediagnostic: AM0PR07MB4513:
x-microsoft-antispam-prvs: <AM0PR07MB4513E2D70EE6096EA80CCC20B9369@AM0PR07MB4513.eurprd07.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:8273;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: z7ilNQiPnzttVYFqLgaYGS4n1ZGD8KOQPwrsMBjG7/4cACIyMTzlFaJI+qQs46KhHPJ+YY6X1svzkNWZXlDUGuZOfakv/BFoXXPIy/DLBrFqa3l4ecNZV5iea0V0PCRmZNiLpUzbixSjyPcgWS4v8Kvd0qmPPieaqT7M1n9D+gscsyXJL6RNrKqNNSZxkfxAqvLoFDdMpLeBDCwmO3bHUdRHHavrN5l26bH8dlKh/i5TxhftuTkdEysiMRhYajN5QI1Mfko3rNHrgeybwl179rMV47EN+/PbsYxwPAvEMoERmZt+Wbb3WiV65mTxHO36RRGCG2pIwf082QoH8YWuPPjIKQyga4gkXeI7kRmzJdEWI5m+hwhJGLIT6IkBgsCcTq3bu5HGKLT2fEhN9/dLK5/GGrRnu4Nqz73smv1bR+7gw668MCbEdgysOaXRXP0LebFe+qZCmnIaYx93uRbTyi/ieQiUkMjLP41q+3b3EaS3ALU8pmQ5WHDEMg4TB/cAzU/vaMwN6RhlBFVqdz68FSpEWIa4dndY1uwvAOQa7GIAdFH9lljPicyFhS6H6c0QK8Tt4PGMSCSbG8fNFYFBdD+Wgw7yaDTWim6WfgOlURjx4zh7RdxTheoPOIbj6E24ijCz8MHLMR5Z9h70sIzZk+8yI5QCmEKLaI6oiZhrid9MAb+3SdcZqHegPpCvVEU6Hag/X/1RYLL06szIp6dEbw==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AM9PR07MB7313.eurprd07.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(136003)(346002)(376002)(396003)(39860400002)(8936002)(186003)(5660300002)(76116006)(83380400001)(110136005)(86362001)(33656002)(7696005)(122000001)(52536014)(966005)(55016002)(66946007)(53546011)(8676002)(6506007)(71200400001)(2906002)(316002)(478600001)(64756008)(66556008)(38100700002)(9686003)(66446008)(66476007); DIR:OUT; SFP:1102;
x-ms-exchange-antispam-messagedata: 30Xx3wwNTteJX74IGSTmUzRBA5/AkckgbzuGsCTZ/vdDC4IiCEOBwVH/ot/TfXpSgDEuEqdjh4gPYDWoHtXzV436E7Ozn8lxIBO8MZQfoy6EfJSMTt+O0nX1BnGgjQk9CeN3PC8wcn9+HedMsLePUzDs7/rbZ2hu7aNKAKBOxnxhuXayCAFBwwGYlcYwbG7IclWhZC5eCWoI9Sni+hVhDchIZhCTKLMFJkcrB76emYITXGoQpq12xr2mLuPo13E5fiKVlPFkYFMCk/1JQimURIrfopZs12LYbP83sl73mMFSrpqygdpcV2J1AvYJ2p4J7RhEwSaoeQ781kAQA5/Xy9XUMPgAd/0/mQ7UAx1OPAZ3ngLSwmThetR7JYc/FJweCwZ5GXbK7FTUMQSjMnL/SI2xTSAKbN9f5oW/7ngUzVh1O8mmhGKzyY2gl3JYRJlqub2uwxQMmUZlMOVvSBe0MKjFT1NCgt89G4iG84EEvUipxymKY8NHnYeq+jOeOIU4EAxPTKSdesV71/LQ8wDPoYHQWKnu1L1uY0zgY2TyTF1RnoDysFstAEu2pcl6LayPz9+O0XMyy7bDEHG+Pw4mp6lKflKhb1gFh26NEuRa2KAbArjkvg4FDJanN0AjZNgoVWdQkNLVVontI2LZbjzUE430S+sk2cuC48sn/64gv+pTXiUf/R+7JCKkU/czpjTs1lKSNQheKpTXazwqgrlnK/qQnWapo9jikPf4IFrLDTfAbJo0MkzAMj5lCz7PjIdt+S7xVYFvYK/1dQSqbCzjRRGDrpI+e8u7RgrghhsQLhh6fYT4HM5sh1D/rQPooH/VF2dz0vvOzAXa8/HJIxbSm076wSLSXR2421svVh7V+3TeRNv54qtyc60LhW6aicgX/Ro63VNdhnt9Nmzp3gA9zwWZxURLCBv+xjZrwK8zbemRFZrAiiYncCsszXXOLn/f67uKqIAnZmwiXdNT4QHpRt4LETtytWlqNloGL+KzdjqtDWThF3WFjXkPpRA/Z4xCvcvOo9vtW9jbyffxS4J27lgSXBFd8gBzzCc/dp39HvuWkPx3ccdOIXTu44n5YHnRSOKxzY/uqDjsXOqwmFsHwcnySTRNDtkEJUrnBULTV2hWDCHo6pPmNcLGcNKIkE/GYc8lrPvhTXiYOvgXD1rQpD7ZAxbe6VhAaVJVhz9C6mKGzc6qPDvtiobOnUXeC+I3H9VDh/qQ/JPaTbIoydaUqi5VDmtghAWPDUJcUYrlrwdL4TJeojdpK7YdcQ0ft+j7QGbbPDYqyC6X/JDKP0hpF6q55RMxieOrkfC5/tQAyTd4w1OEF0PkV3EPJit4UOsSEnzb4XeNEa2j+bd/giddUxeg3bE4VyljJllyoThUm6er7qHXAgETvNQDNkgBNhI0
x-ms-exchange-transport-forked: True
Content-Type: text/plain; charset="iso-2022-jp"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: nokia-bell-labs.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: AM9PR07MB7313.eurprd07.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 707c3e7c-5b2b-492f-2f5b-08d92b53c22e
X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Jun 2021 14:34:58.8448 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 5d471751-9675-428d-917b-70f44f9630b0
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: Io3JJa8YjHUmBEsGiukou3Oqk0dFL2mNXK3Xplq9sZLRZPFcJeHLLgY2I19ihWz1wT3pETdvMMiWgAbk1OKbAy1jK/XBHBpqt9U382MfmWOW9thcgaGS6YSBhZWq7RNt
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR07MB4513
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/PYx4AnWrctRPvBfS6Vbs3EUMpM8>
Subject: Re: [tsvwg] Question regarding slide 3 of https://datatracker.ietf.org/meeting/106/materials/slides-106-tsvwg-sessb-31-tcp-prague-status-of-implementation-and-evaluation-00#page=3
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Jun 2021 14:35:07 -0000

Hi Sebastian,

We typically load the system measuring these plots with 1 flow L4S, one flow Classic (up to now labeled 1-1 in the paper), then we add 2 load levels of dynamic traffic (labeled 1h-1h for high load and 1l-1l for low load in the paper). The high load means 100 requests per second per 40Mbps throughput per traffic type (Classic & L4S) and low load would be 10 times less (10 requests per second per 40Mbps per type).

On the iccrg presentation, you see we mention these details:
● fixed Ethernet
So smoothly shaped physical layer
● long-running TCPs: 1 ECN 1 non-ECN 
1 flow of each traffic type
● web-like flows @ 300/s ECN, 300/s non-ECN 
So this is for the 1h-1h case, where there are 3 times the 100 requests per second, which means for a link of 3 times 40Mbps = 120Mbps (see further) and not for the 1l-1l which would be for 30 times the 40Mbps
● exponential arrival process
For reproducibility and fair comparison, we use a prepared scenario file that is generated as a exponential distribution with a average arrival rate of 100 requests per second in this (1h) case, played our 3 times as fast
● file sizes Pareto distr. α=0.9 1KB min 1MB max
Again for reproducibility and fair comparison, we use a prepared request size file with the parameters specified
● 120Mb/s 10ms base RTT
For this network throughput and base RTT

In this slide we grouped 3 different experiments:
- DualPI2 using for the L4S/ECN flows DCTCP (a good version like the original DCTCP in kernel 3.19 or Prague now for the recent kernels (we didn't maintain DCTCP to be aligned with Prague)) and classic flows Cubic without ECN
- FQ_Codel using for the ECN flows Cubic with ECN enabled and classic flows Cubic without ECN
- PIE using for the ECN flows Cubic with ECN enabled and classic flows Cubic without ECN
So here PIE and FQ_Codel don't use L4S.

The paper plots are using the same network and traffic conditions (for the first row, while the second row is the 1l-1l case), but with slightly different AQM mechanisms and traffic type combinations.
The difference in the paper are for the first DualPI2 DCTCP/CUBIC experiment, in the presentation we had the time-shifted FIFO (you can see that the latency of Classic is never later than that of L4S + about 30ms and v.v.), while if the WRR priority scheduler is used, the latencies are completely decoupled (so L4S even lower latencies and Classic not limited, only by their own congestion control).

For the PIE there shouldn't be much difference besides the typical variance we see below 99.9 percentiles for the classic traffic type (L4S is more consistent up to 99.999).

For FQ_Codel the experiment is completely different, as the ECN-Cubic is replaced by the DCTCP traffic and for ECT(1) we used the immediate 1ms threshold (you are correct we forgot to mention in Table 1, thanks for noticing) instead of the Codel-ECN in the presentation for managing ECN-Cubic.

Hope this clarifies up to here.

Probably you wonder why the latency of the L4S is much bigger in case of FQ, compared to DualQ, which is simple to explain:
- In FQ every flow has its own queue AND AQM with a threshold of 1ms. So all flows are controlled around 1ms, while in L4S they just find usually an empty queue without the 1ms threshold being hit at all, because there is Classic traffic that keeps the L4S flows below the link capacity via the AQM coupling.
- When dynamic flows kick in, there will be many Q's that need to be scheduled, plus new flows or "empty-Q" flows will get priority over the 1ms L4S flows. This means that also L4S flows have to wait much more in FQ than in DualQ. First in their own queue of around 1ms and second for their queue getting scheduled in the RR scheduler. Additionally their bandwidth is aggressively varying at a micro scale when flow queues build up and disappear and when flows start and their packets get priority. This causes additional jitter for the FQ-L4S flows.

So from this comes my insight: use FQ if you think fair rate is most important, use DualQ if you think smooth throughput (low rate jitter)) and low latency (jitter) is most important.

Hope this clarifies the difference between FQ and DualQ (and their results) too, and allow you to reproduce these results. Note that these delays are measured per packet and is only the sojourn time between enqueue and dequeue (so exact queue delay CDF per packet), not anu averaged/smoothed delay that is typically used in some test frameworks which make it impossible to spot these 99.999 percentile delays.

See also inline for some of your specific questions.

Koen.

-----Original Message-----
From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Sebastian Moeller
Sent: Wednesday, June 2, 2021 12:13 PM
To: TSVWG <tsvwg@ietf.org>
Subject: [tsvwg] Question regarding slide 3 of https://datatracker.ietf.org/meeting/106/materials/slides-106-tsvwg-sessb-31-tcp-prague-status-of-implementation-and-evaluation-00#page=3

Hi Bob, Koen,

I recently had a look at https://datatracker.ietf.org/meeting/106/materials/slides-106-tsvwg-sessb-31-tcp-prague-status-of-implementation-and-evaluation-00#page=3 again, and I wonder whether you could share more information about the data/experiments underlaying that figure.

It looks similar to Figure 7 of https://www.bobbriscoe.net/projects/latency/dctth_journal_draft20190726.pdf, except the individual CDFs look slightly different. Figure7 does not have much of a legend and is also not referenced at all in the manuscript. Could you share the exact test conditions, as well as the modifications to fq_codel you mention in the manuscript as well as the ce_threshold value you used (assuming you used that at all), please?

[K] You are right, seems we forgot to mention we used a 1ms threshold for FQ_Codel in the manuscript in Table 1. But besides that:
"(we) used a modified version
of FQ-CoDel, where L4S support was added by using a shallow
ECN marking threshold for any ECT(1) packet, with an
additional check to ensure that the threshold is only applied
if there is more than 1 packet in the queue. The queue length
check was added to prevent 100% marking at lower link rates,
considering that packet serialisation in such cases takes longer"

Also, in that manuscript you seemed to have used fq_codel with target 5ms and interval 100ms and compared it against DualPI2 with a 1 ms ref_delay target" for the L-queue. How does your modified fq_codel stack up in comparison, how are target and interval interpreted for ECT(1) traffic?

[K] We really used a 1ms threshold for ECT[1], so there is a direct head to head comparison possible in the paper (otherwise the 80% percentile would be rather around 5ms instead of 1ms).

Best Regards
	Sebastian