Re: [tcpPrague] A Congestion Control Independent L4S Scheduler - TCP Prague preliminary results

Szilveszter Nadas <Szilveszter.Nadas@ericsson.com> Wed, 02 December 2020 16:06 UTC

Return-Path: <Szilveszter.Nadas@ericsson.com>
X-Original-To: tcpprague@ietfa.amsl.com
Delivered-To: tcpprague@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D6E733A1471 for <tcpprague@ietfa.amsl.com>; Wed, 2 Dec 2020 08:06:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.101
X-Spam-Level:
X-Spam-Status: No, score=-2.101 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=ericsson.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OZIbjrtR4hbc for <tcpprague@ietfa.amsl.com>; Wed, 2 Dec 2020 08:06:10 -0800 (PST)
Received: from EUR03-DB5-obe.outbound.protection.outlook.com (mail-eopbgr40078.outbound.protection.outlook.com [40.107.4.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8EE6C3A1699 for <tcpprague@ietf.org>; Wed, 2 Dec 2020 08:05:19 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=N2ztVX8ymigjmFXacboXiiMMYXm+7L0oiMaUX7Z9/aX4f/nbTgCi75OQnodRcGwE9Z9HD92Qmq51IsMqSZc02MRZd6dEmoNRtGe+9eycJwzPx893fP1676wgYXhzTXInSOlGd83tXLRQ2ip+S3NTPituzrcyde0x5KWypbi9Nk7EKt20ju3jbjPoA2CcBuQvIRH9UT+penE7zVnYr3hOWlq+Et/KEdNQJf0VjWhF6YM7TriSz+xn2rLOCCAmkmndm4NI0F8fi4ZqHlWnw9ODEBF5oRzpVSVADOvGtXOuAsobO+eVOntm9Cg7MvB2JJ8tmoNBs04iaz+6TqI51W+jRw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Ir139OciYsK3bFCCA98S77h3eRlE+00ntZ/JRIEURCg=; b=D5a5xewFpsC4krEsANYyLEOaUE7b/yshpRjmEtSMIM24mEnSp1dQ4RUbEUeAysP3m4Rr89VDePmf9uJTJ0Acro9osS56CzV3G3JX2CZN4b0EEPcVNoTaHcDFRep4BHLxovGAfCWzxFUGk12XvMn+6NyFct5P9moQ6K/6mVLvtaMmWulAu3rCwjD70MAn4MBxmXR//92psGHTvhZOXSvHKQNdgbRIE7fQLOYr+NKXl1YCWagnITMiX/rQSRsoYNmkqiiM9AQWuYKRfFDRoyWVmZsPSV8krP2rBKUJ6mkwN5CsdNWba9L5ZXSKWpphY4qQxZAky8koW/IYxX1GPeE1Fw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=ericsson.com; dmarc=pass action=none header.from=ericsson.com; dkim=pass header.d=ericsson.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Ir139OciYsK3bFCCA98S77h3eRlE+00ntZ/JRIEURCg=; b=Ivvciqr+gsQ2rYRFBGVOnNEPkpxodeMt/U8S1CbMEkijAeNRvZytqjukin6BTtfOZcSZ31et+a2qzeIguRVxX6qAXpYpHuKc5YixP/iAbS+NLay18Bpq7VC7o6Z1WSs0xmtXq9bPOmiXzaAWWXjPa1nSsFvyutRi7QpJih7f/FA=
Received: from AM0PR07MB3953.eurprd07.prod.outlook.com (2603:10a6:208:42::26) by AM0PR07MB6353.eurprd07.prod.outlook.com (2603:10a6:20b:153::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3632.6; Wed, 2 Dec 2020 16:05:17 +0000
Received: from AM0PR07MB3953.eurprd07.prod.outlook.com ([fe80::c445:ace0:2cb2:c48c]) by AM0PR07MB3953.eurprd07.prod.outlook.com ([fe80::c445:ace0:2cb2:c48c%3]) with mapi id 15.20.3632.008; Wed, 2 Dec 2020 16:05:17 +0000
From: Szilveszter Nadas <Szilveszter.Nadas@ericsson.com>
To: "tcpprague@ietf.org" <tcpprague@ietf.org>
CC: Ferenc Fejes <fejes@inf.elte.hu>, Gombos Gergo <ggombos@inf.elte.hu>, LAKI Sandor <lakis@inf.elte.hu>, "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com>, "Tilmans, Olivier (Nokia - BE/Antwerp)" <olivier.tilmans@nokia-bell-labs.com>
Thread-Topic: [tcpPrague] A Congestion Control Independent L4S Scheduler - TCP Prague preliminary results
Thread-Index: AQHWuNcd4GJeRrqtOUWvcu6WwpWhZqnkFQ3A
Date: Wed, 2 Dec 2020 16:05:17 +0000
Message-ID: <AM0PR07MB39538F2B78AD479F413691878BF30@AM0PR07MB3953.eurprd07.prod.outlook.com>
References: <AM0PR07MB39533753D400A496BD46CB588B110@AM0PR07MB3953.eurprd07.prod.outlook.com> <AM8PR07MB7476F8110C025D0F8855B960B9E70@AM8PR07MB7476.eurprd07.prod.outlook.com>
In-Reply-To: <AM8PR07MB7476F8110C025D0F8855B960B9E70@AM8PR07MB7476.eurprd07.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: ietf.org; dkim=none (message not signed) header.d=none;ietf.org; dmarc=none action=none header.from=ericsson.com;
x-originating-ip: [178.164.169.140]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: ca9c4059-f996-4c88-e4ec-08d896dc0fbd
x-ms-traffictypediagnostic: AM0PR07MB6353:
x-ld-processed: 92e84ceb-fbfd-47ab-be52-080c6b87953f,ExtAddr
x-microsoft-antispam-prvs: <AM0PR07MB6353CE668082B37C7D0305F58BF30@AM0PR07MB6353.eurprd07.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: Yvnr+8g5Qtm4OYIR9UoJ8VhcWx4Dxgqyq5dmy6cTQKJyFC/pygnxSgXW0mvCp7udXcc/8Tu56XBF6KtOwtCwYy7bnxCvPziCTLDUU9wuPIpNW3mChOqvhIQwiUkLUlulY4AhWXqP3amlA9f8Z/pR/mjJPrPAmVavobLvk9LEM9ssJJNx4MP0Q59yvpn1IXwtUSCEvfItwN5AFItXFqqMyj0Nd7i54KhdD86sbrbd08szmHDqUBz3v1SbU00Dg00uSoQfWZO0ePgXPsDSkLnaCb7TPX403R0FxxaVn/dferhKZjf5guoVukgzRVZn6TOtMbZgh1QcUXqJETTaZZOZc/mi/zBhU6oRQVxSSR8Xxiq8eoHISXeUYvhCFWhO6X2v63ocz8dc4DekgZGOQ19onAVPfg+ItbdwEUs4sNrCkT0QlyD/dG9F8sezKk2JAt6FSo3CgVp9PQrIf0nG9zXfAg==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AM0PR07MB3953.eurprd07.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(39860400002)(396003)(376002)(346002)(366004)(136003)(9686003)(966005)(53546011)(71200400001)(6506007)(316002)(26005)(6916009)(52536014)(7696005)(55016002)(54906003)(33656002)(66946007)(4326008)(66446008)(66556008)(166002)(186003)(2906002)(76116006)(83080400002)(5660300002)(8936002)(478600001)(66476007)(86362001)(83380400001)(64756008)(8676002); DIR:OUT; SFP:1101;
x-ms-exchange-antispam-messagedata: =?iso-8859-1?Q?GYC38G/vI0VMX0jvjhvciOHT1pJqWYp7Wg6Hi3xUacKYdjEZ6SD+mE2ASo?= =?iso-8859-1?Q?GmwSf1aL3QmA7iH5Y/3bhO66lNfhP/MOy1jbc1hfr3uzwNxm20f1S50uLZ?= =?iso-8859-1?Q?yG7y5uINIjIXciPgaw49OR/2AFTlrWVAdl8/VIYZCS3WQdfoPeyzU2Fk3C?= =?iso-8859-1?Q?lgrsrhv/u8JqpqtX5/Xfwf64KdQzBtTEqPzesXR8HZW/2NJj3/xyHeIHTS?= =?iso-8859-1?Q?LsLvsbv43AMWwobtHm7nRNnATKt8RJJh6nRF1sNf303579ByKt3QDtrKmW?= =?iso-8859-1?Q?3uQPL9sK/Rxu4tApG601xOeSaoJzOJ6qzfoRgXQpU9xF/75XsK8yQSKw0P?= =?iso-8859-1?Q?u/PX/1FIb+UXAw7B6aZzZN+ELhOASe08mBpGw/EddZRmh/OFEVq4IZmU1L?= =?iso-8859-1?Q?Q7UEbGL1Dmh6+OKUlDLw+BnYgHMRsYPtYvzVRFrSOn/lnCWcjCl5xqna0W?= =?iso-8859-1?Q?V7xEt/ByBFeZxjS/oASR2tAIHXFs9zu1ZgWudCNKn/uzgev4gNOKaG6WUb?= =?iso-8859-1?Q?7c6YhfWEVY+RRG9ywKh/65bJ10bDT9MDc5r952d+1ZloVEhwgPVCX9OQhz?= =?iso-8859-1?Q?7vk6/gcpyvLOy7Cbr3JnNjX7oYbiEle1Zauefttw8+6Xr3rsLtKeX58Xvm?= =?iso-8859-1?Q?qJF6zAlxkPepw7NtPUsjfabCKyQrViwAyp5vvLZayAsT3Q6FrMW41t/X6C?= =?iso-8859-1?Q?515ScxbLJ598BkropO3fediHyxajAUj7slplfesyukHDajEzLQObRbIL53?= =?iso-8859-1?Q?zESf+07JzMAOR1o2ZrMLEcTyK0G+MtTqdgTbcP+fKwK9IGxdECnXYlfB8T?= =?iso-8859-1?Q?WnZmMyHzyiMUK8JKimAmETwIkPXC/gELwu35ZU6fwmalgVAvXrbQDveHdR?= =?iso-8859-1?Q?SLYL2MbBtjx6HXgLYt0lWHv+oUMJf/oJz747XCbBqVToqHp4ND2duu/eqI?= =?iso-8859-1?Q?n0s9yiFp2O5x80R7EuV23rA4Xnd244ttRAgFeEtOCyUnar7Vlu2xfZM6v+?= =?iso-8859-1?Q?XOlf8LFnicbN58v3k=3D?=
x-ms-exchange-transport-forked: True
Content-Type: multipart/alternative; boundary="_000_AM0PR07MB39538F2B78AD479F413691878BF30AM0PR07MB3953eurp_"
MIME-Version: 1.0
X-OriginatorOrg: ericsson.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: AM0PR07MB3953.eurprd07.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: ca9c4059-f996-4c88-e4ec-08d896dc0fbd
X-MS-Exchange-CrossTenant-originalarrivaltime: 02 Dec 2020 16:05:17.3430 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 92e84ceb-fbfd-47ab-be52-080c6b87953f
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: Wakn8VUo8DKy5Di7pi7/gK2ixnxVukP3WY18olxejPwjJjv6q3LmeSLt8FRMKeHkt98BghJf81ibOG4i6wGRRtJhJaz0sYWRKiEem9u49fc=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR07MB6353
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpprague/KAjT5XOQZo1GT0dpolK90rd5kvg>
Subject: Re: [tcpPrague] A Congestion Control Independent L4S Scheduler - TCP Prague preliminary results
X-BeenThere: tcpprague@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "To coordinate implementation and standardisation of TCP Prague across platforms. TCP Prague will be an evolution of DCTCP designed to live alongside other TCP variants and derivatives." <tcpprague.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpprague>, <mailto:tcpprague-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpprague/>
List-Post: <mailto:tcpprague@ietf.org>
List-Help: <mailto:tcpprague-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpprague>, <mailto:tcpprague-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 02 Dec 2020 16:06:14 -0000

Hi Koen and  Olivier,

Thanks for the answers.

Some follow-up questions:

==RTT unfairness as 1:5 ratio.==
Is there a way to compensate it in the DualPI2 scheduler? My understanding was that with the right parameters it is compensated "automatically", there was good fairness among DC and Cubic flows in the original (Dual)PI2 papers. I guess that for single PI2 that was because of the shared queue. But what about DualPI2? Can it be configured in a way that  fairness remains also considering queueing for Cubic? (I vaguely remember a description of this in one of your papers, but I cannot find it now). I guess that the good fairness with DCTCP vs. Cubic was an actual anomaly, not the worse fairness with Prague vs. Cubic?

= RTT independent Prague version=
Can you provide a pointer how to configure that? Are the TCP Prague CC parameter defaults in the version we used not meaningful? Shall we change other parameters?

Cheers,
Szilveszter


-----Original Message-----
From: Tilmans, Olivier (Nokia - BE/Antwerp) <olivier.tilmans@nokia-bell-labs.com>
Sent: November 5, 2020 12:10
To: Szilveszter Nadas <Szilveszter.Nadas@ericsson.com>om>; tcpprague@ietf.org
Cc: Ferenc Fejes <fejes@inf.elte.hu>hu>; Gombos Gergo <ggombos@inf.elte.hu>hu>; LAKI Sandor <lakis@inf.elte.hu>
Subject: RE: [tcpPrague] A Congestion Control Independent L4S Scheduler - TCP Prague preliminary results



Hi Szilveszter,



 > The results are at the below link. We are somewhat surprised by much  > increased aggressiveness of TCP Prague. Can you comment on it? The settings  > and setup we used are described in the pdf (and in the original article).



Thanks for sharing these.



* Short answer:

DCTCP actively hurts itself, tripping the step AQM unnecessarily.

This is mostly benign in DC environments due to the almost non existent RTT, and small cwnd, but causes observable performance hits elsewhere.

Prague implements fixes for these issues.



* Long answer:

Please find below the main changes from dctcp to prague. The intuition behind these is to avoid triggering the step AQM part of the dual queue when there is classic traffic, or limit the extent of the received marks.

The first goal ensure that, when there is classic traffic, prague only receives marks from the coupled PI2 component, which are strong enough on their own to keep the L4S queue empty most of the time. Additionally getting marks from the step AQM means that prague is hurting itself, as it will reduce more than necessary. The second goal ensures that prague does not unnecessarily drive the AQM (either PI2 or the step) to high-marking rate; again because it will hurts its throughput and also because it will create a standing queue.



Note that depending on your experimental setups, not all the below factors will contribute.



1. Prague is always paced, regardless of the egress qdisc on the data sender,

   and paces itself at 100% of the estimated BDP.

                DCTCP is only paced when used in combination with fq, and paces at 120%

                of the BDP when that is the case.

2. Prague actively limits its gso bursts (defaults to 250us)

                DCTCP uses Linux defaults, 1ms

3. Prague implements a more accurate internal marking estimate (alpha)

                DCTCP, especially when operating with low marking probabilities, will

                tend to push down alpha to 0, delaying its response to marks (and thus

                increasing queue pressure/causing larger reduction down the line) 4. Prague carry over sub-cwnd reduction, such that multiple marks in a row

   occurring at low overall probability will eventually cause a cwnd reduction

                DCTCP's cwnd reduction code has no memory, i.e., as long as alpha is

                low enough compared to cwnd, no cwnd reduction will ever occur. This

                again drives the aqm to larger mark probabilities.



All of these points contribute to prague being more gentle towards the queue, and more reactive to the received marks (i.e., achieving a lower overall marking level)--the direct effect being smaller cwnd reduction (thus sawtooth) than DCTP, which is critical when operating at higher e2e RTTs with a shallow queue since it lowers the time to recovery.



Additionally, when operating over a long RTT path and controlling by PI2 (i.e., random marking, such that it receives marks almost every RTT--as opposed to a step where it receives nothing for several RTTs and then a RTT with most of the packets being marked), DCTCP suffers from the inaccurate/memoryless computations in 3./4. and its interactions with Linux's PRR code (which prague does not use), which can prevent it from ever reaching its expected rate.



I hope this helps to understand a bit better why prague appears to be more aggressive--it is actually the opposite: dctcp systematically hurts itself at longer base RTT, hence progressively give way as the RTT increases.

You can validate this by observing that the expected rate ratios between classic flows and prague over the dualQ match the theoretical equations (compounding the queue delay difference) even that higher base RTTs, which is not the case for DCTCP.





Best,

Olivier


From: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>
Sent: November 12, 2020 10:35
To: Szilveszter Nadas <Szilveszter.Nadas@ericsson.com>om>; tcpprague@ietf.org
Cc: Ferenc Fejes <fejes@inf.elte.hu>hu>; Gombos Gergo <ggombos@inf.elte.hu>hu>; LAKI Sandor <lakis@inf.elte.hu>
Subject: RE: [tcpPrague] A Congestion Control Independent L4S Scheduler - TCP Prague preliminary results

Hi Szilveszter,

Interesting work, thanks for sharing. There is indeed a very big difference between how DCTCP behaves today in recent kernels, than what we used in the original L4S work testing (Linux kernel 3.19). That original version was a "clean" DCTCP version that matched very well the theoretical equations: r=2/(p.RTT) for uniform stable random and r=2/(p².RTT) on/off marking. Due to many interactions, integer range constraints, pacing/GSO interactions and bugs, the performance really degraded in the recent Linux versions. These issues were fixed within the Linux Prague version, so that is why Prague should perform "better" in respect to matching the equations. It would be interesting to get an idea of the deviation from the r=2/(p.RTT) equation in your tests (relevant in DualPI2 as the coupling gives a smooth probability). Collecting the average RTT (base + queuing latency), marking probability, and rate would make it possible to evaluate this.

Main reasons why we saw deviations is when the probability varies a lot in time. As you might know DCTCP's (and Prague's) equation becomes r=2/(p².RTT) when it receives on/off marking episodes (on episodes in the order of 1 RTT, off in the order of multiple RTTs). Any not so stable marking probability results in a rate between those 2 boundaries. This is the theory, looking forward on how your practice matches this.

>From a first quick look at the results it might not deviate that much, I guess. The more aggressive part is probably due to the RTT unfairness: 1 to 5 rate ratio (with a 5ms base RTT, Classic gets a queue of 5ms+20ms target, so about 25ms and 5 times less throughput). Did you try to use the RTT independent version? If you set it to the f=max(15, RTT) mode, the minimum effective RTT becomes 15ms, so the rate ratio is 3 to 5 in that case.

Thanks and Regards,
Koen.

From: tcpPrague <tcpprague-bounces@ietf.org<mailto:tcpprague-bounces@ietf.org>> On Behalf Of Szilveszter Nadas
Sent: Tuesday, November 3, 2020 6:30 PM
To: tcpprague@ietf.org<mailto:tcpprague@ietf.org>
Cc: Ferenc Fejes <fejes@inf.elte.hu<mailto:fejes@inf.elte.hu>>; Gombos Gergo <ggombos@inf.elte.hu<mailto:ggombos@inf.elte.hu>>; LAKI Sandor <lakis@inf.elte.hu<mailto:lakis@inf.elte.hu>>
Subject: [tcpPrague] A Congestion Control Independent L4S Scheduler - TCP Prague preliminary results

Hi all,

During the review of our article "A Congestion Control Independent L4S Scheduler" we received comments from one of the reviewers, on why we did not also evaluate TCP Prague. So now we rerun the test cases with replacing DCTCP to TCP Prague.

The results are at the below link. We are somewhat surprised by much increased aggressiveness of TCP Prague. Can you comment on it? The settings and setup we used are described in the pdf (and in the original article).
Results: http://ppv.elte.hu/tcp-prague/
Original article (including YouTube presentation): http://ppv.elte.hu/cc-independent-l4s/

Can you comment on the results and the settings we used?

Cheers,
Szilveszter