Re: [tsvwg] Prague requirements survey

"De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com> Sun, 18 April 2021 10:11 UTC

Return-Path: <koen.de_schepper@nokia-bell-labs.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0F86A3A0E14 for <tsvwg@ietfa.amsl.com>; Sun, 18 Apr 2021 03:11:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.902
X-Spam-Level:
X-Spam-Status: No, score=-1.902 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=nokia.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id E18O1EGmN_Dg for <tsvwg@ietfa.amsl.com>; Sun, 18 Apr 2021 03:11:45 -0700 (PDT)
Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-eopbgr80135.outbound.protection.outlook.com [40.107.8.135]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AC0F33A0E13 for <tsvwg@ietf.org>; Sun, 18 Apr 2021 03:11:44 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bBEg5nrzmbLJlnEdMAL3HKX0avCVT7VNHcNXrXdUsWv3spuGVwFAb/8tWYJbH1elmVc2l7/YOdXapFPoRX69t18Or6QuK7wLnB1r4fp6lAnLOiCZC30p7bsQT7rzl/W6J1lBYQn4sMc+sAvOVUsNoYzPrfi+24qR+tOnOFNJSN/aVa4lf5TwESxnBwW1JQ/BBbfgvyvMCeIZAtz6SwnyJHb2lGZGcR7BELSAVlDFHk17cTLjmdR6sHlVRCgo91XQx1uwWqxScH6jACG9nhIxl8LRn75gSiqydNC7WMLWj6QiBDEKAFUGGJjZg7udSeYGUlIPBenkeuoEjyopXljO6g==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=307slrCQPEuB6ookCHVRzcaYMshZe5tY3AfpHaynISc=; b=j5WNGZKYu1km0t3upaVNZHLgc5h/NRkqrhGJWnZtR3xxigMRnfhLLsHgsPSWG2ZEN2gdQbw570NXM8RuvYRkYSKr9TNyf0+NbXosC/aPq20zHz/dWYytDt/I3te3j0ogunaUU5zSZcbWqz/HddT0durV3M8ykMJdY9ixpOs0OgIaD/5KTzMFbQzornTazlFbiJhiMxpQ6pFQ3409q70Jhw/9WB6kxgmTFkBPDefvLj5FbrfXqxs/7yY9wy6c0nuW7Nk6/KKxOcke5MZjDIrwQPBJRN6YD0RR9Prn4+n+s3Rbive7pq/xBmEnwR/R7OgjJWhqFFU1NbCHiM6Tt7gndQ==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nokia-bell-labs.com; dmarc=pass action=none header.from=nokia-bell-labs.com; dkim=pass header.d=nokia-bell-labs.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia.onmicrosoft.com; s=selector1-nokia-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=307slrCQPEuB6ookCHVRzcaYMshZe5tY3AfpHaynISc=; b=zFeGRFJN014tV6RF/lNcidPreMDBplXeyKlTskSxnFhNyqkrUc6mFGmAx7aPgjCbBdyMttq5bHcgyMA3RCg5hDF2PnWztUPCUQvISZdlLJvTkRDsrnDofid0kEKaT9cBb+R6kDY3d+fWnofTPVWe11Km+QJeUlYVEJxugQxy4ac=
Received: from AM8PR07MB7476.eurprd07.prod.outlook.com (2603:10a6:20b:24e::12) by AM0PR07MB5299.eurprd07.prod.outlook.com (2603:10a6:208:f4::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4065.6; Sun, 18 Apr 2021 10:11:38 +0000
Received: from AM8PR07MB7476.eurprd07.prod.outlook.com ([fe80::cc5e:1d65:1335:28d3]) by AM8PR07MB7476.eurprd07.prod.outlook.com ([fe80::cc5e:1d65:1335:28d3%5]) with mapi id 15.20.4065.016; Sun, 18 Apr 2021 10:11:38 +0000
From: "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com>
To: Vidhi Goel <vidhi_goel@apple.com>, Sebastian Moeller <moeller0@gmx.de>
CC: tsvwg IETF list <tsvwg@ietf.org>
Thread-Topic: [tsvwg] Prague requirements survey
Thread-Index: AdbzPLYWoHtOCowHQ0uox2aN7ECzHgE/jRFAASKPXuAATuD/IAAGkG8ABHzn8YAAtrJEsAfOm5tAACnS94AADtbngAAC8DaAADrIY4AADG6qUA==
Date: Sun, 18 Apr 2021 10:11:38 +0000
Message-ID: <AM8PR07MB7476F513E7A6551F27DC7295B94A9@AM8PR07MB7476.eurprd07.prod.outlook.com>
References: <AM8PR07MB7476A907FDD0A49ADBD7CA7EB9BD0@AM8PR07MB7476.eurprd07.prod.outlook.com> <SN2PR00MB017475FC0E8C13754E531E17B6B69@SN2PR00MB0174.namprd00.prod.outlook.com> <AM8PR07MB7476FAE559719D241375A816B9B19@AM8PR07MB7476.eurprd07.prod.outlook.com> <HE1PR0701MB22999C8C05ECA3D995FA7FFEC28F9@HE1PR0701MB2299.eurprd07.prod.outlook.com> <AM8PR07MB7476E0EB3FC368D3C69A5466B98F9@AM8PR07MB7476.eurprd07.prod.outlook.com> <DBBPR07MB7481E1026CDE30D494856F15B9989@DBBPR07MB7481.eurprd07.prod.outlook.com> <AM8PR07MB7476FAEF53518DBFE457AC62B9949@AM8PR07MB7476.eurprd07.prod.outlook.com> <AM8PR07MB747629F14C5AEC5B47F40F56B94C9@AM8PR07MB7476.eurprd07.prod.outlook.com> <92C476A6-3E60-498B-A088-EF24E4B077AC@gmx.de> <83EC2DB8-C42F-4B1D-80C0-F01C2D393A9F@apple.com> <BB1A6362-FB51-471A-BF50-18C882C303E5@gmx.de> <DB7101BA-839C-44E2-B76E-C04F7963B5E5@apple.com>
In-Reply-To: <DB7101BA-839C-44E2-B76E-C04F7963B5E5@apple.com>
Accept-Language: nl-BE, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: apple.com; dkim=none (message not signed) header.d=none;apple.com; dmarc=none action=none header.from=nokia-bell-labs.com;
x-originating-ip: [2a02:1810:1e00:cb00:210b:63c2:20dd:546d]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: f20110bc-355b-4441-a1b1-08d902525ae0
x-ms-traffictypediagnostic: AM0PR07MB5299:
x-microsoft-antispam-prvs: <AM0PR07MB52993809AE06D61B5D168CEEB94A9@AM0PR07MB5299.eurprd07.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:6790;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: 4oWe05EEQp5yNnfBvlrcKQY0IAwewy53tR+l9wvLKc6KEZsBN9jeYPK/0OHRi/dnIt/MRiTnKXz24/gFxYn5GhkAYb0n8LM/NubpGI9LCfe91rJvwRWw4RnqqLqi9+FLYASHjx6xoqyGhN1+SOI1gGUTJgpkajTDtgieEnosF0+BCb4tQ2solSg0cg2o9GfkN6j2IG3Rc0xba+nSmUFAFnzo+b2EWawxHVfz+hCt7Mz8nOniZZ/+Nz1cekyH2DzkXlKll+/DlJTG9GRhcgWgA2y6LIsefX6LByqpdulmTy5kgkRZ/ONwZ1Jg67ez1c91a07wULTBVaplkafKoPiTRTn1JKkioV3vLWfyTSWLwtrtG0dCeMBn2k+ihdGKVVZ6FctMl7YX1bd/SVWzaUFLeyF6wodBrpEzXt+vof7R9uFXgisHC/Wg3+2wpprfe4FThW6FDkHiDYYvqLGZ7X3eTIunBDWstYIImBbK/brWRWznn8jRMGu626n0/zBrpXJjbgvbyxPpmh1QvrC5jQKOcdbFc3vxXfp3rp15go5/CHU2PwVcpQFZPezTQ7YxBGJTBWgV8QYzk8jpfxCjkvaicPgdQbouBBlPKdGYFfiVi2zt47zIyXOgBr2roKL53MH0VwShhGAFM5dwNEUSUXgLZOIY6L/XM9SjF/UUNFCHS+w=
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AM8PR07MB7476.eurprd07.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(376002)(39860400002)(346002)(396003)(136003)(966005)(33656002)(71200400001)(478600001)(66446008)(55016002)(122000001)(76116006)(38100700002)(9686003)(64756008)(30864003)(316002)(66476007)(66946007)(66556008)(4326008)(186003)(5660300002)(83380400001)(53546011)(66574015)(7696005)(8936002)(52536014)(8676002)(110136005)(6506007)(86362001)(2906002)(579004); DIR:OUT; SFP:1102;
x-ms-exchange-antispam-messagedata: =?utf-8?B?dXh5bUQrQVF4bFpSMmZlaXpja1d3ei9Cc01IamlTczlxY05ERmcvR3dUTXBj?= =?utf-8?B?WVI2VkVBWlMvaW1VdC9XS0xkVUc3bjJyb2RzL2h3ZzJHamNuTU50d3BxNlBm?= =?utf-8?B?UFRyZUsyVmk5SDllNTNjOFNjcHcvVVUxWm9OQ05tU0M2VGxWNml4VVV2bm5K?= =?utf-8?B?QndyaDhtdXdSZkxHS1J5a0VxZVRkYUJnMkpmSVlwZkZpOGk0N0grZmhtR1NC?= =?utf-8?B?eWFoQXE4Ukp4cUhLQ0M5UzgrMDNwSy92QXo5WnEzclhEZno3SFIrdTA5SUpF?= =?utf-8?B?QXFNNkIyRWJhSElHSG01RGpGQlFuWkYyZitaY3VGTk9nN09uakY0NnRLQkkr?= =?utf-8?B?V2d4OW9jVXZkVzJSUzI1YTc2anhYR1docTFXU1lDSG9iTzExSDUyRzQyVGFv?= =?utf-8?B?NjlhZFRjNnRUSFBUWGp3SGJ2aDJub0hQNUxmdmVCL2VPM0EwNUVPZFY2dzFM?= =?utf-8?B?eE8yRHRUdWJBb0JySkMvZ1hTQU91WDFhN3NXdCsrWU51a0IvUVZNcFc5aUtk?= =?utf-8?B?UkJUWk9GQVVOL0dGT1JqM3Z5WlNEQm1XbUlDZUg1Mi9UNzNaam5Qb0doaTg2?= =?utf-8?B?cWZkOG91S1lWYXhON2VWWDhla3p0OEdDbzZpYjlHMTRPbXE3OExaZThMZDhz?= =?utf-8?B?b0dDWlBMd3JWRERsNDdaQnZwM0pYa3VFbC9oV1NtaXBRbHNtd3JaN1RuSzNi?= =?utf-8?B?SlRaQzlwTm4zOWFrM001ZExXUlJFN3ZEM29BK3orejE0dGJ4eHFHbDVEZFRu?= =?utf-8?B?YkVZaEhvTm96eFBPaTVjQk9FTHBvUUJOOGtxVmVZTk9iZkJ4UVUySW4za0Iv?= =?utf-8?B?cS9UMmd5bG83WGJHWFhrRk1sM0VTSUUzVWdCZVMwZUlNVmZUbVlKWlFCKzFE?= =?utf-8?B?aVBuOWF6clUxWGphTEF3QzIzSTFLdXhySHBkblBaaEI4L01aMXNTdEQreEwy?= =?utf-8?B?SHRGNytzc0YxRkVoZFJHei9ZdVFWcE9KeHlhbGlGSUhJNXJ4Q2tkaGlZYnBV?= =?utf-8?B?dVVBZW9VeUVValNKeHlhYTcvM0JYYk95TXZjOXQvdzNZdmdUNWFuZEl3OGV3?= =?utf-8?B?Q2dlREZCQmVNV1hOU1g5NFB2MEQyRTMrSk15QlRzQ2w0ZkthVnFCR2Y2bXp6?= =?utf-8?B?VEdNdS9ob3ZaZUVOZkt1ZkNHbURvNGxmeFpjT1EwbGRhd1ZyRmQ0VkFwYjBS?= =?utf-8?B?RkpnZ2JuTFc4LzVqYURGYWE0L2EyMVgya3hLVE5FaEc4N2FUSGdtNFJHRUJm?= =?utf-8?B?S1Uvalh0OS81UlJqSDZLbmRBRzZNQ3prTzlPMWpScUxLRkM1MTUrVVBLS3RD?= =?utf-8?B?M1ZscVhLakVXYmEyU2oyU2R5YzlYcFhvSE5yOVYxaTJHa3JHUFBONnFERllu?= =?utf-8?B?MlNUaCs4Rmttb2RJb21Qak4rcU5Ob2F5bFBCa0xpcmF3dDRPcWJrb2VQT0NQ?= =?utf-8?B?d2lVaHArME9lTHRxUFNGTGd0Z3hkTVZFTlVOenYxa2VlSFlsRU1vQVVGRTAr?= =?utf-8?B?YmRJeFFZSTdNbStlRnh2L3RCZ2U2cGtSWHJDNTVhaFJhQnFkTm9YYUVHQm5X?= =?utf-8?B?Q2FaUTFwM095Y2Z4U0taeXhXYUtHL1JzZ0pRaTROZnVnYVlJdEdBSlhVREZq?= =?utf-8?B?Z2RVY3hiN0xhTW9GcE1YOE5xZWhjMzRxa2djaitPTHZVbmtoUnQ2eU9EaWsy?= =?utf-8?B?SkNURlBReXRteE9mN1VrbGdQU0lzYUhweTZOODNoa25uNklCTW96RUpaU20z?= =?utf-8?B?WmVZN3hxeHFXbmNqTkZkdUgvakVjbWhwT3V1QkdOcWtVUkJWdUx3MTF6NXJZ?= =?utf-8?B?TXhvYjJna0RoaGNoLy9NanhTUy9FZUh5YnArbW4yVHhZUGhVQjVuYU1HWDdV?= =?utf-8?Q?uLujA50+dw1pg?=
x-ms-exchange-transport-forked: True
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: nokia-bell-labs.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: AM8PR07MB7476.eurprd07.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: f20110bc-355b-4441-a1b1-08d902525ae0
X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Apr 2021 10:11:38.4123 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 5d471751-9675-428d-917b-70f44f9630b0
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: 7cqwiV1ERFlqtfRreQ3RwurO6c8bXqOFdELnHwUefygZkEFg3spDgwSEcjD5gfGGZqCd3HSCMzOXsOAmg7pgln4mfbHQGnCXDKrU2yeW42Pu3rP0Z2/NfhR/rRyo5xk7
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR07MB5299
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/ja9KA3tHGQxT5NNdu0cKYVaNTb0>
Subject: Re: [tsvwg] Prague requirements survey
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 18 Apr 2021 10:11:50 -0000

Hi Sebastian, Vidhi,

Some background:

RTT (in)dependence is an end-point property, that can best be corrected in the endpoints. It could be solved in the network if:
- the network can identify flows and schedule then accordingly: FQ_x
- the endpoints add more info (RTT) in the packet headers: RCP, XCP, ...
- the network can identify flows and adapts the marking/drop to let flows converge to the same throughput: CSAQM, ...
- the network can identify flows and "estimate/guess" their RTT and adapt the marking/drop probability: I'm sure you will find research...
- ...

So unless we further expand headers or make it the responsibility to identify and schedule flows, measure rates or guess RTTs in the network, we better solve the problem in the entity that causes it. Assuming from here we restrict us to the latter:

Sebastian, as you mentioned, there has been a lot of research in the past, which resulted in a lot of solutions. The problem is not that these don't work, but that converging at a lower rate for the lower end RTTs requires a clean slate reset of all endpoints, as nobody would start doing that unilaterally. L4S is a clean slate starting point where we set new rules. During the first Prague meeting there was a lot of enthusiasm and proposals just because of this new opportunity. So if we want to introduce RTT independence, this is the moment, and all this previous research can now be used and be deployed.

As L4S removed the queue completely when competing with Classic over a DualQ and limits it to 1ms when not, the previous role of the (large) queue to middle out RTT unfairness completely disappears. So even in an L4S-only world, the unfairness would be unsustainable and needs a solution. As this also solves the DualQ created imbalance is part of the total concept (which otherwise could only be solved by setting both queues to the same RTT target, defying the purpose of the 2 queues and DualQ at all).

Then RTT independence, means that we need to converge to a (more) equal rate when RTTs are different. This means that based on the marking signal we need to agree on a common marking rate (which automatically emerges when a marker marks packets with equal probability and all flows have an equal rate). If such a marking rate in marks per second is defined, it can automatically be translated in a reference RTT when taking the marks per RTT of an existing congestion control into account. This is not a trick, or hack, it is just a result of the concept (RTT independence). We can discuss a reference in marks per second (marking rate) or in terms of a Reference RTT (DCTCP AIMD converging to a fixed 2 marks per RTT being a good reference base CC for that purpose). So "pretending" to be a 25ms is exactly what we do if we set the Reference RTT to 25ms, or set the marking rate to 80 marks per second. For evenly distributed marks this would be around 12.5ms per mark, which would be a very frequent signal for converging to a fair rate. 25ms also is a useful number, as it is a practical lower limit for Classic RTTs on congested links, and require no changes for the Classic flows. It is of course an additional opportunity for Classic CCs to also increase the rate for higher RTTs (but not necessarily for lower RTTs, although this wouldn't hurt them much, as under congestion they wouldn't see lower RTTs on the Internet anyway). CuBic was designed to be a fairness/performance compromise to Reno on the longer RTTs. If it would have been acceptable at that time, they would have set the compromise more towards the performance. Today most traffic comes from nearby datacenters, making most traffic experiencing less than 50ms latency, so I believe setting a reference RTT of 25ms also for higher RTTs would be completely acceptable today.

As a final remark, converging to a steady state rate in the past was always seen as a property of a single mechanism (AIMD of +1 and /2 for Reno, and +cubic(t) and *0.7 for Cubic). I believe we are past simple single ACK response mechanisms (see BBR, ...) where models based on measurements and different states adapt the response and selects appropriate mechanism. When we detect we are out of steady state (0% or 100% marking for a while), the selected mechanism can be RTT dependent (getting up to speed, avoiding latency, ...), once back in sync with the steady marking rate, the RTT independent response can be selected (whatever the mechanism is).

Hope this clarifies.

Koen.

-----Original Message-----
From: Vidhi Goel <vidhi_goel@apple.com> 
Sent: Sunday, April 18, 2021 4:48 AM
To: Sebastian Moeller <moeller0@gmx.de>
Cc: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>om>; tsvwg IETF list <tsvwg@ietf.org>
Subject: Re: [tsvwg] Prague requirements survey

Hi Sebastian,

>> I think the simple proposal in Linux (which you already know) is a good starting point. A
> 
> 	Are we talking about the same Linux proposal here ((https://github.com/L4STeam/linux/commit/a2ef76f8da1c9d1b13fa941f55607f3e60d4112e)? Where TCP Prague is instructed to basically behave like ("pretend") there was a fixed lower bound RTT when growing the congestion window? IMHO that is not a good starting point for fixing RTT-bias, given that that over the internet RTTs easily range from low single to low triple digit ms. Unless we are prepared to set, say 100 ms as our lower bound RTT this approach will not work for the internet, but once we do that we are giving up one of the advantages that high frequency congestion signaling is promising, faster reactions to congestion signals. 
> 
> 	I want to note that TCP Prague by default only tries to equalize RTTs up to 25 ms, which indicates that not even its developers consider it as a generic solution (or they believe that 25ms is a "magic" RTT on the internet). I also note that the root-cause for adding that feature to TCP Prague, was/is the fact that the dual queue coupled AQM failed to properly share capacity between its two queues at low RTTs. This failure is rationalized as being an effect of RTT-bias caused by the difference in queueing delay between the two queues (~1ms fir the LL queue, ~20ms for the classic queue) and the proposed solution is to make TCP Prague not grow its congestion window faster than a 25 ms RTT flow. In other words this is not meaningfully addressing RTT-bias, but is fixing a deficiency in L4S's reference AQM*.

Yes, we are talking about the same proposal.
At the time I read the Linux Prague proposal, I didn’t realize the rationale behind it and now I understand it better with your reasoning. I agree that we should not fix RTT bias which is purely created by the L4S dual queue.

>> s a community, we might come up with more heuristics / tunable parameters to handle edge cases.
> 
> 	Sorry, for the last decades people have worked on RTT-bias and no generic solution based solely on end-point actions has been found. I am not saying that this is impossible, but it it is quite unlikely that this is easy enough for the community to come up with a solution. And TCP Pragure is IMHO not a promising contender for a generic solution.
> 	IMHO, the problem is that the issue is not caused by the endpoints in the first place, but by the interaction of control loops of different "fidelity"/reaction times in bottleneck buffers. This can easily be seen in that a properly configured TCP flow can approach bottleneck capacity when run as the sole flow over a bottleneck, but will be suppressed if competing with TCP flows of shorter RTT in the same bottleneck. It hence seems clear that management of the bottleneck is at least as important to counter RTT-bias as the endpoints's control loops. The L4S approach of relegating the issue solely to the endpoints/protocols to fix, instead of also making the AQM part of the solution strikes me as short-sighted especially in the light of deployment of an AQM being one of the core pillars of the L4S design.
The problem of RTT-unfairness arises from different ACK clocking speeds based on RTT. If the propagation delay is different for two flows, then there is nothing that AQM can do. OTOH, if the propagation delay is same for two flows, and it is really the buffering (queuing) delay that is causing RTT unfairness, then I agree with you that we should solve this problem at the bottleneck.

I believe you are concerned about the latter scenario and yes in this case, we should not try to solve the RTT bias at the endpoint as that could be counter productive to what we are trying to achieve with scalable congestion controllers.

Thanks,
Vidhi

> On Apr 16, 2021, at 3:45 PM, Sebastian Moeller <moeller0@gmx.de> wrote:
> 
> Hi Vidhi,
> 
> 
>> On Apr 16, 2021, at 23:21, Vidhi Goel <vidhi_goel@apple.com> wrote:
>> 
>> Hi Sebastian,
>> 
>>> If this is easy to implement, could you please propose a description of such a solution to the mailing list please? As far as I can tell RT- bias has been a topic of research for decades and still no general solution has beed presented, so I am quite interested to learn more about this comment. Even if the response is something like "for the expected range of RTTs from 1ms to 20 ms" a solution like TCP Pragues, pretend all RTTs are 20ms" I am quite interested in apple's thoughts.
>> 
>> I think the simple proposal in Linux (which you already know) is a good starting point. A
> 
> 	Are we talking about the same Linux proposal here ((https://github.com/L4STeam/linux/commit/a2ef76f8da1c9d1b13fa941f55607f3e60d4112e)? Where TCP Prague is instructed to basically behave like ("pretend") there was a fixed lower bound RTT when growing the congestion window? IMHO that is not a good starting point for fixing RTT-bias, given that that over the internet RTTs easily range from low single to low triple digit ms. Unless we are prepared to set, say 100 ms as our lower bound RTT this approach will not work for the internet, but once we do that we are giving up one of the advantages that high frequency congestion signaling is promising, faster reactions to congestion signals. 
> 
> 	I want to note that TCP Prague by default only tries to equalize RTTs up to 25 ms, which indicates that not even its developers consider it as a generic solution (or they believe that 25ms is a "magic" RTT on the internet). I also note that the root-cause for adding that feature to TCP Prague, was/is the fact that the dual queue coupled AQM failed to properly share capacity between its two queues at low RTTs. This failure is rationalized as being an effect of RTT-bias caused by the difference in queueing delay between the two queues (~1ms fir the LL queue, ~20ms for the classic queue) and the proposed solution is to make TCP Prague not grow its congestion window faster than a 25 ms RTT flow. In other words this is not meaningfully addressing RTT-bias, but is fixing a deficiency in L4S's reference AQM*.
> 
>> s a community, we might come up with more heuristics / tunable parameters to handle edge cases.
> 
> 	Sorry, for the last decades people have worked on RTT-bias and no generic solution based solely on end-point actions has been found. I am not saying that this is impossible, but it it is quite unlikely that this is easy enough for the community to come up with a solution. And TCP Pragure is IMHO not a promising contender for a generic solution.
> 	IMHO, the problem is that the issue is not caused by the endpoints in the first place, but by the interaction of control loops of different "fidelity"/reaction times in bottleneck buffers. This can easily be seen in that a properly configured TCP flow can approach bottleneck capacity when run as the sole flow over a bottleneck, but will be suppressed if competing with TCP flows of shorter RTT in the same bottleneck. It hence seems clear that management of the bottleneck is at least as important to counter RTT-bias as the endpoints's control loops. The L4S approach of relegating the issue solely to the endpoints/protocols to fix, instead of also making the AQM part of the solution strikes me as short-sighted especially in the light of deployment of an AQM being one of the core pillars of the L4S design.
> 
>> https://l4steam.github.io/PragueReqs/Linux_TCP_Prague_L4S_requirements_Compliance_and_Objections.pdf
> 
> Best Regards
> 	Sebastian
> 
> *) And doing so before actual deployment, at a point in time when that AQM could actually still be fixed for good.
> 
> 
>> 
>> Thanks,
>> Vidhi
>> 
>>> On Apr 16, 2021, at 7:16 AM, Sebastian Moeller <moeller0@gmx.de> wrote:
>>> 
>>> Hi Koen,
>>> 
>>> Thanks,.
>>> 
>>> Here is a question for Apple though:
>>> 
>>> "5. Reduce RTT dependence (A1.5)
>>> Section 4.3: A scalable congestion control MUST eliminate RTT bias as much as possible in the range between the minimum likely RTT and typical RTTs expected in the intended deployment scenario.
>>> Apple's comment:<page1image4260772480.png>		
>>> Again, agreed with the rationale behind this and the MUST compliance. This might be easy to implement as well based on heuristics but will require thorough testing."
>>> 
>>> 
>>> If this is easy to implement, could you please propose a description of such a solution to the mailing list please? As far as I can tell RT- bias has been a topic of research for decades and still no general solution has beed presented, so I am quite interested to learn more about this comment. Even if the response is something like "for the expected range of RTTs from 1ms to 20 ms" a solution like TCP Pragues, pretend all RTTs are 20ms" I am quite interested in apple's thoughts.
>>> 
>>> Best Regards
>>> 	Sebastian
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> On Apr 16, 2021, at 14:52, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
>>>> 
>>>> Hi all,
>>>> 
>>>> An update on the survey is available. We received an additional input from Apple which we could publicly share (thanks Vidhi for providing this input). I also updated the consolidated view v2 (available onhttps://github.com/L4STeam/l4steam.github.io#prague-requirements-compliance).
>>>> 
>>>> I believe it is strongly in line with the previous survey conclusions as presented in last tsvwg. One main additional feedback was on “7. Measuring Reordering Tolerance in Time Units”. There was disagreement that using time only and not packet count is a foolproof solution. As far as I understand the objection is to the current wording that a time based mechanism is the only/sufficient way to assure this.
>>>> 
>>>> The objective of this requirement is to allow a certain level of reordering for L4S traffic (actually avoid delaying packets in the network to guarantee correct order of packet delivery). I personally could support wording that expresses the core of the requirement, and not limit the text to one mechanism, which would allow alternative/more robust implementations. The requirement could be expressed as something like: “a scalable congestion control SHOULD  be resilient to reordering over an (adaptive) (time?) interval, which scales with / adapts to throughput, as opposed to counting only in (fixed) units of packets (as in the 3 DupACK rule of RFC 5681 TCP), which is not scalable”. Let’s further discuss here on the list what could be for all parties an acceptable wording.
>>>> 
>>>> Thanks,
>>>> Koen.
>>>> 
>>>> 
>>>> From: De Schepper, Koen (Nokia - BE/Antwerp) 
>>>> Sent: Sunday, March 7, 2021 1:57 AM
>>>> To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>om>; tsvwg IETF list <tsvwg@ietf.org>
>>>> Cc: Bob Briscoe <ietf@bobbriscoe.net>
>>>> Subject: RE: Prague requirements survey
>>>> 
>>>> Hi all,
>>>> 
>>>> The details of the consolidated view of all feedback received is available and can be found via following link: https://l4steam.github.io/PragueReqs/Prague_requirements_consolidated.pdf
>>>> 
>>>> The only strong objections were against the “MUST document” requirements, which will be removed from the next version of the draft. Some clarifications were asked and (will be) added.
>>>> For 2 requirements a big consensus was that they should be developed and evolved as needed during the experiment.
>>>> All other requirements had already implementations and if not, were seen feasible/realizable and were planned to be implemented.
>>>> 
>>>> We will present an overview during the meeting.
>>>> 
>>>> Regards,
>>>> Koen.
>>>> 
>>>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of De Schepper, Koen (Nokia - BE/Antwerp)
>>>> Sent: Wednesday, March 3, 2021 2:20 PM
>>>> To: tsvwg IETF list <tsvwg@ietf.org>
>>>> Subject: Re: [tsvwg] Prague requirements survey
>>>> 
>>>> Hi all,
>>>> 
>>>> We have received several surveys privately, for which I tried to get the approval for sharing those on the overview page: l4steam.github.io | L4S-related experiments and companion website
>>>> 
>>>> Thanks to NVIDIA for sharing their view and feedback for their GeforceNow congestion control. Their feedback was added to the above overview about a week ago. As we didn’t get the explicit approval for the others, we will share and present a consolidated view of all feedback received later and during the meeting.
>>>> 
>>>> Note: pdf versions are now also available on the above page for easier reading.
>>>> 
>>>> Koen.
>>>> 
>>>> 
>>>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of De Schepper, Koen (Nokia - BE/Antwerp)
>>>> Sent: Monday, February 8, 2021 2:37 PM
>>>> To: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>om>; tsvwg IETF list <tsvwg@ietf.org>
>>>> Subject: Re: [tsvwg] Prague requirements survey
>>>> 
>>>> Hi Ingemar,
>>>> 
>>>> Thanks for your contributions. I linked your doc to the https://l4steam.github.io/#prague-requirements-compliance web page (and will do so for others).
>>>> 
>>>> I didn’t see any issues or objections mentioned to the current requirements as specified in the draft. Does this mean you think they are all reasonable, valid and feasible?
>>>> 
>>>> Interesting observation (related to the performance optimization topic 1) that for the control packets “RTCP is likely not using ECT(1)”. Why is this not likely? I assume this will impact the performance? Do we need to recommend the use of ECT(1) on RTCP packets in the draft?
>>>> 
>>>> Thanks,
>>>> Koen.
>>>> 
>>>> From: Ingemar Johansson S <ingemar.s.johansson@ericsson.com> 
>>>> Sent: Monday, February 8, 2021 10:59 AM
>>>> To: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>om>; tsvwg IETF list <tsvwg@ietf.org>
>>>> Cc: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
>>>> Subject: RE: Prague requirements survey
>>>> 
>>>> Hi
>>>> Please find attached (hopefully) a Prague requirements survey applied to SCReAM (RFC8298 std + running code)
>>>> 
>>>> Regards
>>>> Ingemar
>>>> 
>>>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of De Schepper, Koen (Nokia - BE/Antwerp)
>>>> Sent: den 6 februari 2021 23:20
>>>> To: tsvwg IETF list <tsvwg@ietf.org>
>>>> Subject: [tsvwg] Prague requirements survey
>>>> 
>>>> Hi all,
>>>> 
>>>> To get a better understanding on the level of consensus on the Prague requirements, we prepared an overview document listing the L4S-ID draft requirements specific to the CC (wider Prague requirements), as a questionnaire towards potential CC developers. If you are developing or have developed an L4S congestion control, you can describe the status of your ongoing development in the second last column. If you cannot share status, or plan-to/would implement an L4S CC, you can list what you would want to support (see feasible). In the last column you can put any description/limitations/remarks/explanations related to evaluations, implementations and/or plans (will implement or will not implement). Any expected or experienced issues and any objections/disagreements to the requirement can be explained and colored appropriately.
>>>> 
>>>> The document can be found on following link: https://raw.githubusercontent.com/L4STeam/l4steam.github.io/master/PragueReqs/Prague_requirements_Compliance_and_Objections_template.docx
>>>> 
>>>> As an example I filled it for the Linux TCP-Prague implementation on following link: https://l4steam.github.io/PragueReqs/Prague_requirements_Compliance_and_Objections_Linux_TCP-Prague.docx
>>>> 
>>>> Please send your filled document to the list (Not sure if an attachment will work, so I assume you also need to store it somewhere and send a link to it, or send to me directly).
>>>> 
>>>> We hope to collect many answers, understanding the position of the different (potential) implementers and come faster to consensus.
>>>> 
>>>> Thanks,
>>>> Koen.
>>> 
>> 
>