Re: [tsvwg] Todays Meeting material for RTT-independence in TCP Prague

"De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com> Tue, 25 February 2020 16:46 UTC

Return-Path: <koen.de_schepper@nokia-bell-labs.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 744993A108C for <tsvwg@ietfa.amsl.com>; Tue, 25 Feb 2020 08:46:53 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=nokia.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uKMLFVkQYBo8 for <tsvwg@ietfa.amsl.com>; Tue, 25 Feb 2020 08:46:51 -0800 (PST)
Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2115.outbound.protection.outlook.com [40.107.20.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6321B3A1087 for <tsvwg@ietf.org>; Tue, 25 Feb 2020 08:46:51 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ln10DrFna0yIaE9frJ8V6K4E0d+BH8+NcDYYVg2WQAwx3T9vh1Dv+cfEVnK9EN2FFJz36poecvQrMmUuKPn25s+2/xKdy8Ah0+mDj+ORtUnbUU3zdkn2/rseYTLNMt76mhEhySpwy3WqBelDF7ugejQNbR82h9SEUXSyDfGHq8DbQ6nhHfxaLNqvwYOP1oY/VKFda3wc7aKmXKrorTdlS/Y3bPckBusFxWw0GjNZVf1ikaVt7qV8bzVyP0avHHlWP9PiyJEh+Pzn8q/foEJy6Q0kC3JiMxRRP20o604daRMSyMImErMWtxV8cEjD6jSoZImUEZurkJAIPwxy6mRPqg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=v6oicigKqZJjS0XQjmATWsac71eNUpJkpffpOLWPrRM=; b=MKtf0EjeiziFKJdnx5VdHt55AulzOEbQ70yT0D4l7ruKgd9Pya14R+EqjmEy1fZyI9zPbW04hxNLWM5Zcsy5mSyJrTmoTJpE65Ehayu6jAyQEbrcLcQiUivMYPgIWCBdVgoRvWCSBqZxtEiUZuNL3qnfC52naUzUw1zZ01V2y5dQdqDqy0MTEHOjRg4dmnuMnsyPeOAhGPvF8DyrN/8tOShrMLpqLLjGAuzfiYAby4j8kA1qp+FSGkEu+/3GLhuZbkxoRR1Vw0rVVOcgP4ddWpjgBsG4DCf9F1vnRxxVSRoUzC6wE2EA1yJGeU9o1eXGYKx4eOv7tvg8DA0TLuEP2Q==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nokia-bell-labs.com; dmarc=pass action=none header.from=nokia-bell-labs.com; dkim=pass header.d=nokia-bell-labs.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia.onmicrosoft.com; s=selector1-nokia-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=v6oicigKqZJjS0XQjmATWsac71eNUpJkpffpOLWPrRM=; b=QGI4MmkFp8KXM/QFziVPW+PkW0r9i7o6+Yb/Hm5FroLjydSmgLZS/rz/YjRThRel7F3o7dfJbDFOaws8/qFXlPQZpbgFo5ZIQooH2TO3LRxpiKsGZHD5Oqy3OCJLgufgmyMUHEY/Cb5fUiJJwX6uBvBAbJa+p9PHqcP/YvvN2so=
Received: from AM4PR07MB3490.eurprd07.prod.outlook.com (10.170.126.160) by AM4PR07MB3170.eurprd07.prod.outlook.com (10.171.187.152) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2772.9; Tue, 25 Feb 2020 16:46:49 +0000
Received: from AM4PR07MB3490.eurprd07.prod.outlook.com ([fe80::bd38:cc18:dfc5:ab17]) by AM4PR07MB3490.eurprd07.prod.outlook.com ([fe80::bd38:cc18:dfc5:ab17%7]) with mapi id 15.20.2772.012; Tue, 25 Feb 2020 16:46:49 +0000
From: "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com>
To: Sebastian Moeller <moeller0@gmx.de>, tsvwg IETF list <tsvwg@ietf.org>
Thread-Topic: [tsvwg] Todays Meeting material for RTT-independence in TCP Prague
Thread-Index: AQHV6IzqXyJPMNlavE6/ZLIeLJiQe6grtllQ
Date: Tue, 25 Feb 2020 16:46:48 +0000
Message-ID: <AM4PR07MB34904548334F88D3E1D92452B9ED0@AM4PR07MB3490.eurprd07.prod.outlook.com>
References: <09E7F874-41FE-483E-B6AA-4403DD5DA4AB@gmx.de>
In-Reply-To: <09E7F874-41FE-483E-B6AA-4403DD5DA4AB@gmx.de>
Accept-Language: nl-BE, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=koen.de_schepper@nokia-bell-labs.com;
x-originating-ip: [131.228.32.182]
x-ms-publictraffictype: Email
x-ms-office365-filtering-ht: Tenant
x-ms-office365-filtering-correlation-id: 64ff5911-cb96-433c-eb04-08d7ba124ed4
x-ms-traffictypediagnostic: AM4PR07MB3170:
x-microsoft-antispam-prvs: <AM4PR07MB3170F14A2C7D7675838A3355B9ED0@AM4PR07MB3170.eurprd07.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-forefront-prvs: 0324C2C0E2
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(4636009)(39860400002)(396003)(346002)(136003)(366004)(376002)(189003)(199004)(8676002)(8936002)(7696005)(26005)(71200400001)(81166006)(81156014)(5660300002)(53546011)(9686003)(55016002)(966005)(6506007)(66574012)(2906002)(64756008)(66556008)(66946007)(66476007)(66446008)(110136005)(52536014)(33656002)(76116006)(478600001)(186003)(86362001)(316002); DIR:OUT; SFP:1102; SCL:1; SRVR:AM4PR07MB3170; H:AM4PR07MB3490.eurprd07.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:0;
received-spf: None (protection.outlook.com: nokia-bell-labs.com does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: UF+p4eQ+dsqDEb1fiNXhiEEYJuaEWTxdOIZ/hqyg7Qzx4Ii8vfqPOsW8I7tF4ps/WdmQZRR3IGTn6yvZgercUE1s1UPGPPvmgA4NTMWe6NzB39YXS9cz+nNnCMGa6cpNlAJHUwwYpuZydELOe+csMfSQDARM5fLbezSeNgKq/pNQO7gqSSQlPGxqryspX5yVaQu5BwuunRUbBW8kWY8x6iLSX/cjNhMDdPFAfsv23nG5Y61GHM5FetO0qsBAk/gGzactWlIjYYoXuJveMEZ4sUgnswKYOVnkdbpa0wPm/l679ALhAc9SPULlM6DHEUCuISG2B++VgIqSy5m7Yav7NQxZGnsupWkHYGOW5Ehz1o+YLRLj2jgexTtGsW5COYtwJXnjVOGWFBKi5MXoHaTrRO1+rftkL1qDYqmC7616BrTVgdvp8hRmjj47Nk/NyARww3wpyyBF4+Dy1B76+fvInWusSGxQ3jj93vZgcYLZrEcVDho7uiax4j1U7lybR63R
x-ms-exchange-antispam-messagedata: FKhQJxn/pn3q4KQfzfpNrjPNrhsdI+PLMOs0nZhduy8l4A55z1lh80kSafFBLjNWaAbgzeK+NyZKaxRrYla4q59pNSfVMaXgO3DzG8xVsNrpygHTa1gK2TG3e18qqhLd3jI75rIuh7y3jSLTHPTxGQ==
x-ms-exchange-transport-forked: True
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: nokia-bell-labs.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 64ff5911-cb96-433c-eb04-08d7ba124ed4
X-MS-Exchange-CrossTenant-originalarrivaltime: 25 Feb 2020 16:46:49.0269 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 5d471751-9675-428d-917b-70f44f9630b0
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: mek2hXQe5nFsYrI9PIFplj9Kl9SJsIQNvW3AAlHUj8pFZmPKznzAidP9rVNOKJuGoMLvBHUInbrMPcqSJywD3OwyXEy5Vz8MgxDy5JcwJVSVgQvmUtNOcPMxnbn8HRe2
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR07MB3170
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/Avi7JypG31dLGCbSQVi9FuxAp7E>
Subject: Re: [tsvwg] Todays Meeting material for RTT-independence in TCP Prague
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 25 Feb 2020 16:46:54 -0000

Hi Sebastian,

What I showed in the demo is what you called below B). It was a convenient play-example to easily show that we can fully control the RTT-function f(). It compensates the 15 ms extra latency that classic flows gets by needing a bigger queue. So f()=RTT+15ms, making both Prague and classic flows getting exactly the same rate if they have the same base RTT. It is not our recommended f(), it was just simple to show.

If you want A) you need the following f()=max(15ms, RTT), meaning that any flow behaves as a 15ms flow (if it's real RTT is not bigger than 15ms). We haven't tested RTT independence for flows with a larger real RTT than the target RTT. We'll leave that up to others to further test/improve the throughput for higher RTTs (which everyone seems to accept). 

The following plot shows for A) where f()=max(15ms, RTT), the throughput for different 2-flow RTT-mixes (similar as in the paper you referred to):
https://l4steam.github.io/40Mbps%20RTT-mix%20Prague-RTT-indep.png
As you can see on the left half, flows below 15ms become RTT independent (get the same rate), and on the right half, lower than 15ms RTT flows are limited pushing away higher RTT flows (100ms here) up to one comparable to a 15ms flow.

Our implementation has currently both implemented as an option plus one extra proposal function that Bob provided (gradual changing - with a limited RTT independent for the lower RTTs). More on that later.

To be clear we don't propose B), rather something A)-like with a bit lower target RTT (5ms?) that still gives benefits for lower RTTs, but also limited like Bob proposed.

Other possible solutions are:
- have the Classic AQM target at 1ms too
- have a bigger coupling factor
- make classic TCP RTT independent in the higher RTT range
- FQ
- provide RTT info in the packet header
- ...
but I don't think people will in general favor these... but if possible they are still usable.

I don't understand why you keep on saying/repeating that DualQ is broken. DualQ wants to reduce the latency for L4S, but it cannot do the same for Classic, because of limitations of Classic congestion control itself. We don't make Classic traffic RTT dependent, it is already RTT dependent, and has been since the beginning of congestion control. So my conclusion is that the problem is with TCP congestion control that is RTT dependent and Classic that is not happy with a short queue. How do you suggest to solve this other than making TCP less RTT dependent???

Regards,
Koen.

-----Original Message-----
From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of Sebastian Moeller
Sent: Friday, February 21, 2020 9:00 AM
To: tsvwg IETF list <tsvwg@ietf.org>
Subject: [tsvwg] Todays Meeting material for RTT-independence in TCP Prague

Dear All,


after today's virtual meeting I am still pondering Koen's RTT-independence day presentation. The more I am thinking about this the more confused I get about what was actually achieved. 

Was it:
A) true RTT independence between TCP Prague flows so flows of wildly differing RTTs will share an L4S AQM's LL queue fairly?

B) class RTT independence, that is adding the so far under-explained 15 ms target for L4S's non-LL queue to the internal RTT response generation in TCP Prague (which, let me be frank would be a gross hack and solving the right problem (dualQ's failure to meet its goals robustly and reliably) at the wrong position)?

C) all of the above?

I had a look at the slides, and all I see is B) and no data for A), and IIRC the demo also focused on B), dod I miss something. If you have data for A) please share with us, because B) alone is not well-described with the RTT-independence moniker.

Question: is it just me, or do others also get uneasy when a yet un-deployed transport protocol modification (TCP Prague) grows a magic +14.5ms constant somewhere in its innards to work-around the existence of another under-explained 15ms constant somewhere in the innards of another yet un-deployed AQM, INSTEAD of simply fixing said un-deployed AQM to not require such and ugly hack in the first place? Are all L4S compliant transports expected to grow the same ~15ms constant? 
         What if in the future the dualQ AQM is superceded by something else, that for good justification* wants to implement a target of 5ms, do you envision all modified transport protocols to be changed?** 


	The fact of the matter is, the dual queue coupled AQM as currently implemented is broken, but I see


The rationale why the magic f() would have been added to TCP Prague without the need to paper over dualQ's major failure was a bit thin in Koen's presentation, so please supply me with more reasons why this is a good idea and not simply the cheapest way to paper over dualQ brokeness without actual real engineering to fix the root cause?

Also, please show how these modifications make bandwidth sharing inside the LL-queue more equitable and significantly less RTT-dependent, ideally by using a similar mix of flows like in The Good, the Bad and the WiFi: Modern AQMs in a residential setting: T.Høiland-Jørgensen, P. Hurtig, A. Brunstrom: https://www.sciencedirect.com/science/article/pii/S1389128615002479, so that your results can be compared to figure 6. Until that point I will assume that increased RTT-independence is still aspirational.

Best Regards
	Sebastian



*) I note again, that the CODEL RFC has a section that gives some rational why 5ms is a reasonable target value for flows in the 20-200ms RTT range, and that the PIE proponents have not presented any clear study demonstrating that the chosen 15ms is optimal in any dimension, which would be interesting as DCSIS-PIE actually seems to default to 10ms...


**) This is another sticking point, I have asked the L4S team repeatedly to use their test-bed (which should make testing different configurations a breeze), to measure between-class fairness and link-utilization between the LL- and the non-LL queues for short medium and long RTTs with the non-LL-queues target set to 5ms. 
         And so far all I hear is something along the lines of, if that interests me, I could do my own tests. My interpretation is that either the test bed is far less flexible and easy to use, or there is the fear that the 5ms data would reveal something unpleasant?