Re: [tcpPrague] TSO burst sizing causing TCP Prague unfairness on high capacity links ?
"Tilmans, Olivier (Nokia - BE/Antwerp)" <olivier.tilmans@nokia-bell-labs.com> Wed, 10 June 2020 10:20 UTC
Return-Path: <olivier.tilmans@nokia-bell-labs.com>
X-Original-To: tcpprague@ietfa.amsl.com
Delivered-To: tcpprague@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9C19B3A08D5 for <tcpprague@ietfa.amsl.com>; Wed, 10 Jun 2020 03:20:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.903
X-Spam-Level:
X-Spam-Status: No, score=-1.903 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=nokia.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id r-mvsfhSntrb for <tcpprague@ietfa.amsl.com>; Wed, 10 Jun 2020 03:20:42 -0700 (PDT)
Received: from EUR03-AM5-obe.outbound.protection.outlook.com (mail-eopbgr30117.outbound.protection.outlook.com [40.107.3.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 091103A08D1 for <tcpprague@ietf.org>; Wed, 10 Jun 2020 03:20:41 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kgafbjkgIGY4+leaHSPyq5OXK4vG5yajaGWgN6Y8RnUh1XR4HvKJNKrSxzqkCDYsXXLosiDsYSh3nhFYgeYjEsej78TDaEuZdDr0zsBXDB7wuDgnAgRmgCosCJqp8xv8rCC7hAdMkV8DMhairqSY958WXoU2lpKFvLnSrydXWSUNy3e6TaNHtqXE45sK9XXLMD147T57V26MasQeKbXx+zQbYKC537ECR16CLKJ+sBsepfT1a5l2DbM2a2HgpLb3c47UUxkORlNVKzoj0/Vx3V7D2VY98EkCBgtw11b36Y5WsRrFygFQY4WsdJrIJccIOXpoiJbqDKwZ3YmhHkjunQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ZvS0u2EVN+GeYC79NfmurhqsxZMdXvQpfTi4w7MxuFA=; b=T7H0eEHS6llS40TT4Usk3VoxToqqOhAS8xcVsTpK7O8BfkEPZoEl7q7imWQwn+LGanRO2oDxzphcXcgDclwui6Sniq4Gct2BiOoQmZ3URreBwFGYeeq92uXnsss8tjBR2po7XNmCqliyNl/tZyCKz7SglS7xPeQ7ripqgBZIVG6xXW00Yh3g4BgGgFSEUNWjgPqKxPMjYB5Okl30o3jm2XH78Mbb2UzquWDF92Y8C8FsBwOCkvCZXf5x61LdlQK41+XLulhCBMe67A8ysliaCi+Cl1q0NdVbp1RU/LCrJjsGSsSJpyEJZWhpuZkALHXxOA4busf1zAtqwrBHUXf/Ag==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nokia-bell-labs.com; dmarc=pass action=none header.from=nokia-bell-labs.com; dkim=pass header.d=nokia-bell-labs.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia.onmicrosoft.com; s=selector1-nokia-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ZvS0u2EVN+GeYC79NfmurhqsxZMdXvQpfTi4w7MxuFA=; b=d4H0cb33BoBYiu/RTw4hIYC/V6hrOzO3mX7+6PnATcmIms5dbrRLjxXMXbvNvd7GzGmRtfoVTGkmCfSrYUQRSMUzHgGq7Slfg8lRb4M6BRwCkkvqswbsUW975GD7lyWKP28o1xScMjoFIwLPqYzFpPjdr4tnxVfzjnYJm4B7pbk=
Received: from AM0PR07MB4641.eurprd07.prod.outlook.com (2603:10a6:208:79::26) by AM0PR07MB4497.eurprd07.prod.outlook.com (2603:10a6:208:7a::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3088.13; Wed, 10 Jun 2020 10:20:38 +0000
Received: from AM0PR07MB4641.eurprd07.prod.outlook.com ([fe80::8d0b:d9d6:884a:85e1]) by AM0PR07MB4641.eurprd07.prod.outlook.com ([fe80::8d0b:d9d6:884a:85e1%3]) with mapi id 15.20.3088.019; Wed, 10 Jun 2020 10:20:38 +0000
From: "Tilmans, Olivier (Nokia - BE/Antwerp)" <olivier.tilmans@nokia-bell-labs.com>
To: Ashutosh Srivastava <as12738@nyu.edu>, "tcpprague@ietf.org" <tcpprague@ietf.org>
Thread-Topic: [tcpPrague] TSO burst sizing causing TCP Prague unfairness on high capacity links ?
Thread-Index: AQHWNSUH83Z7Omx81kOVidBZY8jg36jRsmAw
Date: Wed, 10 Jun 2020 10:20:38 +0000
Message-ID: <AM0PR07MB46418F029644832BBC06D260E0830@AM0PR07MB4641.eurprd07.prod.outlook.com>
References: <CAJyCXab5M=hUaORAeQs5NO3W-rDYPe6r5j6Wyx6q=Bxz4GEzvA@mail.gmail.com>
In-Reply-To: <CAJyCXab5M=hUaORAeQs5NO3W-rDYPe6r5j6Wyx6q=Bxz4GEzvA@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: nyu.edu; dkim=none (message not signed) header.d=none;nyu.edu; dmarc=none action=none header.from=nokia-bell-labs.com;
x-originating-ip: [2a02:1811:3820:100:fd8e:d206:899:753d]
x-ms-publictraffictype: Email
x-ms-office365-filtering-ht: Tenant
x-ms-office365-filtering-correlation-id: c9123099-956a-4190-595c-08d80d27ebc6
x-ms-traffictypediagnostic: AM0PR07MB4497:
x-microsoft-antispam-prvs: <AM0PR07MB449756D41CDC7240120163B3E0830@AM0PR07MB4497.eurprd07.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-forefront-prvs: 0430FA5CB7
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: PwRA3WcKdl4bquvPEykT8bvnsvyczo3fDf3DEg/KVCfXRpU26530Ps2xGGa/YrfSrWSYfjCIvGznCxcbvZvqAMHxK+Z2tyQqZ7H09NU9AHYzBUmXOTF5pMDHGG0yMox3/XyYgRpyyZhdunV2KEk3p4VmHBdAOqND7JOusSJgjWjvOekQ7xZq0L76KnI2ovK77n6BcWt3w0af0RX/aUO314f35agf+JQTq2TCzz0o0IzkhLq5oIKBPIsWNTSE6Mz8HVfQlpSvnWZqq40pxtjjxEqO5jLFAHMil/Csbd4nfr3zpChfyPEyBpyDAlieBFFHpvfAyJBdnQGSucGj55kD0ww/OugvOcQucyMI06+9RyvDdHkBGJNqdXh/KW9NAfESf7p3sXTdO4q43DNK0DK7zg==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AM0PR07MB4641.eurprd07.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:(4636009)(366004)(396003)(346002)(376002)(39860400002)(136003)(52536014)(2906002)(71200400001)(6506007)(186003)(966005)(86362001)(76116006)(7696005)(5660300002)(66946007)(33656002)(478600001)(316002)(8676002)(83380400001)(55016002)(66556008)(9686003)(8936002)(66446008)(64756008)(66476007)(110136005); DIR:OUT; SFP:1102;
x-ms-exchange-antispam-messagedata: cBboPkVjTjZja4G3MBTlau+9c7+cQBDertQBci/r+fs7+7uBYPHyRCgpjUUM7o060uYHXkiyj2a3hiF1QTiQ/c6Wwv8wHy/ISCEIT4CKJefJVE6wuBr+5SIGxbSAmOl1t7feHlXZhLr/EqKToAexCBfWihNEY+35d+FnbuXdU3grC8KQXi1qRDiJxCv4PkxM/nrMwPj2Rjw4EzG/biJRhs24uS1VA9qotsPqUPAUULrt3y0/YPVP3HYJqu645+bwBAFqqpqz2q7Gnb8hZcerVvDNLRzdKwewGSaSTqTr/9XUbgy5DzgJmtGU9AuXYtLRp22XwFRuk6Seg2GczfsT069IvYmDgSx/HgJLupNtsFmFxkx/mcq9jCiH+PbnyRnHU44Nq9ZGkLNEvjEwP5KHX7yMVH7fg7KRUIqexjTLCpoNbZWgSfCfc4b+tdWFTBWvIHr6v/gnXvEG/e9CeROIi9lKRPtJPybAGg8BQt3ghvtHY/66q8okRL3seJOwaarjW1wvJXkrBFawApSQnVcXl7JB8QSu92AyExUgte8Qn30=
x-ms-exchange-transport-forked: True
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: nokia-bell-labs.com
X-MS-Exchange-CrossTenant-Network-Message-Id: c9123099-956a-4190-595c-08d80d27ebc6
X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Jun 2020 10:20:38.2855 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 5d471751-9675-428d-917b-70f44f9630b0
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: QYBPC338LFy6HGcQbGxPpq7wRYeMbHTJzXf+8gbR49kV0x8XRIiaLo6A8iZzDNPRNMlPtHM/LxlEVoJm86Oh5LhXWDt4baRV3CpR1BhL1VGB5BcF9xz9JWdo0uJNwn/K
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR07MB4497
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpprague/OR-P5jJQdagFUrOZKhApnrAaH_c>
Subject: Re: [tcpPrague] TSO burst sizing causing TCP Prague unfairness on high capacity links ?
X-BeenThere: tcpprague@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "To coordinate implementation and standardisation of TCP Prague across platforms. TCP Prague will be an evolution of DCTCP designed to live alongside other TCP variants and derivatives." <tcpprague.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpprague>, <mailto:tcpprague-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpprague/>
List-Post: <mailto:tcpprague@ietf.org>
List-Help: <mailto:tcpprague-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpprague>, <mailto:tcpprague-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Jun 2020 10:20:44 -0000
Hi Ashutosh, > We observed unfairness > between TCP Prague flows when running over high capacity links ( not just > wireless but in general ) and I would like to share some of our findings > here. Thanks for sharing these! > As you can observe, the second flow grabs almost all the available > bandwidth and the first one is starved. That's definitely not the expected behavior, as it seems the first sender somehow mis-detect a enormous amount of CE marks, making it stay at min-cwnd. I have unfortunately not been able to reproduce this (neither on a virtual nor a physical 10G testbed), so would need some additional help/details to diagnose this. > This experiment was done using commit number e741f5a > <https://github.com/L4STeam/linux/commit/e741f5ac756503e27be9c183dd107eadbe > a40c5c#diff-38ce93325583f02d790276f5cafd1c42> of the TCP Prague linux > kernel implementation ( Apr 8 , 2020 ). After some investigation, we found > that there might be something broken with the TSO burst sizing updates > dones by TCP Prague. I disabled the TSO burst size updates and ran the > experiment with the exact same settings and found that the fairness / > convergence this time was much better. ( See next plot ). Could you confirm that this is also happening with the tip of the current Branch? I.e., 3b63cc0 > Also, if interested you can look into the ss data plots ( srtt and cwnd ) > for these two experiments at this link : > https://drive.google.com/drive/folders/1pLC0dcMF0-M1cgtw9IhoFiYOMvFJ- > cc7?usp=sharing In addition to these, could you: - Confirm that this happens with both the internal TCP pacing and the fq qdisc as pacer on the data sender - Confirm that this happens both with and without gro/gso/tso/lro on the endhosts (data sender and receiver, and also the aqm node as fq does not do gro splitting prior to enqueue)? - Confirm that the problem persists if you increase the base RTT--a simple netem qdisc on the reverse path to add 2-5ms should be sufficient, do not forget to disable gro/gso as that poorly interacts with netem. - Log the reported CE marks by the AQM, as well as the delivered_ce/received_ce counters throughout the experiment on the data sender/receiver? All of these would help to pinpoint which component is at fault. Thanks! Best, Olivier > > Thank you, > > Ashutosh Srivastava > First year PhD student > Department of Electrical and Computer Engineering > NYU Tandon School of Engineering
- [tcpPrague] TSO burst sizing causing TCP Prague u… Ashutosh Srivastava
- Re: [tcpPrague] TSO burst sizing causing TCP Prag… Tilmans, Olivier (Nokia - BE/Antwerp)
- Re: [tcpPrague] TSO burst sizing causing TCP Prag… Ashutosh Srivastava