Re: [tsvwg] Related to "Non-L4S traffic abusing the L-queue" discussion during the interim

Sebastian Moeller <moeller0@gmx.de> Fri, 25 February 2022 22:15 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DBBEB3A0ADC for <tsvwg@ietfa.amsl.com>; Fri, 25 Feb 2022 14:15:17 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.646
X-Spam-Level:
X-Spam-Status: No, score=-6.646 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CaM5VgpbPid1 for <tsvwg@ietfa.amsl.com>; Fri, 25 Feb 2022 14:15:11 -0800 (PST)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0570A3A0AC2 for <tsvwg@ietf.org>; Fri, 25 Feb 2022 14:15:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1645827264; bh=PY1u1IcUAa0VFrxotcm+taARO68DUpWojdcEriPm/Jo=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=Y3Du3y6Y9VgG+z4NSNPM8bmXa3l8c1zbuxoJDGYrZQEAvDvw2Ez2quIKbe2SVoESC 7Qaa8RF7SJGh7VD58GHs6prXZmi0JTWX8PBjRb1SGI4582JtWlFz6wE2lAXT25TFkI 8iJM9IiFEHXdFPECxH5VZYeNXthv4/PA0EMpJIUE=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from smtpclient.apple ([95.116.211.112]) by mail.gmx.net (mrgmx104 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MfYLQ-1nqRtP31uS-00fxmt; Fri, 25 Feb 2022 23:14:23 +0100
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <CAA93jw4CtiYjBg9RAFuOjJHX4T7aUQ07KdetWSgKrNgJg=DPPA@mail.gmail.com>
Date: Fri, 25 Feb 2022 23:14:23 +0100
Cc: "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com>, tsvwg IETF list <tsvwg@ietf.org>, Bob Briscoe <in@bobbriscoe.net>
Content-Transfer-Encoding: quoted-printable
Message-Id: <EBA8389B-CCD2-40E3-8D6D-2BC7327117C1@gmx.de>
References: <AM9PR07MB7313D5AAF6B9D66C74CC35A1B9369@AM9PR07MB7313.eurprd07.prod.outlook.com> <AM9PR07MB7313F1401B14F6F2DB72A2B2B93E9@AM9PR07MB7313.eurprd07.prod.outlook.com> <MN2PR19MB40454F60DEE5735EAD428465833E9@MN2PR19MB4045.namprd19.prod.outlook.com> <CADVnQyk+uSX9GJtMBnsBhn9NzY+L3BKfhhUJ=yu4Aya98YEonw@mail.gmail.com> <MN2PR19MB40458624D266CDB54009AB19833E9@MN2PR19MB4045.namprd19.prod.outlook.com> <AM9PR07MB731311A9E4532FD501B5D94CB93E9@AM9PR07MB7313.eurprd07.prod.outlook.com> <CAA93jw4=JuO9UqBoHLHXCQrLn7toTqPDerFehDajEH2-2dtZWA@mail.gmail.com> <CAA93jw4CtiYjBg9RAFuOjJHX4T7aUQ07KdetWSgKrNgJg=DPPA@mail.gmail.com>
To: Dave Täht <dave.taht@gmail.com>
X-Mailer: Apple Mail (2.3654.120.0.1.13)
X-Provags-ID: V03:K1:ZKOilvrXsz/Vu/WVxJa4hFgi6VogXRN+sS1X/cTPupSUPcJiLln S9u87A1CsAswrhqkY/QPY3v78rfRkYZnrCvp/OKOiY8qY2LlKkIOo5dsY25usLCpaj+OIlX VThMnDp3FCFvdKhBueO8UN5uoh3UAaQl1uAJ3QNkBPUOtvKzy8uqSE+rlHiuep30G6yo+S3 rOt3rexm/Ucn7ejyiGirw==
X-UI-Out-Filterresults: notjunk:1;V03:K0:z/PSjva/T/g=:VhZFV4DcaF5x4NImNLn8Tf LI4fPK3CsAUXwQX5bLafj9QGRg5VN6tuUPF1p4kee/+1jTBEY1E8cplNlyr2zTNK12Mpkra53 DOcF62OFd1jV0rjpRl5w1F8Q9cEepw1xrmt+SA5TN6NaX5WBQ1Zqg6uwE7LG+O/oon6g/1zSz 3qcAfYppQRgXCNirYIlUP0qPplkp/xOigp8mO3Qh9E2+Fyu10OGop6LeJGwT24FQjiFVM77Hz qS0dRMVUMkeA3M2XJ31WwQj2vt1hf8dBG+cEYlP+diAmOhmU4KMiUctQWp3aNyQbxUeMaB5Bu AuSMy6Vi46C1XnSgyuE7eJXFHNy4/AJpahInl1rutuGDhifP/Srl//nDWFsl1AWw3GkElg135 jrPV/xCHgO24ndAGTm0m1ZthTwJjLA9lW2xfPS1m7IziT6AwPUB+3yvOnhdGAOzwLmzaVtZB1 ippj9m6mYBLlRKpU3lLO35+jYspxCJ/XWGMticJ9RdxLFVZsfZrJ0Me8umQxYwBfHgqkCApnh Jr0bTFHFsmYD4z2MgOftvUE2/Gd4DPwjWSqIT0tfm0ltzigxImW0w5D5kRySymHMbOp4gkK1Y sDTSAoPCfUCun+kaVEKqe6Wzer0Hk42l3wepPTGEuEM8mg5vxvFKVES1oVc4PaP3/4bE4Fezj H/7FO3Qr0Vm4fLkR6eQQcfNYhjzmi5GbfKm62fFr5kDR5RikQ/Ql0qvSN88KOZJP+eLTNlRo7 2sLO0g6xC7KVkFVs5RixfqIYIyRtZeIMgNXyyax/9L8q6WMed6CqJr83P+BviIjVIhbXrxgAx sIWBwsj+pFu0NOnAxfFz2TzTWlCqknx7NBLC2Nt2V27Ung8cpiJ3rPob8Rtv+FRXyK1U8uHZm v5yudc6vieTSraaCC0nBLy5rhCcLDQD76IhE8H6fZRY5n/TqjggrQlVUVJSUsg9NGpLeGsL1Z Ei9FL3V9w2PenDpS81kPGB/X6pSGEohceZeJNnm9U74tOhKwuiLcTom9uthK173Df9IS7iVkP y8Q1Q0ePAjVw/T3DxoS9SnlqpDPRAq8r8c/lpCU82/81nFpOE7MuDLwu1n4RlRj+fxM/yupf1 KxvhcBHasPQ2uU=
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/5bjGH1ncbdgfwn0k3wk8pTvRho4>
Subject: Re: [tsvwg] Related to "Non-L4S traffic abusing the L-queue" discussion during the interim
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 25 Feb 2022 22:15:19 -0000

Hi Dave,


> On Feb 25, 2022, at 22:06, Dave Taht <dave.taht@gmail.com> wrote:
> 
> while I do not want to spend much time nitpicking this document...
> 
> "causing most of the time tail-drop" stood out. codel, fq_codel, cake
> all do head drop, and always have.

	I think that codel (not fq_codel configured for a single hask-bucket) on overload actually drops the new packet in the enqueue routine. I would guess this is because most users followed Van's advise to use fq_codel instead... so in a sense sch_codel might be more documentation than a qdisc hardened by exposure to real usage. The linked page seems to indicate that fq_codel(1) was used, but it does not seem to say so for the figures we are discussing here.

Regards
	Sebastian


> 
> On Fri, Feb 25, 2022 at 4:02 PM Dave Taht <dave.taht@gmail.com> wrote:
>> 
>> I am happy to see some results of different scenarios here, and have
>> some suggestions as to others.
>> 
>> On Fri, Feb 25, 2022 at 1:30 PM De Schepper, Koen (Nokia - BE/Antwerp)
>> <koen.de_schepper@nokia-bell-labs.com> wrote:
>>> 
>>> Hi David,
>>> 
>>> 
>>> 
>>> To be sure, we re-did the overload tests recently, confirming the previous overload results. These results are available at: Overload results caused by non-responsive UDP traffic for PIE, DualPI2 and CoDel AQMs | l4steam.github.io
>> 
>> It really helps if you document what codebase you are testing. As one
>> example among many, fq_codel has a drop_batch facility now. An
>> enhancement to that has been, for 4? 5 years? is the codel count gets
>> incremented when drop_batch is hit.
>> 
>> The article sets flows=1 for fq_codel, and then makes claims that it
>> has no overload protection. The "overload protection" is in the fq
>> portion of the algorithm. I have long thought were ecn of any form to
>> become a thing in the real world, that it would be best to also modify
>> codel to do "drop and mark" in the RTT seeking portion of the algo.
>> 
>> Now, if you are trying to make the point that having a queue selector
>> that is kept from hop to hop and allows for differential treatment is
>> a win, I'd rather like to see results for a queue size that is
>> actually sufficient for pie/dualpi to operate at 10Gbit or above,
>> working at a mix of 10 to 60ms or greater RTTs. I assume the packet
>> limit for dualpi is 1000, shared between the two? What happens on this
>> test at queue 1000 at 10Gbit, or queue 10000 at 100Mbit? [1]
>> 
>> In terms of future scaling issues for all these new forms of tcp and
>> qdiscs it would help if y'all (and by this I mean all parties here)
>> were testing 1gbit, 10gbit, and beyond traffic, at this point. The
>> default packet limit in codel is tuned to 10Gbit+, and has largely
>> been supplanted by a memory limit (as in cake) due to the prevalance
>> of gso/gro traffic.
>> 
>> Also seeing the work on TCP BIG at google, how well is anything
>> working with 4k packets?
>> 
>> [1] I've longed for a good rrul (simultaneous up/down) test on an
>> asymmetric network. 1Gbit down/50mbit up being common now.
>> 
>> 
>> 
>>> 
>>> 
>>> 
>>> Specifically look at figure 8 at the end which shows that L4S traffic gets marks, up to 100% and appropriate drop if it reaches and exceeds the link capacity.
>>> 
>>> 
>>> 
>>> The test case of Jonathan is approximated by the 70Mbps non-responsive ECT(1) UDP traffic on a 100Mbps link on a DualPI2 (Prague+Cubic) test case. In Jonathan’s case it was 40Mbps on a 50Mbps link. We also evaluated in extreme when sending at 100Mbps non-responsive ECT(1) UDP traffic on a 100Mbps link, and even exceeding at 140Mbps and 200Mbps. You will see the results are as if it is on a Single Q PIE AQM. Note also that CoDel which never drops ECT packets, causes actually close to starvation and high tail-drop delay results as shown in figure 1, even with ECT(0). So I guess all the concerns about FQ_CoDel and tunnels/Hash-collisions are equally severe and not related to L4S alone (can just be exploited by ECT(0) traffic today already!!).
>>> 
>>> 
>>> 
>>> Koen.
>>> 
>>> 
>>> 
>>> From: Black, David <David.Black@dell.com>
>>> Sent: Friday, February 25, 2022 7:04 PM
>>> To: Neal Cardwell <ncardwell@google.com>
>>> Cc: De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com>; tsvwg IETF list <tsvwg@ietf.org>; Jonathan Morton <chromatix99@gmail.com>; Bob Briscoe <in@bobbriscoe.net>; Black, David <David.Black@dell.com>
>>> Subject: RE: [tsvwg] Related to "Non-L4S traffic abusing the L-queue" discussion during the interim
>>> 
>>> 
>>> 
>>> Hi Neal,
>>> 
>>> 
>>> 
>>> So, I saw that explanation – could someone check the "running code" to make sure that the coupling and marking occur even when the L queue is always empty?
>>> 
>>> 
>>> 
>>> Thanks, --David
>>> 
>>> 
>>> 
>>> From: Neal Cardwell <ncardwell@google.com>
>>> Sent: Friday, February 25, 2022 12:58 PM
>>> To: Black, David
>>> Cc: De Schepper, Koen (Nokia - BE/Antwerp); tsvwg IETF list; Jonathan Morton; Bob Briscoe
>>> Subject: Re: [tsvwg] Related to "Non-L4S traffic abusing the L-queue" discussion during the interim
>>> 
>>> 
>>> 
>>> [EXTERNAL EMAIL]
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Fri, Feb 25, 2022 at 11:56 AM Black, David <David.Black@dell.com> wrote:
>>> 
>>> Koen,
>>> 
>>> 
>>> 
>>> I'll observe that "traffic that is not responding at all to CE marks" is not necessary to achieve the reported results if the experimental setup "prevents the L queue from seeing any
>>> 
>>> need to apply congestion signals, because it is always empty" as there would be no CE marks for the traffic in the L queue to respond to.
>>> 
>>> 
>>> 
>>> I think the key part here is "if". :-) The assertion "prevents the L queue from seeing any need to apply congestion signals, because it is always empty" is from:
>>> 
>>>  https://sce.dnsmgr.net/downloads/L4S-WGLC2-objection-details.pdf [sce.dnsmgr.net]
>>> 
>>> That assertion is inconsistent with the functioning of the Dual-Q algorithm, as described in:
>>> 
>>>  https://www.ietf.org/id/draft-ietf-tsvwg-aqm-dualq-coupled-21.html [ietf.org]
>>> 
>>> 
>>> 
>>> As Bob noted: "in the scenario shown, although the L queue is indeed always empty, it will see a high level of congestion signals (~10% in this case) via the coupling."
>>> 
>>> Here's Bob's e-mail for more context/details:
>>> 
>>>  https://mailarchive.ietf.org/arch/msg/tsvwg/joFr3sfOrxxkYhWdYrO2rLlCNUw/ [mailarchive.ietf.org]
>>> 
>>> 
>>> 
>>> thanks,
>>> 
>>> neal
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Please give that further consideration.
>>> 
>>> 
>>> 
>>> Thanks, --David (as an individual)
>>> 
>>> 
>>> 
>>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of De Schepper, Koen (Nokia - BE/Antwerp)
>>> Sent: Friday, February 25, 2022 4:29 AM
>>> To: tsvwg IETF list; Jonathan Morton
>>> Subject: Re: [tsvwg] Related to "Non-L4S traffic abusing the L-queue" discussion during the interim
>>> 
>>> 
>>> 
>>> [EXTERNAL EMAIL]
>>> 
>>> Hi Jonathan,
>>> 
>>> 
>>> 
>>> Can you confirm that this test is done with “Cubic” traffic that is not responding at all to CE marks? So it is just like any other non-responding traffic (like UDP CBR). We don’t see any other way to explain your results.
>>> 
>>> 
>>> 
>>> If so, we can/should remove this “issue” from the shepherd’s write-up, as such unresponsive flows will get the same throughput on any single-Q bottleneck with or without AQM (taildrop/PI2/PIE/CoDel/STEP/RED/…) with a latency that matches the AQM strategy.
>>> 
>>> 
>>> 
>>> Koen.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> From: tsvwg <tsvwg-bounces@ietf.org> On Behalf Of De Schepper, Koen (Nokia - BE/Antwerp)
>>> Sent: Thursday, February 17, 2022 7:01 PM
>>> To: tsvwg IETF list <tsvwg@ietf.org>; Jonathan Morton <chromatix99@gmail.com>
>>> Subject: [tsvwg] Related to "Non-L4S traffic abusing the L-queue" discussion during the interim
>>> 
>>> 
>>> 
>>> Hi Jonathan,
>>> 
>>> 
>>> 
>>> It seems that the following open issue identified by the chairs:
>>> 
>>> 
>>> 
>>> Non-L4S traffic abusing the L-queue
>>> 
>>> • ‘DualQ gives a large throughput bonus to L queue traffic, ie. a “fast lane”’
>>> 
>>> • Is this a matter specific for DualQ that can be left for experimentation?
>>> 
>>> 
>>> 
>>> is based on the following experiment you performed:
>>> 
>>> 
>>> 
>>>>            simple two-flow competition test on a standard dumbbell topology,
>>> 
>>>>            with the bottleneck running a DualQ qdisc into a 50Mbps shaper.
>>> 
>>>>            Both flows were configured to use CUBIC congestion control with
>>> 
>>>>            ECN negotiated, but one was additionally tweaked to set ECT(1)
>>> 
>>>>            instead of ECT(0) on all data segments, and to pace its output at
>>> 
>>>>            40Mbps. This latter measure prevents the L queue from seeing any
>>> 
>>>>            need to apply congestion signals, because it is always empty.  These
>>> 
>>>>            tweaks allowed that flow to use 80% of the link capacity, gaining a
>>> 
>>>>            fourfold advantage over its competitor,
>>> 
>>> 
>>> 
>>> If there is capacity seeking traffic in the Classic queue, then it is even desired that the L4S queue does not add extra marks. The L4S marks should come only from the Classic coupling.
>>> 
>>> Before diving into details, can you first explain why in your experiment the coupling from the Classic Q has no effect on your paced and ECT(1) labeled Cubic flow?
>>> 
>>> 
>>> 
>>> I would expect that this ECT(1) labeled Cubic flow would get even less throughput than the Classic Cubic flow, as the first gets the doubled coupled CE marking probability (eg 2*10% = 20%) for L4S flows instead of the squared CE marking probability (10%^2 = 1%) which ECT(0) traffic would get.
>>> 
>>> 
>>> 
>>> Thanks,
>>> 
>>> Koen.
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> --
>> I tried to build a better future, a few times:
>> https://wayforward.archive.org/?site=https%3A%2F%2Fwww.icei.org
>> 
>> Dave Täht CEO, TekLibre, LLC
> 
> 
> 
> -- 
> I tried to build a better future, a few times:
> https://wayforward.archive.org/?site=https%3A%2F%2Fwww.icei.org
> 
> Dave Täht CEO, TekLibre, LLC
>