Re: [tsvwg] FQ-CoDel response to unresponsive traffic (was: Related to "Non-L4S traffic abusing the L-queue" discussion during the interim)
Bob Briscoe <ietf@bobbriscoe.net> Sat, 26 February 2022 19:16 UTC
Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7373D3A0C0A for <tsvwg@ietfa.amsl.com>; Sat, 26 Feb 2022 11:16:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.111
X-Spam-Level:
X-Spam-Status: No, score=-7.111 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lDkC6yNLlK_k for <tsvwg@ietfa.amsl.com>; Sat, 26 Feb 2022 11:16:17 -0800 (PST)
Received: from mail-ssdrsserver2.hostinginterface.eu (mail-ssdrsserver2.hostinginterface.eu [185.185.85.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6D5613A0C08 for <tsvwg@ietf.org>; Sat, 26 Feb 2022 11:16:16 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:From:References:Cc:To:Subject:MIME-Version:Date:Message-ID:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=1SresMIKAJDSwa3jU0c20MoKxjkNVNm/1peZVF8iCO4=; b=FSdNOAgcPDplrQEhfx31vbKeOg JpprQRM0k9glbV2z2Y/bzm20Fc9KeK5oIK5HG/9e14IWxpf0WMECLqcD+KzbH6lwipMimqnLNZ1Wv Iu3T+GNIb0/CezhS7U4uUfAwLL6boTdsbsm8kads4oCmHuZqh930HYwAY2W35pewHI9cm93H2y4Vs H6yHKNbsvVEt5Vnr2wsg9guOqEKQ6C7PWvCh7Pmw6SyC9jPWUJSut2YBp4jUcTgUXsMw3yt4pbQ2g e5z65Z/WDoPyEpjq79JYlXxVAxRye+PDlHnnL9+Rera2hZ/iDdpE4y7FfVubL/Wo/SojmroysZKyT FTQReYug==;
Received: from 67.153.238.178.in-addr.arpa ([178.238.153.67]:55650 helo=[192.168.1.11]) by ssdrsserver2.hostinginterface.eu with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from <ietf@bobbriscoe.net>) id 1nO2YL-0001G8-K0; Sat, 26 Feb 2022 19:16:14 +0000
Message-ID: <a5794f7a-e5b9-f2ca-0f45-e396b76726da@bobbriscoe.net>
Date: Sat, 26 Feb 2022 19:16:11 +0000
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0
Content-Language: en-GB
To: Dave Taht <dave.taht@gmail.com>
Cc: "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com>, tsvwg IETF list <tsvwg@ietf.org>, codel@lists.bufferbloat.net
References: <AM9PR07MB7313D5AAF6B9D66C74CC35A1B9369@AM9PR07MB7313.eurprd07.prod.outlook.com> <AM9PR07MB7313F1401B14F6F2DB72A2B2B93E9@AM9PR07MB7313.eurprd07.prod.outlook.com> <MN2PR19MB40454F60DEE5735EAD428465833E9@MN2PR19MB4045.namprd19.prod.outlook.com> <CADVnQyk+uSX9GJtMBnsBhn9NzY+L3BKfhhUJ=yu4Aya98YEonw@mail.gmail.com> <MN2PR19MB40458624D266CDB54009AB19833E9@MN2PR19MB4045.namprd19.prod.outlook.com> <AM9PR07MB731311A9E4532FD501B5D94CB93E9@AM9PR07MB7313.eurprd07.prod.outlook.com> <CAA93jw4=JuO9UqBoHLHXCQrLn7toTqPDerFehDajEH2-2dtZWA@mail.gmail.com> <CAA93jw4CtiYjBg9RAFuOjJHX4T7aUQ07KdetWSgKrNgJg=DPPA@mail.gmail.com> <5114db28-89ac-1eae-b846-22ae37391c6c@bobbriscoe.net> <CAA93jw7BaL-=SOv_JicXPwD8_4Rs89NrUrxBdY8vO92KpthAow@mail.gmail.com>
From: Bob Briscoe <ietf@bobbriscoe.net>
In-Reply-To: <CAA93jw7BaL-=SOv_JicXPwD8_4Rs89NrUrxBdY8vO92KpthAow@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - ssdrsserver2.hostinginterface.eu
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: ssdrsserver2.hostinginterface.eu: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: ssdrsserver2.hostinginterface.eu: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/uTOoxlpk9EIKn3AaLf1SprBP-a4>
Subject: Re: [tsvwg] FQ-CoDel response to unresponsive traffic (was: Related to "Non-L4S traffic abusing the L-queue" discussion during the interim)
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 26 Feb 2022 19:16:23 -0000
Dave, On 26/02/2022 17:13, Dave Taht wrote: > At one level you are interpreting an observed behavior as "tail drop" > - which may well be possible somewhere in the stack, > but it's not clear if you were running a post 2016 kernel which is > what added the drop_batch facility. > > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=9d18562a227 > > This drops from the head, not the tail. [BB] Yes, sry, I should have said 'head drop on arrival', not tail drop. I was trying to say this is not drop driven by the AQM, rather it's drop 'cos there's no more room in buffer allocated to the qdisc. > > I was not satisfied with this solution btw, and in some later patch > added an increment to the codel count in drop_batch so as to pass "bad > things are happening elsewhere" back over to the main portion of the > algorithm. I'm still very unsatisfied with the concept of a fixed and > user configurable drop_batch length, rather than something that > autotuned. > > elsewhere in the fq_codel_fast repo I experimented with eliminating > the queue search, but accepting that small but constant cpu overhead > for a optimizing for what is perceived to be (and may not be!) a > rarely hit condition, or accepting the cost of the search when it > happens, remains to be seen. > > So, while trying to disregard your conclusion this was tail drop, I am > happy that you have clearly identified (with a kernel version), and > described a test (yay!) that tickles a count caching problem and > proposed some solutions here: > > https://bobbriscoe.net/projects/latency/CoDel-delta-bug.pdf > > cc-ing the codel list. [BB] OK. And pls would someone also take note that the much more major design flaw in the control law needs attention nearly 9 years after I reported it (see the last slide in the link above). Bob > > On Sat, Feb 26, 2022 at 8:45 AM Bob Briscoe <ietf@bobbriscoe.net> wrote: >> Dave, >> >> I will keep reminding everyone that this shift of topic to FQ-CoDel is >> distracting from the task at hand: >> "Is Jonathan going to confirm that his 'throughput bonus' and 'fast >> lane' accusations against DualQ are baseless because his experiment was >> broken?" >> >> Nonetheless, response on FQ-CoDel is below, tagged [BB]... >> >> On 25/02/2022 21:06, Dave Taht wrote: >>> while I do not want to spend much time nitpicking this document... >>> >>> "causing most of the time tail-drop" stood out. codel, fq_codel, cake >>> all do head drop, and always have. >> [BB] For the list, we're talking about Figure 5 here: >> https://l4steam.github.io/overload-results/ >> >> I'm nearly certain that the cap at 600 ms is tail drop. >> Cause: The control law increases head drop so slowly that the flow-queue >> containing the unresponsive flow eventually fills the buffer allocated >> to the whole qdisc. Then I believe it moves into what Jonathan calls >> 'tallest sunflower' drop mode (tail drop focused on the longest flow-queue). >> >> To help prove this, here's an experiment Asad ran for me last Oct on >> FQ-CoDel with an unresponsive flow rate just greater than the link rate. >> https://bobbriscoe.net/projects/latency/CoDel-delta-bug.pdf#page=4 >> We were testing very slight overload, so it would stay in head drop >> mode, without hitting the need for tail drop. The plot shows a similar >> series of humps in the queue, but without the cut-off due to tail drop. >> So it's fairly conclusive that Koen's Fig 5 is showing tail drop. >> >> I'll answer your question (on the SANE list) about why the humps repeat, >> but that's a trivial bug compared to the time CoDel takes in the first >> place. >> It's a design flaw, not a bug. >> The so-called 'control' law never even measures the queue it is meant to >> be controlling. >> Here's some history: >> >> * On 12-Nov-2013 I reported that to Kathie and Van as CoDel designers, >> cc the AQM list: >> https://mailarchive.ietf.org/arch/msg/aqm/l4H1QdRl8B-E5FWpJh4w50B_nQE/ >> * No response by anyone for over 18 months, until... >> * 07-Jun-2015: Toke confirmed my analysis empirically (see it, via same >> thread above) >> Toke's plot: >> https://kau.toke.dk/ietf/codel-drop-rate/codel-drop-rate.svg >> * On 30-Sep-2015 you (DaveT) said "cake uses a better curve for CoDel >> but we still need to do more testing in the lab" >> As far as I understand it, that missed the point: CAKE's curve is >> still extremely slow, but somewhat faster than CoDel. >> But, CAKE's control law still never measures the queue it is meant >> to be controlling. >> * 25-Feb-2022: You say you don't want to spend much time nitpicking >> Koen's experiment. >> If not you, someone needs to grasp this nettle, given FQ-CoDel is >> the default qdisc in the Linux mainline. >> >> >> >> Bob >> >> -- >> ________________________________________________________________ >> Bob Briscoe http://bobbriscoe.net/ >> > -- ________________________________________________________________ Bob Briscoe http://bobbriscoe.net/
- [tsvwg] Related to "Non-L4S traffic abusing the L… De Schepper, Koen (Nokia - BE/Antwerp)
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… De Schepper, Koen (Nokia - BE/Antwerp)
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Sebastian Moeller
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Black, David
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Neal Cardwell
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… De Schepper, Koen (Nokia - BE/Antwerp)
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Black, David
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… De Schepper, Koen (Nokia - BE/Antwerp)
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Dave Taht
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Dave Taht
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Sebastian Moeller
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Sebastian Moeller
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Sebastian Moeller
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Luca Muscariello
- [tsvwg] FQ-CoDel response to unresponsive traffic… Bob Briscoe
- [tsvwg] FQ-CoDel response to unresponsive traffic… Bob Briscoe
- Re: [tsvwg] FQ-CoDel response to unresponsive tra… Dave Taht
- Re: [tsvwg] FQ-CoDel response to unresponsive tra… Bob Briscoe
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Bob Briscoe
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… De Schepper, Koen (Nokia - BE/Antwerp)
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… De Schepper, Koen (Nokia - BE/Antwerp)
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Sebastian Moeller
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Bob Briscoe
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Luca Muscariello
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… De Schepper, Koen (Nokia - BE/Antwerp)
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Bob Briscoe
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Luca Muscariello
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Bob Briscoe
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… De Schepper, Koen (Nokia - BE/Antwerp)
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Sebastian Moeller
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Sebastian Moeller
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… De Schepper, Koen (Nokia - BE/Antwerp)
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Luca Muscariello
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Bob Briscoe
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Bob Briscoe
- Re: [tsvwg] Related to "Non-L4S traffic abusing t… Sebastian Moeller
- [tsvwg] Not responding to ECN AQMs in the core (w… Bob Briscoe