Re: [tsvwg] start of WGLC on L4S drafts

Bob Briscoe <ietf@bobbriscoe.net> Thu, 14 October 2021 00:49 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DBD1D3A167E for <tsvwg@ietfa.amsl.com>; Wed, 13 Oct 2021 17:49:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.101
X-Spam-Level:
X-Spam-Status: No, score=-2.101 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id oAMh3DtYln7L for <tsvwg@ietfa.amsl.com>; Wed, 13 Oct 2021 17:49:01 -0700 (PDT)
Received: from mail-ssdrsserver2.hostinginterface.eu (mail-ssdrsserver2.hostinginterface.eu [185.185.85.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C30DA3A1495 for <tsvwg@ietf.org>; Wed, 13 Oct 2021 17:48:59 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:MIME-Version:Date:Message-ID:From:References:Cc:To:Subject:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=B2VvRoiCVwBj4SiMouJKX0QyCsZ1IPuRneoIcRbBoa0=; b=lgslC1yiWuqDlv+EvPxU8WaiYs 8gyP6miOqlsD1enawxKibPR7MhK9UftpRatlfTII1cd+twpRPKelkxhmXbvD40UiyBfWWRW0fisp/ wGWb1DKfAsz0yusX9vr5jwKfGqaXTpsXj7AiclolFEUfNR2dPqsSOhg8NDUlHFAjTTeDJw+bHV4GN kvQqBPuIzOXCQ0TGe/2xS+Y5C4dp4IXIEqjUVBZQ1PGO1tt5cfTCUL4VBqfvgkuqb7f08MgB82QuX csDOXBl2kncnqT9ZqsiKdcBhdOEJoP/5eQEI6+GiS+d6TltapDwpVUBUJIPOrUeqjiHuptiZXRuHg OHTB/YaA==;
Received: from 67.153.238.178.in-addr.arpa ([178.238.153.67]:51428 helo=[192.168.1.11]) by ssdrsserver2.hostinginterface.eu with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from <ietf@bobbriscoe.net>) id 1maove-002kRh-DZ; Thu, 14 Oct 2021 01:48:55 +0100
To: Sebastian Moeller <moeller0@gmx.de>
Cc: "tsvwg@ietf.org" <tsvwg@ietf.org>
References: <7dd8896c-4cd8-9819-1f2a-e427b453d5f8@mti-systems.com> <B575CC81-4633-471A-991F-8F78F3F2F47F@ericsson.com> <aa968ff5-262c-1fd4-981d-05507ac1e59e@erg.abdn.ac.uk> <360988450.1173982.1630607180962@mail.yahoo.com> <b6d9afb6-3328-cfdf-b7bf-2345049ea0dc@bobbriscoe.net> <8415BA1F-2806-4224-9D9F-EB256DA7DF41@gmx.de> <ba6ebe46-fb36-e982-bd2a-e05028e8accb@bobbriscoe.net> <89E5242A-6C45-4BF5-96D0-EAC6F39C7435@gmx.de>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <6caa034d-ef93-b611-d60e-203a9c3e488d@bobbriscoe.net>
Date: Thu, 14 Oct 2021 01:48:54 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0
MIME-Version: 1.0
In-Reply-To: <89E5242A-6C45-4BF5-96D0-EAC6F39C7435@gmx.de>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - ssdrsserver2.hostinginterface.eu
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: ssdrsserver2.hostinginterface.eu: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: ssdrsserver2.hostinginterface.eu: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/6hFHGSMgtjhAfUuTH5tEBS0kDFI>
Subject: Re: [tsvwg] start of WGLC on L4S drafts
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 14 Oct 2021 00:49:07 -0000

Sebastian, see [BB2] (2 response)

On 13/10/2021 08:25, Sebastian Moeller wrote:
> Hi Bob,
>
> more bellow, prefixed [SM2]
>
>
>> On Oct 13, 2021, at 00:02, Bob Briscoe <in@bobbriscoe.net> wrote:
>>
>> Sebastian,
>>
>> Sorry - just noticed I missed this email at the time. See [BB]...
>>
>> On 21/09/2021 21:54, Sebastian Moeller wrote:
>>> Hi Bob,
>>>
>>>
>>>
>>>> On Sep 20, 2021, at 18:40, Bob Briscoe <in@bobbriscoe.net> wrote:
>>>>
>>>> Alex, Gorry, [Sry, just discovered I didn't click end on this]
>>>>
>>>> In "all of a user's applications can shift" the word "can" is important.
>>>> Similarly, in "The L4S architecture is intended to enable all Internet applications to transition" the word "enable" is important.
>>>>
>>>> Neither statement was meant to imply that all applications would move, or should move, just that they would all be _able_ to.
>>>> These statements are primarily intended as counterpoints to the commonly held view that low latency comes from priority scheduling (e.g. via differentiated service) so low latency can only be provided for a subset of traffic, because if all packets requested higher priority, there would be no difference to any packet's latency.
>>> 	[SM] Given that at the core of the L4S reference AQM there sits a (conditional) priority scheduler*, I fail to see how L4S' low latency does not come from priority scheduling? Also in L4S low latency can only be provided to a subset of traffic, namely flows that are well behaved and ECT(1)-marked (given the shared queue design, one bad apple in the L-queue can affect latency/throughput for all flows). I am not sure whether L4S really is contrary to the "commonly held view" here, sure the details differ a bit, but at its core it seems rather similar to traditional techniques for low latency.
>> [BB] You say 'L4S low latency can *only* be provided to a subset' (my emphasis), but the word 'only' is wrong.
>> That is the point. L4S low latency /can/ be provided to all traffic.
> 	[SM2] Well low latency "can" be provided by a dumb FIFO then, as long as the aggregate ingress traffic never exceeds the egress capacity... this is similarly unhelpful as your "can provide".
>
>> Yes, L4S low latency /is/ only provided to flows that are well-behaved and ECT(1) marked, but all traffic /can/ behave smoothly without losing out, and all traffic /can/ use ECT(1) without losing out.
>>
>> The priority scheduler really is not at the core.
> 	[SM2] My words were "at the core of the L4S reference AQM" and I maintain that that is strictly true, DualQ being:
> a) the reference AQM for L4S
> b) the only L4S AQM that has an open reference implementation for Linux
> c) has seen some modicum of testing by others than the developers
>
> I maintain, that DualQ is at its core a (weighed) priority scheduler.
>
>
>> * Consider an L4S FQ scheduler like FQ-CoDel with the ce_threshold parameter applied to ECT(1) traffic. There is no priority scheduling but all traffic can still have very low latency.
> 	[SM2] Well, show me a) an operational reference implementation of such an AQM and b) sufficient testing data to demonstrate that it works under a broad set of realistic traffic conditions. Sure, I would assume an FQ based AQM will have fewer downsides that DualQ, but I would still want to see data.

[BB2]  We published all the test data for the ce-threshold variant of 
FQ_CoDel alongside the DualQ results in July 2019 when I posted this 
link to this list:
     https://bobbriscoe.net/pubs.html#DCttH_TR
The FQ-CoDel tests with ce-threshold enabled for ECT(1) traffic are in 
the right-hand column of all the pages of plots at the end of the paper.
The Evaluation section explains the setup of all the tests plotted at 
the end, and provides commentary on the results.

This reveals to me that you have never read this material. I think you 
need to justify on this list why you have not read the primary 
performance evaluation material about L4S, given everything you have 
said about L4S performance.

You will note from Fig 7 that the tail latency of FQ-L4S under load from 
a mix of short and long flows is pretty good, but not quite as good as 
the DualQ. That's inevitable because when an approach gives priority to 
'new' flows, there will always be runs of 'new' packets that cause 
packets in the 'old' queues to have to wait longer than they would in a 
FIFO. Put another way, if you allow some other packet than the first in 
to be the first out, then you have to delay the first in, which is often 
counterproductive when the scalable congestion controls are collectively 
ensuring that nearly all packets are 'new' and any 'old' queues are 
extremely short anyway.

> Plus, as long as DualQ is L4S's reference implementation that will be deployed in scale by cable-ISPs it does really matter that there might be better L4S-AQM designs possible, does it?
>> * Similarly, an L4S DualQ Coupled AQM can work with any scheduler that provides isolation, for instance the Time-Shifted FIFO. Scheduling priority is not 'at the core', 'cos it's not even necessary. We only ended up choosing a priority scheduler because it gave slightly better tail latency.
> 	[SM2] Well, I am talking about the selected reference AQM for L4S, if you drop the DualQ draft and replace is by  Time-Shifted FIFO I will have a look at that AQM and discuss that. But at the current time DualQ is the relevant AQM, since it is the only one you are currently trying to standardize.

[BB2] I won't respond any further to everything above this point, 
because it has just descended into two opinions about what is the core 
idea of a technology.

But I will correct two possible misunderstandings about what is being 
standardized here.

#1/ DualQ and FQ are equally applicable for L4S. That change was made to 
the l4s-arch and ecn-l4s-id drafts in Feb 2020. The fact that there is a 
draft about a DualQ solution and not an FQ draft is partly because the 
FQ one is so similar to FQ_CoDel that it only needs a short sentence 
describing the difference.

Nonetheless, this is not a spectator sport. If you or anyone else on 
this list wants to prove that an FQ-L4S is better, or as good, or 
whatever, go ahead and try to repeat or improve on the results given 
above. A draft about FQ-L4S would also be extremely quick to write - 
probably less than a page, not including all the top and tail guff.

#2/ And there is a difference between standardization and implementation:
The DualQ Coupled AQM draft does not (experimentally) standardize a WRR 
scheduler. Everything in the appendix about DualPI2 (and the other 
appendix about Curvy RED) is informative. The aqm-dualq-coupled draft 
experimentally standardizes a framework for coupling AQMs across two 
queues (the first line of the introduction says this). The words chosen 
to be capitalized (i.e. normative) have been carefully selected to be as 
concrete as possible without limiting choice.

The only normative statements about the scheduler are:
"The scheduler draining the two queues MUST give L4S packets priority 
over Classic, although priority MUST be bounded in order not to starve 
Classic traffic. The scheduler SHOULD be work-conserving."
This is deliberately written to apply to the TS-FIFO, and all sorts of 
other possible schedulers.

Regards



Bob


>
>> The real core of L4S is scalable congestion control. That is the difference that allows even capacity-seeking traffic to also maintain consistently very low latency. It's easier to see in the L4S FQ example, that it's what keeps a flow's latency down. Not the scheduler.
> 	[SM2] I would have assumed, if that was the core that the L4S effort would have started with a draft for such a scalable congestion controller, instead TCP Prague came as an after thought relatively late in the game...
>
>
>> If TCP had been designed with a scalable congestion control from the start, we would have had L4S with no need for a scheduler at all. The scheduler is primarily a transition mechanism, to isolate from Classic traffic. But no traffic /needs/ to be Classic, it's just how "stuff happened".
> 	[SM2] Realistically, a considerable fraction of internet traffic is going t stay on classic congestion controller for a long time, so I fail to see how this is relevant for the near and intermediate future. Especially given the demonstrated failure modes of TCP Prague, where realistic traffic patterns result in abysmal throughput...
>
>
> Regards
> 	Sebastian
>
>>
>>
>> Bob
>>
>>> Regards
>>> 	Sebastian
>>>
>>>
>>> *) https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-ecn-l4s-id-19:
>>> "In this case it would not be appropriate to call the queue an L4S
>>>     queue, because it is shared by L4S and non-L4S traffic.  Instead it
>>>     will be called the low latency or L queue.  The L queue then offers
>>>     two different treatments:
>>>
>>>     o  The L4S treatment, which is a combination of the L4S AQM treatment
>>>        and a priority scheduling treatment;
>>>
>>>     o  The low latency treatment, which is solely the priority scheduling
>>>        treatment, without ECN-marking by the AQM."
>>>
>>>
>>> https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-l4s-arch-10:
>>> "The tension between prioritizing L4S and coupling the
>>>         marking from the Classic AQM results in approximate per-flow
>>>         fairness.  To protect against unresponsive traffic in the L4S
>>>         queue taking advantage of the prioritization and starving the
>>>         Classic queue, it is advisable not to use strict priority, but
>>>         instead to use a weighted scheduler (see Appendix A of [I-D.ietf-tsvwg-aqm-dualq-coupled])."
>>>
>>> https://datatracker.ietf.org/doc/html/draft-ietf-tsvwg-aqm-dualq-coupled
>>> "Classic traffic needs to build a large queue to prevent under-
>>>     utilization.  Therefore a separate queue is provided for L4S traffic,
>>>     and it is scheduled with priority over the Classic queue.  Priority
>>>     is conditional to prevent starvation of Classic traffic."
>>>
>>> I ignore the yarn spun in those drafts, why this prioritization would only result in latency but not longer-time-scale throughput prioritization, because Pete's data show that for quite a number of realistic traffic patterns DualQ fails to arbitrate capacity equitably between the two queues.
>>>
>>>
>>>
>>>> I've made a note to myself to make it clear that this is not an intention that all applications will transition, rather it is an intention not to prevent any application from transitioning.
>>>>
>>>>
>>>> Bob
>>>>
>>>> On 02/09/2021 19:26, alex.burr@ealdwulf.org.uk wrote:
>>>>> Hi Gorry, tsvwg;
>>>>>
>>>>>
>>>>> See inline [AB]
>>>>>
>>>>>
>>>>>
>>>>> On Thursday, August 12, 2021, 11:23:49 AM GMT+1, Gorry Fairhurst<gorry@erg.abdn.ac.uk>  wrote:
>>>>>
>>>>> ---
>>>>> 1b) Issue: I’d also quibble with: “so that all of a user's applications
>>>>> can shift to it when their stack is updated.“ - again, is the word “all”
>>>>> really necessary or correct here?
>>>>> ---
>>>>>
>>>>> [AB]
>>>>> Not  answering your point, but related:
>>>>>
>>>>> L4s-arch has at the very start the following: “The L4S architecture is intended to enable   _all_ Internet applications to transition away from congestion control algorithms that cause queuing delay, to a new class of  congestion controls that induce very little queuing, aided by  explicit congestion signaling from the network. “
>>>>>
>>>>>
>>>>>   There are two quite different end states here:
>>>>>
>>>>> * An internet in which there are, end-to-end, two classes of traffic on an ongoing basis.
>>>>>
>>>>> * An internet in which nearly all traffic has transitioned to L4S, remaining non-L4S traffic being supported on a backwards-compatibility basis
>>>>>
>>>>> As I understand it, if L4S is deployed, either of these could be possible end states. At present it is not clear that all use cases could migrate (eg, long RTT traffic)
>>>>>
>>>>>
>>>>> It seems like it would be a good idea for the draft to make clear:
>>>>> * That either of these end states are possible
>>>>> * What the formal position of the WG would be if the drafts are accepted, IE what decision has been made (or left for later) regarding the desired end state.
>>>>>
>>>>>
>>>>> regards,
>>>>>
>>>>> Alex
>>>>>
>>>> -- 
>>>> ________________________________________________________________
>>>> Bob Briscoehttp://bobbriscoe.net/
>>>>
>> -- 
>> ________________________________________________________________
>> Bob Briscoe                               http://bobbriscoe.net/

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/