Re: [iccrg] [tsvwg] SCReAM (RFC8298) with CoDel-ECN and L4S

Bob Briscoe <in@bobbriscoe.net> Tue, 17 March 2020 18:45 UTC

Return-Path: <in@bobbriscoe.net>
X-Original-To: iccrg@ietfa.amsl.com
Delivered-To: iccrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5530E3A0A92 for <iccrg@ietfa.amsl.com>; Tue, 17 Mar 2020 11:45:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id P25iNPidCGkA for <iccrg@ietfa.amsl.com>; Tue, 17 Mar 2020 11:45:50 -0700 (PDT)
Received: from cl3.bcs-hosting.net (cl3.bcs-hosting.net [3.11.37.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id ED4BD3A0A81 for <iccrg@irtf.org>; Tue, 17 Mar 2020 11:45:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:From:References:Cc:To:Subject:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=TRbMWXrmNE/tIxYwHG0vtRCwOw8OCACk9uaE7KpxqEU=; b=JR9pMy7aayVGzVJPjf7oqkG25 1Kt8xodjNpBYE638GVhu3fvtMki8NgaRSLOE+BzJo3iyIQXqzLjCk5Kmb7bmn8jLgs6FdoP/yvjBx tO3oFGc1X/PwRlEGtMtPzqVUXnGtO6jWdT2AutL5EP7pMJ/8BIqbk5QMaMn8a9sbepv0PtmAUTFSJ MLDCPEnTbEL7MLiMT9pCNSbW62lLHOiSitwX8M/ztMOFsmGZ65PjJMcupE2QBal3UWKnL8sMHEDU1 jhsWYzyF8Rk/EH8EFFeuqa4OD2J2T1nv4xKDgpH5bJbuKEJJQz9Nk98ZUBwxWOPtvkQANLC76IMvC cIYM7JL7Q==;
Received: from [31.185.135.141] (port=56112 helo=[192.168.0.4]) by cl3.bcs-hosting.net with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from <in@bobbriscoe.net>) id 1jEHDm-00A6pn-RJ; Tue, 17 Mar 2020 18:45:47 +0000
To: Sebastian Moeller <moeller0@gmx.de>
Cc: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>, "iccrg@irtf.org" <iccrg@irtf.org>, Ingemar Johansson S <ingemar.s.johansson=40ericsson.com@dmarc.ietf.org>, "tsvwg@ietf.org" <tsvwg@ietf.org>
References: <HE1PR07MB44251B019947CDB6602B30B2C2FF0@HE1PR07MB4425.eurprd07.prod.outlook.com><A2300F8D-5F87-461E-AD94-8D7B22A6CDF3@gmx.de> <HE1PR07MB4425B105AFF56D1566164900C2FF0@HE1PR07MB4425.eurprd07.prod.outlook.com> <9e5ea80f-d709-e204-f08d-93d3479668aa@bobbriscoe.net> <90A501D0-56A1-4685-800F-10F002FD8FCD@gmx.de>
From: Bob Briscoe <in@bobbriscoe.net>
Message-ID: <e8630484-46af-4130-e603-fc05e8767871@bobbriscoe.net>
Date: Tue, 17 Mar 2020 18:45:45 +0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1
MIME-Version: 1.0
In-Reply-To: <90A501D0-56A1-4685-800F-10F002FD8FCD@gmx.de>
Content-Type: multipart/alternative; boundary="------------9B9089B052B31D1FB86BE5E4"
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - cl3.bcs-hosting.net
X-AntiAbuse: Original Domain - irtf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: cl3.bcs-hosting.net: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: cl3.bcs-hosting.net: in@bobbriscoe.net
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/iccrg/_3J_OxYBNdJJdY1DMFnEOb4LtnQ>
Subject: Re: [iccrg] [tsvwg] SCReAM (RFC8298) with CoDel-ECN and L4S
X-BeenThere: iccrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of Internet Congestion Control Research Group \(ICCRG\)" <iccrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/iccrg>, <mailto:iccrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/iccrg/>
List-Post: <mailto:iccrg@irtf.org>
List-Help: <mailto:iccrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/iccrg>, <mailto:iccrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Tue, 17 Mar 2020 18:45:54 -0000

Sebastian,

On 17/03/2020 14:54, Sebastian Moeller wrote:
> Hi Bob,
>
>
>> On Mar 17, 2020, at 02:12, Bob Briscoe <in@bobbriscoe.net> wrote:
>>
>> Sebastian
>>
>> On 10/03/2020 10:07, Ingemar Johansson S wrote:
>>> Hi
>>>
>>> For the future studies we will only focus on L4S as the scope is to study the performance gain that L4S give for instance for AR/VR, gaming and remote control applications.
>>> Flow aware AQMs with RTT estimates as metadata in the packets is outside the scope as it would require packet inspection, which is not feasible if queues build up on the RLC layer in the 3GPP stack.
>>>
>>> /Ingemar
>>>
>>>> -----Original Message-----
>>>> From: Sebastian Moeller <moeller0@gmx.de>
>>>> Sent: den 10 mars 2020 10:45
>>>> To: Ingemar Johansson S
>>>> <ingemar.s.johansson=40ericsson.com@dmarc.ietf.org>
>>>> Cc: tsvwg@ietf.org; Ingemar Johansson S
>>>> <ingemar.s.johansson@ericsson.com>; iccrg@irtf.org
>>>> Subject: Re: [tsvwg] SCReAM (RFC8298) with CoDel-ECN and L4S
>>>>
>>>> Hi Ingemar,
>>>>
>>>> thanks for posting this interesting piece of data!
>> [BB] Yes, thanks, @Ingemar...
>>
>>>>> On Mar 10, 2020, at 09:02, Ingemar Johansson S
>>>> <ingemar.s.johansson=40ericsson.com@dmarc.ietf.org> wrote:
>>>>> Hi
>>>>>
>>>>> I recently updated the readme on the SCReAM github with a comparison with
>>>> SCReAM in three different settings
>>>>> 	• No ECN
>>>>> 	• CoDel ECN
>>>>> 	• L4S
>>>>> https://protect2.fireeye.com/v1/url?k=63019d27-3f884737-6301ddbc-0cc47
>>>>> ad93e2a-489fa99c3277fb8a&q=1&e=5aab95a7-4aab-4a64-99a5-
>>>> 5b55606e303b&u=
>>>>> https%3A%2F%2Fgithub.com%2FEricssonResearch%2Fscream%23ecn-explicit-congestion-notification
>>>>>
>>>>> Even though it is more than a magnitude difference in queue delay
>>>>> between CoDel-ECN and L4S,
>>>> 	[SM] So, in this simulations of a 20ms path, SCReAM over L4S gives ~10
>>>> times less queueing delay, but also only ~2 less bandwidth compared to SCReAM
>>>> over codel. You describe this as "L4S reduces the delay considerably more" and
>>>> "L4S gives a somewhat lower media rate". I wonder how many end-users would
>>>> tradeoff these 25ms in queueing delay against the decrease in video quality from
>>>> halving the bitrate?
>> [BB] This does seem more harsh than I would have imagined - so a useful data point; thx Ingemar. Nonetheless, this is the end-system taking bandwidth away from itself in order to give itself lower latency in the presence of varying network capacity. Nothing to stop other applications making different tradeoffs.
>>
>> It is worth thinking about the complexity of the policy control and signalling system needed for Ingemar's video to express the tradeoffs it is making here;... if it had to get an FQ scheduler to allow these tradeoffs instead. The scheduler would not necessarily have to make all the tradeoffs itself - for instance the app could underutilize it's 'fair' share, but it would need to be able to take more than its 'fair' share during periods of lower capacity.
> 	[SM] I fail to see it that way, in Codel for example, it is the burst tolerance that allows exactly that kind of bandwidth trading with one self (send more now, make up for it a bit later by sending less). In fq_codel it is the FQ component that decides when a flow/bucket/tin is eligible to send something, and inside each flow/bucket/tin is a codel instance that decides what/when to mark/drop. But as Ingemar's example shows, his application gets significantly more bandwidth with Codel than with L4S, it would be interesting to see a plot of video quality (measured at the receiver). Any thing longer than a "burst" will get into tricky accounting territory if one wants to actually enforce long-term adherence to a set bandwidth share.

[BB] These views are from one side of a long-standing philosophical 
debate, in which neither side holds a monopoly on the truth.

Focusing on the "significantly more bandwidth" side without also saying 
"significantly less latency" misses Ingemar's point. The app could have 
chosen either but it chose what it chose.

If an app chooses quality that leads to a worse quality score, that 
probably means that the weighting of the factors in the quality scoring 
is flawed. Shouldn't the quality score reflect what is important for the 
application?

The ideas you've expressed here such as:

  * the network decides your maximum burst tolerance
  * the network decides that applications can have no more flexibility
    than a burst tolerance
  * the network decides how much users should prefer bandwidth over
    latency, and
  * the network decides one objective video quality metric

...are from the Bell-head world. I'm not saying Bell-head is wrong. I'm 
saying it's pointless trying to insist that a Bell-head idea is more 
correct than a Net-head idea. They are different philosophies.

I prefer to shift the debate to how to design a Net-head solution with a 
configurable degree of Bell-headedness. That is what the combination of 
the DualQ plus per-flow Queue Protection is. If you wind the queue 
protection up tight, you emulate per-flow scheduling. If you loosen it 
off (or disable it, or don't deploy it), you get application freedom.

This is also why it's important to enable both DualQ and FQ. Neither are 
objectively correct. Let the market decide.

>> In the list of SCE issues,
>> https://github.com/heistp/sce-l4s-bakeoff/blob/master/README.md#list-of-sce-issues
>> it says "Another argument is the perception that FQ can do harm to periodic bursty flows, however we have not yet shown this to be the case with hard evidence". I don't recognize that argument, but I would if it had said "...can do harm to flows needing variable throughput or smooth throughput in the presence of variable available capacity". If Ingemar had run a TCP flow in parallel here, I think that would go some way towards the hard evidence sought here?
> 	[SM] Only if we assume that the video stream would be more important to the link's user, what if the TCP flow's speedy completion would actually be more important to the user?
> 	That is, to be blunt, where I believe you fail to see the forrest for the trees, the AQM has no chance of being able to optimally split bandwidth between eligible flows without additional information to base its optimization upon. So either you supply that information (say explicit DSCP marking within a well-managed DSCP-domain) or you ned to accept that all you can do is aim for good enough and "do no harm".

[BB] Outside the forest that your mind is in, there is a wider set of 
forests where:
* in your forest, FQ attempts to optimally split bandwidth
* in other forests, the DualQ Coupled AQM doesn't even attempt to 
optimally split bandwidth, depending instead on "the application knows 
that best" (collectively)
* and in yet other forests (DualQ with configurable per-flow queue 
protection and policing), there can be various points on a spectrum 
between the two

L4S signalling was designed to support all these forests. But according 
to your ironic insult, one of the architects of this ecosystem of 
forests only sees trees. There are more forests than the one you are in.

> IMHO FQ strikes a decent balance here, it might rarely be optimal, but it also is equally rarely pessimal (while any unequal sharing mechanism will result in optimal and pessimal sharing, depending on whom you ask). In addition FQ allows much simpler prediction of what to expect from a link und saturating load.
> Of course this is just an opinion, and everybody here is entitled to their own opinion on this matter...

[BB] It is neither appropriate nor necessary for the IETF to decide to 
assign a codepoint in the IP header that only works well in your forest 
(FQ). The whole point of L4S was to broaden the ways extremely low 
latency could be provided (more forests): from FQ to DualQ with 
QProt+DualQ in between.



Bob

>
> Best Regards
> 	Sebastian
>
>>
>> Bob
>>
>>>> Could you repeat the Codel test with interval set to 20 and target to 1ms,
>>>> please?
>>>>
>>>> If that improves things considerably it would argue for embedding the current
>>>> best RTT estimate into SCReAM packets, so an AQM could tailor its signaling
>>>> better to individual flow properties (and yes, that will require a flow-aware
>>>> AQM).
>>>>
>>>>
>>>>
>>>>> it is fair to say that these simple simulations should of course be seen as just a
>>>> snapshot.
>>>>
>>>> 	[SM] Fair enough.
>>>>
>>>>> We hope to present some more simulations with 5G access, and not just
>>>> simple bottlenecks with one flow, after the summer.
>>>>
>>>> 	[Looking] forward to that.
>>>>
>>>>> Meanwhile, the SCReAM code on github is freely available for anyone who
>>>> wish to make more experiments.
>>>>> /Ingemar
>>>>> ================================
>>>>> Ingemar Johansson  M.Sc.
>>>>> Master Researcher
>>>>>
>>>>> Ericsson Research
>>>>> RESEARCHER
>>>>> GFTL ER NAP NCM Netw Proto & E2E Perf
>>>>> Labratoriegränd 11
>>>>> 971 28, Luleå, Sweden
>>>>> Phone +46-1071 43042
>>>>> SMS/MMS +46-73 078 3289
>>>>> ingemar.s.johansson@ericsson.com
>>>>> www.ericsson.com
>>>>>
>>>>>    Reality, is the only thing… That’s real!
>>>>>        James Halliday, Ready Player One
>>>>> =================================
>> -- 
>> ________________________________________________________________
>> Bob Briscoe                               http://bobbriscoe.net/
>>

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/