Re: [tsvwg] path forward on L4S issue #16

Sebastian Moeller <moeller0@gmx.de> Thu, 11 June 2020 10:41 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 26C2E3A17FE for <tsvwg@ietfa.amsl.com>; Thu, 11 Jun 2020 03:41:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.649
X-Spam-Level:
X-Spam-Status: No, score=-1.649 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 287-LQ3YAoGy for <tsvwg@ietfa.amsl.com>; Thu, 11 Jun 2020 03:41:15 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.15.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4170C3A08E3 for <tsvwg@ietf.org>; Thu, 11 Jun 2020 03:40:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1591872026; bh=oP09DcjHpi9te7tn7E1r+QG2rmGBDpqr6JVaHJZZvfo=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=l6S2TFaTnrNfjuFd3JzkA4njEhP0GYr8XTpwFzhHZMrnE72PAwFuyWG0MWasYjOV/ ZHruT8nuqH6usvnIiMpm47eMiQC61xZbUVpLEH2F/GUzS4BJOJza8lnH7F28OeJ1SM mO4LCkCTBY6bdc1+Wz9il6PXjGn34UXhHcHQ3UQ0=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [10.11.12.3] ([134.76.241.253]) by mail.gmx.com (mrgmx004 [212.227.17.190]) with ESMTPSA (Nemesis) id 1MatVh-1jCBfe3q7X-00cQFv; Thu, 11 Jun 2020 12:40:26 +0200
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <HE1PR0701MB2876B4A3087E2059BE1348E1C2800@HE1PR0701MB2876.eurprd07.prod.outlook.com>
Date: Thu, 11 Jun 2020 12:40:25 +0200
Cc: Jonathan Morton <chromatix99@gmail.com>, "tsvwg@ietf.org" <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <5B8EF39F-7EE9-4281-9BDC-04CD961FC6FE@gmx.de>
References: <HE1PR0701MB2876AA3CBBA215B9FB895B0AC2820@HE1PR0701MB2876.eurprd07.prod.outlook.com> <3637517F-63A0-4862-9885-AB5EA7E6C273@gmail.com> <VI1PR0701MB2877E21B7F406C3DFCFF08BCC2820@VI1PR0701MB2877.eurprd07.prod.outlook.com> <92525827-39B6-4E88-B453-660F8FE22523@gmx.de> <VI1PR0701MB287768D465C37DC46A459C12C2820@VI1PR0701MB2877.eurprd07.prod.outlook.com> <57D7632A-594E-47BC-B6B0-5FBC22AAFE37@gmail.com> <DF67B660-DE2B-4EB8-AD77-5FECF27D1BAC@gmx.de> <HE1PR0701MB287679D1842F15FCDAC6223EC2820@HE1PR0701MB2876.eurprd07.prod.outlook.com> <7EEF7075-396F-4565-89C6-674CBB1E6CB8@gmx.de> <HE1PR0701MB28767A1E570A705EBC65EFC4C2830@HE1PR0701MB2876.eurprd07.prod.outlook.com> <5ADEBD56-1A2C-4C34-9132-E50A4A7A4A42@gmx.de> <HE1PR0701MB287695F2245A480A43DB5F9BC2830@HE1PR0701MB2876.eurprd07.prod.outlook.com> <EEB9B5D1-5F06-4BA6-A078-BE9C26D0EAD6@gmail.com> <HE1PR0701MB2876726FC67BD2850B5D39FAC2830@HE1PR0701MB2876.eurprd07.prod.outlook.com> <E17FCDCD-6D2D-4773-98AA-44EB54A79F62@gmx.de> <HE1PR0701MB287670627208FB1075F78F6DC2830@HE1PR0701MB2876.eurprd07.prod.outlook.com> <E127BDE9-0249-47CD-ACBD-1C4296258B08@gmx.de> <HE1PR0701MB2876B4A3087E2059BE1348E1C2800@HE1PR0701MB2876.eurprd07.prod.outlook.com>
To: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
X-Mailer: Apple Mail (2.3445.104.14)
X-Provags-ID: V03:K1:GmXgSA3d9kayxZhh0+9IpjtwKxxgYgArnrW9oli5B6PeZAbM/0g R2zNxrWgUUityDl1BAzTupKmZHZyAsc4eW1QNLIBoIfN0+yf5kn8WcT3zQwG7BKSjonlvhu ILtwG+jXrlLlqcXXf13Xx4JHwpJmYpOz96UkkX5BO1aIcElUkzjsKpuRW47zlsmuOnQMRaE 3bsDL+le6wC1cnjBZFyMA==
X-UI-Out-Filterresults: notjunk:1;V03:K0:A6WfH3qtjPI=:+WkM0SQFJjWPSi81msrWFm Ds0WOnQzYdz+m4eqCN05vvVZzM085RGV3+KWUWz5EzPH+Y3pN79OJ/U8Ssz218sR8WA6ZqYo1 o5aMW9drmE05qKa6xKx5jaaADMGvGGbBs3ZqYAUSUCdvxQ4/Aqc0XS/rwg2PDVflTjOKErlBv TICIFpvGMM7aDeZ1NGzc+rxwHDDS03F6rui5W0XIhCXq8d9vUzHcYBKSeDqgq0/mbC1jq1OPb J3rCQI3NLk9lkwFomkgIzzrFR8GA7eHN9gw4Gz/DiTL4KLCuaZYAUoIGqv1Q00n+sOWMKeF6z IqgnYPD+N5OaU+L/bknekRu2FCZopJLMVUD56mkaILW4NDLHBBKK3Xw8jXbiZ9srTlsqj9BcF 2ZdwaT8YdMpva7bv/BXRio/vmzwrXCLL89imIo7GO87SCJ1GM8/FUBrELMQjZaAhLYNWdM+bj 7tiCuaEnnWi/lXWG6ysUiHVPBh6T8hnOYgCzwSPket52pNHU3hnuacB/eodxC4YllCaLU+jPM Gta3Wh3A08CTR0j9RWhbbyox+2QwwJcmJL4kXZgoHI1UgMsrK/IJn88YQEf4qzU6upalsOMdF mGk7o7rmShUki2KdoH37Rf+QLaGyUsXmYwvk6Ck3HM9/VtXIRJ4q7auzhNqq/hoU7sAqU8SHa x6ZRiOGwLj4F90t35IjISXJ8wKowW/81N0a4oOs8rMO/WZRrIZ5K0z3OUgcw0ON6ekY3zJG1B QgIdPueHWs2iCXgblmw+4zDTcYVRrzXhU+caEekvWLl3UcwKDUjzHlrSavlf6ZDbvNGYwTN2D nqY7b7VwWM5psWrYH4QEZuWu0NfgO6n2nFHNOorzr7sGMq/eDdaF0FXS6iJWg+XpYgcGsd3ag bU+isNCDHSd1080xb1fTHEDV/PHCNE4HJwdsyqkXr/3tpCxcbLNBn7jJVUQ17rrG8LsnVxKle vR4BOn3pqN1DsEJnAQykbpIyXzJnLnWytiI5WsBN0cpxt9jPHTdjpnNTp7UDixFR73sC5EUgU iX5Q9cYEj/lSaECnEzn2qY005UW0ZnUB6rFPwx9BksO/tG22c5QD3JLfgsZkdk/YaSVm0Ke0C 4VPeCZqlp9lSbnIyjDFrieaz5jYY5E6JgLEB96rvSWmGvPBzYKAvE11EPiZkndpnCshRGNfhg ypxDxUfhm2oXTWhtDRgxyklPUuj2dXDFZ9INHljUIJISyHb2LBSNgmJunvOB+qxIJMlpLPbKQ IPoM6sQUxE8rX2Mh0QiB9Ob1r+vJQBfAPkM/1CvJIniGFuktNmSQdTsHOYHKn2Kg+gVOJHxxK nUK+HrRsS6XzdL4FSWid8N9u6qNxnA3WgAuHcbHny8NMilF7TuQ=
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/piWitLPngklkITq3b0aHdUxobcc>
Subject: Re: [tsvwg] path forward on L4S issue #16
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Jun 2020 10:41:19 -0000

Hi Ingemar,

as usual more below in-line.

> On Jun 11, 2020, at 10:23, Ingemar Johansson S <ingemar.s.johansson@ericsson.com> wrote:
> 
> Hi
> 
> Please see inline [IJ]
> /Ingemar
> 
>> -----Original Message-----
>> From: Sebastian Moeller <moeller0@gmx.de>
>> Sent: den 10 juni 2020 22:05
>> To: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
>> Cc: Jonathan Morton <chromatix99@gmail.com>; tsvwg@ietf.org
>> Subject: Re: [tsvwg] path forward on L4S issue #16
>> 
>> Hi Ingemar,
>> 
>> more below in-line.
>> 
>>> On Jun 10, 2020, at 19:15, Ingemar Johansson S
>> <ingemar.s.johansson@ericsson.com> wrote:
>>> 
>>> Hi
>>> 
>>> As regards to dualQ (below), do you see any specific reason why it
>>> would not be possible to upstream (complexity/memory whatever) it or
>>> is your argument that it is just not done yet ?
>> 
>> 	[SM] Caveat: I am not involved in the Linux kernel and hence can not
>> predict with any certainty dualQ's likelihood of getting included. BUT as
> you
>> know I am arguing for some time now, that there is a discrepancy between
>> the dualQ claims and its actual performance. In other words it is not
> done,
>> and given the responses from the L4S team to this issue (like the 15ms
> hack
>> in TCP Prague's CC response) I am more and more of the opinion, that this
> is
>> not simply a case of dualQ just not being finished, but rather that the
> current
>> dualQ design simply can not meet its promises and someone would need to
>> go back to the drawing board.
> 
> [IJ] Yes, and understand that there are widely differing views about the
> severity of this possible problem, if it really is a problem.

	[SM] Sorry, not accepting that starving nonECT(1) flows at short RTTs is a severe problem seems to be a position not even team L4S took. It has been demonstrated from early on (it is visible in allmost all L4S papers and in the measurements from team SCE) that the coupling heuristic that is an essential component of dualQ's design simply does not give the required isolation guarantees needed to save flows in the non-LL queue from the consequences of its ~15:1 LL:no-LL queue scheduler. At its core dualQ is a ~16:1 scheduler that by applying a number of heuristics will give rough equivalence between the queues under a number of realistic conditions, unfortunately it fails to do so under a number of different equally realistic conditions. Team L4S went so far to propose an absolute ugly hack in TCP Prague to paper over this mis-design in dualQ, accepting that this is a severe problem.
	So are you really taking the position that this is not really a problem? Especially since, as i mentioned before, L4S really only has seen testing for short RTT/few hop scenarios, and is expected to be rolled-out/used by typically close by CDNs?


> 
>> 
>>> 
>>> Also, do you have any comments to my three other questions, please
>>> refer to earlier email in the thread for the context.
>> 
>> 	[SM] I snipped these out of my reply since I had nothing meaningful
>> to add to those.
>> 
>> 
>>> 1) Do you have any public sources that confirm the numbers you quote
>>> below ?. I tried to look up data on this but it surely is not easy.
>> 
>> 	[SM] I do not know where Jonathan's numbers come from. But
>> https://datatracker.ietf.org/meeting/98/materials/slides-98-maprg-tcp-ecn-
>> experience-with-enabling-ecn-on-the-internet-padma-bhooma-00 has some
>> numbers from Apple, I believe Bob cited these numbers multiple times in
> the
>> past.
>> Given the fact that 3gpp contains quite a lot of large carriers maybe that
>> would be a forum to ask for numbers?
> 
> [IJ] OK, that is the same reference that I have found, I was thinking that
> there is perhaps something more up to date available?

	[SM] I believe the problem is, that almost nobody (except maybe for the ripe atlas project) has the required testing nodes deployed required to assess the likelihood/prevalence on CE marking on the internet. Ideally you have a network of widely distributed testing nodes in leaf networks and try to run saturating ECN enabled transfers between them, while recording the number of CE marks per path (pair of testing nodes), nobody in academia really has such a network, but a few major carriers should be able to set something like this up easily, given the big names in support of L4S, I would find it marvelous if these could cooperate to get the CE measurements done, before proceeding with the L4S drafts. 


> As regards to 3GPP
> access then I can very safely say that we can stop worrying, there is AFAIK
> no ECN marking (RFC3168 or anything else) in 3GPP base station nodes so
> there are no interop concerns here. There is a remote possibility that ECN
> marking can be enabled in e.g. industrial Cisco LTE routers I tried this a
> while back but had problems with ECN bleaching which is a configuration
> matter in 3GPP networks. 

	[SM] I meant asking the 3GPP members to help in setting up an internet wide trawl for CE marked packets, for this leaf/edge or close to the leaf/edge mesurement nodes seem required, and hence operators/carrieres that connect end-users would be in a good position to measure CE mark probabilities, no?


> 
>> 
>> 
>>> 2) Which foras are the vendors that manufacture CPEs active in (if any)
> ?.
>> 
>> 	[SM] I believe that OpenWrt certainly supports rfc3168 behaviour,
>> and there are CPE that run on modified OpenWrt, so the OpenWrt forum
>> might be a decent starting point?
> 
> [IJ] OK, thanks, I was of the impression, that OpenWrt is on the table in
> the implementation of fq-codel and the like, is this correct?,

	[SM] SQM-scripts in OpenWrt allows to choose between cake and fq_codel, but cake can be run in single queue mode. QOS-scripts in OpenWrt IIRC defaults to fq_codel.


> if that is
> true then it perhaps fits bets with L4S issue #17 but perhaps it is relevant
> for #16 and #17?. 

	[SM] I believe that Pete Heist's recent data indicates that thanks to the birthday paradoxon, stochastic fq as used in fq_codel and cake is not a full remedy for L4S's changed response to CE-marks, so maybe that separation needs to be reverted again.

> Has this extended beyond the early adopter do-it-your-self community?. I
> recall from earlier in this long tread that you found off the shelf routers
> in stores, were these shipped and ready with RFC3168 support or was an
> update necessary to make it RFC3168 capable?.

	[SM] There is Evenroute's IQrouter, that you can order at amazon, that runs a modified OpenWrt and does use fq (I think it defaults to cake nowadays, and cake is an rfc3168 AQM), then there is the Turris Omnia (and also the MOX I believe) that use fq_codel for their guest networks. And since both of these are OpenWrt based they will default to fq_codel for all interfaces.


> 
>> 
>>> 3) As regards to endpoints implementing RFC3168, do you refer to
>>> servers and clients or something else?. My interpretation is servers
>>> and clients and I don't believe that they are central  to this
>>> discussion, or do I miss something ?.
>> 
>> 	[SM] Well, it is these end-points that are going to suffer, when L4S
>> gets it wrong (when, not if). So these numbers give you an estimate of the
>> potential fall-out area.
> 
> [IJ] OK, then I understand.

	[SM] Please note, that I can not speak for Jonathan, so this was just my opinion on that matter.


> I was a bit mislead as I assumed that we
> discussed network nodes that do RFC3168 ECN marking, not end hosts that
> negotiate ECN. I would like to see these as separate parts of the problem. 

	[SM] Fair enough. To assess the total risk from L4S's changed CE-response however, we need to look at all these things in integrated form.

Best Regards
	Sebastian


> 
>> 
>> Best Regards
>> 	Sebastian
>> 
>> 
>>> 
>>> /Ingemar
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Sebastian Moeller <moeller0@gmx.de>
>>>> Sent: den 10 juni 2020 16:35
>>>> To: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
>>>> Cc: Jonathan Morton <chromatix99@gmail.com>; tsvwg@ietf.org
>>>> Subject: Re: [tsvwg] path forward on L4S issue #16
>>>> 
>>>> Hi Ingemar,
>>>> 
>>>> to gently push back on some details.
>>>> 
>>>>> On Jun 10, 2020, at 15:59, Ingemar Johansson S
>>>> <ingemar.s.johansson@ericsson.com> wrote:
>>>>> 
>>>>> [...]I understand that traffic shaping on outgoing interfaces can be
>>>>> applied in a Linux host but don't see why they become a problem
>>>>> especially as there are qdiscs that support dualQ.
>>>>> [...]
>>>> 
>>>> 	There seems to be a single out-of-the-mainline-Linux-tree repository
>>>> (https://protect2.fireeye.com/v1/url?k=e33cc533-bd9c7f5d-e33c85a8-
>>>> 869a14f4b08c-0ec6a27e7722e722&q=1&e=29721776-06f8-43e4-a1e6-
>>>> 67f0d2c15283&u=https%3A%2F%2Fgithub.com%2FL4STeam%2Flinux) for
>> both
>>>> the dual queue coupled AQM and TCP Prague.
>>>> 	I would not call that prrof of sufficient existence of "qdiscs that
>>>> support dualQ" to allow Linux system admins to switch over to
>>>> dualqand I
>>> do
>>>> not see how even inclusion into the mainline kernel* would this
>>>> solves the issue for currently deployed Linux machines, which often
>>>> use vendor
>>> kernels
>>>> which do not necessarily track mainline closely, especially for
>>>> server distributions.
>>>> 	I would respectfully argue that for safety considerations one should
>>>> look at the current state of the internet and not potential less
>>> problematic
>>>> states one would like to find the internet in...
>>>> 
>>>> Best Regards
>>>> 	Sebastian
>>>> 
>>>> 
>>>> *) As far as I can tell there have been no attempts at upstreaming
>>>> the
>>> dual
>>>> queue coupled AQM yet, so it is not clear what/if survives the
>>>> contact
>>> with
>>>> the linux kernel maintainers.