Re: [tsvwg] path forward on L4S issue #16

Sebastian Moeller <moeller0@gmx.de> Thu, 11 June 2020 11:55 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 926033A053E for <tsvwg@ietfa.amsl.com>; Thu, 11 Jun 2020 04:55:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.649
X-Spam-Level:
X-Spam-Status: No, score=-1.649 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0vq5hcmGCLMS for <tsvwg@ietfa.amsl.com>; Thu, 11 Jun 2020 04:55:48 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 006303A0528 for <tsvwg@ietf.org>; Thu, 11 Jun 2020 04:55:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1591876544; bh=EJGZbVx0xWu23pAe0iJoVmhiOl7H4vPJu6kYlMJEkFg=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=cMuhJnRcQAOxpa9gnkMu7ud6Rp00k8m963oM1N6diK1pdjEBkcjhrR9lEwi86zkmC PWqAVMNI5uDSLnTNzRZcvBpAn5VA7HqCM4z8XKMi5ka9jpo6sZb9qCX+JJTRLdfvKR jnrCFapxRlDHNVWNnGjG+6Uod95fYNWnn31DNhrQ=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [10.11.12.3] ([134.76.241.253]) by mail.gmx.com (mrgmx105 [212.227.17.168]) with ESMTPSA (Nemesis) id 1MQMuX-1jWTeQ3S89-00MHHx; Thu, 11 Jun 2020 13:55:43 +0200
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.14\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <HE1PR0701MB28767E75E3255BEC71630DE2C2800@HE1PR0701MB2876.eurprd07.prod.outlook.com>
Date: Thu, 11 Jun 2020 13:55:42 +0200
Cc: Jonathan Morton <chromatix99@gmail.com>, "tsvwg@ietf.org" <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <59CDF039-0E63-48AD-9450-92E2A388BED6@gmx.de>
References: <HE1PR0701MB2876AA3CBBA215B9FB895B0AC2820@HE1PR0701MB2876.eurprd07.prod.outlook.com> <3637517F-63A0-4862-9885-AB5EA7E6C273@gmail.com> <VI1PR0701MB2877E21B7F406C3DFCFF08BCC2820@VI1PR0701MB2877.eurprd07.prod.outlook.com> <92525827-39B6-4E88-B453-660F8FE22523@gmx.de> <VI1PR0701MB287768D465C37DC46A459C12C2820@VI1PR0701MB2877.eurprd07.prod.outlook.com> <57D7632A-594E-47BC-B6B0-5FBC22AAFE37@gmail.com> <DF67B660-DE2B-4EB8-AD77-5FECF27D1BAC@gmx.de> <HE1PR0701MB287679D1842F15FCDAC6223EC2820@HE1PR0701MB2876.eurprd07.prod.outlook.com> <7EEF7075-396F-4565-89C6-674CBB1E6CB8@gmx.de> <HE1PR0701MB28767A1E570A705EBC65EFC4C2830@HE1PR0701MB2876.eurprd07.prod.outlook.com> <5ADEBD56-1A2C-4C34-9132-E50A4A7A4A42@gmx.de> <HE1PR0701MB287695F2245A480A43DB5F9BC2830@HE1PR0701MB2876.eurprd07.prod.outlook.com> <EEB9B5D1-5F06-4BA6-A078-BE9C26D0EAD6@gmail.com> <HE1PR0701MB2876726FC67BD2850B5D39FAC2830@HE1PR0701MB2876.eurprd07.prod.outlook.com> <E17FCDCD-6D2D-4773-98AA-44EB54A79F62@gmx.de> <HE1PR0701MB287670627208FB1075F78F6DC2830@HE1PR0701MB2876.eurprd07.prod.outlook.com> <E127BDE9-0249-47CD-ACBD-1C4296258B08@gmx.de> <HE1PR0701MB2876B4A3087E2059BE1348E1C2800@HE1PR0701MB2876.eurprd07.prod.outlook.com> <5B8EF39F-7EE9-4281-9BDC-04CD961FC6FE@gmx.de> <HE1PR0701MB28767E75E3255BEC71630DE2C2800@HE1PR0701MB2876.eurprd07.prod.outlook.com>
To: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
X-Mailer: Apple Mail (2.3445.104.14)
X-Provags-ID: V03:K1:lGt38fFkWvumkd+jhOat6770p4ofc7yPyPr/ELl2KQlMDmlntHx wBaFp7yc/Ypva87LX64oUfMhzhS0u8z2rm28p5ioAeyii8sL5gN035I3ITfhVBdbMYz6cg3 B6BUQilbX7HoDUF1T9x+SJvWZRPOrOp1v7qM8gakyh7+FIJpmNHYAG4tsab8rRm/ThIAjhh LcF0fDf6FHkmDD9PJGNSw==
X-UI-Out-Filterresults: notjunk:1;V03:K0:NQSa+pOpCQY=:H/D3zL62ZpVbxC/bD9tT9f Tf6Wti76I0vkgHtIy6YxYS5mmH86ziqtfjDPRXJ48GYL+tmf3AdLpQnzUh48sTm87Ndu2gwRS k4Z/Q3KYgbBksDdpx8wxSDU2fgcGhtwQXvp2O+I9yEelYiqG1yMspQAkc3WkdV3vUDmOyTgJB Kx5ypAErfwj8xFE7OdI7H111Wr0so9O0BL1SopU98IAcw+Qb1Y55RocqkAB3RtEBj1pg6fn7l xKwfNtlQImh2x/3JUlHvjh9yxkooi/RdTUTKFUyfTDsxyLNrasG4b2Vg2aEoscJ+/N0Z7CN0Y fU4McbPVp1X7Io/nqnJ9MErZeAnY2REqJkVUwAu0VAwe2ZpdZei78Gr4W0D4ptp3AQxEeCI+N cbzBL42LiH2VWjAy+mRRxeDKS37EbTnjbA77AoysBnYKph6/Nu2uWvV0JKQHL3174CCB03lFo mNbeOA0u4DlMSdjt2pCeKjOHmfHezoWpbnPbyJ48VKRW8WvnTofLwcwIxKtMobjZYO+yHLQGG NIsSrcTVPTvx6tPG/dsFgVnjWpgYbp0GQN3VlGwAQ9VsxGsVhd3HMxOs89Krwm6JKh6thuEst KIXO4g0Go1FExbUUmvMkA98in9iMSFXPqdaVnHd+7Z5h8dyI8uYNTKRZw8fGhmW/BL2QStfg3 gE4xwhv44aRpJ9Mq+X2P0odyWTsegVdjE9PjNAsxTBA2kYUsr9MXwfocqHA4OTF/s+9p93LLO Wq40ReYkjyDogO9wNbI5LSm4sUrsn+/3/W/GoHbusP8BrHIMC+Nn5bIzIXkXU5WbzoiwjSpzn QujNgFZnKH3jiwoDxkslW+HpCJXAfel+kr4cqtnOth8z8wWWQQZuL/AfKpu+PbTiSOaTI5HVK m/FwLOKNueoVcqhQw12NyreDQ5sS1sG0qdq9ZDLpMMGclQiurXhG0uq+n8aU2jqXsLIasGsRx DFtKcScTOtA4FFVB1EX4MLwLKsSnuusOs19bYeBwjlb4g9eO1EZRpYiEaOoGPd6q42hbo4Jol ZAnifvblsE78dxgfHOWiybKD9+WufZf/+fpFftU/AQfpm77pxiQxT+Fct+oQyREYS1ikNglBc fgPMxDPu3nr+WBtmqxFGfWovh46phgFM21ZJDmf4C2+Mbu+sKlhPt/U2CzbhiqW2nRrkownEw avmeirrUuRk2IXFK2ZtHHp2gQlOLPbdp11l/t03QN9nwQKu8BdmVdpgmz/InrwKqRMGRjZlbp 5NW++0CBeqzWFHJIp
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/v8ZzRp8-APTq4FtViR-7FQYRPNk>
Subject: Re: [tsvwg] path forward on L4S issue #16
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Jun 2020 11:55:51 -0000

Hi Ingemar,



> On Jun 11, 2020, at 13:07, Ingemar Johansson S <ingemar.s.johansson@ericsson.com> wrote:
> 
> Hi
> Inline...
> /Ingemar
> 
>> -----Original Message-----
>> From: Sebastian Moeller <moeller0@gmx.de>
>> Sent: den 11 juni 2020 12:40
>> To: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
>> Cc: Jonathan Morton <chromatix99@gmail.com>; tsvwg@ietf.org
>> Subject: Re: [tsvwg] path forward on L4S issue #16
>> 
>> Hi Ingemar,
>> 
>> as usual more below in-line.
>> 
>>> On Jun 11, 2020, at 10:23, Ingemar Johansson S
>> <ingemar.s.johansson@ericsson.com> wrote:
>>> 
>>> Hi
>>> 
>>> Please see inline [IJ]
>>> /Ingemar
>>> 
>>>> -----Original Message-----
>>>> From: Sebastian Moeller <moeller0@gmx.de>
>>>> Sent: den 10 juni 2020 22:05
>>>> To: Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
>>>> Cc: Jonathan Morton <chromatix99@gmail.com>; tsvwg@ietf.org
>>>> Subject: Re: [tsvwg] path forward on L4S issue #16
>>>> 
>>>> Hi Ingemar,
>>>> 
>>>> more below in-line.
>>>> 
>>>>> On Jun 10, 2020, at 19:15, Ingemar Johansson S
>>>> <ingemar.s.johansson@ericsson.com> wrote:
>>>>> 
>>>>> Hi
>>>>> 
>>>>> As regards to dualQ (below), do you see any specific reason why it
>>>>> would not be possible to upstream (complexity/memory whatever) it or
>>>>> is your argument that it is just not done yet ?
>>>> 
>>>> 	[SM] Caveat: I am not involved in the Linux kernel and hence can not
>>>> predict with any certainty dualQ's likelihood of getting included.
>>>> BUT as
>>> you
>>>> know I am arguing for some time now, that there is a discrepancy
>>>> between the dualQ claims and its actual performance. In other words
>>>> it is not
>>> done,
>>>> and given the responses from the L4S team to this issue (like the
>>>> 15ms
>>> hack
>>>> in TCP Prague's CC response) I am more and more of the opinion, that
>>>> this
>>> is
>>>> not simply a case of dualQ just not being finished, but rather that
>>>> the
>>> current
>>>> dualQ design simply can not meet its promises and someone would need
>>>> to go back to the drawing board.
>>> 
>>> [IJ] Yes, and understand that there are widely differing views about
>>> the severity of this possible problem, if it really is a problem.
>> 
>> 	[SM] Sorry, not accepting that starving nonECT(1) flows at short
> RTTs
>> is a severe problem seems to be a position not even team L4S took. It has
>> been demonstrated from early on (it is visible in allmost all L4S papers
> and in
>> the measurements from team SCE) that the coupling heuristic that is an
>> essential component of dualQ's design simply does not give the required
>> isolation guarantees needed to save flows in the non-LL queue from the
>> consequences of its ~15:1 LL:no-LL queue scheduler. At its core dualQ is a
>> ~16:1 scheduler that by applying a number of heuristics will give rough
>> equivalence between the queues under a number of realistic conditions,
>> unfortunately it fails to do so under a number of different equally
> realistic
>> conditions. Team L4S went so far to propose an absolute ugly hack in TCP
>> Prague to paper over this mis-design in dualQ, accepting that this is a
> severe
>> problem.
>> 	So are you really taking the position that this is not really a
> problem?
>> Especially since, as i mentioned before, L4S really only has seen testing
> for
>> short RTT/few hop scenarios, and is expected to be rolled-out/used by
>> typically close by CDNs?
>> 
> 
> [IJ] I believe that this discussion belongs to issue #28 so I wont delve
> into it in this thread. 

	[SM] Fair enough, now whether #28 will ever see a discussion at all is a different question.



> 
>> 
>> 
>>> 
>>>> 
>>>>> 
>>>>> Also, do you have any comments to my three other questions, please
>>>>> refer to earlier email in the thread for the context.
>>>> 
>>>> 	[SM] I snipped these out of my reply since I had nothing meaningful
>>>> to add to those.
>>>> 
>>>> 
>>>>> 1) Do you have any public sources that confirm the numbers you quote
>>>>> below ?. I tried to look up data on this but it surely is not easy.
>>>> 
>>>> 	[SM] I do not know where Jonathan's numbers come from. But
>>>> https://datatracker.ietf.org/meeting/98/materials/slides-98-maprg-tcp
>>>> -ecn-
>>>> experience-with-enabling-ecn-on-the-internet-padma-bhooma-00 has
>> some
>>>> numbers from Apple, I believe Bob cited these numbers multiple times
>>>> in
>>> the
>>>> past.
>>>> Given the fact that 3gpp contains quite a lot of large carriers maybe
>>>> that would be a forum to ask for numbers?
>>> 
>>> [IJ] OK, that is the same reference that I have found, I was thinking
>>> that there is perhaps something more up to date available?
>> 
>> 	[SM] I believe the problem is, that almost nobody (except maybe for
>> the ripe atlas project) has the required testing nodes deployed required
> to
>> assess the likelihood/prevalence on CE marking on the internet. Ideally
> you
>> have a network of widely distributed testing nodes in leaf networks and
> try
>> to run saturating ECN enabled transfers between them, while recording the
>> number of CE marks per path (pair of testing nodes), nobody in academia
>> really has such a network, but a few major carriers should be able to set
>> something like this up easily, given the big names in support of L4S, I
> would
>> find it marvelous if these could cooperate to get the CE measurements
> done,
>> before proceeding with the L4S drafts.
>> 
>> 
>>> As regards to 3GPP
>>> access then I can very safely say that we can stop worrying, there is
>>> AFAIK no ECN marking (RFC3168 or anything else) in 3GPP base station
>>> nodes so there are no interop concerns here. There is a remote
>>> possibility that ECN marking can be enabled in e.g. industrial Cisco
>>> LTE routers I tried this a while back but had problems with ECN
>>> bleaching which is a configuration matter in 3GPP networks.
>> 
>> 	[SM] I meant asking the 3GPP members to help in setting up an
>> internet wide trawl for CE marked packets, for this leaf/edge or close to
> the
>> leaf/edge mesurement nodes seem required, and hence operators/carrieres
>> that connect end-users would be in a good position to measure CE mark
>> probabilities, no?
> 
> [IJ] OK, my misunderstanding. I leave this uncommented as I don't personally
> know of any such activities. 

	[SM] Nor do I, but if exact numbers of rfc3168 AQMs in the internet would be a potential major hurdle for my project, and I had major ISPs/mobile carriers in my team, this is an activity I would start... According to Bob, Ericsson, Sprint, Google, Nokia Networks, AT&T, Vodafone are all behind L4S so getting these numbers fresh out of the internet should not be an insurmountable obstacle, no?


> 
>> 
>> 
>>> 
>>>> 
>>>> 
>>>>> 2) Which foras are the vendors that manufacture CPEs active in (if
>>>>> any)
>>> ?.
>>>> 
>>>> 	[SM] I believe that OpenWrt certainly supports rfc3168 behaviour,
>>>> and there are CPE that run on modified OpenWrt, so the OpenWrt forum
>>>> might be a decent starting point?
>>> 
>>> [IJ] OK, thanks, I was of the impression, that OpenWrt is on the table
>>> in the implementation of fq-codel and the like, is this correct?,
>> 
>> 	[SM] SQM-scripts in OpenWrt allows to choose between cake and
>> fq_codel, but cake can be run in single queue mode. QOS-scripts in OpenWrt
>> IIRC defaults to fq_codel.
>> 
>> 
>>> if that is
>>> true then it perhaps fits bets with L4S issue #17 but perhaps it is
>>> relevant for #16 and #17?.
>> 
>> 	[SM] I believe that Pete Heist's recent data indicates that thanks
> to
>> the birthday paradoxon, stochastic fq as used in fq_codel and cake is not
> a
>> full remedy for L4S's changed response to CE-marks, so maybe that
>> separation needs to be reverted again.
>> 
>>> Has this extended beyond the early adopter do-it-your-self community?.
>>> I recall from earlier in this long tread that you found off the shelf
>>> routers in stores, were these shipped and ready with RFC3168 support
>>> or was an update necessary to make it RFC3168 capable?.
>> 
>> 	[SM] There is Evenroute's IQrouter, that you can order at amazon,
>> that runs a modified OpenWrt and does use fq (I think it defaults to cake
>> nowadays, and cake is an rfc3168 AQM), then there is the Turris Omnia (and
>> also the MOX I believe) that use fq_codel for their guest networks. And
> since
>> both of these are OpenWrt based they will default to fq_codel for all
>> interfaces.
> 
> [IJ] Thanks, looked up the Turris Omnia, it mentions automatic updates with
> rationale to protect against cyber threats. I interpret this nice feature as
> a possibility to update also for support L4S later on. This is course
> pending interest from the developer community. But atleast there are options
> to avoid the effects of issue #16 and #17 and avoid some of the issues with
> legacy only RFC3168 compatible AQMs, right ?.

	[SM] Yes, both turris and evenroute are good players that offer new firmwares quite often that fix issues, so are in a position to also do this for L4S. Mind you that response could well be to bleach ECT(1) unconditionally... Because we are putting the cart before the horse here, until it is demonstrated that L4S overall improves things bleaching ECT(1) seems to be the safer position, the onus really is on L4S as the new kid on the block, I would say...


> Something similar seem to apply to the Evenroute quote "Over-the-air
> firmware updates to stay current" .

	[SM] +1; both vendors are really good at keeping their promises about updates, but both do not force updates on their users, so having the machinery for automatic OTA updates is nice, but it will not guarantee that all deployed instances actually partake in that service.
	And then there is the long-tail issue with OpenWrt, people on older less beefy routers often see to stick to oudated versions since newer versions require more resources and can in some cases simply not run on old devices, so there will be legacy OpenWrt deployments in the field that require a fork-lift updata (CPE replacement) that is hard to schedule from outside, no? (This does not affect turris nor evenroute, as far as I understanf both still support all their ever deployed devices, but that is IMHO just a question of time until they will EOL/EOS).


Best Regards
	Sebastian


> 
>> 
>> 
>>> 
>>>> 
>>>>> 3) As regards to endpoints implementing RFC3168, do you refer to
>>>>> servers and clients or something else?. My interpretation is servers
>>>>> and clients and I don't believe that they are central  to this
>>>>> discussion, or do I miss something ?.
>>>> 
>>>> 	[SM] Well, it is these end-points that are going to suffer, when L4S
>>>> gets it wrong (when, not if). So these numbers give you an estimate
>>>> of the potential fall-out area.
>>> 
>>> [IJ] OK, then I understand.
>> 
>> 	[SM] Please note, that I can not speak for Jonathan, so this was
> just
>> my opinion on that matter.
>> 
>> 
>>> I was a bit mislead as I assumed that we discussed network nodes that
>>> do RFC3168 ECN marking, not end hosts that negotiate ECN. I would like
>>> to see these as separate parts of the problem.
>> 
>> 	[SM] Fair enough. To assess the total risk from L4S's changed CE-
>> response however, we need to look at all these things in integrated form.
> 
> [IJ] Sure, but it is still important to keep these things separate, after
> all one can alter the argument and say that end hosts can update to use L4S
> with BBRv2 or Prague..