Re: [tsvwg] Another tunnel/VPN scenario (was RE: Reasons for WGLC/RFC asap)

Sebastian Moeller <moeller0@gmx.de> Fri, 04 December 2020 13:38 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AF64C3A0CD2; Fri, 4 Dec 2020 05:38:45 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.649
X-Spam-Level:
X-Spam-Status: No, score=-1.649 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VzyolvyIpZ7y; Fri, 4 Dec 2020 05:38:41 -0800 (PST)
Received: from mout.gmx.net (mout.gmx.net [212.227.15.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4F80F3A0CD0; Fri, 4 Dec 2020 05:38:41 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1607089117; bh=ExbnH+1fjwmRx4510cHFOg4lJ5cKbKvZiiHBGTPwc8g=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=MR2H3GyatqmHmZ1dPhQKSs0HaC7sooQTguzYCUP3TmgoHyrz1a+ko5Q3/BrAecZLI jDTQ02X1sq2HPO7c17JS0EhrxBQCJAtXDyJQL+GHlitv1awOdb/9COyRobSIC/73Oi yZe74EUk/TMpYydyh5HSaPJQpE5EzkcdKYs/hlng=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [192.168.250.102] ([134.76.241.253]) by mail.gmx.com (mrgmx005 [212.227.17.190]) with ESMTPSA (Nemesis) id 1MI5Q5-1kz9f13wgx-00F9F8; Fri, 04 Dec 2020 14:38:36 +0100
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.17\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <216A1CE6-C7ED-4ACB-9E8A-AB0CC0408712@ericsson.com>
Date: Fri, 04 Dec 2020 14:38:35 +0100
Cc: Jonathan Morton <chromatix99@gmail.com>, "tsvwg@ietf.org" <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <E95EDB52-C753-46E8-9188-30E3952FB031@gmx.de>
References: <MN2PR19MB4045A76BC832A078250E436483E00@MN2PR19MB4045.namprd19.prod.outlook.com> <HE1PR0701MB2876A45ED62F1174A2462FF3C2FF0@HE1PR0701MB2876.eurprd07.prod.outlook.com> <56178FE4-E6EA-4736-B77F-8E71915A171B@gmx.de> <0763351c-3ba0-2205-59eb-89a1aa74d303@bobbriscoe.net> <CC0517BE-2DFC-4425-AA0A-0E5AC4873942@gmx.de> <35560310-023f-93c5-0a3d-bd3d92447bcc@bobbriscoe.net> <b86e3a0d-3f09-b6f5-0e3b-0779b8684d4a@mti-systems.com> <7335DBFA-D255-43BE-8175-36AB231D101F@ifi.uio.no> <DA84354E-91EC-4211-98AD-83ED3594234A@gmail.com> <1AB2EA08-4494-4668-AD82-03AEBD266689@ifi.uio.no> <CC06401C-2345-4F68-96FA-B4A87C25064E@gmail.com> <24C55646-C786-4B55-BFEE-D30BBB4ED7C4@ifi.uio.no> <216A1CE6-C7ED-4ACB-9E8A-AB0CC0408712@ericsson.com>
To: Mirja Kuehlewind <mirja.kuehlewind=40ericsson.com@dmarc.ietf.org>
X-Mailer: Apple Mail (2.3445.104.17)
X-Provags-ID: V03:K1:Id26h5jok28aeFoNpJHVe/MMVGDDbOa5Lv7+jkAYrJ/pDQQe18G iWwHIahfsrhLxe3FGwX708mrl7EZM19Ugn7ThZ/Y0DegK0LBaB6UXOlINimHVj+pWmYjI9G YwuV4lsVjNQnH7f6jYfXn5Xg/zS89YLtflR+VYIId1WPkJJvKTCzLC0jEaOE8AkDRmx1XWc t9YjK1jJb6v6D7t59hsIw==
X-UI-Out-Filterresults: notjunk:1;V03:K0:Gc0Htl9aWPM=:muGqMJDyQ2u3JmsYrdtwzY UjvnhYflqD1RNyJXCuGFeLB+9P109BFjvJYYqOWpmVL8Vz0djI5dfD1jm1Ysk2wzhSmZfziDD QPZoNW1dwKYRgO5DWW86jiepJbas05bNElLvHkAO0P519trz7UGmYFmrt2qIGJ7P5LIOZ4WBs jQ8IkGnGLJTWhBV4u+CgdNRATSFgCndLe5cx1NyA5wUTl45aa0KfgN3fIZO0zuaq3ZmHo3exo xe91IYaNdpgxJkEfmvK0B8oelu71L664Kxjp8OF2LFMVK2Bpz1TrdSgqo5UzFvG0S3ZnMiGPd 2Jov3YZ2xMAue6V9KPGk2+AxY/ZgGG3oseT/jEYAvBbYbogI1bS4MjQzjsRniaMTwGI305XIL NxI3jV512i1U7Otv9/RGWOICfSEtFRPOWqJMZVDBYzAflzU2TiBPGGbGdFQVEdAbKtLF3guiP yguGiJH4AgrS1PVL82iBZLpZ44z02EL7nXwBg7qred23FrBfVqL5hmvEe9o+oXptx7gVc79qO 7iNvDQchTSlGk8PgwMcS6apm6U+Npts8gmnr365Qzj2EsclZ8un7I3OyXQnaOY7GfqCDEFw5G PLmQ3i4Cwhss66Dv6r6QBYbb147KVgjOWRDInLgMw2+euYEe5vUCzBiaSmqYts2eIb4Lvwqms iGxyoYjdxqHnEA7iyx+Bl5ov+F9u/TGlIqGTkT+m2YAiK3bLFYTT6auqhT1GQU42Rj4iMhFs+ MTtXLnx3YqWh+0rHvFWzuMlRDK7a2HiZ6xWt3IEGf0onyuN8WKV49qoLQUgyEpoLpox15Sjar F3++k971pTM3qn9WA8SeV1428m470LjZ6J7GSQ1uRUEnQ5QG8xFhCpabs+37IsbXbD0JdB5Hm 4xxo0Wevm9S44ZIcqZFQ==
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/v1DbC5-feSGuxJiYL6i7Aqv71TM>
Subject: Re: [tsvwg] Another tunnel/VPN scenario (was RE: Reasons for WGLC/RFC asap)
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Dec 2020 13:38:46 -0000

Hi Mirja,

more below in-line, prefixed with [SM].


> On Dec 4, 2020, at 13:32, Mirja Kuehlewind <mirja.kuehlewind=40ericsson.com@dmarc.ietf.org> wrote:
> 
> Hi all,
> 
> to add one more option.
> 
> To be clearly any issues discussed only occurred if RCF3168-AQMs (without FQ) are deployed in the network (no matter if any traffic is actually RFC3168-ECN enabled or not). 
> 
> My understanding of the current deployment status is that only ECN-AQMs with FG support are deployed today and I don't think we should put a lot of effort in a complicated RFC3168-AQM detection mechanism which might negative impact the L4S experiment if we have no evidence that these queues are actually deployed.

	[SM] So you seem to reject the tunnel argument, Could you please elaborate why tunnels seem ignorable in this context, but are a big argument for re-defining CE? These two positions seem logically hard to bring into alignment. 

> 
> Further I would like to note that RCF3168 AQMs are already negatively impacting non ECN traffic and advantaging ECN traffic.

	[SM] You must mean that rfc3168 enabled flows do not suffer the other-wise required retransmission after a drop and get a slightly faster congestion feed-back? That is a rather small benefit over non-ECN flows, but sure there is a rationale for ECN usage.

> However, I think it's actually a feature of L4S to provide better performance that non-ECN and therefore providing a deployment incentive for L4S,

	[SM] There is a difference between performing better by doing something better and "making the existing traffic artificially slower" but that is what L4S does (it works on both ends), stacking the deck against non-L4S traffic.

> as long as non-L4S is not starved entirely.

	[SM] Please define what exactly you consider "starved entirely" to mean otherwise this is not helpful.

> We really, really must stop taking about fairness as equal flow sharing.

	[SM] Yes, the paper about the "harm" framework that Wes posted earlier seems to be a much better basis here than a simplistic "all flows need to be strictly equal" strawmen criterion.


> That is not the reality today (e.g. video traffic can take larger shares on low bandwidth links and that keeps it working)

	[SM] You wish, please talk to end users that want at the same time use concurrent video streaming and jitter sensitive on-line gaming over their internet access link, and the heroic measures they are willing to take to make this work. It is NOT working as desired out of the box in spite of DASH video being adaptive and games requiring very little traffic in comparison. The solution would be to switch video over to CBR type of streaming instead of the current burtsy video delivery (but that is unlikely to change).


IMHO L4S will not change much here because a) it still aims to offer rough per flow fairness (at least for flows with similar network paths) and b) the real solution is controlled/targeted un-fairness where a low latency channel is carved out that works in spite of other flows not cooperating (L4S requires implicit coordination/cooperation of all flows to achieve its means, which is to stay civil optimistic).


> and it is not desired at all because not all flows are equal!!!

	[SM] Now, if only we had a reliably and robust way to rank flows by importance that is actually available at the bottleneck link we would be set. Not amount of exclamation marks is going to solve that problem, that importance of flows is a badly defined problem. If our flows cross on say our shared IPS's transit uplink, which is more important? And who should be the arbiter of importance, you, me, the ISP, the upstream ISP? That gets complicated quickly, no?


> The endpoints know what is required to make their application work and as long as there is a circuit breaker that avoids complete starving or collapse, the evolution of the Internet depends on this control in the endpoint and future applications that potentially have very different requirements. Enforcing equal sharing in the network hinders this evolution.

	[SM] Well, the arguments for equal-ish sharing are:
a) simple to understand, predict and measure/debug (also conceptually easy to implement).
b) avoids starvation as best as possible as evenly as possible
c) is rarely pessimal (and almost never optimal), often "good enough".

Compare this with your proposed "anything goes" approach (which does not reflect the current internet which seems mostly rough equitable sharing in nature)
a) extremely hard to make predictions unless the end point controls all flows over the bottleneck
b) has not inherent measures against starvation
c) Has the potential to be optimal, but that requires a method to rate relative importance/value of each packet that rarely exist at the points of congestion. 

How should the routers at a peering point between two AS know, which of the flows in my network I value most? Simply they can't and hence will not come up with the theoretically optimal sharing behavior. I really see no evolutionary argument for anything goes here.

> 
> I also would like to note that L4S is not only about lower latency. Latency is the huge problem we have in the Internet today because the current network was optimized for high bandwidth applications, however, many of the critical things we do on the Internet today actually is more sensitive to latency. This problem is still not fully solved, event hough smaller queues and AQM deployments are a good step in the right direction.

	[SM] Mirja, L4S offer really only very little advancement over the state of the art AQMs, 5ms average queueing delay is current reality, L4S' 1 ms (99.9 quantile) queueing delay really will not change much here, yes 1ms is smaller than 5ms, but please show a realistic  scenario where that difference matters.


> L4S goes even further and the point is not only about reducing latency but to enable the deployment of a completely new congestion control regime with takes into account all the lessons learnt from e.g. data center deployment where we not have to be bounded by today's limitation of "old" congestion controls and co-existence.

	[SM] I do smell second system syndrome here. Instead of aiming for a revolution, how about evolving the existing CCs instead? The current attempts at making DCTCP fit for the wider internet in the guise of TCP Prague are quite disappointing in what they actually deliver. To be blunt TCP Prague demonstrates quite well that the initial assumption DCTCP would work well over the internet if only it was safe to do so was simply wrong. The long initial ramp up time and the massively increased RTT-bias as well as the failure to compete well with cubic flows in FIFO bottlenecks are clear indicators that a new L4S reference transport protocol needs to be developed.


> L4S is exactly a way to transmission to this new regime without starving "old" traffic but there also need to be incentives to actually move to the new world. That's what I would like to see and why I'm existed about L4S.

	[SM] That is a procedural argument that seems to take L4S's promises at face value, while ignoring all the data that demonstrate L4S has still a long way to go to actually deliver on its promises. 
	I also do not believe it to be an acceptable way to create incentives by essentially making existing transport protocols perform worse (over L4s controlled bottlenecks). But that is what L4S does.

Best Regards
	Sebastian


> 
> Mirja
> 
> 
> 
> 
> On 04.12.20, 12:49, "tsvwg on behalf of Michael Welzl" <tsvwg-bounces@ietf.org on behalf of michawe@ifi.uio.no> wrote:
> 
> 
> 
>> On Dec 4, 2020, at 12:45 PM, Jonathan Morton <chromatix99@gmail.com> wrote:
>> 
>>> On 4 Dec, 2020, at 1:33 pm, Michael Welzl <michawe@ifi.uio.no> wrote:
>>> 
>>> Right; bad! But the inherent problem is the same: TCP Prague’s inability to detect the 3168-marking AQM algorithm. I thought that a mechanism was added, and then there were discussions of having it or not having it?  Sorry, I didn’t follow this closely enough.
>> 
>> Right, there was a heuristic added to TCP Prague to (attempt to) detect if the bottleneck was RFC-3168 or L4S.  In the datasets from around March this year, we showed that it didn't work reliably, with both false-positive and false-negative results in a variety of reasonably common scenarios.  This led to both the L4S "benefits" being disabled, and a continuation of the harm to conventional flows, depending on which way the failure went.
>> 
>> The code is still there but has been disabled by default, so we're effectively back to not having it.  That is reflected in our latest test data.
>> 
>> I believe the current proposals from L4S are:
>> 
>> 1: Use the heuristic data in manual network-operations interventions, not automatically.
>> 
>> 2: Have TCP Prague treat longer-RTT paths as RFC-3168 but shorter ones as L4S.  I assume, charitably, that this would be accompanied by a change in ECT codepoint at origin.
>> 
>> Those proposals do not seem very convincing to me, but I am just one voice in this WG.
> 
>    Yeah, so I have added my voice for this particular issue.
> 
>    Cheers,
>    Michael
> 
>