Re: [tsvwg] Another tunnel/VPN scenario (was RE: Reasons for WGLC/RFC asap)

Michael Welzl <michawe@ifi.uio.no> Fri, 04 December 2020 11:33 UTC

Return-Path: <michawe@ifi.uio.no>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9422C3A0B20 for <tsvwg@ietfa.amsl.com>; Fri, 4 Dec 2020 03:33:32 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2RGf9UofV8gD for <tsvwg@ietfa.amsl.com>; Fri, 4 Dec 2020 03:33:30 -0800 (PST)
Received: from mail-out02.uio.no (mail-out02.uio.no [IPv6:2001:700:100:8210::71]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2228A3A0B1D for <tsvwg@ietf.org>; Fri, 4 Dec 2020 03:33:29 -0800 (PST)
Received: from mail-mx11.uio.no ([129.240.10.83]) by mail-out02.uio.no with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93.0.4) (envelope-from <michawe@ifi.uio.no>) id 1kl9L5-0006Lq-2R; Fri, 04 Dec 2020 12:33:27 +0100
Received: from ti0182q160-1994.bb.online.no ([212.251.170.224] helo=[192.168.1.11]) by mail-mx11.uio.no with esmtpsa (TLS1.2:ECDHE-RSA-AES256-GCM-SHA384:256) user michawe (Exim 4.93.0.4) (envelope-from <michawe@ifi.uio.no>) id 1kl9L4-000Acy-6p; Fri, 04 Dec 2020 12:33:27 +0100
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\))
From: Michael Welzl <michawe@ifi.uio.no>
In-Reply-To: <DA84354E-91EC-4211-98AD-83ED3594234A@gmail.com>
Date: Fri, 04 Dec 2020 12:33:24 +0100
Cc: Wesley Eddy <wes@mti-systems.com>, tsvwg@ietf.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <1AB2EA08-4494-4668-AD82-03AEBD266689@ifi.uio.no>
References: <MN2PR19MB4045A76BC832A078250E436483E00@MN2PR19MB4045.namprd19.prod.outlook.com> <HE1PR0701MB2876A45ED62F1174A2462FF3C2FF0@HE1PR0701MB2876.eurprd07.prod.outlook.com> <56178FE4-E6EA-4736-B77F-8E71915A171B@gmx.de> <0763351c-3ba0-2205-59eb-89a1aa74d303@bobbriscoe.net> <CC0517BE-2DFC-4425-AA0A-0E5AC4873942@gmx.de> <35560310-023f-93c5-0a3d-bd3d92447bcc@bobbriscoe.net> <b86e3a0d-3f09-b6f5-0e3b-0779b8684d4a@mti-systems.com> <7335DBFA-D255-43BE-8175-36AB231D101F@ifi.uio.no> <DA84354E-91EC-4211-98AD-83ED3594234A@gmail.com>
To: Jonathan Morton <chromatix99@gmail.com>
X-Mailer: Apple Mail (2.3608.120.23.2.4)
X-UiO-SPF-Received: Received-SPF: neutral (mail-mx11.uio.no: 212.251.170.224 is neither permitted nor denied by domain of ifi.uio.no) client-ip=212.251.170.224; envelope-from=michawe@ifi.uio.no; helo=[192.168.1.11];
X-UiO-Spam-info: not spam, SpamAssassin (score=-5.0, required=5.0, autolearn=disabled, UIO_MAIL_IS_INTERNAL=-5, uiobl=NO, uiouri=NO)
X-UiO-Scanned: CCD1E3ACE923246CCB3837FF279168BCC5B6D097
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/cU98ozmqMCj7yN5B1QLFH7gKIJ4>
Subject: Re: [tsvwg] Another tunnel/VPN scenario (was RE: Reasons for WGLC/RFC asap)
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Dec 2020 11:33:33 -0000


> On Dec 4, 2020, at 11:38 AM, Jonathan Morton <chromatix99@gmail.com> wrote:
> 
>> On 4 Dec, 2020, at 11:08 am, Michael Welzl <michawe@ifi.uio.no> wrote:
>> 
>> Imagine a scenario where L4S is deployed on some hosts feeding into a bottleneck, yet some other hosts don’t participate in the L4S experiment (e.g., these hosts could run a different OS). Now assume that the non-L4S-participating hosts want to try out 3168-style ECN **at this point in time** - and they find it performing terribly poor, worse than without using ECN, concluding that 3168-style ECN isn’t even worth trying. Then, the only possibility for improvement is to also participate in the L4S experiment - and what if this, then, also doesn’t hold its promises (e.g., because the only existing end system implementation doesn’t work well) ?   Won’t this lead to an overall conclusion that ECN isn’t worth it, and never has been?
> 
> In this situation, the effects would be seen by the non-L4S hosts even *before* they enabled ECN.  Here's an example of what they might encounter, noting that a Not-ECT CUBIC flow is involved here:
> 
> http://sce.dnsmgr.net/results/l4s-2020-11-11T120000-final/l4s-s6-rfc3168-1q/l4s-s6-rfc3168-1q-ns-prague-vs-cubic-noecn-pie-50Mbit-20ms_tcp_delivery_with_rtt.svg
> 
> For insight into this, consider that RFC-3168 AQMs are required to drop Not-ECT packets at the same rate as they mark ECT traffic.  That is, the decision to apply a congestion signal to a packet is made independently of its ECT status, and the latter only determines whether it can be a mark, or must be a drop.  The L4S traffic still looks like ECT packets (because RFC-3168 treats ECT1 as equivalent to ECT0), so the marking and dropping rates still go up in concert and the conventional traffic still gets squashed - only now it also has to perform retransmissions, so the goodput actually suffers a bit more than with ECN enabled.

Right; bad! But the inherent problem is the same: TCP Prague’s inability to detect the 3168-marking AQM algorithm. I thought that a mechanism was added, and then there were discussions of having it or not having it?  Sorry, I didn’t follow this closely enough.


> The conclusion that a user or ISP (who isn't familiar with L4S but is simply trying to roll out ECN) might draw from this is that AQM is bad and unreliable, rather than ECN per se; they would actually see a slight improvement from enabling ECN at the endpoint, but a much bigger one from disabling the AQM.  This runs contrary to the established wisdom that AQM reduces latency (which is good) without having too much effect on throughput (which would be bad).
> 
> Another possible conclusion that a more paranoid network operator could draw is that ECN *in general* is a potential DoS attack vector and must therefore be blackholed.  That would be bad for *everyone* including L4S, and would also prevent reverting to RFC-3168 deployment upon failure of the L4S experiment.  I think that is a substantial risk that must also be weighed.

I agree with that.

Cheers,
Michael