Re: [tsvwg] Another tunnel/VPN scenario (was RE: Reasons for WGLC/RFC asap)

Michael Welzl <michawe@ifi.uio.no> Fri, 04 December 2020 09:08 UTC

Return-Path: <michawe@ifi.uio.no>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BE63C3A0A4F for <tsvwg@ietfa.amsl.com>; Fri, 4 Dec 2020 01:08:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ic6MkJYPiyfg for <tsvwg@ietfa.amsl.com>; Fri, 4 Dec 2020 01:08:20 -0800 (PST)
Received: from mail-out01.uio.no (mail-out01.uio.no [IPv6:2001:700:100:10::50]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B1C0E3A138F for <tsvwg@ietf.org>; Fri, 4 Dec 2020 01:08:18 -0800 (PST)
Received: from mail-mx10.uio.no ([129.240.10.27]) by mail-out01.uio.no with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93.0.4) (envelope-from <michawe@ifi.uio.no>) id 1kl74a-00030H-Kd; Fri, 04 Dec 2020 10:08:16 +0100
Received: from ti0182q160-1994.bb.online.no ([212.251.170.224] helo=[192.168.1.11]) by mail-mx10.uio.no with esmtpsa (TLS1.2:ECDHE-RSA-AES256-GCM-SHA384:256) user michawe (Exim 4.93.0.4) (envelope-from <michawe@ifi.uio.no>) id 1kl74Z-0004NO-3g; Fri, 04 Dec 2020 10:08:16 +0100
From: Michael Welzl <michawe@ifi.uio.no>
Message-Id: <7335DBFA-D255-43BE-8175-36AB231D101F@ifi.uio.no>
Content-Type: multipart/alternative; boundary="Apple-Mail=_DC3B342B-E141-4A63-8383-F07F190B0BD1"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\))
Date: Fri, 04 Dec 2020 10:08:13 +0100
In-Reply-To: <b86e3a0d-3f09-b6f5-0e3b-0779b8684d4a@mti-systems.com>
Cc: tsvwg@ietf.org
To: Wesley Eddy <wes@mti-systems.com>
References: <MN2PR19MB4045A76BC832A078250E436483E00@MN2PR19MB4045.namprd19.prod.outlook.com> <HE1PR0701MB2876A45ED62F1174A2462FF3C2FF0@HE1PR0701MB2876.eurprd07.prod.outlook.com> <56178FE4-E6EA-4736-B77F-8E71915A171B@gmx.de> <0763351c-3ba0-2205-59eb-89a1aa74d303@bobbriscoe.net> <CC0517BE-2DFC-4425-AA0A-0E5AC4873942@gmx.de> <35560310-023f-93c5-0a3d-bd3d92447bcc@bobbriscoe.net> <b86e3a0d-3f09-b6f5-0e3b-0779b8684d4a@mti-systems.com>
X-Mailer: Apple Mail (2.3608.120.23.2.4)
X-UiO-SPF-Received: Received-SPF: neutral (mail-mx10.uio.no: 212.251.170.224 is neither permitted nor denied by domain of ifi.uio.no) client-ip=212.251.170.224; envelope-from=michawe@ifi.uio.no; helo=[192.168.1.11];
X-UiO-Spam-info: not spam, SpamAssassin (score=-5.0, required=5.0, autolearn=disabled, AWL=0.001, HTML_MESSAGE=0.001, UIO_MAIL_IS_INTERNAL=-5, uiobl=NO, uiouri=NO)
X-UiO-Scanned: 5810289EA357D5D6A51358DC8AD10C578824884B
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/UbsHCNP9KcBjgB9aGo2gWTsBwkc>
Subject: Re: [tsvwg] Another tunnel/VPN scenario (was RE: Reasons for WGLC/RFC asap)
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Dec 2020 09:08:23 -0000

Hi all,

I’m reacting to this email. I’ve been following this thread from a distance, mostly for its entertainment value (stuff like "I don't know which school you learned networking at”).

So, personally, I think that a decision to go forward should be grounded in data - and I see it as an asset for the group that there’s an opposing team which presents a *critical* analysis. This should weigh more than data presented by the proponents, unless it’s provably erroneous (using broken code or whatever).  I have to apologize for not consulting the list of issues - but, from these emails, it seems that there are two major remaining concerns:


1) RTT unfairness among L4S flows:

If I understood this correctly from emails, this issue has been addressed somewhat ok, albeit perhaps with a hack. Unless this is truly extreme (to the degree of making some flows almost entirely unusable), I wouldn’t worry too much about it, as it is a matter of shooting yourself in the foot if things work very poorly. The Internet has a way of self-regulating such too-aggressive behavior: when BIC was found to be a bit too aggressive, it was replaced with Cubic. Cubic used beta=0.8 for a while, which was found too aggressive, and now it uses 0.7. BBRv1 has received a lot of criticism for being too aggressive, but not much roll-out has been documented… and Google folks have been quick to try to address this with BBRv2. “The bully wins” isn’t really how congestion control tends to play out, perhaps because even a single host produces multiple flows, and people also live in households with multiple heterogeneous hosts.


2) Unfairness towards RFC-3168 style ECN:

This is a very different matter. Recently, there was some discussion of the slowly increasing 3168-style ECN deployment, and Bob has asked if these are all FQ cases (or else they might be irrelevant).
I don’t think that’s the right conversation to have. Imagine a scenario where L4S is deployed on some hosts feeding into a bottleneck, yet some other hosts don’t participate in the L4S experiment (e.g., these hosts could run a different OS). Now assume that the non-L4S-participating hosts want to try out 3168-style ECN **at this point in time** - and they find it performing terribly poor, worse than without using ECN, concluding that 3168-style ECN isn’t even worth trying. Then, the only possibility for improvement is to also participate in the L4S experiment - and what if this, then, also doesn’t hold its promises (e.g., because the only existing end system implementation doesn’t work well) ?   Won’t this lead to an overall conclusion that ECN isn’t worth it, and never has been?

I would also like to remind people that there were two planned experiments: L4S and ABE (RFC 8511). Code for ABE exists but it isn’t really turned on by anyone, as far as I know - perhaps because people are waiting for this discussion’s dust to settle before even trying something else with ECN. I don’t know. Anyway, IIRC, the plan was to allow both experiments to play out individually, without one eliminating the possibility of even trying the other.

In conclusion: I’m inclined to say “let’s test this”, but only if point 2 is convincingly addressed by L4S. That hasn’t been the case for the last, what, 5 years or so, and unless this data is based on wrong or broken code, it doesn’t convince me otherwise:  https://github.com/heistp/l4s-tests#unsafety-in-shared-rfc3168-queues <https://github.com/heistp/l4s-tests#unsafety-in-shared-rfc3168-queues>
IIRC this was also the original plan.

Cheers,
Michael

PS: my personal view on L4S, in general: the use case is thin. L4S is “sold” as minimizing latency, but this is also easily achieved by a tiny DropTail queue. Less easily, and with its own issues, this is also achieved by FQ (or variants thereof). The large majority of interactive latency-critical flows has a low rate (and even high-rate interactive traffic is probably crafted to be adaptive and work below the average capacity limit, to avoid a latency build-up). A short DropTail queue or FQ would work reasonably well for such flows. L4S can attain high throughput without building a large queue, which is great, of course — for non-interactive traffic. For non-interactive traffic, though, I don’t understand why an e2e codepoint is needed: put a PEP there, let it pull data as early as it can from the sender, use L4S with a DSCP across the bottleneck towards the receiver. Why is this not the L4S deployment model, at least while it’s an experiment? Is this just because the IETF doesn’t approve of PEPs?



> On Dec 3, 2020, at 11:17 PM, Wesley Eddy <wes@mti-systems.com> wrote:
> 
> (chair hat on)
> 
> I wish this thread was making progress in any way, but I don't see it, and it's become a bit heated again.
> 
> I think we can probably close this thread, since the scenario described is well understood, usual suspects have clearly expressed their thoughts, and going back and forth doesn't seem to be convincing to one another in any way.
> 
> FYI, there are 762 people on this mailing list, and we're aware that the tone getting too hot discourages some from participating in the conversation.  I *would* like to hear from others or anyone with new perspectives to contribute, rather than just the handful of people that we are hearing a lot from.
> 
>