Re: [tsvwg] l4s and sce testing on vpns

Jonathan Morton <chromatix99@gmail.com> Mon, 22 February 2021 03:10 UTC

Return-Path: <chromatix99@gmail.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9EF823A0B05 for <tsvwg@ietfa.amsl.com>; Sun, 21 Feb 2021 19:10:39 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.848
X-Spam-Level:
X-Spam-Status: No, score=-1.848 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FmYk2z-dUyX3 for <tsvwg@ietfa.amsl.com>; Sun, 21 Feb 2021 19:10:38 -0800 (PST)
Received: from mail-lj1-x235.google.com (mail-lj1-x235.google.com [IPv6:2a00:1450:4864:20::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2BBF43A0ADF for <tsvwg@ietf.org>; Sun, 21 Feb 2021 19:10:38 -0800 (PST)
Received: by mail-lj1-x235.google.com with SMTP id o16so49262585ljj.11 for <tsvwg@ietf.org>; Sun, 21 Feb 2021 19:10:38 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=6towGB3wDHCIwvSoPkKOi1dEYZa97in7ROGwj3tEZDo=; b=TJbHv4z7CDTwsIPVW+S4FlL1jTwgNjPgN+jLPE5j1BQlgH9b6XqTENoOmNQx2QDsZe 5NbVfngi3JaGCcsn/z+TqhCljYC0CkVrc92DbmOxG1cvN7d2Pd4HG8osx9YnVJxR7tIW MQwxmr9s8ku/lRVLnZpTkAUTixY6Kv6Pp+Z7wyf3TRyChf8S6S/t0SM1k5bam1z22yc+ YAXwQP2Dm1fIu8udSO1XoATzUCQe8JJDmth9hG+fIKPuh51dqnTK2btGRn0KZux5yfDT YdQ1BnIDrDVrchYljLbPK9fF0pEj040CUZNya/xEeD+wMEZ79FXhwVj4Ug5iR5K+QAzJ nOvw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=6towGB3wDHCIwvSoPkKOi1dEYZa97in7ROGwj3tEZDo=; b=UZZyz1woNmCddV0Lv7pFJ+cAxL/fRERnNv11F9aQwn0tlUXvsnCCJ4DdEhqml0oqKC lskBN76FhvQNG/BMZHxv9S9iKrEdVeZRUm/JYo1TZfju0CF45s+xaCSXQ1njKy0v1no/ tU0IuKbiqTulX0E/R4sqXeeqX2iB1Kh4Z2Jyn5sq6ZYJuD6vRt03fOUkjfQXZ295J/LL ua0xm1SxytFdImlECmu5/dRD99BevZmeICcWMn9RSLu2bWrtCnKuKQk8dFUr3AQHzjSu 6IUR8qognQl/t8jRzK8a3tYgnDvVSCmFvWdTBJMu388po1U+RZUt+B5AZji6SlAfMQPz 4hWg==
X-Gm-Message-State: AOAM532uoTeh67iEuhBHLhAUxJUNulkI6FR5s/MhmRQgnWgzlvevaHoD B5sd5uJHFHxz8SibKRxPpj+To9jMMR8=
X-Google-Smtp-Source: ABdhPJw6AzleP61KwbspWmWhLAjlqp3khF3uCBQEKUhg7vyjWCmYO0xf40K+UNinDVWhy0fNOiZBsg==
X-Received: by 2002:a2e:9116:: with SMTP id m22mr1184244ljg.498.1613963436383; Sun, 21 Feb 2021 19:10:36 -0800 (PST)
Received: from jonathartonsmbp.lan (176-93-29-60.bb.dnainternet.fi. [176.93.29.60]) by smtp.gmail.com with ESMTPSA id h11sm1144148lfc.298.2021.02.21.19.10.35 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 21 Feb 2021 19:10:36 -0800 (PST)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.7\))
From: Jonathan Morton <chromatix99@gmail.com>
In-Reply-To: <CAA93jw7Jo=adYmJBqt5DC=WUB8W3wzjOVWQj2Kd4APtnzg08aA@mail.gmail.com>
Date: Mon, 22 Feb 2021 05:10:34 +0200
Cc: tsvwg IETF list <tsvwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <B7F56C67-4902-4C66-9CD9-1711D165B694@gmail.com>
References: <CAA93jw7Jo=adYmJBqt5DC=WUB8W3wzjOVWQj2Kd4APtnzg08aA@mail.gmail.com>
To: Dave Taht <dave.taht@gmail.com>
X-Mailer: Apple Mail (2.3445.9.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/tD5shmupI5nL1E-ff6ql44VN9sY>
Subject: Re: [tsvwg] l4s and sce testing on vpns
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Feb 2021 03:10:40 -0000

> On 22 Feb, 2021, at 3:53 am, Dave Taht <dave.taht@gmail.com> wrote:
> 
> Have the L4S and SCE codebases been updated to modern kernels so they
> can be easily tested again?

The SCE kernel has been updated to the 5.10 series, which is quite new and is also an LTS series.  That's actually what had us detect the bug you mentioned.  As part of forward porting, we switched over to Toke's new functions instead of some close equivalents we'd rolled ourselves.

Last we checked, the L4S kernel has not been updated recently.  I'm sure we'll hear about it from interested parties if that's incorrect.

> Has anyone done a study/survey of the effects of this fix and the
> original attempt? Was a detection method for L4S derived for it?

SCE is among a very small and experimental cadre of projects which exercise code related to ECT(1), which probably explains why the bug wasn't noticed for half a year.  We actually had to manually reconfigure Wireshark to verify header checksums as part of troubleshooting, since it doesn't do that by default.

What we noticed was that packets that were marked ECT(1) by a middlebox running the faulty code would be treated as dropped by the receiver, because they had the wrong checksum.  Needless to say, this had a marked effect on throughput and caused AIMD sawtooth behaviour with retransmissions, very unlike what SCE would normally do, and instantly recognisable in a time-series plot of the type Flent produces.  It didn't stop the flows from functioning, however.  When the checksum code was corrected, normal SCE behaviour resumed.

Since L4S sets ECT(1) at origin instead of modifying an existing packet, it probably wouldn't have exercised this bug in a straight connection (though if it had, the effect would have been much more severe and immediately noticeable), because it would calculate the checksum ab-initio over the whole header, not incrementally.  However it might have shown up if the L4S flow were run through a tunnel, as Toke added these functions as part of an overhaul of tunnel decapsulation.

> There were more than a few other things I'd wanted tested since last
> time. I've read over the SCE and L4S related reports as of, august, or
> so of last year, is there new data?

We published some test data for November, and that included some investigation into VPN effects.  As we expected, flow-aware AQMs treat tunnels as a single flow regardless of how many flows are actually in them, and the effect when that includes a mixture of L4S and conventional flows is predictable from previous demonstrations of such mixtures in a single queue AQM.  We take it to mean that single-queue AQM effects are not limited to deployments of native single-queue AQMs, but can also affect FQ and AF AQMs in the presence of tunnelled traffic.

IIRC, we also demonstrated then that Not-ECT conventional flows are affected to the same extent as conventional ECN flows.  This is important, because it shows that users suffering from starvation due to L4S cannot avoid the problem by simply turning off ECN for themselves.  They must either remove the AQM or the L4S traffic.

 - Jonathan Morton