[tsvwg] The state of l4s, bbrv2, sce?

Dave Taht <dave.taht@gmail.com> Fri, 26 July 2019 15:05 UTC

Return-Path: <dave.taht@gmail.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 40139120086 for <tsvwg@ietfa.amsl.com>; Fri, 26 Jul 2019 08:05:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Level:
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aQ1RW1S7xdSM for <tsvwg@ietfa.amsl.com>; Fri, 26 Jul 2019 08:05:29 -0700 (PDT)
Received: from mail-io1-xd36.google.com (mail-io1-xd36.google.com [IPv6:2607:f8b0:4864:20::d36]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9DB7712011F for <tsvwg@ietf.org>; Fri, 26 Jul 2019 08:05:25 -0700 (PDT)
Received: by mail-io1-xd36.google.com with SMTP id g20so105403055ioc.12 for <tsvwg@ietf.org>; Fri, 26 Jul 2019 08:05:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=FvwawBvbETOq4jCNgh80mwUnjQ2OUCNK1tEWlwAtcj0=; b=XCKw4O+9/rEIpFH4uMcaphokmLfH/rWenilbsTTJBdArAgp5Q3FtA7mOnSg/B/fw5S OnbwWoZ2Hk4dKFI2d7DmUGmwTb5y0gPSReTWyMpugOrZeRlJpnoVg1XdBqgzO6e+/66l H7MaHMKrQqAReJDaqx266Z9iKoWzVxpUv1GKlRj0W9b1ucCFO8ZqLitpuiHDZdIagKDM ZDD6sH4/zAfuPIJCgGJUF8piSFLQ5BfmyIxVpcziQSqpKWOniYPqS/SQHquGZ832rr6s i7Tii/AuKN+3/1szBp17rMaaiLeyyhnxgdHsYWeFHtkp+GHoqn4mypgGLIo24HjrQQrS ezEQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=FvwawBvbETOq4jCNgh80mwUnjQ2OUCNK1tEWlwAtcj0=; b=taagJBDCMa8oRtTWPmKnjlSA+Ja4wytOTcWL8vr9uRzQiMAbfMGFIROk2PHrvJ2pRI 6DhAbVJ6xdRn/CcluLdhYsu/8gDEZqL2Uc/wCsEx88tDE+9jZ6CUzfPfTQxUpboKTyzj ygzD+EC1cfNkdHViKCGbGXusw8SMSX3UClEKWsKG+MKSUmwfF1Ntj4jKYzIj3SbSqCbk 9FujTTD1tY3w5rqmoU37j17amcgETmtjCN5mTlenNuZ0J09tTM3hCGWeFw4CUE/JNQAb 7NF+b3hLn615dgPaTcwereFJsAJOEk2kdgeHd8NDlOdCckrRFkS76JFTx7njEekxDAzo kzsA==
X-Gm-Message-State: APjAAAXIyA8mF/H46B2TUiCPtuXDCfrxjPJNjiNL2WKR/0/ItXDOKNWm IcZ4QK2CXBp6oJ2gt7FjYhDBIQNtA4+BJSOPK0k=
X-Google-Smtp-Source: APXvYqyuzOKZKAdTUAZiQprudg4CjM7kNVawevxcms5PDMkaoyN5b9WXpeV0TMpuOKndncRcbpbJv1yd42frYW0ik78=
X-Received: by 2002:a02:c9d8:: with SMTP id c24mr44797358jap.38.1564153524677; Fri, 26 Jul 2019 08:05:24 -0700 (PDT)
MIME-Version: 1.0
References: <364514D5-07F2-4388-A2CD-35ED1AE38405@akamai.com> <1238A446-6E05-4A55-8B3B-878C8F39FC75@gmail.com> <AM4PR07MB3459B1173917DAFBCEB25511B9FA0@AM4PR07MB3459.eurprd07.prod.outlook.com> <17B33B39-D25A-432C-9037-3A4835CCC0E1@gmail.com> <AM4PR07MB345956F52D92759F24FFAA13B9F50@AM4PR07MB3459.eurprd07.prod.outlook.com> <52F85CFC-B7CF-4C7A-88B8-AE0879B3CCFE@gmail.com> <AM4PR07MB3459B471C4D7ADAE4CF713F3B9F60@AM4PR07MB3459.eurprd07.prod.outlook.com> <D231681B-1E57-44E1-992A-E8CC423926B6@akamai.com> <AM4PR07MB34592A10E2625C2C32B9893EB9F00@AM4PR07MB3459.eurprd07.prod.outlook.com> <A6F05DD3-D276-4893-9B15-F48E3018A129@gmx.de> <AM4PR07MB3459487C8A79B1152E132CE1B9CB0@AM4PR07MB3459.eurprd07.prod.outlook.com> <87ef2myqzv.fsf@taht.net> <a85d38ba-98ac-e43e-7610-658f4d03e0f4@mti-systems.com> <CE03DB3D7B45C245BCA0D243277949363062879C@MX307CL04.corp.emc.com> <803D9CA8-220E-4F98-9B8E-6CE2916C3100@gmail.com> <0079BC6B-4792-48ED-90D3-D9A69407F316@gmx.de> <22af0671-fdd0-0953-fc96-55b34beb0be9@bobbriscoe.net> <AC3C0A74-43C7-4351-B4FA-33AD2066B479@gmail.com> <VI1PR07MB34703F2998D9EE7B79BD4E60B9C40@VI1PR07MB3470.eurprd07.prod.outlook.com> <3EB0D59D-69A7-4730-BCDF-10E5C61EF987@heistp.net> <AM4PR07MB3459D891047B874E5AABB7D1B9C10@AM4PR07MB3459.eurprd07.prod.outlook.com> <B356F262-6047-4CA1-A6B8-B780C7981F63@heistp.net>
In-Reply-To: <B356F262-6047-4CA1-A6B8-B780C7981F63@heistp.net>
From: Dave Taht <dave.taht@gmail.com>
Date: Fri, 26 Jul 2019 08:05:11 -0700
Message-ID: <CAA93jw53+Cmt=c4ie9PZbOvq6fewor_nsQ65OuP1POWjmGErmg@mail.gmail.com>
To: Pete Heist <pete@heistp.net>
Cc: "De Schepper, Koen (Nokia - BE/Antwerp)" <koen.de_schepper@nokia-bell-labs.com>, "ecn-sane@lists.bufferbloat.net" <ecn-sane@lists.bufferbloat.net>, "tsvwg@ietf.org" <tsvwg@ietf.org>, Neal Cardwell <ncardwell@google.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/ezUQnzkO0hsVvgwwR-KD4OVKz9E>
Subject: [tsvwg] The state of l4s, bbrv2, sce?
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Jul 2019 15:05:32 -0000

Changing the title....

I hope to be able to add some features and boxes to the worldwide
flent fleet to gather up some more data. Simple stuff includes trying
to verify more fully worldwide what happens when you twiddle the ecn
bits, mildly longer term look at what happens when conflicting
interpretations
of these bits are in play somewhere on the path, bit longer than that
getting an openwrt build up as a middlebox and vm, and then finally,
finally
see what happens on a couple kinds of wifi.

There's now a flent server in mumbai, in particular, which I hope will
shed some insight as to the state of networks in india, long term, on
a variety
of fronts. But none of it's ready lacking a good release to freeze on.

1) BBRv2 is now available for public hacking. I had a good readthrough
last night.

The published tree applies cleanly (with a small patch) to net-next.
I've had a chance to read through the code (lots of good changes to
bbr!).

Although neal was careful to say in iccrg the optional ecn mode uses
"dctcp/l4s-style signalling", he did not identify how that was
actually applied
at the middleboxes, and the supplied test scripts
(gtests/net/tcp/bbr/nsperf) don't do that. All we know is that it's
set to kick in at 20 packets. Is it fq_codel's ce_threshold? red? pie?
dualpi? Does it revert to drop on overload?

Is it running on bare metal? 260us is at the bare bottom of what linux
can schedule reliably, vms are much worse.

Couple notes:

BBRv2 doesn't use ect(1) as an identifier.

The chromium release has no support for ecn at all.

Adding back in the stuff I'd first done to rfc3168 bbrv1 looks
straightforward, making it do sce, less so.

2) To clarify something from the l4s team, are the results you've been
presenting for years all from the 3.19 kernel? bsd? microsoft? ns2?
ns3? what?

The code on github is not worth testing against currently? It does
have some needed features like a setsockopt for using up ect(1).

should I use the issue tracker for that? I have some comments on
dualpi in addition to my outstanding question about pie's default of
drop at 10% mark
rate vs dualpi's 0. Notably it's set to 1000 packets now (fq_codel
defaults to 10,000 and we switched to memory limits both in it and
cake given a modern
packet's dynamic range of 64b to 64k). I've observed 10gige can be in
the 2-3k packets range... has dualpi been tested above 1gige yet?

3) The current patches for sce need to get rebased for net-next. The
sch_cake mods are easy but as the dctcp code did morph a bit since sce
work forked it as did the other tcps. I took a stab at forward porting
it to net-next, but I figure that development is hot and heavy and
some patches will land after ietf. I do not mind taking a stab again
at cleaning it up (helps me to understand what's going on), as how the
algos currently (as of, like, yesterday) work is clear to me... what
I'd like to do at least is also add 'em to the out of tree
fq_codel_fast implementation.

Did I miss anything about the current state of things?

My basic testbed is a string of containers on a couple 12 core boxes
on bare metal, and more advanced is the openwrt stuff part of my wifi
lab. That's
presently almost all 4.14 based on arm, mips, and x86, running both on
real hardware and in emulation.

On Fri, Jul 26, 2019 at 6:10 AM Pete Heist <pete@heistp.net> wrote:
>
>
> > On Jul 25, 2019, at 12:14 PM, De Schepper, Koen (Nokia - BE/Antwerp) <koen.de_schepper@nokia-bell-labs.com> wrote:
> >
> > We have the testbed running our reference kernel version 3.19 with the drop patch. Let me know if you want to see the difference in behavior between the “good” DCTCP and the “deteriorated” DCTCP in the latest kernels too. There were several issues introduced which made DCTCP both more aggressive, and currently less aggressive. It calls for better regression tests (for Prague at least) to make sure it’s behavior is not changed too drastically by new updates. If enough people are interested, we can organize a session in one of the available rooms.
> >
> > Pete, Jonathan,
> >
> > Also for testing further your tests, let me know when you are available.
>
> Regarding testing, we now have a five node setup in our test environment running a mixture of tcp-prague and dualq kernels to cover the scenarios Jon outlined earlier. With what little time we’ve had for it this week, we’ve only done some basic tests, and seem to be seeing behavior similar to what we saw at the hackathon, but we can discuss specific results following IETF 105.
>
> Our intention is to coordinate a public effort to create reproducible test scenarios for L4S using flent. Details to follow post-conference. We do feel it’s important that all of our Linux testing be on modern 5.1+ kernels, as the 3.19 series was end of life as of May 2015 (https://lwn.net/Articles/643934/), so we'll try to keep up to date with any patches you might have for the newer kernels.
>
> Overall, I think we’ve improved the cooperation between the teams this week (from zero to a little bit :), which should hopefully help move both projects along...
> _______________________________________________
> Ecn-sane mailing list
> Ecn-sane@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/ecn-sane



-- 

Dave Täht
CTO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-831-205-9740