Re: [tsvwg] [Ecn-sane] Comments on L4S drafts

Dave Taht <> Wed, 24 July 2019 16:21 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 67C6F120043 for <>; Wed, 24 Jul 2019 09:21:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id cdfTvnko7El8 for <>; Wed, 24 Jul 2019 09:21:17 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:4864:20::d42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id AB8EC120024 for <>; Wed, 24 Jul 2019 09:21:17 -0700 (PDT)
Received: by with SMTP id j5so86669375ioj.8 for <>; Wed, 24 Jul 2019 09:21:17 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=KzeKJWM+M0U0WYvfaY6CMPKBNan4zUal40w7vGh16eE=; b=BH7Of/C21ZKCQuWxNpdO5fUttgeVNUoNTeD/oim7AXBOUqzbptAuTju6GXrDwC3vSU yVGuMiDFZpD/gxqzXVJOZNwAI0cEfS/iTbQOLkaYW5vlVDWjMDFDQqixG8RgGdrjgKdR oM9RZBxmWIlwm53qZ0Zno9VtaIe97kx4AHAzeb2zwDmYhzb4DfCSFUNVfhkya/7nKv6w h/3lHjkGeSNDWKAzoZaVWPrgZUkWfQ8v8TwQzY2wmjGtaNt77pGaXQOGdWz7KursDlxD hUzkZiAvNI3JNUhHkywW/MHFRdwqp5SDFcCmxAZ1uFur+lOKUnYWY3DpPC+Md/dHbqGV JAWA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=KzeKJWM+M0U0WYvfaY6CMPKBNan4zUal40w7vGh16eE=; b=kKsW3c+0oljIugNWDO3QJSgqTifvd3FtRETNaEA5c/V45bCibMGHjTOtENgu0DrsqG jxtfkkeEO+WF5sah+mP0VfX4HvoqJCmIDZagXhGjvzdXmcWW+zfmxcObBW4AMhW0hKyD 8Cw8GG/oNG4chEb+VzWiPtqsP7AovPJ2RrzJ3upOesaX725v7dwRNoexHSOCfHepfSQ4 ZwlWRVyFEO9HmnNjhC+9rlkXLOr3D+8UhZQRTFQ/sgjCFREN6i0h5Nok6YR2HqUc4B4R 4GOeMKCUXHBCwPsZDZc9yFLSkF5h4igmTYz5gr/Cq56QRaA9DSe9ULOzJ5E99hnpKGaL ZDXA==
X-Gm-Message-State: APjAAAWaVMhBgBU7j/IBsOhnjkii8Z7lLD6UOjWjoaOnmR1pnHG/ENVU q8IzeiNXKMbxz7RQVcf3FAiGpnhhYjfv9hs5B4A=
X-Google-Smtp-Source: APXvYqyejjt/jKY6H5Yo6fBEjUdm2w98j73uKIp9WtCo56qVsN59B9cPkc55Q2vdZEpOkTNxaZxv2HBYNPx/e937AxE=
X-Received: by 2002:a5e:8b43:: with SMTP id z3mr75281141iom.287.1563985276783; Wed, 24 Jul 2019 09:21:16 -0700 (PDT)
MIME-Version: 1.0
References: <> <> <> <> <> <> <> <> <> <> <> <> <> <> <> <> <> <>
In-Reply-To: <>
From: Dave Taht <>
Date: Wed, 24 Jul 2019 09:21:04 -0700
Message-ID: <>
To: Wesley Eddy <>
Cc: Dave Taht <>, "De Schepper, Koen (Nokia - BE/Antwerp)" <>, "" <>, "" <>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <>
Subject: Re: [tsvwg] [Ecn-sane] Comments on L4S drafts
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 24 Jul 2019 16:21:22 -0000

On Fri, Jul 19, 2019 at 4:42 PM Dave Taht <> wrote:
> On Fri, Jul 19, 2019 at 3:09 PM Wesley Eddy <> wrote:
> >
> > Hi Dave, thanks for clarifying, and sorry if you're getting upset.
> There have been a few other disappointments this ietf. I'd hoped bbrv2
> would land for independent testing. Didn't.
> I have some "interesting" patches for bbrv1 but felt it would be saner
> to wait for the most current version (or for the bbrv2 authors to
> have the small rfc3168 baseline patch I'd requested tested by them
> rather than I), to bother redoing that series of tests and publishing.

The bbrv2 code did indeed land yesterday (and - joy!) was accompanied
by test scripts for repeatable results. The iccrg preso was
impressive. thank you, thank you. It's going to take a while to
retofit my suggested simpler rfc3168 ecn handing, and or/sce, but not
as long as until next ietf.

> I'd asked if the dctcp and dualpi code on github was stable enough to
> be independently tested. No reply.

In poking through the most current git trees, I see this commit
finally installed into dctcp *sane behavior
in response to loss* which it didn't have before.

commit aecfde23108b8e637d9f5c5e523b24fb97035dc3
Author: Koen De Schepper <>
Date:   Thu Apr 4 12:24:02 2019 +0000
    tcp: Ensure DCTCP reacts to losses

Which explains a few things. Now I get to throw out 8 years of test
results and start over. And throw out most of yours, also. Please note
that seeing a bug fixed of this magnitude gives me joy. Perhaps many
issues I saw were due to this, not theory/spec failures. This brings
up another issue I'll start a new subject line for.

This commit looks to make a dent in the GRO issue I've raised periodically:

commit e3058450965972e67cc0e5492c08c4cdadafc134
Author: Eric Dumazet <>
Date:   Thu Apr 11 05:55:23 2019 -0700
    dctcp: more accurate tracking of packets delivery

After commit e21db6f69a95 ("tcp: track total bytes delivered with ECN CE marks")
core TCP stack does a very good job tracking ECN signals.

    The "sender's best estimate of CE information" Yuchung mentioned in his
    patch is indeed the best we can do.

    DCTCP can use tp->delivered_ce and tp->delivered to not duplicate the logic,
    and use the existing best estimate.

    This solves some problems, since current DCTCP logic does not deal
with losses
    and/or GRO or ack aggregation very well.


Still it's hard to mark multiple packets in a gso/gro bundle - cake
does gso splitting by default, dualpi
does not. Has tso/gro been enabled or disabled for other's tests so far?

> The SCE folk did freeze and document a release worth testing.

But it looks to me they were missing both these commits.

> I did some testing on wifi at battlemesh but it's too noisy (but the
> sources of "noise" were important) and too obviously "ecn is not the
> wifi problem"
> I didn't know there was an "add a delay based option to cubic patch"
> until last week.
> So anyway, I do retain hope, maybe after this coming week and some
> more hackathoning, it might be possible to start getting reproducible
> and repeatable results from more participants in this controversy.
> Having to sit through another half-dozen presentations with
> irreproducible results is not something I look forward to, and I'm
> glad I don't have to.
> > When we're talking about keeping very small queues, then RTT is lost as
> > a congestion indicator (since there is no queue depth to modulate as a
> > congestion signal into the RTT).  We have indicators that include drop,
> > RTT, and ECN (when available).  Using rate of marks rather than just
> > binary presence of marking gives a finer-grained signal.  SCE is also
> > providing a multi-level indication, so that's another way to get more
> > "ENOB" into the samples of congestion being fed to the controllers.
> While this is extremely well said, RTT is NOT lost as a congestion
> indicator, it just becomes finer grained.
> While I'm reading tea-leaves... there's been a lot of stuff landing in
> the linux kernel from google around edf scheduling for tcp and the
> hardware enabled pacing qdiscs. So I figure they are now in the nsec
> category on their stuff but not ready to be talking.
> > Marking (whether classic ECN, mark-rate, or multi-level marking) is
> > needed since with small queues there's lack of congestion information in
> > the RTT.
> small queues *and isochronous, high speed, wired connections*.
> What will it take to get the ecn and especially l4s crowd to take a
> hard look at actual wireless or wifi packet captures? I mean, y'all
> are sitting staring into your laptops for a week, doing wifi. Would it
> hurt to test more actual transports during
> that time?

I do keep hoping someone will attempt to publish some wifi results. I guess
that might end up being me, next time around.

> How many ISPs would still be in business if wifi didn't exist, only {X}G?
> the wifi at the last ietf sucked...
> Can't even get close to 5ms latencies on any form of wireless/wifi.
> Anyway, I long ago agreed that multiple marks (of some sort) per rtt
> made sense (see my position statements on ecn-sane),
> but of late I've been leaning more towards really good pacing,  rtt
> and chirping with minimal marking required on
> "small queues *and isochronous, high speed, wired connections*.
> >
> > To address one question you repeated a couple times:
> >
> > > Is there any chance we'll see my conception of the good ietf process
> > > enforced on the L4S and SCE processes by the chairs?
> >
> > We look for working group consensus.  So far, we saw consensus to adopt
> > as a WG item for experimental track, and have been following the process
> > for that.
> Well, given the announcement of docsis low latency, and the size of
> the fq_codel deployment,
> and the l4s/sce drafts, we are light-years beyond anything I'd
> consider to be "experimental" in the real world.
> Would recognizing this reality and somehow converting this to a
> standards track debate within the ietf help anything?
> Would getting this out of tsvwg and restarting aqmwg help any?
> I was, up until all this blew up in december, planning on starting the
> process for an rfc8289bis and rfc8290bis on the standards track.
> >
> > On the topic of gaming the system by falsely setting the L4S ID, that
> > might need to be discussed a little bit more, since now that you mention
> > it, the docs don't seem to very directly address it yet.
> to me this has always been a game theory deal killer for l4s (and
> diffserv, intserv, etc). You cannot ask for
> more priority, only less. While I've been recommending books from
> kleinrock lately, another one that
> I think everyone in this field should have is:
> I've read it countless times (and can't claim to have understood more
> than a tiny percentage of it). I wasn't aware
> until this moment there was a kindle edition.
> > I can only
> > speak for myself, but assumed a couple things internally, such as (1)
> > this is getting enabled in specific environments, (2) in less controlled
> > environments, an operator enabling it has protections in place for
> > getting admission or dealing with bad behavior, (3) there could be
> > further development of audit capabilities such as in CONEX, etc.  I
> > guess it could be good to hear more about what others were thinking on this.
> I think there was "yet another queue" suggested for detected bad behavior.
> >
> > > So I should have said - "tosses all normal ("classic") flows into a
> > > single and higher latency queue when a greedy normal flow is present"
> > > ... "in the dualpi" case? I know it's possible to hang a different
> > > queue algo on the "normal" queue, but
> > > to this day I don't see the need for the l4s "fast lane" in the first
> > > place, nor a cpu efficient way of doing the right things with the
> > > dualpi or curvyred code. What I see, is, long term, that special bit
> > > just becomes a "fast" lane for any sort of admission controlled
> > > traffic the ISP wants to put there, because the dualpi idea fails on
> > > real traffic.
> >
> > Thanks; this was helpful for me to understand your position.
> Groovy.
> I recently ripped ecn support out of fq_codel entirely, in
> the fq_codel_fast tree. saved some cpu, still measuring (my real objective
> is to make that code multicore),
> another branch also has the basic sce support, and will have more
> after jon settles on a ramp and single queue fallbacks in
> sch_cake. btw, if anyone cares, there's more than a few flent test
> servers scattered around the internet now that
> do some variant of sce for others to play with....
> >
> >
> > > Well if the various WGs would exit that nice hotel, and form a
> > > diaspora over the city in coffee shops and other public spaces, and do
> > > some tests of your latest and greatest stuff, y'all might get a more
> > > accurate viewpoint of what you are actually accomplishing. Take a look
> > > at what BBR does, take a look at what IW10 does, take a look at what
> > > browsers currently do.
> >
> > All of those things come up in the meetings, and frequently there is
> > measurement data shown and discussed.  It's always welcome when people
> > bring measurements, data, and experience.  The drafts and other
> > contributions are here so that anyone interested can independently
> > implement and do the testing you advocate and share results.  We're all
> > on the same team trying to make the Internet better.
> Skip a meeting. Try the internet in Bali. Or africa. Or south america.
> Or on a boat, Or do an interim
> in places like that.
> >
> >
> --
> Dave Täht
> CTO, TekLibre, LLC
> Tel: 1-831-205-9740


Dave Täht
CTO, TekLibre, LLC
Tel: 1-831-205-9740