Re: [aqm] Review of draft-hoeiland-joergensen-aqm-fq-codel-00

Dave Taht <dave.taht@gmail.com> Sun, 26 October 2014 22:05 UTC

Return-Path: <dave.taht@gmail.com>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E7D451A1AB7 for <aqm@ietfa.amsl.com>; Sun, 26 Oct 2014 15:05:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.7
X-Spam-Level:
X-Spam-Status: No, score=0.7 tagged_above=-999 required=5 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zKcun6An_BZa for <aqm@ietfa.amsl.com>; Sun, 26 Oct 2014 15:05:48 -0700 (PDT)
Received: from mail-oi0-x231.google.com (mail-oi0-x231.google.com [IPv6:2607:f8b0:4003:c06::231]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3B4AF1A1AB0 for <aqm@ietf.org>; Sun, 26 Oct 2014 15:05:48 -0700 (PDT)
Received: by mail-oi0-f49.google.com with SMTP id a3so2477799oib.22 for <aqm@ietf.org>; Sun, 26 Oct 2014 15:05:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=+IlBTnbB3D1J+BGhmQ92nn1k8mfSYCfYSnQE1NqKXCs=; b=KU8oNUiNpW63wFSnI3CpHo3UX5qjGyZUDXU9JP7qfkXBkJBYMJWTYyQltqE2lwEWep stvg2F695E7cgysJ5+V/w9cvhlysdLrmvMxZZyIm7GfxUEGOOmTEol5box5p3vBa7EFw Be7NTHpIutcpq+RhoMnqsiLcc+hxw1B8RmcJzREmwNtt+h8RIAgCqGkBf451mFKmzboR HNqMzhF24ZIMdcSBdbXFPttHiLRVKIwtNFtp70MmQISzgtuNiHFBWJ6d09t34hkaCEuA 5iu4ui1b6DFkKwJYvtLfVf7fgvoBmphFFrKRBi5tMjVQvEK/ITP07jycl57T3oAKmOTv K65Q==
MIME-Version: 1.0
X-Received: by 10.182.24.166 with SMTP id v6mr16858867obf.30.1414361147637; Sun, 26 Oct 2014 15:05:47 -0700 (PDT)
Received: by 10.202.227.211 with HTTP; Sun, 26 Oct 2014 15:05:47 -0700 (PDT)
In-Reply-To: <CAGhGL2AHJZGm9=9zm9iAZDSPyX4NpMGfYMdjmw4SKrzyvbqKFA@mail.gmail.com>
References: <CAGhGL2AHJZGm9=9zm9iAZDSPyX4NpMGfYMdjmw4SKrzyvbqKFA@mail.gmail.com>
Date: Sun, 26 Oct 2014 15:05:47 -0700
Message-ID: <CAA93jw4VrWz=WeH-CopN7aZNCjVG5tYfDKgonWCGRqSeLQVKKw@mail.gmail.com>
From: Dave Taht <dave.taht@gmail.com>
To: Jim Gettys <jg@freedesktop.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: http://mailarchive.ietf.org/arch/msg/aqm/rf06S9EIXFCxcCVDEcf_T6O2Oi4
Cc: "aqm@ietf.org" <aqm@ietf.org>
Subject: Re: [aqm] Review of draft-hoeiland-joergensen-aqm-fq-codel-00
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 26 Oct 2014 22:05:52 -0000

We are working on revising the draft today, if anyone else has comments,
please let us know soonest.

On Tue, Sep 30, 2014 at 2:24 PM, Jim Gettys <jg@freedesktop.org> wrote:
> Looks pretty good to me. I took a pretty good read on the airplane west
> yesterday, and comments are below.
>                                   - Jim
> 1.
>
> "The rest of this document" is immediately followed by "The rest of this
> section".  That repetition is unfortunate, and the second is capitalized.

k.

> 1.2
>
> You should note again that fq_codel is not limited to hashing on f-tuples;
> just that the current implementation defaults to hashing those.

k.

>
> Noting that fq_codel gets really good performance is nice and usually better
> than most deployed ad-hoc classification schemes, but you should
> also note that if you want hard "guarantees" of performance, packet
> classification still has a role to play.  For example, multiplexing a
> control plane for a network over that network would be something that
> requires
> explicit classification to provide such guarantees.

I have thought about publishing something as to how this stuff has
mostly deployed, which is with a 3 or 4 level classification system, as
per the SQM, qos-scripts, and free.fr'd DRR + fq_codel system. As
well as what's currently in "cake".

I documented some of that here:

http://snapon.lab.bufferbloat.net/~d/draft-taht-home-gateway-best-practices-00.html

But as there is another working group entirely working on nailing down the
definitions of the diffserv codepoints and the expected behavior (not that
I agree that an insane level of detail is needed), and there hasn't
been much interest
here in what I've been calling "comprehensive queue management", I'm not
going to bother polishing that up before codel, fq-codel, and pie go
standards track.

There are even more complex classification schemes in play in
"streamboost", and I haven't torn apart the netgear X4's "dynamic QoS"
scheme.

>
> 4.1.2 Target
>
> Should the target be tuned to at least the transmission time of an aggregate
> on aggregating media (e.g. Cable, 802.11n, etc)?  I think so....
> but we can/should check with Kathy and Van....

No. We really don't have a good handle on what to be doing in the case of
aggregation, and it's not really the limiting factor on how codel or fq_codel
behave in cable and 802.11n. In the case of wireless in particular, it's not
the aggregation that kills you so much as the "taxi stand topology".

fq_codel, pie and codel work ok with a minimally buffered p2p wireless network
today, but could be much better.

> 4.1.3
>
> This default of 10240 packets bothers me.  That's of order 15,306,000 bytes.

It's potentally worse than that in the case of TSO/GSO, as the actual range of
packet size is 64 - 64k, not 64-1500 bytes.

Thankfully TSO/GSO offloads have not made it to the wifi stacks I'm aware
of yet, and there has been some work towards reducing the size of these
to sane values for the link rate.

> It should be computed on link speed, or maybe observed transmission rate.
> And should it be in bytes?

Given the 2+ years of experience we have with it now, I would definately argue
for a do-over that was purely byte limited. This restricts the in-use dynamic
range of the limit to the overhead of the skb (256 bytes usually) + the payload.

The limit is rather hard to hit, though, as usually the aqm algorithms kick in
long before it is hit.

openwrt patches the limit to 1000 packets. (and again, offloads are not commonly
in use on routers)

> How does one compute the "suitable size"?  So it's wrong by at least an
> order
> of magnitude for *most* current devices.
>
> We really don't want to discourage naive people on IoT devices to think
> that it will eat them out of house and home and therefore not to use
> fq_codel.
>
> At a minimum, we should note that this is an artifact of the current
> implementation, and note the shortcoming of the current implementation.
>
> Sigh.
>
> Where did our "no knobs" mantra get lost?  I guess Eric is too used to
> machines with 10G and faster ethernet interfaces.

Well, if the actual link speed were available to the qdisc layer (it isn't) the
limit could be sized appropriately.

>
> 5.3
>
> The "Jenkins" hash should be referenced.  Which one is used?

It appears to be a lookup3 from that suite, but I'd have to go look at the
code again...

>
> Possibly worth noting that some other hash could also be used (I presume
> it can be?), and why was the Jenkins hash chosen?

It was there. I have seen a few other hash variants, (like spookyhash),
but I'd like to try them on a typical dataset rather than an artificial one.

There is very little entropy in
the protocol portion of the tuple, for example.

I liked arris's 4 way set associative hash idea, btw.

And then of course, sch_fq uses a red/black tree and gets away from
hashing entirely,
with spectacular results. (on a server)

> 5.2
>
> "This is because otherwise the queue can" -> "Otherwise the queue could"
>
> The last paragraph does probability computations; IIRC, this was work
> done by Paul McKenney for SFQ.  It should be referenced.
>
> 6.2
>
> There are indenting problems in this section.
>
> 6.3
>
> "it cannot be easily retrofitted to devices".
>
> Cannot is a bit strong; some silicon may also have modifyable firmware.
>
> I'd say instead:
>
> "it may be impossible to retrofit devices that do most of their
> processing in silicon and lack space for timestamping"
>
> You say:
>    Also, timestamping functions in the core OS need to be very
>    efficient.
>
> Somehow we should make clear that perfection is not required.
> Timestamping to of order a millisecond is fine; certainly getting time
> at that resolution is sufficient, so long as the overhead to get the time at
> that resolution is very low.

Well, I'd have actually preferred that the full timestamp was preserved.
(the 64bit timestamp is presently shifted right 10 bits) and turned into
a 32 bit one....

It would make throwing a drop notification to userspace more "interesting"

Secondly, while a millisecond may well be enough at low rates,
at least one variant of codel will throw away a bunch of packets
with very similar timestamps at high rates when a high drop rate is
needed.

>
>
> 6.5
>
> You say:
>    Finally, FQ-CoDel drops packets from the largest flows sooner and
>    more accurately than CoDel alone, and it is more responsive to
>    changes in bandwidth, and in number of flows, than CoDel alone.
>
> Why is this true?  It's an assertion without explanation.  I'd be happier
> if there is a one/two sentence explanation.

I labored over that sentence for ages. A more complicated explanation
would not fit into the margins of this draft, and worthy of a paper all
by itself.

Let me stare into space on that for a while.
>
>
>
>
>
> _______________________________________________
> aqm mailing list
> aqm@ietf.org
> https://www.ietf.org/mailman/listinfo/aqm
>



-- 
Dave Täht

thttp://www.bufferbloat.net/projects/bloat/wiki/Upcoming_Talks