Re: [tcpPrague] [aqm] L4S status update

Jonathan Morton <> Tue, 29 November 2016 03:42 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id A43E51294F2; Mon, 28 Nov 2016 19:42:18 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.45
X-Spam-Status: No, score=-2.45 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id P2uW8ryP8TtH; Mon, 28 Nov 2016 19:42:16 -0800 (PST)
Received: from ( [IPv6:2a00:1450:4010:c07::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 5908012945A; Mon, 28 Nov 2016 19:42:16 -0800 (PST)
Received: by with SMTP id o141so11408362lff.1; Mon, 28 Nov 2016 19:42:16 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20120113; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=H38vsGm3lLVUb1pXAvPmM94o0lBUrJ8kxEo/85EYNM8=; b=zwHDBh39O2AP6Z0iISzJxWsJvxqWqPxdDkzISJlR7czMHScRMo+KRyqFGWYsMGLQ+v 8F31IvJfng0usvisKmcyYMG9Budo7d8c+sZRU7wRdcg0LCRYaUH84i2QzxMnR3ZwSXLK SSihx4P6qtNo02L/advWxgZqJIXGjsiCVqgD9XORUSOm55/2KcaXla7SWH1oq2jXdhwP a3nH+w3Jn9v0eSncGhX02ENTuOZTpync1zlvhaaqzBywOSAfTyKuA9hzpz1hCiGm2ab+ JD4/mDFfXTt+nWz5gFnb83ph0L2H2rb4hgFW2BayTHGHiyIMOQWGceF5rDfgeJOhH6mj hhRA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20130820; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=H38vsGm3lLVUb1pXAvPmM94o0lBUrJ8kxEo/85EYNM8=; b=CBE6HAV4injevT8ioO1oPlUlsTk7tDFci4dw7UM9iZjGSTiiSiLmoRgSzzdARse2tC jYG2LMHj3fXxtNIL3kcSozfv7ifYhbtsExte5VfPdvWhELGgMrXYmUwK95Xl4e6gyvU0 6lW4S6i90+GyqHgoCo+JPYv6b/m+tBzqdGi2dTmUpZau5Daop3sm/zThKimy74TlSAEv 3zisqwaNGVxDIPGrPCy3DV6QTUNE8tdiicn0tTsxDw5dG1bbxI6lG5HjTSYVe3ni89QC VDSOIEm8iKr3s6qOlE5EscpHXsJ2jSql6lHDNAgRl2GOYXypv5qNR3PVGBTgsZjnR2hG RH8g==
X-Gm-Message-State: AKaTC00zzr2W5vU/bbDppj0yashXmeh6YlNL6GJv60Ev8Ug5FExnE/JBFjC1TqHtO/vcHA==
X-Received: by with SMTP id s131mr12405559lja.26.1480390934462; Mon, 28 Nov 2016 19:42:14 -0800 (PST)
Received: from [] ( []) by with ESMTPSA id u63sm13156328lja.34.2016. (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 28 Nov 2016 19:42:13 -0800 (PST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Jonathan Morton <>
In-Reply-To: <>
Date: Tue, 29 Nov 2016 05:42:11 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <> <> <> <> <> <> <>
To: Matt Mathis <>
X-Mailer: Apple Mail (2.3124)
Archived-At: <>
Cc: tcpm IETF list <>, tsvwg IETF list <>, Bob Briscoe <>, TCP Prague List <>, "Bless, Roland (TM)" <>, AQM IETF list <>
Subject: Re: [tcpPrague] [aqm] L4S status update
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "To coordinate implementation and standardisation of TCP Prague across platforms. TCP Prague will be an evolution of DCTCP designed to live alongside other TCP variants and derivatives." <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 29 Nov 2016 03:42:19 -0000

> On 29 Nov, 2016, at 04:55, Matt Mathis <> wrote:
> Bob's point is that fq_anything forfeits any mechanism for an application or user to imply the value of the traffic by how much congestion they are willing to inflict on other traffic.

Yes, it does.

I actually consider that a good thing, because most applications will, given the choice, choose to inflict more congestion on other traffic in order to boost their own performance.  There are honourable exceptions, but it’s not a behaviour we can solely rely on.

For example, Steam uses between four and eight parallel TCP streams (I can’t figure out what the number depends on) to receive game updates, when one or two would already saturate most domestic Internet connections.  This magnifies the impact on other things the user might be doing with that connection, such as - ironically enough - playing multiplayer games.  You’d think Valve, of all companies, would keep that in mind.

> This concept is the foundation of ConEx and related technologies which could move the capacity allocation problem into the economic domain.

“Economic domain” only works if there is a financial cost borne by the causer of the congestion.  Good luck making that work, in a world where IoT device manufacturers don’t bear the costs of DDoSes launched through them.

> That said, fq_anything does not work at core router scale.


> Note that there are two views, each of which is self consistent:
> 1) You need fq_* to isolate flows; prioritization must be done with IP/TOS/DSCP bits; aggressive flows can't hurt other flows; low delays require that flows sharing a Q to be nice to each other and respond to AQM

…or that different flows are carefully kept in different queues.

Cake uses set-associative flow hashing to achieve flow isolation much more reliably than the current version of fq_codel.

Cake also applies Codel and BLUE in parallel, each covering a different AQM regime - BLUE takes over if and when Codel fails to control a particular queue.  If both Codel and BLUE fail to control the queue, Cake uses head-drops from the longest queue to remain within a total memory budget.  All of these avoid penalising well-behaved traffic whenever possible.

Cake also has mechanisms which consider “per host” fairness simultaneously with “per flow” fairness.  Incidentally, the Linux wifi got “per station airtime fairness” along with the fq_codel upgrade, achieving a similar aim by different means.

SFB uses a Bloom filter to a similar end, though that’s not strictly an FQ qdisc.

> 2) Uniform AQM/drop/mark per packet permits shared economic view of the value of the traffic (e.g. a price) ; traffic is prioritized by how aggressive of CC you choose; low delay [is/should be] a design property of the shared CC and AQM algorithms.

It’s fair to say that “uniform AQM” is the only mechanism available at the core level, mainly because there are too many flows there to treat individually.  But at that level, network engineering is supposedly all about providing sufficient link capacity so that AQM of any kind is unnecessary, because there is no congestion.  Still, plain AQM is a valid and potentially useful mitigation against transient overloads.

There are exceptions.  Some ISPs have been known to deliberately restrict peering capacity in certain directions to deliberately cause congestion to specific types of traffic, supposedly as a financial lever.  These ISPs would not be interested in applying AQM to reduce the impact of this deliberately-induced congestion.

I do have to ask, though, what protection this “shared economic view" provides against a single bulk flow which simply ignores all congestion signals?

> If you have a way to create proper incentives about congestion (e.g. price and chargeback), #2 is probably a strong system; if that fails #1 is probably stronger.
> Note that half solutions or solutions split between the models don't work. period.  Arguing about incomplete systems that are missing some of the parts is pointless because they don't work at some level (often layer 8 or 9).

Indeed.  I’m not aware of any “complete” systems in category 2 - and I don’t count “data caps” among them, unpopular though they are.

 - Jonathan Morton