Re: [quicwg/base-drafts] ACK generation recommendation (#3304)

mjoras <> Tue, 14 January 2020 23:19 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 4927712006D for <>; Tue, 14 Jan 2020 15:19:35 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -8
X-Spam-Status: No, score=-8 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id Hb2IGGHSfTPe for <>; Tue, 14 Jan 2020 15:19:31 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id AA6CE120046 for <>; Tue, 14 Jan 2020 15:19:31 -0800 (PST)
Date: Tue, 14 Jan 2020 15:19:30 -0800
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=pf2014; t=1579043970; bh=m0xgnmoD2QcOKGeqfu9Xz5bA0jrR2D7PD4xEbnEc0vI=; h=Date:From:Reply-To:To:Cc:In-Reply-To:References:Subject:List-ID: List-Archive:List-Post:List-Unsubscribe:From; b=nH/V1mi8TB8dLfff0yGcF8b4WdUw+y/87pyneIgLP8mLgwjxCfSCt1lphxYZT5xcz 6JFxwvDH+66uqaHzkczMXiDWpJMf9dmCUis/S9jYEjj5/ow+g1a0qZHnJ2OMEFTit6 8FMASRSgTl3PJGf4ppZNqwcajMqXQqkZ8Jk1pzbU=
From: mjoras <>
Reply-To: quicwg/base-drafts <>
To: quicwg/base-drafts <>
Cc: Subscribed <>
Message-ID: <quicwg/base-drafts/issues/3304/>
In-Reply-To: <quicwg/base-drafts/issues/>
References: <quicwg/base-drafts/issues/>
Subject: Re: [quicwg/base-drafts] ACK generation recommendation (#3304)
Mime-Version: 1.0
Content-Type: multipart/alternative; boundary="--==_mimepart_5e1e4c82a4c47_7c563fcef84cd9641033bb"; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Precedence: list
X-GitHub-Sender: mjoras
X-GitHub-Recipient: quic-issues
X-GitHub-Reason: subscribed
X-Auto-Response-Suppress: All
Archived-At: <>
X-Mailman-Version: 2.1.29
List-Id: Notification list for GitHub issues related to the QUIC WG <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 14 Jan 2020 23:19:35 -0000

Some tests, as promised, cc @janaiyengar @dtikhonov.

The first test is using an "iperf-style" single stream, single conn, single thread on client and server. The server uses BBR as the congestion controller. The server is an infinite source and the client is an infinite sink and the test runs for 60s and measures the average throughput.

The first situation is using the loopback interface on Linux, standard MTU size, and no introduced delay or loss. Changing the ACK generation interval from 10 -> 2, we observe a **20% relative decrease in throughput**.

Using the same test but with a 15ms netem delay with mild loss and the results are more dramatic. The ACK ranges expand considerably, which ends up being a significant cost for both the client and the server which results in a **50% relative decrease in throughput**.

These tests don't really reflect how most people plan to use QUIC (on the internet, not with multi-Gbps sustained transfers), but I believe they are illustrative of the costs we're dealing with. Note that the vast majority of the profiled stacks for mvfst in these tests are spent in sendmsg, recvmsg, crypto, and serializing the QUIC frames. There are still some opportunities for us to micro-optimize our implementation, but relative to the "fixed" costs most implementations are paying, it is likely only a few percentage reductions here and there. ACK handling, for example, which is very implementation-dependent, ends up being **less costly to the server** than writing ACK frames when the ACK interval is 2 versus 10.

We also have a way to test this using a real reverse proxy with synthetic traffic. This particular set up uses a **real** transatlantic backbone link with minimal loss. The link has an RTT of ~100ms, and the server is using BBR as the congestion controller. Two tests are of interest, one is generally not CPU bound for the server while the other is. The one typically not CPU bound is many clients each requesting one 1MB resource. The test which typically becomes CPU bound is many clients each requesting ten 1MB resources.

For one 1MB resource per client the ACK interval from 2 -> 10 actually **increases RPS by about 20%**. I think this is largely due to congestion controller benefits from a higher ack frequency (note that this does **does not** start out by acking every other, it starts out acking every 10).

For the ten 1MB resources per client case we see the dramatic results again, where the increased ACK frequency causes a **60-70% drop** in RPS. In this case it seems that the client is the one that's really causing the regression by having to ACK much more frequently. 

All of this is to say, I think there are dragons lurking here. We've had good success with the heuristic of ACKing every 10 combined with 1/4 or 1/8 RTT. After some thought I don't think including this (and the "after 100" optimization) as the default recommendation is problematic, as long as we have some nice language conveying the basics of the tradeoffs at play here.

You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub: