Re: [tcpm] TCP Tuning for HTTP - update

Willy Tarreau <> Wed, 17 August 2016 18:13 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id C812112D845 for <>; Wed, 17 Aug 2016 11:13:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -8.168
X-Spam-Status: No, score=-8.168 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.247, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=unavailable autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id lNJdSPX7EmFo for <>; Wed, 17 Aug 2016 11:13:00 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id E534E12D5D3 for <>; Wed, 17 Aug 2016 11:12:59 -0700 (PDT)
Received: from lists by with local (Exim 4.80) (envelope-from <>) id 1ba5Gq-0002S3-FD for; Wed, 17 Aug 2016 18:08:56 +0000
Resent-Date: Wed, 17 Aug 2016 18:08:56 +0000
Resent-Message-Id: <>
Received: from ([]) by with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <>) id 1ba5Gk-0002PU-VK for; Wed, 17 Aug 2016 18:08:50 +0000
Received: from ([] by with esmtp (Exim 4.80) (envelope-from <>) id 1ba5Gj-00072n-4M for; Wed, 17 Aug 2016 18:08:50 +0000
Received: (from willy@localhost) by pcw.home.local (8.15.2/8.15.2/Submit) id u7HI82nu016788; Wed, 17 Aug 2016 20:08:02 +0200
Date: Wed, 17 Aug 2016 20:08:02 +0200
From: Willy Tarreau <>
To: Joe Touch <>
Cc: Mark Nottingham <>,, HTTP Working Group <>, Patrick McManus <>, Daniel Stenberg <>
Message-ID: <>
References: <> <> <> <> <> <>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <>
User-Agent: Mutt/1.6.0 (2016-04-01)
Received-SPF: pass client-ip=;;
X-W3C-Hub-Spam-Status: No, score=-5.5
X-W3C-Hub-Spam-Report: AWL=-0.575, BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: 1ba5Gj-00072n-4M 319c3bc1f0ed11828bc45607e7417e92
Subject: Re: [tcpm] TCP Tuning for HTTP - update
Archived-At: <>
X-Mailing-List: <> archive/latest/32288
Precedence: list
List-Id: <>
List-Help: <>
List-Post: <>
List-Unsubscribe: <>

On Wed, Aug 17, 2016 at 08:14:08AM -0700, Joe Touch wrote:
> There are many other sites - and books - that already indicate how to
> configure systems efficiently.
> So if your argument is that a man page summary is needed, sure - but
> again, is a new one needed? And why is this then needed as an RFC?

The difference is that a man page is OS-specific while an RFC gets a
unique number and serves as a reference. It can be cited in new RFCs
to justify certain choices. It can receive errata. You would probably
say that the current document is very linux-centric for now but I
understood it as a beginning of something more generic where advices
are given based on principles before sysctls and that the resulting
sysctls are just examples of applications of the principles in the

> > Also, I don't know if there have been any update, but these documents use
> > SunOS 4.1.3 running on a sparc 20 as a reference. While I used to love
> > working on such systems 20 years ago, they predate the web era and systems
> > have evolved a lot since to deal with high traffic. ...
> Yes, and discussing those issues would be useful - but not in this
> document either.

Why ? Lots of admins don't understand why the time_wait timeout remains
at 240 seconds on Solaris with people saying "if you want to be conservative
don't touch it but if you want to be modern simply shrink it to 30 seconds
or so". People need to understand why advices have changed over 3 decades.

> > So you need to expect that only researchers and maybe TCP stack developers
> > will find your work useful these days, server admins can hardly use this
> > anymore. However it is very possible that some TCP stacks have taken benefit
> > of your work to reach the level of performance they achieve right now, I
> > don't know. Thus I think that Daniel's work completes quite well what you've
> > done in that it directly addresses people's concerns without requiring the
> > scientific background.
> Let me see if I get your complete argument:
>     - the appropriate refs are 20 years old
>     - server admins need a doc
> What exactly do server admins need regarding Nagle (which is configured
> inside the app already), socket sizing (configured inside the app), etc?

Lots of things : 
  - time_wait tuning (which everybody gives different advices on, I've
    even seen firewall vendors recommend to shrink it to one second because
    it allowed their product to perform better in benchmarks)

  - TCP timestamps: what they provide, what are the risks (some people in
    banking environments refuse to enable them so that they cannot be used
    as an oracle to help in timing attacks).

  - window scaling : how much is needed.

  - socket sizing : contrary to what you write, there's a lot of tuning
    on the web where people set the default buffer sizes to 16MB without
    understanding the impacts when dealing with many sockets

  - SACK : why it's better. DSACK what it adds on top of SACK.

  - ECN : is it needed ? does it really work ? where does it cause issues ?

  - SYN cookies : benefits, risks

  - TCP reuse/recycling : benefits, risks

  - dealing with buffer bloat : tradeoffs between NIC-based acceleration
    and pacing

  - what are orphans and why you should care about them in HTTP close mode

  - TCP fastopen : how does it work, what type of workload is improved,
    what are the risks (ie: do not enable socket creation without cookie
    by default just because you find it reduces your server load)

  - whether to choose a short or a large SYN backlog depending on your
    workload (ie: do you prefer to process everything even if the dequeuing
    is expensive or to drop early in order to recover fast).

... and probably many other that don't immediately come to my mind. None
of these ones was a real issue 20 years ago. All of them became issues for
many web server admins who just copy-paste random settings from various
blogs found on the net who just copy the same stupidities over and over
resulting in the same trouble being caused to each of their reader.

> I.e., at the most this is a man page (specific to an OS). At the least,
> this isn't useful at all.

As you can see above, nothing I cited was OS-specific but only workload
specific. That's why I think that an informational RFC is much more suited
to this than an OS-specific man page. The OS man page may rely on the RFC
to propose various tuning profiles for different workloads however.