Re: [tcpm] Faster application handshakes with SYN/ACK payloads

Joe Touch <touch@ISI.EDU> Mon, 04 August 2008 18:21 UTC

Return-Path: <>
Received: from [] (localhost []) by (Postfix) with ESMTP id 6EB3F28C1A8; Mon, 4 Aug 2008 11:21:37 -0700 (PDT)
Received: from localhost (localhost []) by (Postfix) with ESMTP id B3FED3A6923 for <>; Mon, 4 Aug 2008 11:21:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -0.265
X-Spam-Status: No, score=-0.265 tagged_above=-999 required=5 tests=[AWL=-2.334, BAYES_00=-2.599, FRT_STOCK2=3.988, SARE_ADLTOBFU=0.68]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id S2o8QdSjHJdn for <>; Mon, 4 Aug 2008 11:21:34 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id BA8BE3A691F for <>; Mon, 4 Aug 2008 11:21:34 -0700 (PDT)
Received: from [] ([]) by (8.13.8/8.13.8) with ESMTP id m74ILHZX012203 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Mon, 4 Aug 2008 11:21:20 -0700 (PDT)
Message-ID: <>
Date: Mon, 04 Aug 2008 11:20:39 -0700
From: Joe Touch <touch@ISI.EDU>
User-Agent: Thunderbird (Windows/20080708)
MIME-Version: 1.0
To: Adam Langley <>
References: <> <> <> <> <> <> <> <> <>
In-Reply-To: <>
X-Enigmail-Version: 0.95.6
X-ISI-4-43-8-MailScanner: Found to be clean
Subject: Re: [tcpm] Faster application handshakes with SYN/ACK payloads
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"; Format="flowed"

Hash: SHA1

Hi, Adam.

Adam Langley wrote:
| On Sun, Aug 3, 2008 at 2:13 PM, Joe Touch <> wrote:
|> | However, without SYNACK payloads, this banner would cost another round
|> | trip.
|> I have noted this before; an extra RTT doesn't do much of anything.
|> There are more delays in HTTP over TCP that incur round trips, e.g.,
|> dealing with interactions with sending an ACK every other segment, as
|> well as apps that fail to disable Nagle optimizations (as  they should
|> when they're interactive and have messages longer than a single byte).
| A extra round trip is very important. See the two quotes that I
| highlighted above for some evidence. At Google we work to shave off
| every millisecond and anything which required a whole extra round trip
| would, rightly, be dismissed immediately.

IMO, you are confusing TCP with RTP if that's the case. I appreciate the
idea of optimizing TCP, but there are so many other places where RTTs
are not under your control, and so many RTTs in a connection, that
removing (or adding) a single one anywhere seems in the noise.

Let's agree to disagree on that point, and let others weigh in. Moving on...

|> | With SYNACK payloads, we can add this very useful banner without add a
|> | round trip. However, servers can't just start including the banner
|> | whenever the server's kernel supports the SYNACK payload setsockopt
|> | because it would break all the existing clients who wouldn't know what
|> | to make of it.
|> This is where you've lost me. Server TCPs can send data in the SYNACK.
|> Clients that ACK just the SYN cause the server to retransmit; clients
|> that ACK the whole segment work as you intend.
|> What part if this isn't sufficient?
| Taking HTTP as an example, the very first bytes from the server must
| be the reply to the first request from the client. If servers started
| sending something else before hand, no matter what packet it's carried
| in, all existing HTTP clients would break.
| I'm suggesting that, if the ability to get a short banner from the
| server to the client in the SYNACK had been generally available when
| HTTP was created, the authors would have used it and the
| application-level protocol would have been different. The fact that it
| otherwise requires an extra round trip was strong reason for the
| client to start the exchange.
| However, since it wasn't availible, in order for some protocols (inc
| HTTP) to take advantage of such a capability we need an option to
| signal to both ends that they can use a slightly different application
| level protocol. At the server side, to add the extra banner, and at
| the client side to expect it.

That seems like - as you note - an application layer protocol issue. It
doesn't seem to have anything to do with TCP.

|> This code is confusing; is it a library that happens to be in the
|> kernel, or is it intended to be code inside TCP?
|> If this is intended inside TCP, you're having TCP insert data into the
|> user stream. That changes TCP semantics.
|> Can you explain which of the above applies, or some other interpretation
|> that explains the code without such inconsistencies?
| Imaging that we change the sockets interface such that applications
| get a callback when a SYN frame is received on a listening socket.
| Applications could inspect the SYN frame and choose to enqueue data to
| be written. Hopefully the kernel would send some of that data in the
| SYNACK, but that's unimportant here.

Why wouldn't the application just enqueue the data? Why does it need to
know that the other end supports this?

|> Did you actually try reducing the overal latency of a single TCP
|> connection by 1 RTT and see if it actually matters? If these apps are
|> using persistent connections, then the first RTT exchange is all they'll
|> see anyway.
| Yes. These are internal numbers, sadly, but a RTT really does matter.
| For well connected users, an RTT might be only 20ms to multi-homed
| service. That, I'll agree, is fairly inconsequential. But San
| Francisco <-> London is about 150ms and if you're in less well
| connected continents (like Africa), RTTs are hundreds of milliseconds
| to anywhere. An RTT really matters.

150ms is large when considered in the context of a small number of RTTs,
but in the context of a connection that lasts 10 RTTs, it's 10%; the
impact drops as you keep the connection going.

I.e., a reduction of a single RTT is important only when the total
exchange lasts 2-3 RTTs. That's 8KB-16KB of data - and it matters only
for the first chunk of data in a persistent connection. For web pages
with dozens of embedded components (in specific, more than 4, which is
the typical limit for simultaneous connections), again this seems very
much in the noise...

Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla -

tcpm mailing list