Re: [tcpm] Faster application handshakes with SYN/ACK payloads

Joe Touch <touch@ISI.EDU> Sun, 03 August 2008 21:14 UTC

Return-Path: <tcpm-bounces@ietf.org>
X-Original-To: tcpm-archive@megatron.ietf.org
Delivered-To: ietfarch-tcpm-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 47BF83A6C5E; Sun, 3 Aug 2008 14:14:43 -0700 (PDT)
X-Original-To: tcpm@core3.amsl.com
Delivered-To: tcpm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 5795628C1D6 for <tcpm@core3.amsl.com>; Sun, 3 Aug 2008 14:14:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.689
X-Spam-Level: *
X-Spam-Status: No, score=1.689 tagged_above=-999 required=5 tests=[AWL=-0.300, BAYES_00=-2.599, FRT_STOCK2=3.988, J_CHICKENPOX_57=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id g7wpnqCcc8Gx for <tcpm@core3.amsl.com>; Sun, 3 Aug 2008 14:14:40 -0700 (PDT)
Received: from vapor.isi.edu (vapor.isi.edu [128.9.64.64]) by core3.amsl.com (Postfix) with ESMTP id A56533A6C5E for <tcpm@ietf.org>; Sun, 3 Aug 2008 14:14:39 -0700 (PDT)
Received: from [192.168.2.121] (bolongo.jbtelenet.com [204.11.152.116]) by vapor.isi.edu (8.13.8/8.13.8) with ESMTP id m73LEYZP008244 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Sun, 3 Aug 2008 14:14:37 -0700 (PDT)
Message-ID: <48961F94.2040507@isi.edu>
Date: Sun, 03 Aug 2008 14:13:56 -0700
From: Joe Touch <touch@ISI.EDU>
User-Agent: Thunderbird 2.0.0.16 (Windows/20080708)
MIME-Version: 1.0
To: Adam Langley <agl@imperialviolet.org>
References: <396556a20807311252j67b1ab26mf6511dbdae780fdd@mail.gmail.com> <48924496.9060907@isi.edu> <396556a20807311640w2b17d447ud0c51241dc84f682@mail.gmail.com> <48935337.5060205@isi.edu> <396556a20808021200u75c3bdd5h77c328a9b61f8d78@mail.gmail.com> <4895B1F0.3070102@isi.edu> <396556a20808031106q18f6145cu7f6911ad8277d60c@mail.gmail.com>
In-Reply-To: <396556a20808031106q18f6145cu7f6911ad8277d60c@mail.gmail.com>
X-Enigmail-Version: 0.95.6
X-ISI-4-43-8-MailScanner: Found to be clean
X-MailScanner-From: touch@isi.edu
Cc: tcpm@ietf.org
Subject: Re: [tcpm] Faster application handshakes with SYN/ACK payloads
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://www.ietf.org/mailman/private/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Sender: tcpm-bounces@ietf.org
Errors-To: tcpm-bounces@ietf.org

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi, Adam,

Adam Langley wrote:
| On Sun, Aug 3, 2008 at 6:26 AM, Joe Touch <touch@isi.edu> wrote:
|> -----BEGIN PGP SIGNED MESSAGE-----
|> As a minor note, it doesn't seem useful to extend that socket option to
|> say how much data to put in the SYNACK; nothing else in TCP knows about
|> application data boundaries, and I think it would be a mistake to assume
|> that.
|
| For protocols like SMTP where a constant banner is sent at the
| beginning of every connection, that's true. As long as the data, if
| not sent in the SYNACK, is enqueued to be sent anyway, that would
| certainly work.
|
| However, it appears that I've not made the case for the option yet. To
| do this, let me take HTTP as an example.
|
| HTTP is very latency sensitive[1].

This has been known for a while ;-)

e.g.: John Heidemann, Katia Obraczka, and Joe Touch, "Modeling the
Performance of HTTP Over Several Transport Protocols", IEEE/ACM
Transactions on Networking, V5, N5, Oct. 1997, pp.616-630.

| Because of this, and guided by the
| sockets API, the client starts the exchange. Thus, if the client
| wishes to probe for optional features (like TLS upgrade: RFC2817) the
| exchange works like this:
|   Client --- OPTIONS ---> Server
|   Client <--- 101 Switching --- Server
|   Client --- GET ---> Server
|   ...
|
| Given the latency sensitivity, nobody actually probes for
| opportunistic TLS like this.
|
| If HTTP sent a banner, like SMTP servers, then my life would be a lot
| easier! The banner could advertise all the extensions supported.

That sounds like a useful thing to raise to the httpbis WG, FWIW.

| However, without SYNACK payloads, this banner would cost another round
| trip.

I have noted this before; an extra RTT doesn't do much of anything.
There are more delays in HTTP over TCP that incur round trips, e.g.,
dealing with interactions with sending an ACK every other segment, as
well as apps that fail to disable Nagle optimizations (as  they should
when they're interactive and have messages longer than a single byte).

| With SYNACK payloads, we can add this very useful banner without add a
| round trip. However, servers can't just start including the banner
| whenever the server's kernel supports the SYNACK payload setsockopt
| because it would break all the existing clients who wouldn't know what
| to make of it.

This is where you've lost me. Server TCPs can send data in the SYNACK.
Clients that ACK just the SYN cause the server to retransmit; clients
that ACK the whole segment work as you intend.

What part if this isn't sufficient?

| Had SYNACK payloads been generally available when HTTP was created, we
| would have ended up with a different protocol.

Please explain. From the app point of view, *nothing changes*. TCP is a
stream oriented protocol, and ***NOTHING*** the app does necessarily
correlates to particular segments. The app sees the data get there.

I.e., apps that require any sort of realtime behavior ought not be using
TCP.

...
|> As a final question (I'm not a sockets expert), it might be that there
|> are other socket modifications to allow users to write to connections
|> that are not yet open (or can they just be queued up? in which case, it
|> seems like the socket must be completely bound, i.e., this might not
|> work for unbound LISTENs).
|
| This would be a major change to the sockets API.

I agree, but isn't that where this is going? If not, how do you expect
it to be used?

...
| To avoid this, I'm fixing the logic that the application can use and
| moving it to the kernel. I propose that the logic be:
|
| unsigned payload_length;
| u8 *payload = applications_constant_payload(&payload_length);
|
| if (include_random_nonce && payload_length >= 8) {
|    u8 random[8];
|    generate_random_bytes(random, 8);
|    memcpy(payload, random, 8);
| }
|
| if (only_send_if_requested) {
|   if (SA_PP_option_seen_in_syn) {
|     write(payload, payload_length);
|   }
| } else {
|   write(payload, payload_length);
| }

This code is confusing; is it a library that happens to be in the
kernel, or is it intended to be code inside TCP?

If it's just a library, it doesn't matter where it is. To TCP, the
"write" calls are its "application". There is no such thing as a payload
length at that point; the app has no knowledge of how writes correspond
to payload entities.

If this is intended inside TCP, you're having TCP insert data into the
user stream. That changes TCP semantics.

Can you explain which of the above applies, or some other interpretation
that explains the code without such inconsistencies?

|> I see no good way to have the application tell TCP how many bytes to
|> send, or to have TCP tell the app this. That is an API change that lets
|> the app know about segment boundaries, and that's inconsistent with
|> TCP's API semantics.
|
| In correctness terms, you are correct. It doesn't matter when the
| banner gets sent, or if it gets split over several packets. However,
| applications would very much like to know how many bytes the kernel
| will agree to transmit in reply to a SYN frame because, if a banner
| would cost an extra round trip they might prefer not to include it.

***WHY???***

I know I'm repeating myself here, but ***TCP IS NOT A REALTIME
PROTOCOL***, and there is *no correlation between user writes and
segment boundaries*.

...
| [1] (justification that HTTP is very latency sensitive:)
|
| I have access to private data that confirms that user-perceived
| latency is very important, but here are some public quotes to that
| effect:
|
| "In A/B tests (at Amazon.com), we tried delaying the page in
| increments of 100 milliseconds and found that even very small delays
| would result in substantial and costly drops in revenue."
| (http://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html)
|
| "would you like 10, 20, or 30 results (from a Google search). Users
| unanimously wanted 30. But 10 did way better in A/B testing (30 was
| 20% worse) due to lower latency of 10 results. 30 is about twice the
| latency of 10"
(http://perspectives.mvdirona.com/2008/05/29/IO2008RoughNotesFromMarissaMayerDay2KeynoteAtGoogleIO.aspx)

Did you actually try reducing the overal latency of a single TCP
connection by 1 RTT and see if it actually matters? If these apps are
using persistent connections, then the first RTT exchange is all they'll
see anyway.

Joe
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkiWH5QACgkQE5f5cImnZrti5ACgvMuGJimDVOxm+vVPYl1OjYGK
aKQAn2Qid+EPficLglJ+AiXItYQNs0FS
=1lzN
-----END PGP SIGNATURE-----
_______________________________________________
tcpm mailing list
tcpm@ietf.org
https://www.ietf.org/mailman/listinfo/tcpm