Re: [tcpm] Faster application handshakes with SYN/ACK payloads

On Sun, Aug 3, 2008 at 6:26 AM, Joe Touch <touch@isi.edu> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> As a minor note, it doesn't seem useful to extend that socket option to
> say how much data to put in the SYNACK; nothing else in TCP knows about
> application data boundaries, and I think it would be a mistake to assume
> that.

For protocols like SMTP where a constant banner is sent at the
beginning of every connection, that's true. As long as the data, if
not sent in the SYNACK, is enqueued to be sent anyway, that would
certainly work.

However, it appears that I've not made the case for the option yet. To
do this, let me take HTTP as an example.

HTTP is very latency sensitive[1]. Because of this, and guided by the
sockets API, the client starts the exchange. Thus, if the client
wishes to probe for optional features (like TLS upgrade: RFC2817) the
exchange works like this:
  Client --- OPTIONS ---> Server
  Client <--- 101 Switching --- Server
  Client --- GET ---> Server
  ...

Given the latency sensitivity, nobody actually probes for
opportunistic TLS like this.

If HTTP sent a banner, like SMTP servers, then my life would be a lot
easier! The banner could advertise all the extensions supported.
However, without SYNACK payloads, this banner would cost another round
trip.

With SYNACK payloads, we can add this very useful banner without add a
round trip. However, servers can't just start including the banner
whenever the server's kernel supports the SYNACK payload setsockopt
because it would break all the existing clients who wouldn't know what
to make of it.

Had SYNACK payloads been generally available when HTTP was created, we
would have ended up with a different protocol. So, to take advantage
of SYNACK payloads in HTTP we need to make a backwards incompatible
modification to the protocol. Thus, the option is to decide if this
different protocol is in effect. That's why both sides need to reach a
consistent view of it.

This same logic applies to any protocol that eschewed a server banner
for latency reasons. So the new application level exchange looks like:

Client <--- Banner --- Server
Client --- Request ---> Server
Client <--- Reply --- Server

But, because of SYNACK payloads, the number of round trips can be the same.

> As a final question (I'm not a sockets expert), it might be that there
> are other socket modifications to allow users to write to connections
> that are not yet open (or can they just be queued up? in which case, it
> seems like the socket must be completely bound, i.e., this might not
> work for unbound LISTENs).

This would be a major change to the sockets API. Additionally, it
would require waking up userspace processes for every SYN, making SYN
floods more effective.

To avoid this, I'm fixing the logic that the application can use and
moving it to the kernel. I propose that the logic be:

unsigned payload_length;
u8 *payload = applications_constant_payload(&payload_length);

if (include_random_nonce && payload_length >= 8) {
   u8 random[8];
   generate_random_bytes(random, 8);
   memcpy(payload, random, 8);
}

if (only_send_if_requested) {
  if (SA_PP_option_seen_in_syn) {
    write(payload, payload_length);
  }
} else {
  write(payload, payload_length);
}

> I see no good way to have the application tell TCP how many bytes to
> send, or to have TCP tell the app this. That is an API change that lets
> the app know about segment boundaries, and that's inconsistent with
> TCP's API semantics.

In correctness terms, you are correct. It doesn't matter when the
banner gets sent, or if it gets split over several packets. However,
applications would very much like to know how many bytes the kernel
will agree to transmit in reply to a SYN frame because, if a banner
would cost an extra round trip they might prefer not to include it.
It's because of this that the draft specifies the 64 bytes number.
This does expose aspects of segmentation to userspace. However, many
scarifies are made in the name of DDoS mitigation.

> | I'm proposing that a constant payload (optionally with 8 random bytes)
> | overcomes R1 and R3. And limiting the size of that payload to 64 bytes
> | overcomes R2.
>
> I don't understand the "8 random bytes" issue; the data path should be
> sending data from the app only, and that's not random.

The optional inclusion of random bytes covers the one case where
requiring a constant payload is problematic: cryptographic protocols
very often want to include a random nonce. Because of this, I'm making
an exception to the "constant payload" rule because the kernel can
easily fulfill this.

[1] (justification that HTTP is very latency sensitive:)

I have access to private data that confirms that user-perceived
latency is very important, but here are some public quotes to that
effect:

"In A/B tests (at Amazon.com), we tried delaying the page in
increments of 100 milliseconds and found that even very small delays
would result in substantial and costly drops in revenue."
(http://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html)

"would you like 10, 20, or 30 results (from a Google search). Users
unanimously wanted 30. But 10 did way better in A/B testing (30 was
20% worse) due to lower latency of 10 results. 30 is about twice the
latency of 10" (http://perspectives.mvdirona.com/2008/05/29/IO2008RoughNotesFromMarissaMayerDay2KeynoteAtGoogleIO.aspx)

Cheers,

AGL

-- 
Adam Langley agl@imperialviolet.org http://www.imperialviolet.org
_______________________________________________
tcpm mailing list
tcpm@ietf.org
https://www.ietf.org/mailman/listinfo/tcpm