Re: [tcpm] Faster application handshakes with SYN/ACK payloads

"Adam Langley" <agl@imperialviolet.org> Sun, 03 August 2008 18:06 UTC

Return-Path: <tcpm-bounces@ietf.org>
X-Original-To: tcpm-archive@megatron.ietf.org
Delivered-To: ietfarch-tcpm-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 4E3153A6C1C; Sun, 3 Aug 2008 11:06:09 -0700 (PDT)
X-Original-To: tcpm@core3.amsl.com
Delivered-To: tcpm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 85D483A6C1F for <tcpm@core3.amsl.com>; Sun, 3 Aug 2008 11:06:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.49
X-Spam-Level:
X-Spam-Status: No, score=0.49 tagged_above=-999 required=5 tests=[AWL=-2.121, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, FRT_STOCK2=3.988, J_CHICKENPOX_57=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pQb-bLYsuG4S for <tcpm@core3.amsl.com>; Sun, 3 Aug 2008 11:06:06 -0700 (PDT)
Received: from rv-out-0506.google.com (rv-out-0506.google.com [209.85.198.238]) by core3.amsl.com (Postfix) with ESMTP id 9F7143A6A11 for <tcpm@ietf.org>; Sun, 3 Aug 2008 11:06:06 -0700 (PDT)
Received: by rv-out-0506.google.com with SMTP id b25so1487180rvf.49 for <tcpm@ietf.org>; Sun, 03 Aug 2008 11:06:32 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender :to:subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references :x-google-sender-auth; bh=XPN1xq89NqamBy3nDdEc/XE5449tudAvYKqeoVZclXA=; b=XAOOHCy2TlpPI4CKBMYeRUbkHDmcFO16mEuFyAt19RwUCEVsEn3R+bNfLrFE8heFJL u1MMvtI5liggStoMgO2XohcSGRF6vr8ZjLNRNHcvhw2otQDGKooL7yW0ba4DheDzpprK mGOtt6ZcEv+Ml7EverUumibRr8sGDoLliQGjg=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references:x-google-sender-auth; b=YUl3qLBjjT1U2ZjoJEmBSc7J7TV0r8iACamUZzDCJEfDhxVDzHchuoQXL+D0MMlVZe fbFio0OCJ3R8PyxFUFWo94QKQQzYYEbk8zhG8jkuQLYNxdWyGT2sF3kAizJ3pU/a/Ml5 zwMrGDGVmnC5zjdBFTsMSh4YaZsRsCQS1+UsQ=
Received: by 10.140.139.3 with SMTP id m3mr5741790rvd.44.1217786791275; Sun, 03 Aug 2008 11:06:31 -0700 (PDT)
Received: by 10.141.186.3 with HTTP; Sun, 3 Aug 2008 11:06:31 -0700 (PDT)
Message-ID: <396556a20808031106q18f6145cu7f6911ad8277d60c@mail.gmail.com>
Date: Sun, 3 Aug 2008 11:06:31 -0700
From: "Adam Langley" <agl@imperialviolet.org>
To: "Joe Touch" <touch@isi.edu>
In-Reply-To: <4895B1F0.3070102@isi.edu>
MIME-Version: 1.0
Content-Disposition: inline
References: <396556a20807311252j67b1ab26mf6511dbdae780fdd@mail.gmail.com> <48924496.9060907@isi.edu> <396556a20807311640w2b17d447ud0c51241dc84f682@mail.gmail.com> <48935337.5060205@isi.edu> <396556a20808021200u75c3bdd5h77c328a9b61f8d78@mail.gmail.com> <4895B1F0.3070102@isi.edu>
X-Google-Sender-Auth: a60a0fe791e2affc
Cc: tcpm@ietf.org
Subject: Re: [tcpm] Faster application handshakes with SYN/ACK payloads
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://www.ietf.org/mailman/private/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: tcpm-bounces@ietf.org
Errors-To: tcpm-bounces@ietf.org

On Sun, Aug 3, 2008 at 6:26 AM, Joe Touch <touch@isi.edu> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> As a minor note, it doesn't seem useful to extend that socket option to
> say how much data to put in the SYNACK; nothing else in TCP knows about
> application data boundaries, and I think it would be a mistake to assume
> that.

For protocols like SMTP where a constant banner is sent at the
beginning of every connection, that's true. As long as the data, if
not sent in the SYNACK, is enqueued to be sent anyway, that would
certainly work.

However, it appears that I've not made the case for the option yet. To
do this, let me take HTTP as an example.

HTTP is very latency sensitive[1]. Because of this, and guided by the
sockets API, the client starts the exchange. Thus, if the client
wishes to probe for optional features (like TLS upgrade: RFC2817) the
exchange works like this:
  Client --- OPTIONS ---> Server
  Client <--- 101 Switching --- Server
  Client --- GET ---> Server
  ...

Given the latency sensitivity, nobody actually probes for
opportunistic TLS like this.

If HTTP sent a banner, like SMTP servers, then my life would be a lot
easier! The banner could advertise all the extensions supported.
However, without SYNACK payloads, this banner would cost another round
trip.

With SYNACK payloads, we can add this very useful banner without add a
round trip. However, servers can't just start including the banner
whenever the server's kernel supports the SYNACK payload setsockopt
because it would break all the existing clients who wouldn't know what
to make of it.

Had SYNACK payloads been generally available when HTTP was created, we
would have ended up with a different protocol. So, to take advantage
of SYNACK payloads in HTTP we need to make a backwards incompatible
modification to the protocol. Thus, the option is to decide if this
different protocol is in effect. That's why both sides need to reach a
consistent view of it.

This same logic applies to any protocol that eschewed a server banner
for latency reasons. So the new application level exchange looks like:

Client <--- Banner --- Server
Client --- Request ---> Server
Client <--- Reply --- Server

But, because of SYNACK payloads, the number of round trips can be the same.

> As a final question (I'm not a sockets expert), it might be that there
> are other socket modifications to allow users to write to connections
> that are not yet open (or can they just be queued up? in which case, it
> seems like the socket must be completely bound, i.e., this might not
> work for unbound LISTENs).

This would be a major change to the sockets API. Additionally, it
would require waking up userspace processes for every SYN, making SYN
floods more effective.

To avoid this, I'm fixing the logic that the application can use and
moving it to the kernel. I propose that the logic be:

unsigned payload_length;
u8 *payload = applications_constant_payload(&payload_length);

if (include_random_nonce && payload_length >= 8) {
   u8 random[8];
   generate_random_bytes(random, 8);
   memcpy(payload, random, 8);
}

if (only_send_if_requested) {
  if (SA_PP_option_seen_in_syn) {
    write(payload, payload_length);
  }
} else {
  write(payload, payload_length);
}

> I see no good way to have the application tell TCP how many bytes to
> send, or to have TCP tell the app this. That is an API change that lets
> the app know about segment boundaries, and that's inconsistent with
> TCP's API semantics.

In correctness terms, you are correct. It doesn't matter when the
banner gets sent, or if it gets split over several packets. However,
applications would very much like to know how many bytes the kernel
will agree to transmit in reply to a SYN frame because, if a banner
would cost an extra round trip they might prefer not to include it.
It's because of this that the draft specifies the 64 bytes number.
This does expose aspects of segmentation to userspace. However, many
scarifies are made in the name of DDoS mitigation.

> | I'm proposing that a constant payload (optionally with 8 random bytes)
> | overcomes R1 and R3. And limiting the size of that payload to 64 bytes
> | overcomes R2.
>
> I don't understand the "8 random bytes" issue; the data path should be
> sending data from the app only, and that's not random.

The optional inclusion of random bytes covers the one case where
requiring a constant payload is problematic: cryptographic protocols
very often want to include a random nonce. Because of this, I'm making
an exception to the "constant payload" rule because the kernel can
easily fulfill this.


[1] (justification that HTTP is very latency sensitive:)

I have access to private data that confirms that user-perceived
latency is very important, but here are some public quotes to that
effect:

"In A/B tests (at Amazon.com), we tried delaying the page in
increments of 100 milliseconds and found that even very small delays
would result in substantial and costly drops in revenue."
(http://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html)

"would you like 10, 20, or 30 results (from a Google search). Users
unanimously wanted 30. But 10 did way better in A/B testing (30 was
20% worse) due to lower latency of 10 results. 30 is about twice the
latency of 10" (http://perspectives.mvdirona.com/2008/05/29/IO2008RoughNotesFromMarissaMayerDay2KeynoteAtGoogleIO.aspx)


Cheers,


AGL

-- 
Adam Langley agl@imperialviolet.org http://www.imperialviolet.org
_______________________________________________
tcpm mailing list
tcpm@ietf.org
https://www.ietf.org/mailman/listinfo/tcpm