Re: [TLS] Last Call: draft-ietf-tls-renegotiation

Marsh Ray <marsh@extendedsubset.com> Fri, 04 December 2009 05:17 UTC

Return-Path: <marsh@extendedsubset.com>
X-Original-To: tls@core3.amsl.com
Delivered-To: tls@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 3D56D3A68E8 for <tls@core3.amsl.com>; Thu, 3 Dec 2009 21:17:31 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.507
X-Spam-Level:
X-Spam-Status: No, score=-2.507 tagged_above=-999 required=5 tests=[AWL=0.092, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dfCRPLZ3gvKZ for <tls@core3.amsl.com>; Thu, 3 Dec 2009 21:17:30 -0800 (PST)
Received: from mho-02-ewr.mailhop.org (mho-02-ewr.mailhop.org [204.13.248.72]) by core3.amsl.com (Postfix) with ESMTP id 295623A68F2 for <tls@ietf.org>; Thu, 3 Dec 2009 21:17:30 -0800 (PST)
Received: from xs01.extendedsubset.com ([69.164.193.58]) by mho-02-ewr.mailhop.org with esmtpa (Exim 4.68) (envelope-from <marsh@extendedsubset.com>) id 1NGQXh-000IIZ-7l; Fri, 04 Dec 2009 05:17:21 +0000
Received: from [127.0.0.1] (localhost [127.0.0.1]) by xs01.extendedsubset.com (Postfix) with ESMTP id 1D5CE603C; Fri, 4 Dec 2009 05:17:20 +0000 (UTC)
X-Mail-Handler: MailHop Outbound by DynDNS
X-Originating-IP: 69.164.193.58
X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/mailhop/outbound_abuse.html for abuse reporting information)
X-MHO-User: U2FsdGVkX1+5tmWJRxgG9CwsTvFMmVwBfkIGzeVi8ho=
Message-ID: <4B189B5D.5030907@extendedsubset.com>
Date: Thu, 03 Dec 2009 23:17:17 -0600
From: Marsh Ray <marsh@extendedsubset.com>
User-Agent: Thunderbird 2.0.0.23 (Windows/20090812)
MIME-Version: 1.0
To: Michael D'Errico <mike-list@pobox.com>
References: <4B181209.5090507@extendedsubset.com> <200912032023.nB3KNIS2008868@fs4113.wdf.sap.corp> <e8c553a60912031332y7feeaeddx885e729710f1c0ec@mail.gmail.com> <4B18314E.3000002@pobox.com> <4B1873A3.7060900@jacaranda.org> <4B187DFF.2070708@pobox.com>
In-Reply-To: <4B187DFF.2070708@pobox.com>
X-Enigmail-Version: 0.96.0
OpenPGP: id=1E36DBF2
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Cc: tls@ietf.org
Subject: Re: [TLS] Last Call: draft-ietf-tls-renegotiation
X-BeenThere: tls@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <tls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tls>, <mailto:tls-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tls>
List-Post: <mailto:tls@ietf.org>
List-Help: <mailto:tls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tls>, <mailto:tls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Dec 2009 05:17:31 -0000

Michael D'Errico wrote:
> 
> If someone plans to make a device to detect the RI extension (possibly
> as part of a firewall or similar), then the closer it is to the front
> of the list, the less delay will be added to the processing of the
> connection, and the more connections can be handled per second.  We
> definitely don't want our fix to slow down the establishment of all
> secure connections by any significant amount.

We are talking about iterating bits of data that generally comes in over
a network, right?

Let's say the IETF gets really creative and we end up with of 20
extensions on every Client Hello. The order is well-randomized, so the
server has to "wade through" an average of 10 extensions to find the RI.
Some of these extensions are really, really big (several KB). So big
that for half of them the beginning of the next one hasn't even been
fetched into the data cache yet.

So it's reasonable to assume that the cost will be predominated by five
cold access to slow DRAM at 200ns each. Let's see that comes to... 1 us!
Well a microsecond is nothing to sneeze at.

During this microsecond many modern CPUs can be processing bits of other
connections (that do happen to be in the data cache), so some of that
might be hidden in the total throughput. Not to mention that presumably
there will be other reasons to "wade through" the extensions like if
somebody actually wants to process one for something useful. (For
comparison, my modern CPU takes about 18 ms to do an RSA in the
handshake, but I know dedicated hardware is much much faster).

Well, we don't know how long the device takes to process a handshake
without extensions, so it's hard to say what impact randomized ordering
will have.

We do know from experience that many devices end up being network-bound
rather that CPU- or memory-access-bound. It looks like there's a
two-byte length for all extensions, which strongly implies that the
maximum extension data is 64KB. Let's assume those five memory accesses
were over 16KB of data (data that somehow didn't make it into the cache).

What speed of network is necessary to be only 10 times slower than us
wading through this morass of extensions? 16KB*8/1us/10 ... I get about
13G bit/s.

So my rough estimate is that an implementation using a conventional
memory architecture could saturate a full-duplex GigE link doing nothing
but exchanging hellos loaded-out with extensions and still have 10-100x
of headroom.

This assumes of course that magic hardware makes all the other
calculations instantaneous. To the extent that assumption is false, the
overhead of wading through the extensions becomes rapidly even less
significant.

Now what happens if 90% of implementors take your recommendation and put
the RI first in the list every time (for optimum performance of course)?

A few things happen:

1. Some development group accidentally writes code that makes an
assumption that the ordering is consistent (possibly they aren't native
speakers of RFC 2119).

2. A Testing and Verification "QA" department tests their product's
interoperability with four different independent implementations. They
don't realize that all four took the recommendation on ordering, but
others that they didn't test with use some other ordering.

3. Two percent of the servers on the internet end up hanging the
connection in a slow failure mode when they get a client hello with
sub-optimally ordered extensions.

4. One of the top four major web browsers sends extensions in some
arbitrary data-dependent order. The RI extension ends up with about a
33% chance of being first. It takes a minor research effort to figure
out why 1.33% of connections end up falling back to extensionless SSLv3.

5. The hello extension mechanism gets a permanent reputation as "not
ready to require the use of on the real internet".

6. Consequently, the SNI extension is unsuitable for deployment by
shared hosting providers, so every HTTPS site needs its own IP. The
world develops an acute shortage of routable IPv4 addresses when
Twitbook buys Verisign and convinces everyone that they need a
personalized certificate for their page. Everyone switches to IPv6 (ok
now it's getting a bit far-fetched).

And last but not least:

6. Some big bloated client hello message somewhere gets processed 1
microsecond faster.

- Marsh