[hybi] Proposal: A TLS-based handshake

Adam Barth <ietf@adambarth.com> Sun, 15 August 2010 20:02 UTC

Return-Path: <ietf@adambarth.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 4278D3A6817 for <hybi@core3.amsl.com>; Sun, 15 Aug 2010 13:02:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.428
X-Spam-Level:
X-Spam-Status: No, score=-1.428 tagged_above=-999 required=5 tests=[AWL=-0.051, BAYES_50=0.001, FM_FORGED_GMAIL=0.622, GB_I_LETTER=-2]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rqKki1ZWgGyo for <hybi@core3.amsl.com>; Sun, 15 Aug 2010 13:02:57 -0700 (PDT)
Received: from mail-yx0-f172.google.com (mail-yx0-f172.google.com [209.85.213.172]) by core3.amsl.com (Postfix) with ESMTP id E5AA53A6403 for <hybi@ietf.org>; Sun, 15 Aug 2010 13:02:55 -0700 (PDT)
Received: by yxp4 with SMTP id 4so1494395yxp.31 for <hybi@ietf.org>; Sun, 15 Aug 2010 13:03:32 -0700 (PDT)
Received: by 10.151.62.21 with SMTP id p21mr4747654ybk.170.1281902611930; Sun, 15 Aug 2010 13:03:31 -0700 (PDT)
Received: from mail-iw0-f172.google.com (mail-iw0-f172.google.com [209.85.214.172]) by mx.google.com with ESMTPS id v6sm3534435ybm.11.2010.08.15.13.03.30 (version=SSLv3 cipher=RC4-MD5); Sun, 15 Aug 2010 13:03:30 -0700 (PDT)
Received: by iwn3 with SMTP id 3so1019596iwn.31 for <hybi@ietf.org>; Sun, 15 Aug 2010 13:03:29 -0700 (PDT)
Received: by 10.231.33.67 with SMTP id g3mr4940599ibd.31.1281902609263; Sun, 15 Aug 2010 13:03:29 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.231.60.18 with HTTP; Sun, 15 Aug 2010 13:03:09 -0700 (PDT)
From: Adam Barth <ietf@adambarth.com>
Date: Sun, 15 Aug 2010 13:03:09 -0700
Message-ID: <AANLkTi=kWevqsnTfBkL+9j1mRaHath7rwPdC0pz=xuV1@mail.gmail.com>
To: Hybi <hybi@ietf.org>
Content-Type: text/plain; charset="ISO-8859-1"
Subject: [hybi] Proposal: A TLS-based handshake
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 15 Aug 2010 20:02:59 -0000

As requested by the chairs, please find below a proposal for a
TLS-based design for the WebSocket handshake.  I believe this
handshake to be technically superior to the current handshake design
in a number of ways:

1) We can have high confidence that this protocol is not vulnerable to
cross-protocol attacks.
2) This handshake has a higher success rate: 95% for this design
compared with 63% for the previous design [1].
3) This handshake is secure against active network attacks (aka,
man-in-the-middle attackers).
4) This handshake works in the presence of transparent proxies.
5) Contrary to popular belief, using TLS does not impose much
additional load on the server [2].

The handshake below is adapted from
<http://www.whatwg.org/specs/web-socket-protocol/>.  I've attempted to
keep the structure of the text the same to allow for easy comparison
with the previous handshake.  For example, I've left the original step
numbers.  Please excuse any typos.  I've omitted the server-side
version of the handshake, but I can provide it if it's unclear how
that works.

Kind regards,
Adam

[1] http://www.ietf.org/mail-archive/web/hybi/current/msg01605.html
[2] "In January this year (2010), Gmail switched to using HTTPS for
everything by default. Previously it had been introduced as an option,
but now all of our users use HTTPS to secure their email between their
browsers and Google, all the time. In order to do this we had to
deploy no additional machines and no special hardware. On our
production frontend machines, SSL/TLS accounts for less than 1% of the
CPU load, less than 10KB of memory per connection and less than 2% of
network overhead. Many people believe that SSL takes a lot of CPU time
and we hope the above numbers (public for the first time) will help to
dispel that."
-- http://www.imperialviolet.org/2010/06/25/overclocking-ssl.html


== TLS-based handshake ==

5.1.  Opening handshake

   When the user agent is to *establish a WebSocket connection* to a
   host /host/, on a port /port/, from an origin whose ASCII
   serialization is /origin/, and with a (possibly empty) list of
strings giving the
   /protocols/, it must run
   the following steps.  The /host/ must have been punycode-encoded
   already if necessary (i.e. it does not contain characters above
   U+007E).  The /origin/ must not contain characters in the range
   U+0041 to U+005A (i.e.  LATIN CAPITAL LETTER A to LATIN CAPITAL
   LETTER Z).  The various strings in /protocols/ must all
   be non-empty strings with characters in the range U+0021 to U+007E,
   and must all be unique.  [ORIGIN]

   1.   If the user agent already has a WebSocket connection to the
        remote host (IP address) identified by /host/, even if known by
        another name, wait until that connection has been established or
        for that connection to have failed.  If multiple connections to
        the same IP address are attempted simultaneously, the user agent
        must serialize them so that there is no more than one connection
        at a time running through the following steps.

        If the user agent cannot determine the IP address of the remote
        host (for example because all communication is being done
        through a proxy server that performs DNS queries itself), then
        the user agent must assume for the purposes of this step that
        each host name refers to a distinct remote host, but should
        instead limit the total number of simultaneous connections that
        are not established to a reasonably low number (e.g., in a Web
        browser, to the number of tabs the user has open).

        NOTE: This makes it harder for a script to perform a denial of
        service attack by just opening a large number of WebSocket
        connections to a remote host.  A server can further reduce the
        load on itself when attacked by making use of this by pausing
        before closing the connection, as that will reduce the rate at
        which the client reconnects.

        NOTE: There is no limit to the number of established WebSocket
        connections a user agent can have with a single remote host.
        Servers can refuse to connect users with an excessive number of
        connections, or disconnect resource-hogging users when suffering
        high load.

   2.   _Connect_: If the user agent is configured to use a proxy when
        using the WebSocket protocol to connect to host /host/ and/or
        port /port/, then connect to that proxy and ask it to open a TCP
        connection to the host given by /host/ and the port given by
        /port/.

           EXAMPLE: For example, if the user agent uses an HTTP proxy
           for all traffic, then if it was to try to connect to port 80
           on server example.com, it might send the following lines to
           the proxy server:

              CONNECT example.com:80 HTTP/1.1
              Host: example.com

           If there was a password, the connection might look like:

              CONNECT example.com:80 HTTP/1.1
              Host: example.com
              Proxy-authorization: Basic ZWRuYW1vZGU6bm9jYXBlcyE=

        Otherwise, if the user agent is not configured to use a proxy,
        then open a TCP connection to the host given by /host/ and the
        port given by /port/.

        NOTE: Implementations that do not expose explicit UI for
        selecting a proxy for WebSocket connections separate from other
        proxies are encouraged to use a SOCKS proxy for WebSocket
        connections, if available, or failing that, to prefer the proxy
        configured for HTTPS connections over the proxy configured for
        HTTP connections.

        For the purpose of proxy autoconfiguration scripts, the URL to
        pass the function must be constructed from /host/, /port/,
        /resource name/, and the /secure/ flag using the steps to
        construct a WebSocket URL.

        NOTE: The WebSocket protocol can be identified in proxy
        autoconfiguration scripts from the scheme ("ws:" for unencrypted
        connections and "wss:" for encrypted connections).

   3.   If the connection could not be opened, then fail the WebSocket
        connection and abort these steps.

   4.   Perform a TLS handshake over the
        connection.  If this fails (e.g. the server's certificate could
        not be verified), then fail the WebSocket connection and abort
        these steps.  Otherwise, all further communication on this
        channel must run through the encrypted tunnel.  [RFC2246]

        User agents must use the Server Name Indication extension in the
        TLS handshake.  [RFC4366]

        User agents must use the Next Protocol Negotiation extension in
        the TLS handshake, selecting the "776562736f636b6574" protocol
        ("websocket" in UTF-8).  [NPN]

        *TODO*: If we wish to have two levels of security, we could skip
        the server certificate check in this step for the less secure version.

   6.   Let /fields/ be an empty list of strings.

   13.  Add the string consisting of the concatenation of the string
        "Origin:", a U+0020 SPACE character, and the /origin/ value, to
        /fields/.

   14.  If the /protocols/ list is empty, then skip this step.

        Otherwise, add the string consisting of the concatenation of the
        string "Sec-WebSocket-Protocol:", a U+0020 SPACE character, and
        the strings in /protocols/, maintaining their relative order and
        each separated from the next by a single U+0020 SPACE character,
        to /fields/.

   24.  For each string in /fields/, in a random order: send the string,
        encoded as UTF-8, followed by a UTF-8-encoded U+000D CARRIAGE
        RETURN U+000A LINE FEED character pair (CRLF).  It is important
        that the fields be output in a random order so that servers not
        depend on the particular order used by any particular client.

        NOTE: Only fields explicitly mentioned in the requirements of
        this specification are sent.  For example, user agents do not
        send fields such as |User-Agent|, |Accept-Language|, or
        |Content-Type| in the WebSocket handshake.

   25.  Send a UTF-8-encoded U+000D CARRIAGE RETURN U+000A LINE FEED
        character pair (CRLF).

   31.  Let /fields/ be a list of name-value pairs, initially empty.

   32.  _Field_: Let /name/ and /value/ be empty byte arrays.

   33.  Read a byte from the server.

        If the connection closes before this byte is received, then fail
        the WebSocket connection and abort these steps.

        Otherwise, handle the byte as described in the appropriate entry
        below:

        -> If the byte is 0x0D (UTF-8 CR)
           If the /name/ byte array is empty, then jump to the fields
           processing step.  Otherwise, fail the WebSocket connection
           and abort these steps.

        -> If the byte is 0x0A (UTF-8 LF)
           Fail the WebSocket connection and abort these steps.

        -> If the byte is 0x3A (UTF-8 :)
           Move on to the next step.

        -> If the byte is in the range 0x41 to 0x5A (UTF-8 A-Z)
           Append a byte whose value is the byte's value plus 0x20 to
           the /name/ byte array and redo this step for the next byte.

        -> Otherwise
           Append the byte to the /name/ byte array and redo this step
           for the next byte.

        NOTE: This reads a field name, terminated by a colon, converting
        upper-case letters in the range A-Z to lowercase, and aborting
        if a stray CR or LF is found.

   34.  Let /count/ equal 0.

        NOTE: This is used in the next step to skip past a space
        character after the colon, if necessary.

   35.  Read a byte from the server and increment /count/ by 1.

        If the connection closes before this byte is received, then fail
        the WebSocket connection and abort these steps.

        Otherwise, handle the byte as described in the appropriate entry
        below:

        -> If the byte is 0x20 (UTF-8 space) and /count/ equals 1
           Ignore the byte and redo this step for the next byte.

        -> If the byte is 0x0D (UTF-8 CR)
           Move on to the next step.

        -> If the byte is 0x0A (UTF-8 LF)
           Fail the WebSocket connection and abort these steps.

        -> Otherwise
           Append the byte to the /value/ byte array and redo this step
           for the next byte.

        NOTE: This reads a field value, terminated by a CRLF, skipping
        past a single space after the colon if there is one.

   36.  Read a byte from the server.

        If the connection closes before this byte is received, or if the
        byte is not a 0x0A byte (UTF-8 LF), then fail the WebSocket
        connection and abort these steps.

        NOTE: This skips past the LF byte of the CRLF after the field.

   37.  Append an entry to the /fields/ list that has the name given by
        the string obtained by interpreting the /name/ byte array as a
        UTF-8 byte stream and the value given by the string obtained by
        interpreting the /value/ byte array as a UTF-8 byte stream.

   38.  Return to the "Field" step above.

   39.  _Fields processing_: Read a byte from the server.

        If the connection closes before this byte is received, or if the
        byte is not a 0x0A byte (UTF-8 LF), then fail the WebSocket
        connection and abort these steps.

        NOTE: This skips past the LF byte of the CRLF after the blank
        line after the fields.

   41.  If there is not
        exactly one entry in the /fields/ list whose name is "sec-
        websocket-origin", or if the
        /protocol/ was specified but there is not exactly one entry in
        the /fields/ list whose name is "sec-websocket-protocol", or if
        there are any entries in the /fields/ list whose names are the
        empty string, then fail the WebSocket connection and abort these
        steps.  Otherwise, handle each entry in the /fields/ list as
        follows:

        -> If the entry's name is "sec-websocket-origin"
           If the value is not exactly equal to /origin/, then fail the
           WebSocket connection and abort these steps.  [ORIGIN]

        -> If the entry's name is "sec-websocket-protocol"
           If the /protocols/ list was not empty, and the value is not
           exactly equal to one of the strings in the /protocols/ list,
           then fail the WebSocket connection and abort these steps.

           If the the value contains any characters outside the range
           U+0021 to U+007F, then fail the WebSocket connection and
           abort these steps.

           Otherwise, let the *selected WebSocket subprotocol* be the
           entry's value.

        -> Any other name
           Ignore it.

   47.  The *WebSocket connection is established*.  Now the user agent
        must send and receive to and from the connection as described in
        the next section.