Re: [hybi] A WebSocket handshake

Maciej Stachowiak <mjs@apple.com> Fri, 08 October 2010 21:09 UTC

Return-Path: <mjs@apple.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 707A63A6952 for <hybi@core3.amsl.com>; Fri, 8 Oct 2010 14:09:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.606
X-Spam-Level:
X-Spam-Status: No, score=-106.606 tagged_above=-999 required=5 tests=[AWL=-0.007, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CHpQ-re3Qo8M for <hybi@core3.amsl.com>; Fri, 8 Oct 2010 14:09:09 -0700 (PDT)
Received: from mail-out3.apple.com (mail-out.apple.com [17.254.13.22]) by core3.amsl.com (Postfix) with ESMTP id D1A3D3A6964 for <hybi@ietf.org>; Fri, 8 Oct 2010 14:09:08 -0700 (PDT)
Received: from relay16.apple.com (relay16.apple.com [17.128.113.55]) by mail-out3.apple.com (Postfix) with ESMTP id 58AC5AE11F9C for <hybi@ietf.org>; Fri, 8 Oct 2010 14:10:14 -0700 (PDT)
X-AuditID: 11807137-b7b12ae000001ea3-28-4caf88b63352
Received: from et.apple.com (et.apple.com [17.151.62.12]) by relay16.apple.com (Apple SCV relay) with SMTP id 0B.21.07843.6B88FAC4; Fri, 8 Oct 2010 14:10:14 -0700 (PDT)
MIME-version: 1.0
Content-type: text/plain; charset="windows-1252"
Received: from [17.151.102.89] by et.apple.com (Sun Java(tm) System Messaging Server 6.3-7.04 (built Sep 26 2008; 32bit)) with ESMTPSA id <0L9Z00017PH11L70@et.apple.com> for hybi@ietf.org; Fri, 08 Oct 2010 14:10:14 -0700 (PDT)
From: Maciej Stachowiak <mjs@apple.com>
In-reply-to: <AANLkTimQ5x-v+Mz_OHrNDdtVd94E+HOBWwo3_f1ktEeg@mail.gmail.com>
Date: Fri, 08 Oct 2010 14:10:13 -0700
Content-transfer-encoding: quoted-printable
Message-id: <AB43D171-AC38-47CE-BDC7-401E6D782622@apple.com>
References: <AANLkTimQ5x-v+Mz_OHrNDdtVd94E+HOBWwo3_f1ktEeg@mail.gmail.com>
To: Adam Barth <ietf@adambarth.com>
X-Mailer: Apple Mail (2.1081)
X-Brightmail-Tracker: AAAAAA==
Cc: Hybi <hybi@ietf.org>
Subject: Re: [hybi] A WebSocket handshake
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Oct 2010 21:09:11 -0000

The following comments are not informed by the follow-ups to this thread (I haven't had a chance to read them yet) but I wanted to give my comments ASAP.

Here's some issues (mostly non-security, but still important IMO):

1) There doesn't seem to be a provision for passing a resource path from the client to the server, since the CONNECT method does not include a path. This is a problem because:

   a) It precludes having multiple independent WebSocket services on the same host and port. That seems like a problem, since most ports are effectively unavailable on the public internet.
   b) The WebSocket client API and the ws: URI scheme are based on the premise of identifying a specific WebSocket resource on a server.

This could be addressed by exchanging further messages after the handshake to identify the resource, or using the provision for metadata, but:

   i. This potentially adds round trips.
   ii. It would make life harder for meta-servers that serve both HTTP and multiple WebSocket services and dispatch internally - they would have to actually understand WebSocket protocol enough to read message frames before they could do the dispatch.

2) Is there a way for this handshake to pass through non-transparent intermediaries (i.e. an explicitly set proxy server)? At least as specified, it seems like it would always fail, which is unfortunate if the non-transparent intermediary could work.

3) Is there a way for this handshake to pass through transparent intermediaries that are aware of the WebSocket protocol? Doesn't seem like it. It would be unfortunate to cause a hard failure in all such cases.

4) Nothing in the handshake identifies the target host. This seems like a problem because:

   a) It makes it effectively impossible to offer a WebSocket service on a shared virtual host. I gather that virtual hosting scenarios were one of the attack vectors considered, but failing to address this use case doesn't seem like a good solution.
   b) It exacerbates the risk of DNS rebinding attacks. I don't know offhand if a DNS rebinding attack against WebSocket would do anything bad, but sending and checking an explicit host identifier would be a total defense, so it seems unfortunate that this is missing.

(Maybe it is intended that the host is identified e.g. via a header specifying the target Origin, but this is not spelled out).


I like the general approach (based on CONNECT and encrypting the data stream), it seems more robust than depending on the details of error handling code.

Regards,
Maciej



On Oct 5, 2010, at 3:15 PM, Adam Barth wrote:

> Please find below a proposal for a new WebSocket handshake.  The
> handshake attempts to combine the benefits of the HTTP handshake with
> the benefits of a TLS-based handshake.  The handshake incorporates
> ideas from a number of the other handshakes discussed previously,
> including those from Maciej Stachowiak, Ian Hickson, and Greg Wilkins.
> In addition to proposing a handshake, the document also contains a
> threat model and a security analysis.  Feedback appreciated.
> 
> Kind regards,
> Adam
> 
> 
> Pretty HTML version:
> 
> https://docs0.google.com/document/edit?id=1hRLcVc8FHsXOQvaulG2KmvGKepgFffcevyJn-dAEsrI&hl=en&authkey=COOWhaAD&pli=1
> 
> Not-so-pretty text version:
> 
> = A WebSocket Handshake =
> 
> Adam Barth
> Eric Rescorla
> October 5, 2010
> 
> == Introduction ==
> 
> This document describes a handshake for the WebSocket protocol that
> resists cross-protocol attacks.  The handshake sends a fixed sequence
> of bytes and a random nonce from the client to the server to establish
> two keys for a bidirectional encrypted tunnel, which the parties then
> use for further communication.  Although an eavesdropper can determine
> the encryption keys, computing the keys requires knowledge of a
> globally unique identifier, making it unlikely that an observer
> unfamiliar with the the WebSocket protocol will interpret the
> encrypted bytes on the wire as anything other than random bytes.
> Before explaining the handshake, we present a model of the threats
> posed by exposing a new network protocol to untrusted content running
> in a web browser.  We then work through some simple handshake designs
> to build intuition for what can go wrong in a flawed design.
> 
> == Threats ==
> 
> In this document, we evaluate the risks posed by exposing the
> WebSocket protocol to untrusted web content in a standard web browser.
> We make the usual assumption in web security that the user visits the
> attacker’s web site.
> 
> Web browsers already expose an HTTP-based networking facility to
> untrusted web content.  In designing WebSockets, we are concerned with
> the additional risks incurred by granting the attacker additional
> network privileges.  We are chiefly concerned with three scenarios:
> 
> 1) The attacker uses the WebSocket protocol to attack a server that
> does not support the WebSocket protocol.  In this scenario, we are
> concerned with protecting a wide variety of servers that implement a
> wide variety of protocols.
> 
>    a) We do not assume the server implements any particular protocol
> exactly according to its specification.  Instead, we aim for “real
> world” security in which servers might have a number of common bugs.
> 
>    b) We do not assume the server uses a strong authentication
> scheme.  In particular, we are concerned with protecting servers that
> rely on connectivity alone for authentication (e.g., inside a
> corporate intranet).  Although using strong authentication is a best
> practice, strong authentication is far from universal in deployments.
> 
> 2) The attacker uses other network facilities in the browser to attack
> a WebSocket server.  For example, the attacker might use and HTML form
> element to generate an HTTP message targeted at a WebSocket server.
> In this scenario, do not assume the WebSocket server follows the
> WebSocket protocol specification in every detail.  Instead, we seek to
> protect WebSocket servers that contain some implementation errors.  Of
> course, we cannot hope to protect servers with arbitrary
> implementation errors (e.g., memory safety errors), but, when given a
> choice, we prefer protocols whose security is robust to sloppy
> implementation.  We are concern with two kinds of attacks in this
> model:
> 
>    a) The attacker crafts an HTTP request that confuses the WebSocket
> server into performing an undesirable mutation to its internal state.
> 
>    b) The attacker crafts an HTTP request that confuses the WebSocket
> server into responding with content that the browser then interprets
> to the detriment of the server (e.g., allows the attacker to mount a
> cross-site scripting attack against the server’s origin).
> 
> 3) The attacker communicates with a WebSocket server, but the ensuing
> traffic confuses a network intermediary.  Without loss of generality,
> we can assume that the WebSocket server colludes with the attacker to
> aid him or her in confusing the intermediary.  In particular, we are
> especially concerned with transparent HTTP proxies in corporate
> intranets because these proxies are common and confusing such as proxy
> could let the attacker extract confidential information from the
> corporation.
> 
> == Strawmen ==
> 
> One natural approach is to design the handshake to mimic an HTTP POST
> request.  Using a POST request as a template is attractive because an
> attacker can already generate POST requests to many network locations
> using the HTML form element.  If WebSockets are less generative than
> the form element, then we can argue by reduction the WebSockets does
> not increase the attack surface for cross-protocol attacks.  Here’s an
> example WebSocket handshake templated on a POST request:
> 
> Client -> Server:
> POST /path/of/attackers/choice HTTP/1.1
> Host: host-of-attackers-choice.com
> Sec-WebSocket-Key: <connection-key>
> 
> Server -> Client:
> HTTP/1.1 200 OK
> Sec-WebSocket-Accept: <connection-key>
> 
> The idea behind this protocol is that by echoing back the
> connection-key, the server has agreed to establish a WebSocket
> connection.  Unfortunately, this handshake has serious problems.  If
> the attacker can host an htaccess file at any location a target HTTP
> server, the attacker can opt the server into using WebSockets.  The
> server will believe the first HTTP request is complete and is
> expecting another HTTP request on the socket.  However, the attacker
> can now send (roughly) arbitrary bytes on the socket, spoofing HTTP
> requests and reading back the response.
> 
> To repair this vulnerability, we replace value of the
> Sec-WebSocket-Accept response header with HMAC-SHA1(<connection-key>,
> <uuid>), on the assumption that a simple configuration file will be
> unable compute an HMAC.  However, this modification is insufficient.
> 
> Consider, for example, a virtual hosting environment in which the
> attacker can place PHP scripts on the server.  For example, such
> hosting environments are widely available commercially, such as from
> 1and1.com.  Now, the attacker can complete the WebSocket handshake
> because the PHP script can compute the HMAC and send the appropriate
> response header.  The attacker has now opted into the WebSocket
> protocol on behalf of the rest of the entire socket.  Unfortunately,
> the attacker is only empowered to speak on behalf his own virtual
> host.  This privilege escalation is likely to be exploitable by
> spoofing further HTTP requests in WebSocket message frames.  In these
> spoofed messages, the attacker can spoof the Host header and interact
> with other virtual hosts reachable on the same socket.
> 
> To attempt to repair this vulnerability, we remove the attacker’s
> ability to designate a PHP script on the server:
> 
> Client -> Server:
> OPTIONS * HTTP/1.1
> Host: host-of-attackers-choice.com
> Sec-WebSocket-Key: <connection-key>
> 
> Server -> Client:
> HTTP/1.1 200 OK
> Sec-WebSocket-Accept: HMAC(<connection-key>, “...”)
> 
> This handshake still has problems in more sophisticated virtual
> hosting scenarios, but let’s put those aside for the moment to
> consider how this handshake interacts with transparent HTTP proxies.
> Recall that the browser will not use the proxy version of the
> handshake because the proxy is transparent.
> 
> Unfortunately, this handshake is likely to confuse a transparent
> proxy.  After seeing these messages exchanged, a transparent proxy
> will likely believe that the next bytes emitted by the browser will be
> another HTTP request.  However, the browser believes it has
> established a WebSocket connection and will let the attacker send
> WebSocket frames to the transparent proxy.  The attacker can likely
> use these frames to spoof HTTP requests for intranet resources (again,
> by spoofing the Host header) and read back the response, stealing
> confidential information from the corporation’s intranet.
> 
> To attempt to repair this vulnerability, we add the Upgrade header to
> inform the transparent proxy that the socket is switching protocols:
> 
> Client -> Server:
> OPTIONS * HTTP/1.1
> Host: host-of-attackers-choice.com
> Connection: Upgrade
> Sec-WebSocket-Key: <connection-key>
> Upgrade: WebSocket
> 
> Server -> Client:
> HTTP/1.1 101 Switching Protocols
> Connection: Upgrade
> Upgrade: WebSocket
> Sec-WebSocket-Accept: HMAC(<connection-key>, “...”)
> 
> Unfortunately, the RFC 2817 HTTP upgrade mechanism is virtually unused
> in practice.  If you search the web for references to upgrade, you
> either find links to RFC 2817 or discussion of the WebSocket protocol.
> It seems entirely likely that some number of transparent proxies will
> be oblivious to the HTTP upgrade mechanism.  Organizations could
> easily deploy such proxies and never have any operational issues with
> them.  For this reason, assuming that transparent proxies the HTTP
> upgrade mechanism is a dangerous assumption.  If the proxy is
> oblivious to HTTP upgrade, the proxy could easily treat this handshake
> the same way it would treat the previous iteration, which allows the
> attacker to steal confidential information from corporate intranets.
> 
> Rather than relying upon the rarely used HTTP upgrade mechanism to
> inform network intermediaries that the remainder of the socket is not
> HTTP, we propose using the RFC 2817 CONNECT mechanism.  This mechanism
> is widely used on the Internet to tunnel TLS connections through
> proxies.  Proxy implementations that lack support for the CONNECT
> mechanism will likely discover and repair that oversight quickly.
> 
> == Proposal ==
> 
> In this section, we present our proposal for a WebSocket handshake and
> tunnel.  The handshake established a shared “secret” between the
> client and the server, which they use to encrypt subsequent traffic.
> This handshake lacks a number of endpoint and extension negotiation
> features of the current handshake.  We expect the working group to add
> these features inside the encrypted tunnel.
> 
> === Handshake Request ===
> 
> To establish a WebSocket connection, the browser sends an RFC 2817
> CONNECT request:
> 
> Client -> Server:
> CONNECT 1C1BCE63-1DF8-455C-8235-08C2646A4F21.invalid:443 HTTP/1.1
> Host: 1C1BCE63-1DF8-455C-8235-08C2646A4F21.invalid:443
> Sec-WebSocket-Key: <connection-key1>
> 
> where <connection-key1> is a 128-bit random number encoded in base64.
> This initial message has several desirable properties:
> 
> 1) The attacker cannot influence any of the bytes included in the
> message.  Instead of using the attacker’s host name, we use an invalid
> host name (per RFC 2606).  Although we could use any invalid host
> name, we use this host name as a globally unique identifier for the
> WebSocket protocol.
> 
> 2) Any intermediaries that understand this message according to its
> HTTP semantics with route the request to a non-existent domain and
> fail the request.  In particular, they will not route the
> Sec-WebSocket-Key to the attacker, making it difficult for the
> attacker to perform actions based on the key.
> 
> 3) Transparent proxies are likely to interpret this request as an
> HTTPS connect request and assume the remainder of the socket is
> unintelligible.  Because the remainder of the bytes on the socket are
> encrypted (see below), the attacker is unlikely to be able to trick
> the transparent proxy into taking further action.
> 
> 4) This message cannot be generated by a web attacker in today’s browsers.
> 
> 5) A server that wishes to multiplex HTTP and WebSockets on the same
> port can use the request-line to distinguish the two protocols.
> 
> The client can also include additional information in the first
> handshake message by encrypting that information in AES-128-CTR using
> the key HMAC-SHA1(<connection-key1>,
> “C1BA787A-0556-49F3-B6AE-32E5376F992B”) and a counter block that is
> the byte number represented in 128-bit network byte order
> (big-endian).  We expect browsers to use this additional information
> to include additional meta-data about the connection (e.g., the origin
> of the web site that created the WebSocket) rather than
> application-layer messages.
> 
> Encrypting the additional information makes it difficult for the
> attacker to predict the bytes that appear on the wire.  Without the
> ability to predict on-the-wire bytes, the attacker will have
> difficulty crafting a network message that confuses a non-WebSocket
> server or an intermediary.  Effectively, the attacker is limited to
> sending random traffic to a chosen server.  To limit opportunities for
> abuse, the browser should limit the amount of unsolicited data the
> attacker can send (500 bytes?) before the server accepts the WebSocket
> connection to avoid spamming unwitting servers with too much traffic.
> 
> === Handshake Response ===
> 
> To accept the request, the server replies with the following message:
> 
> Server -> Client:
> HTTP/1.1 200 OK
> Sec-WebSocket-Accept: <hmac>
> Sec-WebSocket-Key: <connection-key2>
> 
> where <hmac> is HMAC-SHA1(<connection-key1>,
> “258EAFA5-E914-47DA-95CA-C5AB0DC85B11”) encoded in base64 and
> <connection-key2> is a 128-bit random number encoded in base64.  If
> <connection-key2> is identical to <connection-key1>, the client aborts
> the handshake.  This message completes the CONNECT mechanism.
> 
> The entity that generated the HMAC has demonstrated understanding of
> the WebSocket protocol by including the UUID in the HMAC.  Because the
> original network message did not designate any particular host, we can
> have reasonable assurance that the entity that generated the HMAC
> speaks on behalf of the entire socket (and not just on behalf of one
> virtual host).  Because the HMAC occurs near the beginning of the
> socket (and is proceeded by a fixed string), we mitigate the risk that
> the replying entity is actually speaking a non-HTTP, non-WebSocket
> protocol.
> 
> After sending the handshake response, the server can begin sending
> information over the encrypted tunnel described in the following
> section.  We expect that the first message sent by the server will
> contain meta-data about the connection and that subsequent messages
> will contain application-layer messages.
> 
> === Tunnel ===
> 
> The handshake establishes two keys, which the client and server use to
> form an encrypted tunnel for further communication:
> 
> Client -> Server Key:
> HMAC-SHA1(<connection-key1> || <connection-key2>,
>                       “363A6078-74D2-4C0B-8CBC-1E6A36E83442”)
> 
> Server -> Client Key:
> HMAC-SHA1(<connection-key1> || <connection-key2>,
>                       “2306C3BE-0ACF-42C0-B69E-DFFE02CFA346”)
> 
> All subsequent bytes are encrypted using AES-128-CTR with the
> appropriate directional key and a counter block that is the byte
> number represented in 128-bit network byte order (big-endian).
> 
> Encrypting the tunnel makes it difficult for an attacker to use the
> browser’s HTTP network facilities to attack a poorly implemented
> WebSocket server.  Because the attacker is unable to learn the
> <connection-key2> chosen by the server, the attacker will have
> difficulty crafting an HTTP request that the WebSocket server will
> decrypt to something sensible.
> 
> Encrypting the traffic from the server to the client makes it
> difficult for the attacker to generate an HTTP request to an honest by
> poorly implemented WebSocket server that causes its response to be
> interpreted to its detriment by the browser.  In particular, it is
> unlikely that the server’s response will be treated as an HTML
> document by the browser, preventing the attacker from leveraging the
> WebSocket server to mount a cross-site scripting attack against the
> server’s origin.
> 
> == Analysis ==
> 
> We analyze the risks of this protocol in the three scenarios of interest:
> 
> 1) The attacker uses the WebSocket protocol to attack a server that
> does not support the WebSocket protocol.  There are two cases to
> consider: the server is familiar with HTTP semantics or the server is
> oblivious of HTTP:
> 
>    a) The attacker will find it difficult to attacker a server that
> is familiar with HTTP semantics with this handshake because the HTTP
> semantics of the handshake point to routing the request to a
> non-existent network location.  If the request somehow routes to the
> attacker, the HTTP semantics then point to transporting opaque data
> over the socket.
> 
>    b) The attacker will find it difficult to attack an HTTP-oblivious
> server with this handshake because the attacker can send only a fixed
> message followed by seemingly random bytes.  None of the bytes sent to
> the server can be controlled directly by the attacker.  It seems
> unlikely that the attacker will be able to advance the non-WebSocket
> server very far down its state machine.
> 
> 2) The attacker uses other network facilities in the browser to attack
> a WebSocket server.  There are two cases to consider: the server
> implements the WebSocket protocol correctly or the server implements
> an imperfect version of the WebSocket protocol:
> 
>    a) If the server correctly implements the WebSocket protocol, the
> attacker will be unable to use the other network facilities of the
> browser to complete the handshake with the server because the attacker
> is unable to generate the first network message.
> 
>    b) If the server implements an imperfect version of the WebSocket
> protocol, the attacker will be unable to learn the value of either of
> the directional keys for the tunnel.  Without knowledge of these keys,
> the attacker will find it difficult (i) to craft a message that
> decrypts to something meaningful to the WebSocket server and (ii) to
> trick the WebSocket server into responding with something meaningful
> to the browser.
> 
> 3) The attacker communicates with a WebSocket server, but the ensuing
> traffic confuses a network intermediary.  If the intermediary attempts
> to route the request (e.g., because the intermediary is an HTTP
> proxy), the handshake will fail because the request does not contain
> any routing information for the target server.  If the handshake
> completes and the intermediary understands HTTP semantics (as widely
> used), the intermediary will likely reason that the remainder of the
> socket is an opaque TLS connection.  In either case, the intermediary
> is unlikely to take undesirable actions as a result of the WebSocket
> connection.
> 
> == Conclusion ==
> 
> We believe this handshake is superior to the current handshake because
> this handshake has a stronger argument for security.  Because the
> attacker cannot control any of the bytes sent by the browser, the
> attacker will have difficulty mounting a cross-protocol attack using
> this handshake.
> 
> That said, there is no guarantee that this handshake resists
> cross-protocol attacks.  These security properties are not very well
> studied, making designing protocols that achieve these properties more
> art than science.  However, the handshake we propose has a number of
> heuristic properties that suggest it might stand up to further
> scrutiny.