Re: [hybi] Web sockets and existing HTTP stacks

Maciej Stachowiak <mjs@apple.com> Mon, 01 February 2010 06:31 UTC

Return-Path: <mjs@apple.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 3A47F3A68E0 for <hybi@core3.amsl.com>; Sun, 31 Jan 2010 22:31:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.356
X-Spam-Level:
X-Spam-Status: No, score=-106.356 tagged_above=-999 required=5 tests=[AWL=0.242, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id D8ITcWfOtd0y for <hybi@core3.amsl.com>; Sun, 31 Jan 2010 22:31:18 -0800 (PST)
Received: from mail-out3.apple.com (mail-out3.apple.com [17.254.13.22]) by core3.amsl.com (Postfix) with ESMTP id ED0983A68B3 for <hybi@ietf.org>; Sun, 31 Jan 2010 22:31:17 -0800 (PST)
Received: from relay15.apple.com (relay15.apple.com [17.128.113.54]) by mail-out3.apple.com (Postfix) with ESMTP id 291EF82FE661 for <hybi@ietf.org>; Sun, 31 Jan 2010 22:31:51 -0800 (PST)
X-AuditID: 11807136-b7bafae000000e8d-1a-4b667556a07b
Received: from et.apple.com (et.apple.com [17.151.62.12]) by relay15.apple.com (Apple SCV relay) with SMTP id 33.94.03725.655766B4; Sun, 31 Jan 2010 22:31:51 -0800 (PST)
MIME-version: 1.0
Content-type: multipart/alternative; boundary="Boundary_(ID_W9Hi72M34+l24cz9C42eEg)"
Received: from [17.151.96.3] by et.apple.com (Sun Java(tm) System Messaging Server 6.3-7.04 (built Sep 26 2008; 32bit)) with ESMTPSA id <0KX5007X4GT04B90@et.apple.com> for hybi@ietf.org; Sun, 31 Jan 2010 22:31:50 -0800 (PST)
From: Maciej Stachowiak <mjs@apple.com>
In-reply-to: <5c902b9e1001312024k7ba2df94iceeb0828051fddaf@mail.gmail.com>
Date: Sun, 31 Jan 2010 22:31:48 -0800
Message-id: <568D478E-DEE2-440B-8A71-8F1B9970E60D@apple.com>
References: <557ae280911171402v7546e5e7n93a1e57f87dc10e5@mail.gmail.com> <Pine.LNX.4.62.0912032347360.15540@hixie.dreamhostps.com> <4B2C1D52.9020505@webtide.com> <5c902b9e0912181640n497169cdrfa71f9a2908e6ef3@mail.gmail.com> <20091219005442.GA10949@shareable.org> <4B2C287E.1030006@webtide.com> <Pine.LNX.4.64.1001310835410.3846@ps20323.dreamhostps.com> <5821ea241001311219j111d25a3h27fb2d05a2ece32d@mail.gmail.com> <5821ea241001311226s3d2092d7kef13f958db3a0132@mail.gmail.com> <A3071537-C3A5-4C0D-945F-618382435383@apple.com> <5c902b9e1001312024k7ba2df94iceeb0828051fddaf@mail.gmail.com>
To: Justin Erenkrantz <justin@erenkrantz.com>
X-Mailer: Apple Mail (2.1077)
X-Brightmail-Tracker: AAAAAQAAAZE=
Cc: hybi@ietf.org
Subject: Re: [hybi] Web sockets and existing HTTP stacks
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 01 Feb 2010 06:31:20 -0000

Consolidating replies to multiple messages...

On Jan 31, 2010, at 8:24 PM, Justin Erenkrantz wrote:

> On Sun, Jan 31, 2010 at 7:19 PM, Maciej Stachowiak <mjs@apple.com> wrote:
>> What are the RFC2616 requirements on HTTP upgrade?
> 
> If the server accepts the Upgrade request, it will send back a 101 in
> an HTTP/1.1-formatted response and then *after* the blank line
> terminating the response, the new protocol is in effect.  So, in
> essence, you have one HTTP/1.1 request and one HTTP/1.1 response, a
> blank line (CRLF), and then all bets are off.  (See 10.1.2 of RFC
> 2616.)

Sure, but I don't see a limit on the reasons a client or server could reject an upgrade attempt, including looking at details of the HTTP header that HTTP itself does not care about.

> 
>> Does it limit the set of criteria that an upgrade protocol may use to reject an upgrade request or response? I didn't see such a requirement. So I think WebSocket does satisfy the contract for port 80, though admittedly in a way that may be inconvenient for some deployed software.
> 
> No, because the drafts mandate "specific opaque bytes" rather than
> anything that looks like an HTTP/1.1 request. 

The spec requires sending requests and responses that are valid HTTP syntax. It also requires rejecting some upgrade requests or responses that would also be valid (and equivalent) at the HTTP level. This is a mild layering violation, but I don't see how it "breaks the contract".

In any case, I'm more interested in what is a practical problem for implementations, than theoretical purity. 

On Jan 31, 2010, at 8:05 PM, Greg Wilkins wrote:

> Maciej Stachowiak wrote:
> 
>> This reduces the burden on server implementors. 
> 
> It actually increases the burden.   If there are only a few fixed headers then it's easier
> to correctly order and format them.   With arbitrary headers we have to allow existing mechanisms
> add/examine their headers, but then make sure they've not actually broken any websocket
> restrictions.

Would a strict status line requirement, followed by headers that can have any order or capitalization, be reasonable?

> 
>> But it also seems to reduce the security benefit. In particular, the part of the handshake where
>> the server echoes back the origin is not part of the hardcoded handshake but rather just a normal header. In light of this I'm not sure the fixed header is pulling its weight.
>> It does probably add some amount of protection for resources that suffer header injection attacks - to fake the WebSocket handshake you'd have to inject before the real HTTP
>> response header, and could not rely on injecting in the middle of a response. 
> 
> But then the sentinel framing of websocket is completely vulnerable to injection attacks.
> All websocket endpoints will have to validate that utf-8 data given to them really is utf-8 data.

I believe the only component that absolute has to check for violations of the sentinel framing to maintain security is the browser - it needs to prevent script code running in the browser from doing shady things to an existing connection, so it can't let scripts inject a sentinel. If your threat model is non-browser code, that presumably lacks any user credentials you are using for access control, so the risk is limited. If you are thinking about an out-of-browser client that does have the user's credentials, then the client can already send any valid message it chooses, so there's no motive to try to abuse the protocol framing to inject messages.

For robustness, of course, all parties should check both incoming and outgoing framing for correctness.

> And I still don't get the protection it is giving?  Can you describe a concrete example
> of an attack that could happen if arbitrary ordering of headers?

I should note that requiring a hardcoded format is not my idea, and I'm not even sure it's a good idea, so I am probably not the best person to give a full justification. However, my understanding is that the threat models that this problem tries to address are:

1) Ordinary HTTP server has a header injection vulnerability. Hostile code in the browser attempts to get unauthorized access to HTTP resources with the user's credentials, via WebSocket.
   => Header injection vulnerabilities are most likely to occur in the middle of the headers, not before the status line, which is why a hardcoded prefix provides some defense. (But I'm not sure status line plus two fixed headers is any stronger than just a requirement on the status line.)

2) Non-HTTP network service gives a connecting party some limited control over what it echos back. If an attacker can, through normal use of WebSocket, get it to produce any response that looks like a valid WebSocket handshake, then they may be able to get unauthorized access from the browser.
   => In general, the less flexible the handshake format, the harder it is to find a protocol you can abuse this way. However, I think the key defense for this kind of thing is some kind of challenge/response where the response echoed back is based on the request, but in a way that does not just consist of literally echoing back some of the characters.

My own judgment would be that for both of these threat models, requiring a specific status line but allowing the Connection and Upgrade headers to appear anywhere in the response headers and with any capitalization would provide about the same level of defense. I can also ask some security experts I know to evaluate this reasoning.


On Jan 31, 2010, at 8:19 PM, Justin Erenkrantz wrote:

> On Sun, Jan 31, 2010 at 7:08 PM, Maciej Stachowiak <mjs@apple.com> wrote:
> 
>> But it also seems to reduce the security benefit.
> 
> I've noticed a few mentions so far of "security" as a key driver for
> having an hardcoded initialization sequence, but I can't just envision
> the tangible security benefits from mandating this.
> 
> So, what is the threat model that this mechanism is trying to prevent?

See above. I believe the mechanism is intended to reduce the attack surface that WebSocket creates in existing HTTP and non-HTTP Web services. I am not sure I agree with the exact tradeoff made, but I can certainly see the logic. Though the client sending a nonce and the server responding with a particular hash of that nonce would be even more effective at protecting existing servers than anything that only includes fixed components and excerpts of the request text.

> How do these threats differ from other attacks against HTTP?  --

Existing browser mechanisms to send HTTP requests can be subject to superficially similar attacks:

A) Form submission can be used to send HTTP requests to non-HTTP services, whether they run on port 80 or some other port. Due to the same-origin policy, the result is not disclosed to the attacker, so these are only attacks on integrity, not confidentiality. To defend a non-HTTP service against this threat model, you need to ensure that anything with the syntax of a valid HTTP request that might be sent by a browser cannot have any side effects on your service. You don't need to worry about the risk of information disclosure in your response, though.

B) HTTP header injection attacks could be used against HTTP services, and may pose various security risks. In the case of vanilla HTTP, though, we can't mitigate this risk by forcing certain headers to be in a specific place. An example of this risk: if the attacker has hostile JS running in a browser that supports CORS, they may be able to get access to a confidential resource, if they can inject response headers "Access-Control-Allow-Credentials: true" and "Access-Control-Allow-Origin: <ORIGIN>" where <ORIGIN> is the attacker's security origin. However, this can't really be mitigated in the same way as WebSocket, and the integrity risk is somewhat lower since it's hard to use this technique to inject arbitrary additional requests.

I think WebSocket presents a somewhat different attack surface, so it's worth considering additional mitigation, but I am not sure the current  mechanism strikes quite the right balance.

For better protection with less impact on server code, I'd include a nonce in the handshake request (provided by the UA), require the status line of the response to include a particular hash of the nonce and the origin (it doesn't even have to be a cryptographically strong hash, it just needs to result in something other than literal echoing), and remove verbatim requirements from the rest of the response.


On Jan 31, 2010, at 8:05 PM, Greg Wilkins wrote:

> Maciej Stachowiak wrote:
> 
> 

>> OTOH just the special status line ("HTTP/1.1 101 Web Socket Protocol Handshake") guarantees this.
>> Would it be reasonable to limit the hardcoded part of the handshake to the status line?
> 
> It does not need to be expressed as hardcoded bytes.
> It can be expressed as a HTTP response with status code of 101 and
> reason of "Web Socket Protocol Handshake"

I can see why requiring exact ordering and capitalization of some of the response header fields would be a burden for server-side software that wants to plug into a general-purpose http server. Is a requirement for exact character sequence of the status line a burden in the same way? Or is this a more theoretical concern?

> Then if we ever get HTTP/1.2 or HTTP/2.0, websocket will not break!


If the client tries to upgrade an HTTP/1.1 request to WebSocket and the server gives an HTTP/1.2 or HTTP/2.0 response, then it had better break! We'll need to adjust the client side of the protocol before we can let servers respond with different versions of HTTP.

Regards,
Maciej