Re: [hybi] WebSocket -76 is incompatible with HTTP reverse proxies

Ian Hickson <ian@hixie.ch> Wed, 21 July 2010 21:24 UTC

Return-Path: <ian@hixie.ch>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 736743A6B66 for <hybi@core3.amsl.com>; Wed, 21 Jul 2010 14:24:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.392
X-Spam-Level:
X-Spam-Status: No, score=-2.392 tagged_above=-999 required=5 tests=[AWL=0.207, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6CKj9oIux9u3 for <hybi@core3.amsl.com>; Wed, 21 Jul 2010 14:24:00 -0700 (PDT)
Received: from looneymail-a2.g.dreamhost.com (caibbdcaaaaf.dreamhost.com [208.113.200.5]) by core3.amsl.com (Postfix) with ESMTP id 3F8F63A6BBD for <hybi@ietf.org>; Wed, 21 Jul 2010 14:24:00 -0700 (PDT)
Received: from ps20323.dreamhostps.com (ps20323.dreamhost.com [69.163.222.251]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by looneymail-a2.g.dreamhost.com (Postfix) with ESMTP id 2E97D16D427 for <hybi@ietf.org>; Wed, 21 Jul 2010 14:24:17 -0700 (PDT)
Date: Wed, 21 Jul 2010 21:24:16 +0000
From: Ian Hickson <ian@hixie.ch>
To: "hybi@ietf.org" <hybi@ietf.org>
In-Reply-To: <8B0A9FCBB9832F43971E38010638454F03E9DCCAD4@SISPE7MB1.commscope.com>
Message-ID: <Pine.LNX.4.64.1007211706030.7242@ps20323.dreamhostps.com>
References: <20100706210039.GA12167@1wt.eu> <B709B846-2A8C-4B84-8F4D-B06B81D91A7B@brandedcode.com> <20100707044129.GH12126@1wt.eu> <AANLkTik-i_9a7JpaFRqPLBr68buPM5Ml3N1iabaJby8k@mail.gmail.com> <8B0A9FCBB9832F43971E38010638454F03E9DCCA29@SISPE7MB1.commscope.com> <AANLkTima-dMQjX7S0WURFPrY--bTJJUs9PZcd4bNmNdW@mail.gmail.com> <8B0A9FCBB9832F43971E38010638454F03E9DCCAD4@SISPE7MB1.commscope.com>
Content-Language: en-GB-hixie
Content-Style-Type: text/css
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Subject: Re: [hybi] WebSocket -76 is incompatible with HTTP reverse proxies
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 21 Jul 2010 21:24:01 -0000

On Tue, 6 Jul 2010, Willy Tarreau wrote:
> 
> Last week, it was reported to me that a site that was running fine on 
> draft 75 could not get the draft 76 handshake to complete via a HAProxy 
> load balancer, which runs as an HTTP reverse proxy. The connection would 
> remain open between the client and haproxy, and between haproxy and the 
> server, with the server never responding. The same client (Chromium 
> 6.0.414.0) directly connected to the server worked fine.
> 
> The guy was kind enough to send me some network captures which show an 
> obvious problem : the 8-bytes nonce from the client is not advertised as 
> a content-length, so it is not forwarded by the reverse proxy as it is 
> either part of a next request or pending data for when the handshake 
> completes.

Right, you need to update all the server-side components to support 
WebSocket. WebSocket is a new protocol. Similar updates would be needed to 
support other new protocols.


> I can't agree with that because until the handshake completes, the proxy 
> does not know whether the server will handle the request as a WS 
> handshake or anything else, and it must absolutely not accept to blindly 
> trust any random client who sets an Upgrade header that any server is 
> free to ignore.

Obviously all server-side components have to be configured to know the 
setup that they are in. This includes telling load balancers and other 
front-end intermediaries which hosts are ready to handle WebSocket 
connections and which are not. Just like a reverse proxy would not be 
configured to forward a connection to an SMTP server behind the firewall, 
it wouldn't be configured to send WebSocket traffic to HTTP servers.


> Conversely, having no Content-Length header in the request means that we 
> don't know what a reverse proxy will do if it receives a valid one. For 
> instance, we could very well imagine that some reverse proxies which 
> will assume that Content-Length == 8 for any request containing 
> "Upgrade: WebSocket" will have trouble when receiving a different 
> Content-Length header. This could be used to pass larger amounts of data 
> than what is allowed by the protocol to a second reverse-proxy, which, 
> if it is able to parallelize pipelined requests, will forward the first 
> one to the server and the second one (embedded in the apparent data) to 
> another server.

The spec is very clear about how a server side is to parse the handshake. 
I don't think there's any ambiguity here. There's no need for the reverse 
proxy to "assume a Content-Length" or anything like that; if it decides 
that the request is a WebSocket request (e.g. based on the presence of an 
"Upgrade: WebSocket" field, or based on the target IP or the given 
resource name), then it should follow the Web Socket spec.


> The first obvious solution that comes to mind is to comply with the HTTP 
> protocol which will be implemented along the whole chain and to simply 
> add a "Content-Length: 8" header in the request.

As far as I can tell there is nothing here that contradicts the HTTP spec. 
If there is a specific requirment in the HTTP spec that is being 
contradicted, please cite it.

We could add Content-Length: 0, but as far as I can tell that's implied 
for GET anyway, so it wouldn't change anything in conforming software. 
(This isn't very clear in the HTTP spec though.)

We can't add Content-Length: 8, since that would mean the data would be 
sent through with the first request even in non-WebSocket-aware man-in- 
the-middle proxies, which defeats the point.


> But this raises a second point : shouldn't we switch to POST instead of 
> GET then?

We could use another method, but there doesn't seem to be much reason to 
do so. GET turns out to be the simplest to deal with in a variety of 
situations.


> Anyway, we have to do something now because we've reached the point Ian
> tried to ensure we would avoid a long time ago : the deadlock which is
> undetectable by the client.

A deadlock isn't a big deal. The problem was a false-positive situation, 
where the handshake works but frames don't go through.


On Wed, 7 Jul 2010, Thomson, Martin wrote:
> > 
> > Content-length: 0 also makes sense but it means that the nonce will be 
> > sent *after* the handshake, which means we'd have a second round-trip.
> 
> The round-trip thing is a fallacy.  Just as you can pipeline requests, 
> so can you send extra handshakey parts after the headers.
> 
> Solution:  The handshake includes a complete HTTP message, PLUS extra 
> stuff.  All of this is sent at once, but the HTTP stuff stops half way.

That's exactly what the spec does.


On Thu, 8 Jul 2010, Greg Wilkins wrote:
>
> You are correct that it is not an extra round trip.  But I do not think 
> it is a good solution to send a complete HTTP message PLUS extra stuff 
> in the request.
>
> If the handshake is legal HTTP, the server should be able to rejects the 
> websocket upgrade without closing the connection.  This would allow the 
> connection to remain in the browsers pool of connections and avoid an 
> extra round trip to establish another connection if the application 
> falls back to non-websocket transports.

The browser can't know if the server is really an HTTP server, so it can't 
possibly reuse the connection. It could in fact be a huge security hole, 
depending on how we did this. It is, in either case, far more complexity 
than is in any way justified here.

All of these problems come from thinking of Web Sockets as a subprotocol 
of HTTP. It isn't. Web Sockets is its own high-level protocol built on top 
of TCP. It just happens to look enough like HTTP that you can reuse the 
port, but that doesn't mean it's an HTTP-based protocol. Thinking of Web 
Sockets as having anything to do with HTTP is a mistake.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'