Re: [hybi] WebSocket -76 is incompatible with HTTP reverse proxies

Willy,

Thanks for raising this point. I'm an author of a node.js powered websocket 
server, and this was actually a really awkward part of the websocket spec to 
comply to when writing my server.

Without having a content-length header, the http parser in node doesn't know 
how to handle the extra data it receives in that initial handshake. What we 
ended up doing as a fix, was just to provide an upgradeHead as a slice of the
buffer from where the headers ended to the end of the packet.

In short, I'd be all for having a content-length header sent to declare the body 
as a body, or to declare it as just extra data that should be handled post upgrade.
From what I can tell from what Ian is telling me over irc, he'd be more inclined to 
add in content-length: 0 as a header, then content-length: 8.

Hopefully this reply does make somewhat sense. 

Yours,
Micheil Smith
--
BrandedCode.com

ps: I'm hoping this does actually reply to the mailing list, rather then the individual.

On 07/07/2010, at 7:00 AM, Willy Tarreau wrote:

> Hi,
> 
> I was having a private discussion with Ian last week about a recent
> issue introduced in draft 76, but I realized it would be more useful
> and constructive to bring the issue to the list than to privately
> discuss possible fixes.
> 
> Last week, it was reported to me that a site that was running fine
> on draft 75 could not get the draft 76 handshake to complete via a
> HAProxy load balancer, which runs as an HTTP reverse proxy. The
> connection would remain open between the client and haproxy, and
> between haproxy and the server, with the server never responding.
> The same client (Chromium 6.0.414.0) directly connected to the
> server worked fine.
> 
> The guy was kind enough to send me some network captures which show
> an obvious problem : the 8-bytes nonce from the client is not advertised
> as a content-length, so it is not forwarded by the reverse proxy as
> it is either part of a next request or pending data for when the
> handshake completes. Unfortunately, the server wants those bytes to
> complete the handshake, so we have a dirty deadlock not even detectable
> by the end user.
> 
> Ian proposed to upgrade the reverse proxy to detect the WebSocket
> handshake in the request (before it completes) and that it accepts
> to forward those 8 bytes.
> 
> I can't agree with that because until the handshake completes, the
> proxy does not know whether the server will handle the request as
> a WS handshake or anything else, and it must absolutely not accept
> to blindly trust any random client who sets an Upgrade header that
> any server is free to ignore. Doing so would make the reverse-proxy
> vulnerable to HTTP request smuggling attacks [1] or even to any type
> of filtering bypass depending on the length of the data it lets go.
> Even with 8 bytes it is possible to send a "GET /xx\n" which is a
> valid HTTP/0.9 request and is accepted by some servers in a keep-alive
> connection (including Apache).
> 
> Example :
> 
>       GET /index.html HTTP/1.1
>       Host: example.com
>       Connection: Upgrade
>       Sec-WebSocket-Key2: 12998 5 Y3 1  .P00
>       Sec-WebSocket-Protocol: sample
>       Upgrade: WebSocket
>       Sec-WebSocket-Key1: 4 @1  46546xW%0l 1 5
>       Origin: http://example.com
> 
>       GET /..
> 
> If the server does not handle WebSocket for this ressource, it will
> happily return the index and/or a 404 and proceed with next request
> contained in the 8 bytes it received.
> 
> Conversely, having no Content-Length header in the request means that
> we don't know what a reverse proxy will do if it receives a valid one.
> For instance, we could very well imagine that some reverse proxies
> which will assume that Content-Length == 8 for any request containing
> "Upgrade: WebSocket" will have trouble when receiving a different
> Content-Length header. This could be used to pass larger amounts of
> data than what is allowed by the protocol to a second reverse-proxy,
> which, if it is able to parallelize pipelined requests, will forward
> the first one to the server and the second one (embedded in the apparent
> data) to another server.
> 
> The first obvious solution that comes to mind is to comply with the
> HTTP protocol which will be implemented along the whole chain and to
> simply add a "Content-Length: 8" header in the request. This fixes
> the dirty hang, this fixes the fact that reverse proxies have to blindly
> trust the client, this fixes the case with the different content length,
> and it even makes it possible for WebSocket aware reverse proxies to
> refuse requests which don't have exactly "Content-Length: 8".
> 
> But this raises a second point : shouldn't we switch to POST instead of
> GET then ? After all, GET + content-length is not well defined, httpbis
> p1 says :
> 
>  The presence of a message-body in a request is signaled by the
>  inclusion of a Content-Length or Transfer-Encoding header field in
>  the request's header fields.  When a request message contains both a
>  message-body of non-zero length and a method that does not define any
>  semantics for that request message-body, then an origin server SHOULD
>  either ignore the message-body or respond with an appropriate error
>  message (e.g., 413).  A proxy or gateway, when presented the same
>  request, SHOULD either forward the request inbound with the message-
>  body or ignore the message-body when determining a response.
> 
> So that means that we're not certain again that the data will pass
> through all reverse proxies. That said, we could consider that WebSocket
> aware reverse proxies will let the data flow, but the problem of the hung
> connection remains if the reverse proxy eats the data before forwarding
> to the server.
> 
> The POST solves all that. POST + content-length is normal and well
> supported everywhere. The POST also has the nice advantage that we're
> sure we won't get a reply from a cache (and the POST method is even
> one of the 3 defined methods which must invalidate caches).
> 
> The POST also has a nice advantage for various client implementations
> that it is easier to implement than GET with some data.
> 
> After some thinking, I'm wondering why we want to pass the nonce as
> data in the request instead of passing it as headers. After all, we're
> interested in data in the response to ensure we're able to let data
> flow and that the whole chain understands the Upgrade request. In fact,
> if any one intermediate ignores the 101 and takes it as a 100 (as is
> explicitly permitted by httpbis-p2), the response will be aborted
> because the pending data does not look like an HTTP response, and this
> is perfect, it's what we're looking for. But in the request ? I fail
> to see the added value of having it in the data. Probably that it would
> be easier to put it in a header and keep the GET.
> 
> Anyway, we have to do something now because we've reached the point Ian
> tried to ensure we would avoid a long time ago : the deadlock which is
> undetectable by the client. And it already happens through a common
> reverse proxy since draft 76, and will do through load balancers, IDS
> and HTTP-aware firewalls for the same reasons, without any simple way
> to fix it without breaking HTTP security on those components. And it's
> not like if we could imagine those components will not be in use where
> WebSocket is deployed !
> 
> Best regards,
> Willy
> 
> [1]   http://www.owasp.org/index.php/HTTP_Request_Smuggling
> 
> 
> _______________________________________________
> hybi mailing list
> hybi@ietf.org
> https://www.ietf.org/mailman/listinfo/hybi

Micheil Smith
--
BrandedCode.com

On 07/07/2010, at 7:00 AM, Willy Tarreau wrote:

> Hi,
> 
> I was having a private discussion with Ian last week about a recent
> issue introduced in draft 76, but I realized it would be more useful
> and constructive to bring the issue to the list than to privately
> discuss possible fixes.
> 
> Last week, it was reported to me that a site that was running fine
> on draft 75 could not get the draft 76 handshake to complete via a
> HAProxy load balancer, which runs as an HTTP reverse proxy. The
> connection would remain open between the client and haproxy, and
> between haproxy and the server, with the server never responding.
> The same client (Chromium 6.0.414.0) directly connected to the
> server worked fine.
> 
> The guy was kind enough to send me some network captures which show
> an obvious problem : the 8-bytes nonce from the client is not advertised
> as a content-length, so it is not forwarded by the reverse proxy as
> it is either part of a next request or pending data for when the
> handshake completes. Unfortunately, the server wants those bytes to
> complete the handshake, so we have a dirty deadlock not even detectable
> by the end user.
> 
> Ian proposed to upgrade the reverse proxy to detect the WebSocket
> handshake in the request (before it completes) and that it accepts
> to forward those 8 bytes.
> 
> I can't agree with that because until the handshake completes, the
> proxy does not know whether the server will handle the request as
> a WS handshake or anything else, and it must absolutely not accept
> to blindly trust any random client who sets an Upgrade header that
> any server is free to ignore. Doing so would make the reverse-proxy
> vulnerable to HTTP request smuggling attacks [1] or even to any type
> of filtering bypass depending on the length of the data it lets go.
> Even with 8 bytes it is possible to send a "GET /xx\n" which is a
> valid HTTP/0.9 request and is accepted by some servers in a keep-alive
> connection (including Apache).
> 
> Example :
> 
>        GET /index.html HTTP/1.1
>        Host: example.com
>        Connection: Upgrade
>        Sec-WebSocket-Key2: 12998 5 Y3 1  .P00
>        Sec-WebSocket-Protocol: sample
>        Upgrade: WebSocket
>        Sec-WebSocket-Key1: 4 @1  46546xW%0l 1 5
>        Origin: http://example.com
> 
>        GET /..
> 
> If the server does not handle WebSocket for this ressource, it will
> happily return the index and/or a 404 and proceed with next request
> contained in the 8 bytes it received.
> 
> Conversely, having no Content-Length header in the request means that
> we don't know what a reverse proxy will do if it receives a valid one.
> For instance, we could very well imagine that some reverse proxies
> which will assume that Content-Length == 8 for any request containing
> "Upgrade: WebSocket" will have trouble when receiving a different
> Content-Length header. This could be used to pass larger amounts of
> data than what is allowed by the protocol to a second reverse-proxy,
> which, if it is able to parallelize pipelined requests, will forward
> the first one to the server and the second one (embedded in the apparent
> data) to another server.
> 
> The first obvious solution that comes to mind is to comply with the
> HTTP protocol which will be implemented along the whole chain and to
> simply add a "Content-Length: 8" header in the request. This fixes
> the dirty hang, this fixes the fact that reverse proxies have to blindly
> trust the client, this fixes the case with the different content length,
> and it even makes it possible for WebSocket aware reverse proxies to
> refuse requests which don't have exactly "Content-Length: 8".
> 
> But this raises a second point : shouldn't we switch to POST instead of
> GET then ? After all, GET + content-length is not well defined, httpbis
> p1 says :
> 
>   The presence of a message-body in a request is signaled by the
>   inclusion of a Content-Length or Transfer-Encoding header field in
>   the request's header fields.  When a request message contains both a
>   message-body of non-zero length and a method that does not define any
>   semantics for that request message-body, then an origin server SHOULD
>   either ignore the message-body or respond with an appropriate error
>   message (e.g., 413).  A proxy or gateway, when presented the same
>   request, SHOULD either forward the request inbound with the message-
>   body or ignore the message-body when determining a response.
> 
> So that means that we're not certain again that the data will pass
> through all reverse proxies. That said, we could consider that WebSocket
> aware reverse proxies will let the data flow, but the problem of the hung
> connection remains if the reverse proxy eats the data before forwarding
> to the server.
> 
> The POST solves all that. POST + content-length is normal and well
> supported everywhere. The POST also has the nice advantage that we're
> sure we won't get a reply from a cache (and the POST method is even
> one of the 3 defined methods which must invalidate caches).
> 
> The POST also has a nice advantage for various client implementations
> that it is easier to implement than GET with some data.
> 
> After some thinking, I'm wondering why we want to pass the nonce as
> data in the request instead of passing it as headers. After all, we're
> interested in data in the response to ensure we're able to let data
> flow and that the whole chain understands the Upgrade request. In fact,
> if any one intermediate ignores the 101 and takes it as a 100 (as is
> explicitly permitted by httpbis-p2), the response will be aborted
> because the pending data does not look like an HTTP response, and this
> is perfect, it's what we're looking for. But in the request ? I fail
> to see the added value of having it in the data. Probably that it would
> be easier to put it in a header and keep the GET.
> 
> Anyway, we have to do something now because we've reached the point Ian
> tried to ensure we would avoid a long time ago : the deadlock which is
> undetectable by the client. And it already happens through a common
> reverse proxy since draft 76, and will do through load balancers, IDS
> and HTTP-aware firewalls for the same reasons, without any simple way
> to fix it without breaking HTTP security on those components. And it's
> not like if we could imagine those components will not be in use where
> WebSocket is deployed !
> 
> Best regards,
> Willy
> 
> [1]   http://www.owasp.org/index.php/HTTP_Request_Smuggling
> 
> 
> _______________________________________________
> hybi mailing list
> hybi@ietf.org
> https://www.ietf.org/mailman/listinfo/hybi