Re: [hybi] Web sockets and existing HTTP stacks

Ian Hickson <ian@hixie.ch> Sun, 31 January 2010 09:22 UTC

Return-Path: <ian@hixie.ch>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 9D0A628C103 for <hybi@core3.amsl.com>; Sun, 31 Jan 2010 01:22:31 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.342
X-Spam-Level:
X-Spam-Status: No, score=-2.342 tagged_above=-999 required=5 tests=[AWL=0.257, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fdS2-Tf8UX0U for <hybi@core3.amsl.com>; Sun, 31 Jan 2010 01:22:29 -0800 (PST)
Received: from looneymail-a4.g.dreamhost.com (caibbdcaaaaf.dreamhost.com [208.113.200.5]) by core3.amsl.com (Postfix) with ESMTP id D435B28C101 for <hybi@ietf.org>; Sun, 31 Jan 2010 01:22:29 -0800 (PST)
Received: from ps20323.dreamhostps.com (ps20323.dreamhost.com [69.163.222.251]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by looneymail-a4.g.dreamhost.com (Postfix) with ESMTP id 51C0D83C6; Sun, 31 Jan 2010 01:22:59 -0800 (PST)
Date: Sun, 31 Jan 2010 09:22:57 +0000
From: Ian Hickson <ian@hixie.ch>
To: Greg Wilkins <gregw@webtide.com>, Justin Erenkrantz <justin@erenkrantz.com>, Jamie Lokier <jamie@shareable.org>
In-Reply-To: <4B2C287E.1030006@webtide.com>
Message-ID: <Pine.LNX.4.64.1001310835410.3846@ps20323.dreamhostps.com>
References: <557ae280911171402v7546e5e7n93a1e57f87dc10e5@mail.gmail.com> <557ae280911200711i5493e654k67c1f5f07336bfb9@mail.gmail.com> <Pine.LNX.4.62.0912032347360.15540@hixie.dreamhostps.com> <4B2C1D52.9020505@webtide.com> <5c902b9e0912181640n497169cdrfa71f9a2908e6ef3@mail.gmail.com> <20091219005442.GA10949@shareable.org> <4B2C287E.1030006@webtide.com>
Content-Language: en-GB-hixie
Content-Style-Type: text/css
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Cc: hybi@ietf.org
Subject: Re: [hybi] Web sockets and existing HTTP stacks
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 31 Jan 2010 09:22:31 -0000

(-cc whatwg)

On Sat, 19 Dec 2009, Greg Wilkins wrote:
>
> To support websocket in Jetty, we had to use our existing HTTP stack 
> because we don't know that the request is a websocket upgrade until we 
> have already accepted it and parsed it using our existing HTTP stack.

Well, yeah. That's going to be the case with any protocol that shares its 
port with HTTP. Web Socket tries to make this easier by making it at least 
_possible_ to parse the header with an HTTP stack, if not necessarily 
easy.
 

On Fri, 18 Dec 2009, Justin Erenkrantz wrote:
> 
> Ditto - this is a critical problem with the current websocket RFC.  I 
> would place more credence into the argument that websocket RFC has 
> nothing at all to do with HTTP if it didn't rely upon the HTTP/1.1 
> Upgrade semantics. Either it's related, or it's not.

It's related in the sense that it shares the port and is designed such 
that its handshake can be interpreted as an HTTP Upgrade request. It's 
unrelated in that it is an independent protocol and doesn't need to be 
bootstrapped off HTTP in any way.


> Let's please not make the mistake of trying to tunnel protocols thru 
> HTTP firewalls just because you think it's fun.

I'm not sure what you mean by HTTP firewall. If you mean a regular 
firewall that only allows port 80, then tunnelling through is one of the 
requirements Web Socket was designed to address -- it certainly wasn't 
done for fun. If you mean an HTTP proxy, then Web Socket is explicitly 
designed to _not_ go through them.


On Sat, 19 Dec 2009, Jamie Lokier wrote:
> 
> Given WebSocket's stated goals (on this list) of:
> 
>    - Not working through HTTP proxies, including intercepting proxies.
>    - Not being HTTP compatible.
>    - Not being correctly parsable by a correct HTTP request parser.
> 
> I'm thinking it would be easier to just use port 81 and be done with it.

Unfortunately one of the other requirements is "can be used in 
environments that only have ports 80 and 443 open". Shareing port 80 with 
an HTTP server isn't the interesting case for Web Sockets, it's being 
hosted on port 443 that's the interesting case.

(I originally proposed using port 81, but when registering the port was 
told to use port 80 instead.)


On Fri, 18 Dec 2009, Justin Erenkrantz wrote:
> 
> Proxies (mainly reverse proxies) and caching are incredibly important to 
> any real scalability efforts.  I'm not aware of many high-traffic web 
> setups that don't rely upon massive amounts of front-ends.  So, any port 
> 80 traffic to any big sites is very likely to hit an hardware box (F5, 
> etc.) or a software balancer (a la mod_proxy or varnish or squid).  If 
> the current WebSockets ID is intended to *fail* in those situations, the 
> deployment story is likely to be a real non-starter...

How so? If you want to deploy a Web Socket server, then yeah, you have to 
update your scalers and so on. That doesn't seem like a shocking 
revelation, it seems obvious. Do you expect to use HTTP caching servers 
and load balancers in front of an IRC or SSH server?


> I'm curious what the makeup of this hybi@ list/WG is - how many server 
> or intermediary devs are on this?  From browsing the archives, it looks 
> like it is mostly browser/user-agent developers so far.

I can't speak for other members, but my current employer is the third 
largest server developer by active server count [1], and is also a pretty 
significant user of server-side intermediaries such as load balancers, 
routers, and the like. (At the time I started working on Web Socket, then 
called TCPConnection, I worked for Opera.)

[1] http://news.netcraft.com/archives/2010/01/07/january_2010_web_server_survey.html


On Sat, 19 Dec 2009, Jamie Lokier wrote:
>
> I'd say there are many server and application developers on the list 
> (including me), but the WebSocket protocol appears to be more driven 
> from the browser API side.  It is explicitly designed to support a 
> Javascript API, and pays little attention to issues like network 
> performance and scalability.

There's not much mention of it in the protocol, but those issues were 
considered in its development. It is likely that the balance is more 
towards the end-user and Web author side of things than is usual at the 
IETF, but having seen how complex some IETF protocols are (e.g. HTTP, 
XMPP, BEEP), I don't think that's a bad direction to go in.

If there are improvements that can be made for network performance and 
scalability while maintaining the author-friendliness of the protocol, 
please do send the feedback in. I've already discussed this protocol with 
people involved in mobile networks, large scale server-side deployments, 
small-scale server-side developers, and developers of server-side 
frameworks of various sizes, but more feedback is always welcome.


On Tue, 22 Dec 2009, Greg Wilkins wrote:
> 
> But now that you have been encouraged to use port 80, then you need to 
> play by the existing rules on port 80 - up until the time the upgrade 
> response is sent. After that, you can go crazy with the cheeze-whiz on 
> new protocol and any intermediary that does pass the upgrade, but does 
> not handle the subsequent traffic is in the wrong.
> 
> But prior to the upgrade response, you are in HTTP and should be 
> governed by RFC2616 and not just pretend HTTP.

The client is required to send something that is HTTP-compliant. There's 
no pretense here, it's really HTTP compliant. It's over-constrained 
relative to what HTTP allows, but that's fine, because it's a Web Socket 
client, not an HTTP client, so there's no difficulty in implementing such 
extra requirements.

On the server side, if it's a Web Socket server, then HTTP is irrelevant, 
and again, the extra requirements don't matter. It can just send the Web 
Socket response (which happens to look like HTTP, but that's mostly just 
for consistency with the request). If it's an HTTP server, then Web Socket 
doesn't apply.

So I really don't understand what it is you think needs to be changed. 
You've said things like the headers should be allowed to be in any 
order... but the client is a Web Socket client, so why would we 
arbitrarily say that the client could send back headers in a different 
order? It wouldn't help anyone, since the client isn't an HTTP client, and 
is unlikely to share any of the relevant code. (Code for some of the 
authentication headers could be shared, but those _are_ allowed in any 
order, in their spot in the request.) Similarly, the server, when parsing 
the headers in "HTTP" mode, is unaffected by the order -- and indeed, the 
Web Socket spec doesn't require _anything_ from the server in terms of 
parsing the client request. You can completely ignore it for all the spec 
cares. All that matters is that you send back a specific handshake. But if 
you're sending back the handshake, then you're a Web Socket server, so why 
do we need to follow HTTP rules? We've already established the client is a 
Web Socket client, so what on earth is the point of using HTTP rules?


> > It's only through three years of responding to feedback that we've 
> > reached this point, and I've done my best to ensure that the handshake 
> > really is compatible with HTTP
> 
> Why "compatible with HTTP"?  Why not just be real HTTP prior to the 
> upgrade?

I really don't understand the practical difference. Are you suggesting 
that a Web Socket client should look around at what HTTP connections are 
already open to the same port and if there is one, just do an Upgrade 
half-way through the connection? That sounds like a nightmare waiting to 
happen. What practical real problem would supporting that solve?


> This list and the squid list and other forums are full of people raising 
> use-cases that are not supported.  This thread was started by a post 
> about a C HTTP stack that had difficulty with the "compatible HTTP", 
> I've posted similar issues with the Jetty implementation.

It seems pretty obvious that trying to jam a two-way protocol into a 
server designed for a request-response protocol is going to be non-trivial 
_regardless_ of what the protocol looks like. Making it possible to 
upgrade half-way through a connection or requiring that the server support 
even more headers or making it possible to hide the Upgrade request in the 
trailer headers of a chunked request or something equally asinine simply 
isn't going to make it easier.


> The squid list shows how a caching proxy is having problems with just 
> "compatibility"

A caching proxy makes as much sense with Web Socket as with, say, SSH.


> and there have been plenty of other use-cases listed about load 
> balancers, SSL offload, etc. etc.

As far as I can tell, these issues have been resolved.


On Sat, 19 Dec 2009, Greg Wilkins wrote:
> Jamie Lokier wrote:
> > For protocols which don't look like HTTP, you mainly do it because 
> > sometimes TCP port 80 is the only available outgoing port in 
> > firewalled environments.  Ironically, there is often an intercepting 
> > proxy in such environments, so it won't actually work.
> 
> In countries with repressive regimes that are against free speech ( eg. 
> Australia), there will always be a proxy on port 80 making sure that you 
> don't read about religions (eg falun gong) that offend our major trading 
> partners (eg china) etc.
> 
> So websocket is specified to not work at all in Australia. Of course it 
> probably will work - but not because it is specified to do so, but 
> because the proxies just happen to be implemented in certain ways.  
> Happen stance is not a great basis for a protocol.

Can you use TLS from Australia on port 443? If you can, then Web Socket 
should work fine. Port 80 really isn't the interesting story.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'