Re: [hybi] Multiplexing in WebSocket

Ian Hickson <ian@hixie.ch> Thu, 22 October 2009 11:47 UTC

Return-Path: <ian@hixie.ch>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 86A4E28C137 for <hybi@core3.amsl.com>; Thu, 22 Oct 2009 04:47:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.245
X-Spam-Level:
X-Spam-Status: No, score=-0.245 tagged_above=-999 required=5 tests=[AWL=-2.240, BAYES_40=-0.185, SARE_RMML_Stock17=0.64, SARE_RMML_Stock4=1.54]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id B4zhSkQR5-iD for <hybi@core3.amsl.com>; Thu, 22 Oct 2009 04:47:48 -0700 (PDT)
Received: from looneymail-a2.g.dreamhost.com (caibbdcaaaaf.dreamhost.com [208.113.200.5]) by core3.amsl.com (Postfix) with ESMTP id 97C943A69C9 for <hybi@ietf.org>; Thu, 22 Oct 2009 04:47:48 -0700 (PDT)
Received: from hixie.dreamhostps.com (hixie.dreamhost.com [208.113.210.27]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by looneymail-a2.g.dreamhost.com (Postfix) with ESMTP id 4540716D330; Thu, 22 Oct 2009 04:47:58 -0700 (PDT)
Date: Thu, 22 Oct 2009 12:01:04 +0000
From: Ian Hickson <ian@hixie.ch>
To: Martin Tyler <martin.tyler@boo.org>, Julian Reschke <julian.reschke@gmx.de>, Greg Wilkins <gregw@webtide.com>, Wellington Fernando de Macedo <wfernandom2004@gmail.com>
In-Reply-To: <4ADB6F0B.4000004@gmail.com>
Message-ID: <Pine.LNX.4.62.0910221120380.9145@hixie.dreamhostps.com>
References: <4ACE50A2.5070404@ericsson.com> <3a880e2c0910081600v3607665dp193f6df499706810@mail.gmail.com> <4ACF4055.6080302@ericsson.com> <Pine.LNX.4.62.0910092116010.21884@hixie.dreamhostps.com> <4AD2E353.8070609@webtide.com> <4AD2F43D.6030202@ninebynine.org> <4AD39A64.4080405@webtide.com> <Pine.LNX.4.62.0910132335390.25383@hixie.dreamhostps.com> <4AD53DCA.6050304@webtide.com> <Pine.LNX.4.62.0910170203460.9145@hixie.dreamhostps.com> <4ADA7FD4.9010406@webtide.com> <4ADB6F0B.4000004@gmail.com>
Content-Language: en-GB-hixie
Content-Style-Type: text/css
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Cc: hybi@ietf.org
Subject: Re: [hybi] Multiplexing in WebSocket
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Oct 2009 11:47:50 -0000

On Sat, 17 Oct 2009, Martin Tyler wrote:
> On Sat, Oct 17, 2009 at 3:34 AM, Ian Hickson <ian@hixie.ch> wrote:
> > 
> > You are assuming monolithic servers like Apache, with multiple 
> > frameworks plugged into those servers. I'm imagining 100-line perl 
> > scripts. While we have such differing goals, I don't think we can come 
> > to an agreement on this. I want the protocol to be so simple that the 
> > frameworks are unnecessary; that if they are used, they would be the 
> > ones implementing the entire protocol on the server-side. In this 
> > world, there is no server to handle the multiplexing across multiple 
> > frameworks.
>
> Could you explain why you are 'imagining' this?

Because I think that it would result in a healthier ecosystem.


> It's not how things are done now, so why do you think server side 
> development is going to suddenly totally change just because WebSocket 
> is available?

It is how things are done now, for CGI scripts. It's not how things are 
done for HTTP servers.

I don't think server-side development will totally change, but I think 
that it will enable a class of development that simply isn't possible 
today. Imagine how the world might be different if instead of having to 
twist Apache to do your bidding, you could just as easily write your own 
fully-conforming replacement HTTP server in a weekend. Some of the world's 
biggest Web sites have a competitive advantage because they can afford to 
roll their own HTTP server highly optimised for their own setup. By having 
a simpler protocol, with Web Socket, that advantage will be available to 
anyone, not just those who can afford a development team dedicated to 
implementing the big spec.


> Are these multiple 100 line scripts all going to be separate processes 
> listening on separate ports on the same server? This just doesn't sound 
> likely at all.

That depends on what the author wants to do. I expect most sites will only 
need one Web Socket server, which might service multiple URLs.


> I'm imagining server side APIs that wrap this up for you, handling the 
> multiplexing if we go down Gregs route, but even if not you still want 
> something to handle the protocol for you. It would be nice if there 
> could be a single server side API, but that is unlikely due to the 
> number of server side technologies and frameworks - but something fairly 
> consistent could be developed.

Having a simple protocol doesn't preclude this.

Having a complex protocol requires it.


> I really don't get this 11 year olds writing perl scripts idea - have 
> you explained the reasons, use cases, benefits etc of this?

Simplicity is its own benefit.


On Sat, 17 Oct 2009, Julian Reschke wrote:
> Ian Hickson wrote:
> > On Wed, 14 Oct 2009, Julian Reschke wrote:
> > > So, for clarity, it would probably be good to state in the draft 
> > > that this question was considered not to be an issue.
> > 
> > I don't think it's not an issue. It's an issue like many others. I'm 
> > not going to start listing all the things that were considered in the 
> > design of the protocol, that'd just be a massive list of things.
> 
> Well, earlier on you said:
> 
> > FWIW, so far I've concluded (based on long discussions with a number 
> > of developers and implementors) that multiplexing would actually not 
> > be that great. It could probably only be used for multiplexing to a 
> > single websocket path, without making the handshake highly asymmetric 
> > or breaking the HTTP compatibility. If it's intended for per-client 
> > multiplexing, it can easily be implemented at the application layer, 
> > where a much better job can be done that could be done at the protocol 
> > layer. If it's intended for multiclient multiplexing, then it's only 
> > helping server load, not client performance, and server load isn't 
> > especially an issue here.
> 
> So it appears that you aware of that drawback, but chose not to address 
> this. I still think it would be helpful to document that.

I also chose not to support per-message addressing, per-message 
out-of-band metadata, priority channels, compression, connection close 
handshakes, and Gopher compatibility. However, I don't think it's valuable 
for the spec to have a section discussing all the things it doesn't 
support, since that list is infinite.


On Sat, 17 Oct 2009, Greg Wilkins wrote:
> Ian Hickson wrote:
> 
> > Then the author really should just combine the event sources into one 
> > (or use the EventSource feature
> 
> But there might not be 1 author!  Mash-ups and portals and such things 
> will combine the work of many authors unknowingly onto the same page. We 
> want to encourage reuse of components and not require every webapp to be 
> rewritten from scratch with a shared messaging infrastructure.

I don't buy that these widgets will all be communicating with the same 
backend.


> > Could you describe the scenario in which you imagine an application 
> > needing multiple two-way communication channels one after the other, 
> > each for short periods of time, each with a different server-side 
> > development team? I'm not really seeing the use case here.
> 
> Consider the google home page.  This can be customized with lots of 
> widgets from third parties.

If the widgets are from third parties, then they wouldn't all talk back to 
the same server.

If the widgets are all talking back to the same server, then the server 
provider can also provide a common library that channels all communication 
over a single socket. Consider widgets used on sites like Yahoo! and 
MySpace, based on Caja -- they wouldn't allow direct access to WebSocket 
at all, and would channel everything through a single socket. This allows 
them to control the multiplexing.


> Imagine the same sort of site, but fronting an organization with lots of 
> real time data. A stock trading site for example.
> 
> They may allow a user to populate their pages with widgets supplied by 
> third parties that may perform all sorts of trading analysis, monitoring 
> and trading.  I widget may establish "connections" to:
> 
>     + receive live price information
>     + monitor the state of orders placed
>     + interact with others in a process to create a
>       complex instrument, swap etc.
> 
> These "connections" may be short or long lived (who knows how long an 
> order might take to complete).
> 
> For scalability of the service, you don't want every widget to be given 
> it's own TCP/IP connection.

Indeed -- and you wouldn't. You'd provide a shared worker that handled the 
connection management and which multiplexed all your messages for you, in 
a manner that makes the most sense for this site.


> Of course the organization could provide a js framework that did that 
> multiplexing - but widget developers would be highly unlikely to use it.

I don't see why. I think it's in fact far more likely that they'd use a 
provided library than try to reverse-engineer the site's protocol.


> If they use websocket directly, then the code they write can access the 
> service both in the portal site and from other sites that don't have the 
> framework. Plus if they don't use the framework, they get their own 
> dedicated TCP/IP connection and they will get some latency/performance 
> benefits of other widgets that are sharing a connection. Of course this 
> creates an arms race as it only works if everbody else is sharing. Soon 
> all widgets will be using their own dedicated connection, then each 
> widget would start opening multiple dedicated connections (eg 1 per 
> share), trying to get more and more resources allocated.
>
> Developers and resources are like 3 years olds and toys! There is no 
> such thing as optional sharing!

I disagree with your pessimism here. I think there is ample evidence that 
people are happy to use JS libraries to talk to services. Consider 
Google's GData API. Do you know anyone who uses this API without using the 
GData JS library?


> I really do not think you should be using your influence over the 
> websocket protocol to drive some agenda to revolutionize server side 
> development.
> 
> We cannot design a protocol that is only suitable for some imagined 
> future scenario.
> 
> Trying to enforce this future by making future protocols badly support 
> monolithic servers strikes me as some Orwellian mind control exercise!

Web Sockets supports monolithic servers fine, it just doesn't require 
them.

I don't think you should be using _your_ influence by continuing the trend 
of over-engineered protocols that require monolithic servers. :-)


> > If the channels are multiplexed, then we're assuming that the 
> > server-side is an "aggregating resource" as you put it. If it's 
> > possible to use an aggregating resource with a multiplexed connection, 
> > then why would it not be possible to do it with a Web Socket 
> > connection? I don't understand.
> 
> Because in the real world, we have one set of developers writing servers 
> and user-agents and another set of developers writing applications that 
> run on them.

That can still happen. I just think we shouldn't artificially create a 
market for the server developers by making them necessary to application 
development. That's unhealthy.


> The infrastructure developers impose all sorts of restrictions on the 
> application developers

That pretty much summarises the problem with complex protocols.


> In this world, the application developers will throw off the 
> infrastructure shackles imposed on them, because indeed a dedicated 
> TCP/IP connections is better than a shared one. Cross domain development 
> is so much easier without security models etc. Better let them run as 
> root as well, because then they can open any port and up their priority 
> to stop sharing CPU.

The Web Socket protocol has a built-in security model. The rest of your 
comment is a wild exaggeration of what I'm saying.


> There are really really good reasons why infrastructure is developed for 
> application developers.  Infrastructure is not only to enable 
> applications, but it is to constrain them and to force them to share 
> resources fairly and to respect security models.
>
> But there is absolutely no way that any responsible organisation is 
> going to deploy a system where sockets opened by applications 
> programmers are exposed directly to the internet and run with ad-hoc 
> protocol implementations.

Web Sockets doesn't require this. It allows you to be as constraining as 
you like.


> > I assume it is a given that you would want a protocol to have the 
> > property that connecting with path A, then opening a channel for path 
> > B, should result in a connection with the same internal state as 
> > connecting with path B, then opening a channel for path A.
> > 
> > Given the HTTP Upgrade mechanism whereby an HTTP server can have a 
> > WebSocket script assigned to each path, you can end up in a situation 
> > where connecting to A and connecting to B establish connections with 
> > different server-side scripts.
> 
> How so.
> 
> I'm advocating abstracting developers away from connections.
> Handler A will send/receive messages for path A
> Handler B will send/receive messages for path B
> 
> Neither handler has any need to know if a shared connection was
> used or not.

The model you describe is incompatible with light-weight self-contained 
servers without a monolithic controlling server. I'm not interested in 
artificially supporting such monolithic server development when we don't 
have to.


> > The entire purpose of Web Socket is to provide a TCP connection to 
> > JavaScript. That's the goal.
> 
> Really?  If it was, then why not just expose the real socket API?

Because that wouldn't be Web-safe, and it wouldn't be compatible with the 
Web browser quantised event model.

Web Socket is the smallest delta I could find from TCP to be Web-safe and 
quantised. It adds a handshake to prevent connection to WebSocket-unaware 
servers; it adds a security layer to prevent XSS- and CSRF-like attacks; 
and it adds framing to change TCP from a stream model to a packet model. 
Other than that, it adds nothing.


> Surely the aim is to establish bidirectional communication with the 
> server.  Whose to say that TCP/IP will be the favoured transport for the 
> entire life span of the applications using the websocket API?

If something else becomes better, then let's introduce a new protocol. For 
now, though, that's not a problem.


> In any case, if the IETF is to endorse a new web protocol, I believe 
> that their charter is to consider more than just the needs of javascript 
> developers.

The IETF is welcome to do whatever it likes. The only reason Web Sockets 
is here is that the IETF wanted it here.

Web Sockets has as its goal to safely expose TCP to JS developers. If 
there are other goals that need to be met, then other protocols should be 
developed. We shouldn't artificially try to jam multiple goals into one 
protocol just because we happen to be on one mailing list together.


On Sun, 18 Oct 2009, Wellington Fernando de Macedo wrote:
>
> I would like to make a contribution to this discussion. I think the ws 
> protocol is good as it is right now. But I think multiplexing is a 
> valuable feature. I don't like the BWTP proposal because I think it is 
> expensive. Considering the current protocol, I suggest we introduce 
> "channel handshakes", using another frame type. Each WebSocket object 
> instance would be associated with one channel. I don't intend to make 
> another proposal, so I will illustrate what I'm thinking using an 
> example and I hope it can be useful someway :)
> 
> 1. Suppose there is already a established connection:
> 
> User agent
> -------------
> GET /ws/script1.php HTTP/1.1
> Upgrade: WebSocket
> Connection: Upgrade
> Host: www.myhost.com
> Origin: file://
> CR LF
> 
> Server
> -------------
> HTTP/1.1 101 Web Socket Protocol Handshake
> Upgrade: WebSocket
> Connection: Upgrade
> websocket-origin: file://
> websocket-location: ws://www.myhost.com/ws/script1.php
> CR LF
> 
> 2. A normal exchange of data.
> 
> User Agent
> --------------
> 0x00 ping 0xff
> 
> Server
> --------------
> 0x00 pong 0xff
> 
> 3. In the meanwhile the user agent has to open 
> ws://www.myhost.com/ws/script2.php. Then, using the same connection, the 
> user agent could use the 0x01 frame type as a way to perform the 
> "channel handshake". Also, the request must have a websocket-channel-id 
> header (containing a two bytes number value) which would identity the 
> channel.
> 
> User Agent
> --------------
> 0x01 GET /ws/script2.php HTTP/1.1
> Upgrade: WebSocket
> Connection: Upgrade
> Host: www.myhost.com
> Origin: file://
> websocket-channel-id: 1
> CR LF 0xff
> 
> Server
> -------------
> 0x01 HTTP/1.1 101 Web Socket Protocol Handshake
> Upgrade: WebSocket
> Connection: Upgrade
> websocket-origin: file://
> websocket-location: ws://www.myhost.com/ws/script2.php
> websocket-channel-id: 1
> CR LF 0xff
> 
> 4. Since that, both channels could exchange data using a third frame 
> type (0x02), followed by the channel id. The initial channel could still 
> use the 0x00 frame type, however. For instance, a exchange of data on 
> the initial channel would be:
> 
> User Agent
> --------------
> 0x00 ping 0xff
> 
> Server
> --------------
> 0x00 pong 0xff
> 
> or
> 
> User Agent
> --------------
> 0x02 0x00 0x00 ping 0xff
> 
> Server
> --------------
> 0x02 0x00 0x00 pong 0xff
> 
> 4. And a exchange of data on the new channel:
> User Agent
> --------------
> 0x02 0x00 0x01 ping 0xff
> 
> Server
> --------------
> 0x02 0x00 0x01 pong 0xff

The problem with this model is that if the connections are being handed 
from an HTTP server to various Web Socket servers, one for each URL, the 
above wouldn't work -- you'd end up connecting to /ws/script1.php, and 
then asking _that_ script to connect to /ws/script2.php, which makes no 
sense.

It's also unclear how you would handle the security model in this world. 
What if the server returns a different origin for the nested channel? 
Also, how do you handle connection closure? What if the user opens a dozen 
tabs to a server, and then closes all but one -- does the server have to 
keep all the channels open?

This kind of model would introduce the possibility of a proxy that merges 
multiple Web Socket connections together, but what if the author tested 
without a proxy, and so never saw multiple connections from different 
users to the same TCP connection, and therefore assumed per-connection 
security? Also, how would a heavy-traffic site handle the one connection 
from a proxy at a large site, where channels were continually opened?

Generally, this introduces a host of thorny problems that TCP has already 
solved. I think rather than reinventing TCP inside WebSocket, we're better 
off leaving this as an application-layer problem, where authors are much 
better positioned to make educated choices for how to implement this.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'