Re: [hybi] Multiplexing in WebSocket (Was: HyBi Design Space)

Ian Hickson <ian@hixie.ch> Sat, 17 October 2009 02:22 UTC

Return-Path: <ian@hixie.ch>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 10F533A6958 for <hybi@core3.amsl.com>; Fri, 16 Oct 2009 19:22:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.53
X-Spam-Level:
X-Spam-Status: No, score=-2.53 tagged_above=-999 required=5 tests=[AWL=0.069, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cCvjzy4mM4hG for <hybi@core3.amsl.com>; Fri, 16 Oct 2009 19:22:41 -0700 (PDT)
Received: from looneymail-a2.g.dreamhost.com (caibbdcaaaaf.dreamhost.com [208.113.200.5]) by core3.amsl.com (Postfix) with ESMTP id B4E0D3A684A for <hybi@ietf.org>; Fri, 16 Oct 2009 19:22:41 -0700 (PDT)
Received: from hixie.dreamhostps.com (hixie.dreamhost.com [208.113.210.27]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by looneymail-a2.g.dreamhost.com (Postfix) with ESMTP id 6A81D16D3F7; Fri, 16 Oct 2009 19:22:46 -0700 (PDT)
Date: Sat, 17 Oct 2009 02:34:47 +0000
From: Ian Hickson <ian@hixie.ch>
To: Greg Wilkins <gregw@webtide.com>
In-Reply-To: <4AD53DCA.6050304@webtide.com>
Message-ID: <Pine.LNX.4.62.0910170203460.9145@hixie.dreamhostps.com>
References: <4ACE50A2.5070404@ericsson.com> <3a880e2c0910081600v3607665dp193f6df499706810@mail.gmail.com> <4ACF4055.6080302@ericsson.com> <Pine.LNX.4.62.0910092116010.21884@hixie.dreamhostps.com> <4AD2E353.8070609@webtide.com> <4AD2F43D.6030202@ninebynine.org> <4AD39A64.4080405@webtide.com> <Pine.LNX.4.62.0910132335390.25383@hixie.dreamhostps.com> <4AD53DCA.6050304@webtide.com>
Content-Language: en-GB-hixie
Content-Style-Type: text/css
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Cc: "hybi@ietf.org" <hybi@ietf.org>
Subject: Re: [hybi] Multiplexing in WebSocket (Was: HyBi Design Space)
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 17 Oct 2009 02:22:43 -0000

On Wed, 14 Oct 2009, Greg Wilkins wrote:
> >>> 
> >>> However, WebSocket connections are long lived by design,
> >>
> >> I do not believe you can anticipate websocket usage over the long 
> >> term.  Just because you think now they will only be used for long 
> >> lived connections, that may not always be the case.
> > 
> > Well using Web Socket for a short request-response burst would be 
> > pretty stupid (given HTTP)... What other uses do you have in mind?
> 
> Which is why having a websocket be associated with 1 and only 1 server 
> side resource is a problem.  A client may be interested in many event 
> sources from a server, but each might only exist (or be of interest) for 
> a short period of time.

Then the author really should just combine the event sources into one (or 
use the EventSource feature, which is probably more appropriate for this 
use case anyway). I certainly don't think we should be optimising this 
protocol for brief sessions, that would just encourage people to use it 
for that purpose. It performs much better with long-lived sessions.


> Sure it is possible to create an aggregating resource to avoid having to 
> have multiple short lived connections to each real resource as it comes 
> and goes.... but only if all the resources are from the same 
> application/framework/development.

Could you describe the scenario in which you imagine an application 
needing multiple two-way communication channels one after the other, each 
for short periods of time, each with a different server-side development 
team? I'm not really seeing the use case here.


> >> But even if the frameworks implemented to same wire protocol for 
> >> multiplexing, they would not be able to:
> >>
> >>   + share connections between frameworks
> > 
> > Even with multiplexing we wouldn't be able to share connections 
> > between server-side frameworks -- why would someone use two 
> > client-side frameworks but the same server-side framework?
> 
> Because multiplexing should be built into the transport and will not 
> rely on a server side framework.

You are assuming monolithic servers like Apache, with multiple frameworks 
plugged into those servers. I'm imagining 100-line perl scripts. While we 
have such differing goals, I don't think we can come to an agreement on 
this. I want the protocol to be so simple that the frameworks are 
unnecessary; that if they are used, they would be the ones implementing 
the entire protocol on the server-side. In this world, there is no server 
to handle the multiplexing across multiple frameworks.


> Making multiplexing a transport concern means that applications and 
> frameworks do not need to consider connection efficiency when designing 
> their event streams.

If the channels are multiplexed, then we're assuming that the server-side 
is an "aggregating resource" as you put it. If it's possible to use an 
aggregating resource with a multiplexed connection, then why would it not 
be possible to do it with a Web Socket connection? I don't understand.


> >> It is short sighted to build such limitations into a once-in-a-decade 
> >> chance to upgrade the protocols we use on the web.
> > 
> > This is not a once-in-a-decade chance.
> 
> I hope you a right.  But still no reason to not try to get it right as 
> possible in the next iteration.

Sure, so long as we don't sacrifice something else for perfection (such as 
time-to-market or simplicity of implementation).


> >>> and the cost of setting up a connection therefore is insignificant 
> >>> compared to the cost of the protocol itself.
> >>
> >> There is also a cost associated with maintaining and servicing a 
> >> connection.
> > 
> > This would be the same cost associated with maintaining and servicing 
> > a channel in a multiplexed connection.
> 
> Sorry but this is just wrong.
> 
> Both approaches need the application end point for the channel
> and all its associated resources.
> 
> But only the multiple TCP/IP connection approach needs additional
> kernel space buffers, file descriptors and larger select sets etc.
> 
> The key difference is that with multiple TCP/IP connections, the server 
> has to undertake to read a windows worth of data on each of those 
> connections.  With a multiplexed solution, only a single window needs to 
> be read before back pressure is asserted and the flow of additional data 
> slowed to match the rate of consumption.

Servers already have to deal with multiple connections per user, for HTTP. 
I do not see how WebSocket makes things worse.

If your OS' kernel is badly optimised such that multiple connections are 
expensive, then instead of changing the protocol, fix the kernel. That way 
it'll also work better for multiple IMAP connections, multiple Jabber 
connections, multiple HTTP connections, etc.


> >> Implementing connection policies is not something that should be left 
> >> to the application developer. It is far far better that the browsers 
> >> implement connection control policies.
> > 
> > I disagree. I think it's far better that the _server_ decide the 
> > connection control policies. That's the model Web Sockets follows (as 
> > an author you'll do whatever the server expects).
> 
> How can a server influence the connections that a client opens, other 
> than to refuse excess ones.  That is pretty course grained "control"

You don't need any finer grain control. You provide a service, and you 
provide the terms of service. If an author violates those terms of service 
(e.g. opening multiple connections instead of using a shared worker), then 
you terminate his connection, so that his app fails. His users will then 
exert pressure on the author to fix his app. There is no greater pressure.


> >> But multiplexing can be much much simpler.  At it simplest, it just 
> >> needs a channel id in each frame and the channel creation can be 
> >> automatic.
> > 
> > Multiplexing at the per-path level would be of extremely limited use, 
> > IMHO. If there really are multiple widgets talking back to the same 
> > _server_, they certainly won't be talking back to the same _path_.
> 
> If by path, you mean server side resource or URI, then I'm definitely 
> saying that we should be able to multiplex multiple paths over the same 
> TCP/IP connection.

That would be incompatible with the goal of sharing the port with an HTTP 
server, since that requires the path name to be given up front.


> HTTP is not limited to get the same URI over the same connection, so why 
> should websocket or similar?

I assume it is a given that you would want a protocol to have the property 
that connecting with path A, then opening a channel for path B, should 
result in a connection with the same internal state as connecting with 
path B, then opening a channel for path A.

Given the HTTP Upgrade mechanism whereby an HTTP server can have a 
WebSocket script assigned to each path, you can end up in a situation 
where connecting to A and connecting to B establish connections with 
different server-side scripts.

Then, opening the other channel would not have the desired effect, as the 
HTTP process wouldn't be involved anymore.



> why must routing of messages between client side components and server 
> side components be tied to a TCP/IP connection.  The path is identified 
> by a URL and neither the client nor server should care if a dedicated or 
> shared TCP/IP connection is used.

The entire purpose of Web Socket is to provide a TCP connection to 
JavaScript. That's the goal.


> >>   + share connections with code written directly to the websocket API.
> > 
> > That's basically just another framework.
> 
> except that it is one that you into which cannot easily insert 
> additional layered protocols.  Let's say RFC66666 comes out which is the 
> universally agreed best way to multiplex messages on top of your 
> proposed websocket protocol. How are you going to get an implementation 
> of that inserted into every direct usage of the websocket API?

The same way you got Web Socket use in the first place -- by writing it. 
Unless RFC66666 is more complicated than Web Sockets itself, it wouldn't 
be much of a burden.


> >>   + share connections with other iframes, tabs and/or windows.
> > 
> > This seems far too infrequent an occurrance to really optimise for.
> 
> Hello????  Do you ever use a browser????  Are you telling me you have 
> never opened two tabs/windows to the same website either accidentally or 
> on purpose???

For sites that want to support this use case, we have shared workers.


> One of the keen users of this technology that I have encountered is 
> traders, who habitually have 3 huge screens on their desks with umpteen 
> windows to their trading applications, all with slightly different view.

Sure, and such systems would use shared workers. There's far more state 
that they'll want to share here than just the TCP connection -- common 
client-side database backends, common UI state, etc.


> >>   + implement a strong origin policy for security
> > 
> > I don't understand what situation would lead to multiple channels 
> > having different origins but talking to the same server with the same 
> > authentication cookies and the same path.
> 
> It is entirely plausible that a service might publish events, some of 
> which are only for code from the same origin and others are either for 
> code from any origin or from a specific set of origins.

Given the sensitivity of this kind of thing, I would feel _very_ 
uncomfortable relying on authors getting the server-side of this right, 
and even less comfortable relying on proxy implementings getting the proxy 
side of this right. I'd much rather have different connections for each 
origin, so that we can rely on the user agent to get this right. There are 
far fewer of them and they get more QA.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'