Re: [hybi] Multiplexing in WebSocket

Ian Hickson <ian@hixie.ch> Sat, 24 October 2009 10:55 UTC

Return-Path: <ian@hixie.ch>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 705063A6767 for <hybi@core3.amsl.com>; Sat, 24 Oct 2009 03:55:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.547
X-Spam-Level:
X-Spam-Status: No, score=-2.547 tagged_above=-999 required=5 tests=[AWL=0.052, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RiRsd2k5PHIC for <hybi@core3.amsl.com>; Sat, 24 Oct 2009 03:55:25 -0700 (PDT)
Received: from looneymail-a2.g.dreamhost.com (caibbdcaaaaf.dreamhost.com [208.113.200.5]) by core3.amsl.com (Postfix) with ESMTP id 257DB3A68ED for <hybi@ietf.org>; Sat, 24 Oct 2009 03:55:25 -0700 (PDT)
Received: from hixie.dreamhostps.com (hixie.dreamhost.com [208.113.210.27]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by looneymail-a2.g.dreamhost.com (Postfix) with ESMTP id 8B40916D3CC; Sat, 24 Oct 2009 03:55:36 -0700 (PDT)
Date: Sat, 24 Oct 2009 11:09:07 +0000
From: Ian Hickson <ian@hixie.ch>
To: Greg Wilkins <gregw@webtide.com>
In-Reply-To: <4AE23D7A.2060009@webtide.com>
Message-ID: <Pine.LNX.4.62.0910240926500.9145@hixie.dreamhostps.com>
References: <4ACE50A2.5070404@ericsson.com> <3a880e2c0910081600v3607665dp193f6df499706810@mail.gmail.com> <4ACF4055.6080302@ericsson.com> <Pine.LNX.4.62.0910092116010.21884@hixie.dreamhostps.com> <4AD2E353.8070609@webtide.com> <4AD2F43D.6030202@ninebynine.org> <4AD39A64.4080405@webtide.com> <Pine.LNX.4.62.0910132335390.25383@hixie.dreamhostps.com> <4AD53DCA.6050304@webtide.com> <Pine.LNX.4.62.0910170203460.9145@hixie.dreamhostps.com> <4ADA7FD4.9010406@webtide.com> <4ADB6F0B.4000004@gmail.com> <Pine.LNX.4.62.0910221120380.9145@hixie.dreamhostps.com> <4AE08907.7080402@webtide.com> <Pine.LNX.4.62.0910230348470.9145@hixie.dreamhostps.com> <4AE1E659.5050507@webtide.com> <Pine.LNX.4.62.0910232154470.13521@hixie.dreamhostps.com> <4AE23D7A.2060009@webtide.com>
Content-Language: en-GB-hixie
Content-Style-Type: text/css
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Cc: hybi@ietf.org
Subject: Re: [hybi] Multiplexing in WebSocket
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 24 Oct 2009 10:55:26 -0000

On Fri, 23 Oct 2009, Greg Wilkins wrote:
> Ian Hickson wrote:
> > On Fri, 23 Oct 2009, Greg Wilkins wrote:
> >> Ian Hickson wrote:
> >>> On Thu, 22 Oct 2009, Joshua Bell wrote:
> >>>
> >>>> * I seem to recall that one of the desires for sentinel-based 
> >>>> frames was to allow octet streams for which the length was not 
> >>>> known in advance.
> >>>
> >>> No; the only reason for sentinel-based frames was to not rely on 
> >>> authors having to determine the length of their UTF-8-encoded 
> >>> strings, which in many environments can be easy to get wrong.
> >>
> >> Authors that can't determine the length of a UTF-8 string are not 
> >> exactly the sort of developers that should be implementing network 
> >> protocols.
> > 
> > Wow.
> > 
> > I cannot stand behind such a judgmental statement. Personally I would 
> > like to make this kind of thing accessible to as many people as 
> > possible.
> 
> I don't understand your Wow.  Buffer overflows have historically been 
> one of the biggest security issues with any services exposed to the 
> internet.

My "Wow" was to your comment saying that someone might not be the sort of 
developers that should be implementing something.

WebSocket is actually intentionally designed to make buffer overruns 
harder to accidentally write into code. For binary data, there is no 
sentinel, so you can allocate the given amount of memory and fill it, 
without any risk of overrunning anything. Indeed, you can't do anything 
else, since there's no way to detect the end of the frame other than by 
its length. The only risk is with null length frames, but there no data is 
written into them, so the damage, if any, is limited. For text data, there 
is no length marker, so the only way way to have an overrun is to decide 
on a fixed maximum size and then have someone send you a frame that's 
larger. Provided you don't make that mistake, there's no way to be tricked 
into a buffer overrun as far as I can tell, since there's no up-front 
length declaration.


> Utf-8 will mostly have a 1 character to 1 byte mapping (at least for 
> english speakers), so many programmers will write code that they think 
> works by allocating byte buffers the same size as their characters 
> strings.  But then when giving multi-byte characters they will suffer 
> buffer overflows.

The way the protocol is designed, they don't need to allocate those 
buffers. If they use UTF-8 internally, they can just output the string 
straight into the I/O layer, with a 0x00 byte first, and a 0xFF byte 
after. That's the reason for the server-to-client data not having a length 
field for UTF-8, actually.

(The protocol wasn't intended to be symmetric, it just ended up that way 
because that design made sense given the goals of making parsing and 
serialising easy for the server side.)


> I've made this programming error myself many times! So perhaps I too am 
> one of the authors that should not be writing network protocols.

If you don't think you should be writing one, you almost certainly 
shouldn't be designing one. :-)


> The level of programming skill needed to manage meta data or channels is 
> entirely of the same order of magnitude as managing utf-8 encoding.

The main difference is that most people won't manage UTF-8 encoding. In 
Perl, for instance, there's the "Encode" library that you can just pass 
byte strings to to have them converted into UTF-8, and vice-versa.


> So my point is probably better expressed by saying that programmers who 
> are able to write a websocket implementation that correctly handles 
> multi-byte utf-8 characters, will be entirely capable of handling the 
> additional "complexity" of some of the additional capabilities being 
> discussed here.

Complexity in one part of the protocol is not a license to make the 
protocol complicated everywhere.

That's a red herring, though. Multiplexing (for one) isn't omitted just, 
or even primarily, because it would be complicated, but because the 
cost-benefit ratio doesn't make it worth it. We can push the complexity of 
multiplexing (say) to the application level, so that authors who need it 
can use it, and others don't need to worry about it.

This is very much in line with the Unix philosophy of making small self- 
contained building blocks, instead of monolithic solutions.


> >> It seams an entirely reasonable simplification of websocket to use 
> >> only length limited framing and to use the type byte to indicate such 
> >> things as content charset
> > 
> > What's the use case for doing anything other than UTF-8?
> 
> Even limiting myself to js in the browser, I can think of of reasons 
> that a content type other than UTF-8 would be desirable.
> 
> * Compressed UTF-8

Binary data will be supported in v2, once the JS side supports it. Then 
you can compress your data easily.

If it turns out that short messages would in fact benefit from compression 
(which personally I am skeptical about), then we can easily add it in a 
future version as a dedicated feature, too.


> * UTF-16 for those whose language happens to use a lot of >2 byte
>   utf-8 characters

If we were worried about bandwidth, gzip compressing UTF-8 would give 
better results across the board than allowing the author to switch between 
UTF-8 and UTF-16.


> * UTF-16 for those that can't deal with the uncertainty and/or
>   unpredictability of the length of a UTF-8 string

UTF-16 is far worse than UTF-8 when it comes to character-to-byte length 
issues, in practice.


> If we consider what a browser itself might like to do with websocket, or 
> a non-browser client might like, then we have:
> 
> * Mime encoded content for those that want to send other
>   content types down a stream.  Images, sounds etc.

We have binary frames already staked out in the protocol, waiting for 
support in JS for binary.


> * Some other framed content for those that want to implement
>   multiplexing on top of websocket (as you advocate) but don't
>   want to have to do it with text based framing.

Again, binary data will be support in v2. The protocol already supports 
it, it's just not in the API.


> * Anybody that wants to send any content with 0xFF in it
>   and does not want to base64 code their entire content
>   as a result

Same.


> * A mobile phone that is restricted to a single outgoing
>   connection (not uncommon) so the browser wants to
>   transport HTTP over the websocket connection

Could you elaborate on this use case?


> * Something that none of us has thought of yet

Once we think of it, we can address it. There's no point trying to 
optimise for something we don't know about; we have no way to know if we 
are optimising in the right way.


> You also say that future multiplexing (or other complex things), should 
> be built on top of websocket. Yet when anybody asks for additional 
> content types to help do that, you say they are not needed and it all 
> can be done in UTF-8.

I stand by that. Multiplexing can be done in UTF-8 easily.


> I find this amusing, because Websocket has gone against the common 
> convention for web protocols being humanly readable ascii encoded. 
> Instead it is a byte squeezed binary protocol.

I'm glad you think IP, TCP, and DNS are unconventional. :-)

WebSocket's handshake is ASCII based. It makes no sense to make the 
framing ASCII based; that would just make the error handling really 
complicated. One byte to introduce the frame and one byte to end it is 
simpler.


> Yet when it comes to building protocols on top of websocket, then all of 
> a sudden you are an advocate of text encoded protocols.

WebSocket supports both text and binary equally. The only reason the 
binary part is a second-class citizen right now is that JS doesn't support 
binary yet.


> To send a stream of images with websocket, you will need to have a 
> sentinel framed UTF-8 message that contains a JSON or mime header to 
> give the content type of the base64 or hex encoded image.  That's 3 
> envelopes around each image! Ouch!!!

To send a stream of images with WebSocket, you will need to wait until 
JavaScript gives you a way to get the image data in a usable form. I 
certainly wouldn't recommend sending it as text, that would be stupid.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'