Re: [hybi] Multiplexing in WebSocket
Ian Hickson <ian@hixie.ch> Sat, 24 October 2009 10:55 UTC
Return-Path: <ian@hixie.ch>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 705063A6767 for <hybi@core3.amsl.com>; Sat, 24 Oct 2009 03:55:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.547
X-Spam-Level:
X-Spam-Status: No, score=-2.547 tagged_above=-999 required=5 tests=[AWL=0.052, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RiRsd2k5PHIC for <hybi@core3.amsl.com>; Sat, 24 Oct 2009 03:55:25 -0700 (PDT)
Received: from looneymail-a2.g.dreamhost.com (caibbdcaaaaf.dreamhost.com [208.113.200.5]) by core3.amsl.com (Postfix) with ESMTP id 257DB3A68ED for <hybi@ietf.org>; Sat, 24 Oct 2009 03:55:25 -0700 (PDT)
Received: from hixie.dreamhostps.com (hixie.dreamhost.com [208.113.210.27]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by looneymail-a2.g.dreamhost.com (Postfix) with ESMTP id 8B40916D3CC; Sat, 24 Oct 2009 03:55:36 -0700 (PDT)
Date: Sat, 24 Oct 2009 11:09:07 +0000
From: Ian Hickson <ian@hixie.ch>
To: Greg Wilkins <gregw@webtide.com>
In-Reply-To: <4AE23D7A.2060009@webtide.com>
Message-ID: <Pine.LNX.4.62.0910240926500.9145@hixie.dreamhostps.com>
References: <4ACE50A2.5070404@ericsson.com> <3a880e2c0910081600v3607665dp193f6df499706810@mail.gmail.com> <4ACF4055.6080302@ericsson.com> <Pine.LNX.4.62.0910092116010.21884@hixie.dreamhostps.com> <4AD2E353.8070609@webtide.com> <4AD2F43D.6030202@ninebynine.org> <4AD39A64.4080405@webtide.com> <Pine.LNX.4.62.0910132335390.25383@hixie.dreamhostps.com> <4AD53DCA.6050304@webtide.com> <Pine.LNX.4.62.0910170203460.9145@hixie.dreamhostps.com> <4ADA7FD4.9010406@webtide.com> <4ADB6F0B.4000004@gmail.com> <Pine.LNX.4.62.0910221120380.9145@hixie.dreamhostps.com> <4AE08907.7080402@webtide.com> <Pine.LNX.4.62.0910230348470.9145@hixie.dreamhostps.com> <4AE1E659.5050507@webtide.com> <Pine.LNX.4.62.0910232154470.13521@hixie.dreamhostps.com> <4AE23D7A.2060009@webtide.com>
Content-Language: en-GB-hixie
Content-Style-Type: text/css
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Cc: hybi@ietf.org
Subject: Re: [hybi] Multiplexing in WebSocket
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 24 Oct 2009 10:55:26 -0000
On Fri, 23 Oct 2009, Greg Wilkins wrote: > Ian Hickson wrote: > > On Fri, 23 Oct 2009, Greg Wilkins wrote: > >> Ian Hickson wrote: > >>> On Thu, 22 Oct 2009, Joshua Bell wrote: > >>> > >>>> * I seem to recall that one of the desires for sentinel-based > >>>> frames was to allow octet streams for which the length was not > >>>> known in advance. > >>> > >>> No; the only reason for sentinel-based frames was to not rely on > >>> authors having to determine the length of their UTF-8-encoded > >>> strings, which in many environments can be easy to get wrong. > >> > >> Authors that can't determine the length of a UTF-8 string are not > >> exactly the sort of developers that should be implementing network > >> protocols. > > > > Wow. > > > > I cannot stand behind such a judgmental statement. Personally I would > > like to make this kind of thing accessible to as many people as > > possible. > > I don't understand your Wow. Buffer overflows have historically been > one of the biggest security issues with any services exposed to the > internet. My "Wow" was to your comment saying that someone might not be the sort of developers that should be implementing something. WebSocket is actually intentionally designed to make buffer overruns harder to accidentally write into code. For binary data, there is no sentinel, so you can allocate the given amount of memory and fill it, without any risk of overrunning anything. Indeed, you can't do anything else, since there's no way to detect the end of the frame other than by its length. The only risk is with null length frames, but there no data is written into them, so the damage, if any, is limited. For text data, there is no length marker, so the only way way to have an overrun is to decide on a fixed maximum size and then have someone send you a frame that's larger. Provided you don't make that mistake, there's no way to be tricked into a buffer overrun as far as I can tell, since there's no up-front length declaration. > Utf-8 will mostly have a 1 character to 1 byte mapping (at least for > english speakers), so many programmers will write code that they think > works by allocating byte buffers the same size as their characters > strings. But then when giving multi-byte characters they will suffer > buffer overflows. The way the protocol is designed, they don't need to allocate those buffers. If they use UTF-8 internally, they can just output the string straight into the I/O layer, with a 0x00 byte first, and a 0xFF byte after. That's the reason for the server-to-client data not having a length field for UTF-8, actually. (The protocol wasn't intended to be symmetric, it just ended up that way because that design made sense given the goals of making parsing and serialising easy for the server side.) > I've made this programming error myself many times! So perhaps I too am > one of the authors that should not be writing network protocols. If you don't think you should be writing one, you almost certainly shouldn't be designing one. :-) > The level of programming skill needed to manage meta data or channels is > entirely of the same order of magnitude as managing utf-8 encoding. The main difference is that most people won't manage UTF-8 encoding. In Perl, for instance, there's the "Encode" library that you can just pass byte strings to to have them converted into UTF-8, and vice-versa. > So my point is probably better expressed by saying that programmers who > are able to write a websocket implementation that correctly handles > multi-byte utf-8 characters, will be entirely capable of handling the > additional "complexity" of some of the additional capabilities being > discussed here. Complexity in one part of the protocol is not a license to make the protocol complicated everywhere. That's a red herring, though. Multiplexing (for one) isn't omitted just, or even primarily, because it would be complicated, but because the cost-benefit ratio doesn't make it worth it. We can push the complexity of multiplexing (say) to the application level, so that authors who need it can use it, and others don't need to worry about it. This is very much in line with the Unix philosophy of making small self- contained building blocks, instead of monolithic solutions. > >> It seams an entirely reasonable simplification of websocket to use > >> only length limited framing and to use the type byte to indicate such > >> things as content charset > > > > What's the use case for doing anything other than UTF-8? > > Even limiting myself to js in the browser, I can think of of reasons > that a content type other than UTF-8 would be desirable. > > * Compressed UTF-8 Binary data will be supported in v2, once the JS side supports it. Then you can compress your data easily. If it turns out that short messages would in fact benefit from compression (which personally I am skeptical about), then we can easily add it in a future version as a dedicated feature, too. > * UTF-16 for those whose language happens to use a lot of >2 byte > utf-8 characters If we were worried about bandwidth, gzip compressing UTF-8 would give better results across the board than allowing the author to switch between UTF-8 and UTF-16. > * UTF-16 for those that can't deal with the uncertainty and/or > unpredictability of the length of a UTF-8 string UTF-16 is far worse than UTF-8 when it comes to character-to-byte length issues, in practice. > If we consider what a browser itself might like to do with websocket, or > a non-browser client might like, then we have: > > * Mime encoded content for those that want to send other > content types down a stream. Images, sounds etc. We have binary frames already staked out in the protocol, waiting for support in JS for binary. > * Some other framed content for those that want to implement > multiplexing on top of websocket (as you advocate) but don't > want to have to do it with text based framing. Again, binary data will be support in v2. The protocol already supports it, it's just not in the API. > * Anybody that wants to send any content with 0xFF in it > and does not want to base64 code their entire content > as a result Same. > * A mobile phone that is restricted to a single outgoing > connection (not uncommon) so the browser wants to > transport HTTP over the websocket connection Could you elaborate on this use case? > * Something that none of us has thought of yet Once we think of it, we can address it. There's no point trying to optimise for something we don't know about; we have no way to know if we are optimising in the right way. > You also say that future multiplexing (or other complex things), should > be built on top of websocket. Yet when anybody asks for additional > content types to help do that, you say they are not needed and it all > can be done in UTF-8. I stand by that. Multiplexing can be done in UTF-8 easily. > I find this amusing, because Websocket has gone against the common > convention for web protocols being humanly readable ascii encoded. > Instead it is a byte squeezed binary protocol. I'm glad you think IP, TCP, and DNS are unconventional. :-) WebSocket's handshake is ASCII based. It makes no sense to make the framing ASCII based; that would just make the error handling really complicated. One byte to introduce the frame and one byte to end it is simpler. > Yet when it comes to building protocols on top of websocket, then all of > a sudden you are an advocate of text encoded protocols. WebSocket supports both text and binary equally. The only reason the binary part is a second-class citizen right now is that JS doesn't support binary yet. > To send a stream of images with websocket, you will need to have a > sentinel framed UTF-8 message that contains a JSON or mime header to > give the content type of the base64 or hex encoded image. That's 3 > envelopes around each image! Ouch!!! To send a stream of images with WebSocket, you will need to wait until JavaScript gives you a way to get the image data in a usable form. I certainly wouldn't recommend sending it as text, that would be stupid. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
- [hybi] HyBi Design Space Salvatore Loreto
- Re: [hybi] HyBi Design Space Infinity Linden (Meadhbh Hamrick)
- Re: [hybi] HyBi Design Space Thomson, Martin
- Re: [hybi] HyBi Design Space Salvatore Loreto
- Re: [hybi] HyBi Design Space Salvatore Loreto
- [hybi] Multiplexing in WebSocket (Was: HyBi Desig… Ian Hickson
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Julian Reschke
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Infinity Linden (Meadhbh Hamrick)
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Greg Wilkins
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Graham Klyne
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Greg Wilkins
- Re: [hybi] HyBi Design Space Stefano Salsano
- Re: [hybi] HyBi Design Space Thomson, Martin
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Graham Klyne
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Ian Hickson
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Ian Hickson
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Greg Wilkins
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Ian Hickson
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Jamie Lokier
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Ian Hickson
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Greg Wilkins
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Greg Wilkins
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Julian Reschke
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Julian Reschke
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Michael(tm) Smith
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Julian Reschke
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Bjoern Hoehrmann
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Ian Hickson
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Julian Reschke
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Ian Hickson
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Julian Reschke
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… SM
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Ian Hickson
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Martin Tyler
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Ian Hickson
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Julian Reschke
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Greg Wilkins
- Re: [hybi] Multiplexing in WebSocket (Was: HyBi D… Wellington Fernando de Macedo
- [hybi] new drat design-space-bidirectional Salvatore Loreto
- Re: [hybi] Multiplexing in WebSocket Ian Hickson
- Re: [hybi] Multiplexing in WebSocket Greg Wilkins
- Re: [hybi] Multiplexing in WebSocket Ian Hickson
- Re: [hybi] Multiplexing in WebSocket Julian Reschke
- Re: [hybi] Multiplexing in WebSocket Ian Hickson
- Re: [hybi] Multiplexing in WebSocket Peter Saint-Andre
- Re: [hybi] Multiplexing in WebSocket Greg Wilkins
- Re: [hybi] Multiplexing in WebSocket Ian Hickson
- Re: [hybi] Multiplexing in WebSocket Greg Wilkins
- Re: [hybi] Multiplexing in WebSocket Jamie Lokier
- Re: [hybi] Multiplexing in WebSocket Greg Wilkins
- Re: [hybi] Multiplexing in WebSocket Greg Wilkins
- Re: [hybi] Multiplexing in WebSocket Ian Hickson
- Re: [hybi] Multiplexing in WebSocket Greg Wilkins
- Re: [hybi] Multiplexing in WebSocket Ian Hickson