Re: [hybi] WS framing alternative

Ian Hickson <ian@hixie.ch> Fri, 30 October 2009 04:21 UTC

Return-Path: <ian@hixie.ch>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 58BF03A67B8 for <hybi@core3.amsl.com>; Thu, 29 Oct 2009 21:21:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.948
X-Spam-Level:
X-Spam-Status: No, score=-1.948 tagged_above=-999 required=5 tests=[AWL=-0.589, BAYES_00=-2.599, SARE_LWSHORTT=1.24]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cARfb+-ALbZW for <hybi@core3.amsl.com>; Thu, 29 Oct 2009 21:21:40 -0700 (PDT)
Received: from looneymail-a1.g.dreamhost.com (caibbdcaaaaf.dreamhost.com [208.113.200.5]) by core3.amsl.com (Postfix) with ESMTP id CFC2A3A6827 for <hybi@ietf.org>; Thu, 29 Oct 2009 21:21:40 -0700 (PDT)
Received: from hixie.dreamhostps.com (hixie.dreamhost.com [208.113.210.27]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by looneymail-a1.g.dreamhost.com (Postfix) with ESMTP id D1A8815D7CF for <hybi@ietf.org>; Thu, 29 Oct 2009 21:21:56 -0700 (PDT)
Date: Fri, 30 Oct 2009 04:22:06 +0000
From: Ian Hickson <ian@hixie.ch>
To: hybi@ietf.org
In-Reply-To: <4AEA5713.8020008@it.aoyama.ac.jp>
Message-ID: <Pine.LNX.4.62.0910300346010.25616@hixie.dreamhostps.com>
References: <8B0A9FCBB9832F43971E38010638454F0F1EA72C@SISPE7MB1.commscope.com> <Pine.LNX.4.62.0910270903080.9145@hixie.dreamhostps.com> <a9699fd20910270426u4aa508cepf557b362025ae5db@mail.gmail.com> <Pine.LNX.4.62.0910271824200.25616@hixie.dreamhostps.com> <4AE76137.8000603@webtide.com> <Pine.LNX.4.62.0910272118590.25608@hixie.dreamhostps.com> <20091029123121.GA24268@almeida.jinsky.com> <4AEA0E6C.1060607@webtide.com> <4AEA5713.8020008@it.aoyama.ac.jp>
Content-Language: en-GB-hixie
Content-Style-Type: text/css
MIME-Version: 1.0
Content-Type: MULTIPART/MIXED; BOUNDARY="1909464018-698486445-1256876526=:25616"
Subject: Re: [hybi] WS framing alternative
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 30 Oct 2009 04:21:42 -0000

On Wed, 28 Oct 2009, Thomson, Martin wrote:
> 
> > - It requires parsing using a presized buffer for variable-encoding 
> > text, which risks character/byte mismatches and thus buffer overruns.
> 
> That's untrue.
> 
> Or you've made an assumption about my proposal that I never stated: i.e. 
> that length is stated in characters.  Which would, of course, lead to 
> these conclusions.  My intent (which I incorrectly assumed would be 
> clear) was that length was octets.  It seems obvious to me - character 
> encoding changes with different content types, thus you can't use 
> character-based encodings.

You are proposing that given a length and a series of bytes encoding text 
in a variable-sized encoding (UTF-8), the application return a series of 
characters. My point is that this means that you have two lengths (the 
length of the string in characters and the length of the string in bytes), 
so you risk inexperienced software authors making elementary yet dangerous 
mistakes in terms of how to read (or write) data to the stream. WebSocket 
tries to avoid ever mixing the two (you either deal with bytes and byte 
lengths, or you use sentinel bytes and no lengths -- you never have 
characters and byte lengths mixed together).

This isn't a problem if you assume that there will be only a few people 
implementing this and that they will all be competent. This is no an 
assumption I believe one can make if one's goal is to enable even amateurs 
to make use of this technology.


Regarding the rest of your e-mail, I don't disagree with the facts, but I 
disagree with the cost-benefit analyses, with your opinion of what is and 
what is not an acceptable risk, and with some of your goals. I don't know 
of any objective way to argue those points, and I agree that if you start 
with your positions, that you wouldn't develop WebSockets the way it is 
currently specced.

What is the process for proceeding in the IETF when people have 
fundamentally different and mutually exclusive opinions?


On Thu, 29 Oct 2009, Rory Byrne wrote:
> 
> I think that Greg Wilkins is right on the money in what he says about 
> managing server resources. A byte count which tells a server how much 
> data a client intends to push to it, is a critical piece of information. 
> We're talking about servers that are going to have many thousands of 
> clients permanently connected to them for weeks, or months, on end. To 
> stay alive, these servers will need to be excellent at resource 
> management. IMO, the protocol as it stands at the moment, is not giving 
> these servers the information they require to enable them to adequately 
> manage their resources.

The server will know what the clients will generally be sending -- e.g. a 
chat client is going to be expecting messages in the range of half a 
kilobyte or less, generally. So the server can just assume that that is 
the expected size, and only needs to do anything else if a message goes 
above that point (either bail, if the protocol the server supports will 
_never_ need such a large message, or switch to a bigger buffer).


> In a further effort to boost our chances of building robust WebSocket 
> servers, I would hope that we might consider adding some sort of maximum 
> frame length negotiation. Nothing fancy, there could be a suitable 
> default maximum, and a 'Max-Frame-Length' header which enables a client 
> to negotiate a higher maximum. Maybe something like this:
> 
>         GET /demo HTTP/1.1
>         Upgrade: WebSocket
>         Connection: Upgrade
>         Host: example.com
>         Origin: http://example.com
>         WebSocket-Protocol: sample
>         Max-Frame-Length: 2097152
> 
>         HTTP/1.1 101 Web Socket Protocol Handshake
>         Upgrade: WebSocket
>         Connection: Upgrade
>         WebSocket-Origin: http://example.com
>         WebSocket-Location: ws://example.com/demo
>         WebSocket-Protocol: sample
>         Max-Frame-Length: 1048576
> 
> The server would only send a 'Max-Frame-Length' header if it wanted to 
> set the maximum to be lower than the client suggested. Any thoughts?

It's not necessary. The person in charge of the server is the one who 
invents the application-level protocol, and so the server can just define 
what its maximum is, without any negotiation.

It'd be like an HTTP client and server negotiating the maximum size of a 
<form> POST. If the client sends too much data, the server can just refuse 
it. The server is the one who decides what is acceptable.


On Fri, 30 Oct 2009, Greg Wilkins wrote:
> 
> I think having the ability to negotiate such parameters is a good way to 
> fail fast if a server and client are not compatible (eg client needs to 
> send larger messages than server is willing to receive).

How could that ever happen? A client isn't ever going to just randomly 
connect to a WebSocket server and start sending messages -- it's only 
going to send messages the server wants to support.


> However, the problem with making this kind of negotiation optional (and 
> this goes for my earlier examples of a load balancer communicating SSL 
> info and/or node stickyness), is that the current WS protocol has no 
> place for meta data to be added in an optional manner - so that it can 
> be ignored by implementations that don't care.

Sure, you just put them in the frames, as part of the higher-level 
protocol.


> Ian has previously said that instead of adding headers to the Upgrade 
> request/response, messages should be injected into the stream to 
> communicate this meta data.  However, because there is no way to flag 
> these messages as meta data, simple implementations are going to try to 
> handle them as messages.

You can flag messages however you like. WebSocket is like TCP -- it's just 
a transport on which you define your own protocol. Is there a way in TCP 
to define the URL of a stream? Or a way to add metadata?


> This limitation of no standard meta-data paths, means that it is nigh 
> impossible to introduce features like multiplexing, load balancing, 
> fragmentations, HTTP transport etc. as optional additional 
> specifications built on top of ws.  Because there is no meta channel, 
> simple implementations will treat everything as a message and break if 
> there is any new protocol layered on top.

This is either completely false, or we have dramatically different goals.

It's easy to add multiplexing, load balancing, or fragmentation as layers 
above WebSocket. For example, with multiplexing, you could just define 
that your frames have payloads like:

   C->S: open channel

   S->C: opened channel 1

   C->S: channel 1
         bla bla bla...

   S->C: channel 1
         foo bar baz...

   C->S: close channel 1

...or whatever.

Given the goal of just providing TCP for Web pages, it doesn't really make 
sense to have a more involved protocol.

It *would* make sense if your goal is something else, like tunneling 
Jabber over HTTP or something. But that's a different project than 
WebSockets. It's probably a perfectly reasonable thing for Hybi to work 
on, but it should be articulated as its own project.


> It appears that the whatwg want layered protocols to be controlled by 
> the allocation of frame byte values to them and only officially 
> sanctified extensions will get the nod.

Layered protocols don't need frame types. In fact, frame types are 
completely transparent to layered protocols. If your protocol is just 
text-based, you'll never need anything beyond this version of WebSocket.


> The Whatwg needs to loosen up and realize that it should not try to 
> completely control how a protocol will be used in the wild (eg an not 
> force all servers to be 100 line perl scripts).  They have to allow it 
> some room to breath and grow.  They need to allocate some space for some 
> arbitrary meta data to be used or ignored as implementations see fit.

I think you are misunderstanding the point of the frame types.


On Fri, 30 Oct 2009, Jamie Lokier wrote:
> 
> That means any intermediary (which may be just code inside a client or 
> server framework, or a proxy) will have to pass all all the bytes it 
> receives as it receives them, with little or no delays, because 
> intermediaries can't know where a message ends in general, except for 
> the frame types defined in the first version.

Actually the protocol defines how you parse (find message ends) for every 
frame type possible. (0x00-0x7F use 0xFF delimiters, 0x80-0xFF use a 
length declaration.)


> A curious irony of WebSocket in it's current form is that there's no 
> reason a particular implementation should bother to use WebSocket 
> framing, if it doesn't wish to be accessible to Javascript browsers.

There's no reason a particular implementation should bother to use 
WebSocket _at all_, if it doesn't wish to be accessible to Javascript 
browsers. Seriously, the whole point of WebSocket is to provide an API to 
scripts. That's it. If your use case is something else, then WebSockets is 
not the appropriate solution. 


> Since you must negotiate any custom protocol layered on top of
> WebSocket out of band anyway (e.g. by agreeing what is present on some
> URL)

Actually the WebSocket-Protocol header lets you negotiate that at 
connection time.


> and because message boundaries cannot be recognised for unknown frame 
> types,

Actually that's well-defined already for all possible frame types.


> and because there is no way to report per-frame errors,

That's an application-level concern; there's no concept of errors at the 
protocol level.


> in general all software which doesn't block WebSocket and intends to be 
> future compatible must pass the byte stream along verbatim, without 
> delays.

Right.


> That means when implementing clients and servers of custom protocols, 
> you have two choices:
> 
>     - You want it to be accessible to Javascript in web browsers (in
>       the short term future).  So you must frame everything into UTF-8
>       frames, whatever kind of data it is, and define your own text
>       protocol within that.

Right.


>     - You don't care about Javascript in web browsers.  Because you
>       know anything forwarding your WebSocket connection which is
>       future compatible must pass along arbitrary bytes without
>       delays, because there is no generic way for it to know frame
>       lengths, you might as well send frame type byte NN (pick a
>       number that is not used) followed by raw TCP with your own
>       protocol after the upgrade request, if that is convenient.
> 
>       There seems to be no reason to use the spec's binary frames, if
>       a TCP stream is more convenient for some application.

There seems to be no reason to use WebSocket at all in this situation. It 
doesn't provide anything more than TCP if you're not targetting JavaScript 
in Web browsers.


On Fri, 30 Oct 2009, "Martin J. Dürst" wrote:
> 
> That said, however, I also agree with some other people on this list that
> extension points should be considered carefully. Greg's proposal:
> 
> >   1) WS should allow arbitrary headers in the upgrade request/response
> >
> >   2) WS should define a frame type for transport meta data, whose
> >      content is defined only as Mime (or just mime headers).
> >      The semantics of these can be left to other protocols built
> >      on top of WS.
> 
> seems to be a very good starting point.

Could you elaborate on what problem this solves?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'