Re: [hybi] Frame size

Ian Hickson <ian@hixie.ch> Fri, 16 April 2010 19:52 UTC

Return-Path: <ian@hixie.ch>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 2298528C1A3 for <hybi@core3.amsl.com>; Fri, 16 Apr 2010 12:52:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.356
X-Spam-Level:
X-Spam-Status: No, score=-0.356 tagged_above=-999 required=5 tests=[AWL=-0.357, BAYES_50=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BIE9o7JW0HP0 for <hybi@core3.amsl.com>; Fri, 16 Apr 2010 12:52:47 -0700 (PDT)
Received: from looneymail-a4.g.dreamhost.com (caibbdcaaaaf.dreamhost.com [208.113.200.5]) by core3.amsl.com (Postfix) with ESMTP id 0EF6428C19E for <hybi@ietf.org>; Fri, 16 Apr 2010 12:52:46 -0700 (PDT)
Received: from ps20323.dreamhostps.com (ps20323.dreamhost.com [69.163.222.251]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by looneymail-a4.g.dreamhost.com (Postfix) with ESMTP id E827D8668; Fri, 16 Apr 2010 12:52:30 -0700 (PDT)
Date: Fri, 16 Apr 2010 19:52:30 +0000
From: Ian Hickson <ian@hixie.ch>
To: "Thomson, Martin" <Martin.Thomson@andrew.com>
In-Reply-To: <8B0A9FCBB9832F43971E38010638454F03E3F313ED@SISPE7MB1.commscope.com>
Message-ID: <Pine.LNX.4.64.1004161940180.751@ps20323.dreamhostps.com>
References: <8B0A9FCBB9832F43971E38010638454F03E3F313ED@SISPE7MB1.commscope.com>
Content-Language: en-GB-hixie
Content-Style-Type: text/css
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Cc: Hybi <hybi@ietf.org>
Subject: Re: [hybi] Frame size
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 16 Apr 2010 19:52:48 -0000

On Fri, 16 Apr 2010, Thomson, Martin wrote:
> 
> Proposal:
> 
>  - Frame size is indicated up front.

This seems incompatible with the requirement that we not expose amateur 
programmers to the complications of measuring UTF-8 strings.


>  - Frames are binary.

Not sure what this means in this context, unless you mean the length is in 
bytes and not in characters; if you mean the latter then I agree that any 
length in the protocol should be in bytes.


>  - Frame size be strictly limited (2 octets should suffice).
>  - Sub-protocol needed if messages are larger than the max frame.

This seems like it would make the API highly unpredictable. For example, 
it would mean that you could transmit 65,536 characters if they were the 
character "X", but only 21,845 if they were the character U+263A. That 
would be very confusing. Similarly, if we add compression, it would mean 
that how many bytes you could pass in the send() method would depend on 
how well it compressed. This would result in a very bad programming 
experience, especially for anyone new to this kind of thing. (I would 
expect most subprotocols to be ridiculously primitive; asking amateurs to 
figure out chunking mechanics is non-trivial, especially if they have to 
work out where to split a UTF-16 string as exposed in JS, before it is 
converted to UTF-8 and compressed, which is what matters for deciding what 
will fit in the frame.)


> Binary frames isolate user-space (the payload) from the framing layer.  We had this discussion in the meeting (c.f. Pete Resnick's comments).  Thus, 
>  - binary frames are easier to process (intermediaries don't have to inspect every octet)
>  - kinder on implementations (framing component does not depend on UTF-8 component)
>  - isolates framing layer from the bugs of higher layers

If we were designing a protocol for experts, I would completely agree. 


> IF used for UTF-8 AND implementer counts characters instead of octets 
> THEN framing doesn't work. [...] One solution to this problem is to 
> start a frame with a known sequence of octets, so that this can be 
> detected.

That's an interesting approach, but I worry that anything hardcoded like 
this would be handled in ways that defeat the purpose. This is why, for 
instance, the proposed handshake uses unpredictable keys to force the 
server to prove it read the handshake -- otherwise, servers written by 
amateurs might not actually read the handshake but just send back the 
right response, simply assuming the handshake came from a Web Socket 
server. If it weren't for targetting amateur programmers, the handshake 
could be a heck of a lot shorter; we could even do away with the 
requirement for a round trip and for the client to test the handshake, 
since we could assume the server would only accept connections that were 
trusted.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'