Re: [hybi] WS framing alternative

Ian Hickson <> Tue, 27 October 2009 09:11 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 9D74F3A6923 for <>; Tue, 27 Oct 2009 02:11:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.553
X-Spam-Status: No, score=-2.553 tagged_above=-999 required=5 tests=[AWL=0.046, BAYES_00=-2.599]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id gV4akUUO-Pfq for <>; Tue, 27 Oct 2009 02:11:46 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id A390D3A67EB for <>; Tue, 27 Oct 2009 02:11:46 -0700 (PDT)
Received: from ( []) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTP id 37AF016D31B; Tue, 27 Oct 2009 02:11:59 -0700 (PDT)
Date: Tue, 27 Oct 2009 09:11:59 +0000
From: Ian Hickson <>
To: "Thomson, Martin" <>
In-Reply-To: <>
Message-ID: <>
References: <>
Content-Language: en-GB-hixie
Content-Style-Type: text/css
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Cc: "" <>
Subject: Re: [hybi] WS framing alternative
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 27 Oct 2009 09:11:47 -0000

On Tue, 27 Oct 2009, Thomson, Martin wrote:
> WS is binary.  I believe that the benefits (reduced frame size, etc...) 
> are outweighed by the disadvantages (debugging, etc...).

WS is really a text/binary hybrid. The handshake is text-based (UTF-8), 
and the frames come in two types, one for pure binary data which is binary 
delimited with a packed integer length prefix, and one for text (UTF-8) 
with a null prefix and a 0xFF suffix.

> There would be little cost in a minor change that would also enable use 
> of MIME.
>   WS-frame = WS-length CRLF WS-headers CRLF WS-body CRLF
>   WS-length = 1*DIGIT
>   WS-headers = WS-header-name ":" WS-header-value CRLF
>      ; add whatever MIME requires here, except the ugly stuff
>   WS-body = *OCTET

This would have some pretty major costs:

- It requires length delimiting for text frames, which is more complicated 
to implement (it's non-trivial to tell the difference between characters 
and bytes).

- It requires parsing using a presized buffer for variable-encoding text, 
which risks character/byte mismatches and thus buffer overruns.

- It allows arbitrary extensions, which introduces an implementation, QA, 
documentation, and tutorial cost without providing any new features at the 
API level.

- It makes it possible for the headers to have errors in them, which 
requires us to define how to handle errors, and requires significantly 
more code to implement the parsing.

> Set some basic defaults for MIME headers (i.e. Content-Type = 
> text/plain;charset=utf-8) and the impact on existing implementations is 
> minimal.

I disagree that it's minimal.

> Rather than sending:
>   [0x80, 0x0d]Hello, World!

Currently you would send a text frame thus:

   [0x00] Hello, World! [0xFF]

> You send (indenting for readability):
>   13
>   Hello, World!
> The unknown length thing is harder to replicate, but you could do as 
> Greg suggests for BWTP, or something even simpler:
>   WS-length = 1*DIGIT [WS-incomplete-frame]
>   WS-incomplete-frame = "+"
> That is:
>   4+
>   Hell
>   9
>   o, World!

That seems like orders of magnitude more complexity than necessary, 
especially given that the only tangible benefit is a modicum of 
improvement in ease of debugging.

In fact, I would argue that the increased complexity actually increases 
the need for debugging more than the text-based approach actually gives 
you in terms of debugging. Debugging binary protocols isn't especially 
difficult. You just pipe tcpdump through hexdump and less, and the issue 
is done. The protocol is so simple that you can hack together a read/write 
console in less than an hour.

Ian Hickson               U+1047E                )\._.,--....,'``.    fL       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'