Re: [hybi] how do we move forward on agreeing on framing?

John Tamplin <jat@google.com> Thu, 19 August 2010 17:28 UTC

Return-Path: <jat@google.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 4671C3A69E1 for <hybi@core3.amsl.com>; Thu, 19 Aug 2010 10:28:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -105.826
X-Spam-Level:
X-Spam-Status: No, score=-105.826 tagged_above=-999 required=5 tests=[AWL=0.150, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VWYmaJXEenB1 for <hybi@core3.amsl.com>; Thu, 19 Aug 2010 10:28:21 -0700 (PDT)
Received: from smtp-out.google.com (smtp-out.google.com [216.239.44.51]) by core3.amsl.com (Postfix) with ESMTP id 02C083A69A6 for <hybi@ietf.org>; Thu, 19 Aug 2010 10:28:20 -0700 (PDT)
Received: from wpaz29.hot.corp.google.com (wpaz29.hot.corp.google.com [172.24.198.93]) by smtp-out.google.com with ESMTP id o7JHStAw032377 for <hybi@ietf.org>; Thu, 19 Aug 2010 10:28:55 -0700
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta; t=1282238935; bh=4o8suz4AiKbAveB2ps5uff2sQiw=; h=MIME-Version:In-Reply-To:References:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=hXXb7JtyoQYo66sXvgGrsmwVTLX7S4StotD8BtzQOCXXpsSHq1s9FiRMuaSEkFhoD a5+yl9LjFoj+ogdXfqwCw==
DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns; h=mime-version:in-reply-to:references:from:date:message-id: subject:to:cc:content-type:x-system-of-record; b=ZGo8qxxUbwh5HV7SI6qlToS8xCQc7L3QwYjgo/Y6CRPBWpM4vmpHSdPVTEoa5Q0GJ Ldo+DoBARMBfJ72NSY8ig==
Received: from ywf7 (ywf7.prod.google.com [10.192.6.7]) by wpaz29.hot.corp.google.com with ESMTP id o7JHSsOv027591 for <hybi@ietf.org>; Thu, 19 Aug 2010 10:28:54 -0700
Received: by ywf7 with SMTP id 7so981502ywf.30 for <hybi@ietf.org>; Thu, 19 Aug 2010 10:28:54 -0700 (PDT)
Received: by 10.151.38.16 with SMTP id q16mr277892ybj.340.1282238934236; Thu, 19 Aug 2010 10:28:54 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.151.60.3 with HTTP; Thu, 19 Aug 2010 10:28:34 -0700 (PDT)
In-Reply-To: <AANLkTi=tw7PPQzn4U0qEO7dDKncWUt0J8eUK2QBoHq90@mail.gmail.com>
References: <AANLkTineuhvGsC_vca6AiAX8OmHdkE-7s7rA1DQtjtMm@mail.gmail.com> <1282231803.22142.649.camel@vulcan.aspl.local> <AANLkTim44=x0BRpF3BYMqS9GNzHA+icG818JgfRRaFPT@mail.gmail.com> <AANLkTi=tw7PPQzn4U0qEO7dDKncWUt0J8eUK2QBoHq90@mail.gmail.com>
From: John Tamplin <jat@google.com>
Date: Thu, 19 Aug 2010 13:28:34 -0400
Message-ID: <AANLkTinu+3AYRuQJXXoUa6mw4UAuGxO1Wu5pOfs7NFCP@mail.gmail.com>
To: gustav trede <gustav.trede@gmail.com>
Content-Type: multipart/alternative; boundary="0015174ff5bc314038048e308492"
X-System-Of-Record: true
Cc: Hybi <hybi@ietf.org>
Subject: Re: [hybi] how do we move forward on agreeing on framing?
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Aug 2010 17:28:22 -0000

On Thu, Aug 19, 2010 at 1:02 PM, gustav trede <gustav.trede@gmail.com>wrote:

> Only one thing that i fail to understand, regarding the payload length
> encoding:
> What technical reason makes it so much better then the original binary
> frame length encoding( if (byte[i]&0x80)==0x80) to detect the length bytes)
> that its rather massive waste of bandwidth is justified ?.
>

There are a couple of problems with it:

   - you can have an arbitrary number of leading 0x80 bytes, so there is no
   single representation of a length
   - you can have an arbitrary number of length bytes, which means to
   properly support the spec you have to implement multi-precision arithmetic.
    In practice, implementations will likely use 32 or 64-bit ints, and then
   some attacker will exploit those implementations.  With this proposal, every
   implementor knows they must support 64-bit lengths

If you fix those problems by disallowing leading 0x80 bytes and making the
maximum number of length bytes 9 (for the same 63-bit maximum length), you
now have 9 steps in the length, and I don't think they are justified.  I
think the 8 bytes of length is not significant for a larger message, so in
the worst case you have 127 payload bytes, so the overall message length is
137 bytes (2 byte fixed header, 8 byte extended length, payload) -- giving
an overhead of 7.3%.  More typical payload sizes will result in overheads of
the extra length bytes of well under half a percent.

The two length encoding mechanisms compare like this (ignoring the
opcode/flags byte, and any TCP/IP headers), with overall overhead
percentages for each:

   - 0-126 octets - both take one additional byte
   - 127 octets - v76 takes 1 byte, this proposal takes 9 (1.6%, 7.3%)
   - 128-16383 octets - v76 takes 2 bytes, this proposal takes 9 (2.3%-.02%,
   7.2%-.06%)
   - 16384-2097152 octets - v76 takes 3 bytes, this proposal takes 9
   (.03%-.0002%, .06%-.0005%)
   - ...

You can already see that there is no significant difference in overhead for
the larger packet sizes, yet we have the complexity of all these different
size length values for little benefit.  When you consider TCP/IP headers as
well, the difference in overhead is negligible.

Yes, it is more efficient to have a 2-byte length format for small packets
just above 127 bytes, but is that worth the additional complexity of extra
length steps?  Profiling of real-world WebSocket apps (admittedly, there are
few currently) suggests that there will be lots of very small packets
(especially once compression is supported), and a few larger packets.  So I
think that matches well with the proposal to provide exactly two steps of
length sizes.

-- 
John A. Tamplin
Software Engineer (GWT), Google