Re: [hybi] deflate-stream and masking

Bjoern Hoehrmann <> Sun, 24 July 2011 20:22 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 1DB2021F851F for <>; Sun, 24 Jul 2011 13:22:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -3.494
X-Spam-Status: No, score=-3.494 tagged_above=-999 required=5 tests=[AWL=-0.895, BAYES_00=-2.599]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id lP4jaZfuaBJu for <>; Sun, 24 Jul 2011 13:22:32 -0700 (PDT)
Received: from ( []) by (Postfix) with SMTP id 277EB21F8514 for <>; Sun, 24 Jul 2011 13:22:30 -0700 (PDT)
Received: (qmail invoked by alias); 24 Jul 2011 20:22:29 -0000
Received: from (EHLO HIVE) [] by (mp055) with SMTP; 24 Jul 2011 22:22:29 +0200
X-Authenticated: #723575
X-Provags-ID: V01U2FsdGVkX18yvz4oaKIyHz069F3F6ttJpT5owFyGx8dDTlFNpV qm137X3HIH63Wd
From: Bjoern Hoehrmann <>
To: Greg Wilkins <>
Date: Sun, 24 Jul 2011 22:22:30 +0200
Message-ID: <>
References: <>
In-Reply-To: <>
X-Mailer: Forte Agent 3.3/32.846
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Y-GMX-Trusted: 0
Cc: Hybi <>
Subject: Re: [hybi] deflate-stream and masking
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Server-Initiated HTTP <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sun, 24 Jul 2011 20:22:39 -0000

* Greg Wilkins wrote:
>I took a days worth of traffic from an IRC channel and wrapped it up
>as JSON messages sent as websocket frames.
>There were 487 message that looked like:
>     {channel:"#webtide", username:"tbecker", text:"joakime: jenkins
>had issues pulling from github a couple of times  last week"}
>As an unmasked WS stream, it was 50675 bytes, and as a masked stream
>is was 52623 bytes.
>I then compressed both these streams with gzip and got 13306 bytes for
>unmasked and 51704 bytes for the masked!!!!

Deflate streams consist of blocks and the blocks consist of tables and
symbols and the symbols represent either bits from what is compressed,
most of those bits are affected by the mask, and back-references that
instruct a decoder to copy bytes -- which are not affected by the mask.
If you create two files, one has "1234" repeating, the other "abcd" re-
peating, and `gzip` them, only a few bytes will be different. You can
trivially create a Websocket stream with some byte sequence repeating.

If you want to have "GET ..." on the wire, well, that can be just some

  <append x bytes from position y to the output>
  <append x bytes from position y to the output>
  <append x bytes from position y to the output>

bits in the deflate stream. That's not easy to force because encoders
have many options in how they create the stream regardless of masking,
and masking does make it more difficult, but you are greatly aided by,
for instance, having all bytes in the mask be the same, which is not
generally the case if you try to subvert masking without compression.

Without some accepted proof to that effect, the Working Group cannot
claim that masking notably changes the security properties of the pro-
tocol under deflate-stream, which means it would be safe to use only
where it would be safe to use the same mask for all frames, in which
case you no longer have the "compresses poorly" problem, and it would
clearly rule out implementing it in web browsers that "need" masking.

If the base protocol specification were to have a feature that cannot
be implemented by a very large segment of implementations for security
reasons, it would have to have extremely important benefits; deflate-
stream however doesn't have them. Moreover, there does not seem to be
any kind of consensus that extensions should be able to arbitrarily
modify the on-the-wire format. Either way it should not be in the base
protocol specification.
Björn Höhrmann · ·
Am Badedeich 7 · Telefon: +49(0)160/4415681 ·
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 ·