[hybi] preliminary WebSockets compression experiments

John Tamplin <jat@google.com> Fri, 23 April 2010 19:48 UTC

Return-Path: <jat@google.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 453393A6A40 for <hybi@core3.amsl.com>; Fri, 23 Apr 2010 12:48:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.377
X-Spam-Level:
X-Spam-Status: No, score=-102.377 tagged_above=-999 required=5 tests=[AWL=-0.999, BAYES_80=2, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id V1lrCHQ4Wj8R for <hybi@core3.amsl.com>; Fri, 23 Apr 2010 12:48:06 -0700 (PDT)
Received: from smtp-out.google.com (smtp-out.google.com [216.239.44.51]) by core3.amsl.com (Postfix) with ESMTP id D54B43A67D2 for <hybi@ietf.org>; Fri, 23 Apr 2010 12:48:05 -0700 (PDT)
Received: from wpaz1.hot.corp.google.com (wpaz1.hot.corp.google.com [172.24.198.65]) by smtp-out.google.com with ESMTP id o3NJllP3032734 for <hybi@ietf.org>; Fri, 23 Apr 2010 12:47:48 -0700
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta; t=1272052068; bh=BDhvEq8KaaPEobX3O5sWRjsHqrs=; h=MIME-Version:From:Date:Message-ID:Subject:To:Content-Type; b=ErMdb2n/K7LATCyws+hl0exq3kAn2zpbEVB65uauA4E82w3Ak59o8NAnHnj/BCYFi FrvTZB+ir0cJJxJNvx9GA==
DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns; h=mime-version:from:date:message-id:subject:to:content-type:x-system-of-record; b=tP0FiPuhpBM4kD5PTrLrLAQc2xyax38azASVeyfI+3T1XPFopirQDwcw/RkK5yWcw Wi2KsIgNa2gTJTJ0kImlA==
Received: from pxi10 (pxi10.prod.google.com [10.243.27.10]) by wpaz1.hot.corp.google.com with ESMTP id o3NJlj3g016054 for <hybi@ietf.org>; Fri, 23 Apr 2010 12:47:47 -0700
Received: by pxi10 with SMTP id 10so920613pxi.21 for <hybi@ietf.org>; Fri, 23 Apr 2010 12:47:44 -0700 (PDT)
Received: by 10.140.88.9 with SMTP id l9mr713759rvb.286.1272052064925; Fri, 23 Apr 2010 12:47:44 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.150.117.30 with HTTP; Fri, 23 Apr 2010 12:47:24 -0700 (PDT)
From: John Tamplin <jat@google.com>
Date: Fri, 23 Apr 2010 15:47:24 -0400
Message-ID: <q2z3f94964f1004231247zc7b60dc3l5fbb4748d129c3c@mail.gmail.com>
To: hybi@ietf.org
Content-Type: text/plain; charset="UTF-8"
X-System-Of-Record: true
Subject: [hybi] preliminary WebSockets compression experiments
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 23 Apr 2010 19:48:07 -0000

[This is a resend of a message sent on Monday that never appeared to show
up - apologies if this is a duplicate]

I have run some experiments on existing WebSockets applications to see what
could be gained if compression were enabled.

To do this, I captured traces with tcpdump, then wrote a small libpcap-based
program to strip out individual TCP streams to separate files.  I then wrote
another program which read these files, removed the WebSocket framing, and
tried compressing them in various ways using zlib and analyzed the results.

The applications tested were:

   - A hacked version of Google Wave that used WebSockets for communication
   with the server, with Chrome 5 as the browser.  The messages in this case
   are JSON consisting of ASCII text.  (Note that this was a relatively small
   trace, I am getting a larger trace to make sure the results hold).
   - GWT Quake, which is an HTML5 app that gets up to 60fps running in the
   browser using WebSockets for communication with the server.  The messages
   are largely binary, with individual bytes being encoded as UTF8 characters
   (so bytes 0x80-0xFF are encoded as a two-byte UTF8 character), and those
   binary values are largely IEEE floating point values.  The sample taken was
   from two players, one using Chrome 5 on Windows and the other using WebKit
   nightly on Mac playing a multiplayer game.

The compression methods tried were:

   - GZIP - using zlib with a gzip header (ie, deflateInit2(zstr,
   Z_DEFAULT_COMPRESSION, Z_DEFLATED, 15 | 16, 8, Z_DEFAULT_STRATEGY) - this
   should closely model the gzip encoding used for HTTP responses.
   - DEFLATE - similar, but using deflateInit(Z_DEFAULT_COMPRESSION) so only
   the zlib header is included
   - DEFLATE/STREAM - the above methods compress each frame separately, so
   the compression dictionary has to be rebuilt for each message.  However,
   this method maintains compression state across messages in the same stream
   (using deflate(Z_SYNC_FLUSH) to finish an individual frame), so later
   messages can exploit redundancy from previous messages.  This does have the
   downside of maintaining state for the duration of the connection, but there
   is already significant state due to keeping a TCP connection up

For each stream, I collected the following information:

   - total payload bytes transferred (note that I did not count TCP/IP
   overhead or WebSockets framing overhead)
   - packet sizes at the 25th, 50th 75th, and 90th percentiles
   - percent reduction
   - in each case, I assumed that if the compression resulted in an increase
   in the frame size, it would be sent uncompressed instead (but I include the
   count of such frames)

Google Wave

client->server:
   31 frames totalling 9189 bytes, percentiles: 248/249/396/397 bytes

   - GZIP: 6193 total bytes, percentiles 183/184/225/228, 32.60% reduction
   - DEFLATE: 5937 total bytes, percentiles: 175/176/217/220 bytes, 35.39%
   reduction, none under 63 bytes
   - DEFLATE/STREAM: 1274 total bytes, percentiles: 26/28/36/76 bytes,
   86.14% reduction, 88.8% under 63 bytes

server->client:
   42 frames totalling 8962 bytes, percentiles: 90/91/111/657 bytes

   - GZIP: 5485 total bytes, percentiles: 87/88/109/282 bytes, 38.80%
   reduction
   - DEFLATE: 5149 total bytes, percentiles: 79/80/101/274 bytes, 42.55%
   reduction
   - DEFLATE/STREAM: 1427 total bytes, 25/26/28/37 bytes, 84.08% reduction

Conclusion:

   - Contrary to expectations, the wave protocol has sufficient redundancy
   to get savings from compression - even basic gzip compression provides
   significant benefits for this trace of the wave protocol.
   - Sharing compression state across messages results in 5-6x reduction in
   frame sizes, which would be very important in mobile environments


GWT Quake

client 1 -> server:
   3837 frames totalling 212653 bytes, percentiles: 50/54/60/66 bytes

   - GZIP: total 210242 bytes, percentiles: 50/54/60/66 bytes, 1.13%
   reduction, 3833 frames grew larger
   - DEFLATE: total 210184 bytes, percentiles: 50/54/60/66 bytes, 1.16%
   reduction, 3817 frames grew larger
   - DEFLATE/STREAM: total 94902 bytes, percentiles: 21/24/28/31 bytes,
   55.37% reduction

server -> client 1:
   2150 frames totalling 608658 bytes, percentiles: 120/163/476/478 bytes

   - GZIP: total 405572 bytes, percentiles: 112/142/281/282 bytes, 33.37%
   reduction
   - DEFLATE: total 388897 bytes, percentiles: 104/134/273/274 bytes, 36.11%
   reduction
   - DEFLATE/STREAM: total 96093 bytes, percentiles: 23/44/58/68 bytes,
   84.21% reduction

client 2 -> server:
   1996 frames totalling 103091 bytes, percentiles: 44/50/58/64

   - GZIP: total 101887 bytes, percentiles: 44/50/57/64 bytes, 1.17%
   reduction, 1982 frames grew larger
   - DEFLATE: total 101647 bytes, percentiles: 44/50/57/64 bytes, 1.40%
   reduction, 1938 frames grew larger
   - DEFLATE/STREAM: 49229 bytes, percentiles 21/24/28/32 bytes, 52.25%
   reduction, 99.9% under 63 bytes

server -> client 2:
   1880 frames totalling 423606 bytes, percentiles: 110/165/333/380

   - GZIP: total 303437 bytes, percentiles: 102/142/217/255 bytes, 28.37%
   reduction, 250 frames grew larger
   - DEFLATE: total 289864 bytes, percentiles: 94/134/209/247 bytes, 31.57%
   reduction, 77 frames grew larger
   - DEFLATE/STREAM: total 73246 bytes, percentiles: 18/27/53/67 bytes,
   82.71% reduction

Conclusion:

   - It is absolutely critical that uncompressed frames be allowed even when
   compression has been negotiated -- otherwise, the total bytes transferred
   would be much higher (either through loss of compression where useful, or
   by sending compressed data that is larger than the uncompressed data).
   - the client->server stream only compresses if state is maintained across
   frames, which also gives an order of magnitude size reduction on the
   server->client stream.
   - Even small packets benefit -- with persistent compression state,
   traffic that is 90% under 64 bytes still gets 2:1 compression.

Implications for Protocol Changes for Compression:

   - At least some apps will have different characteristics between the
   traffic in each direction.  For example, in GWT Quake, the client->server
   traffic would pay a roughly 13% size penalty if it had to be compressed or
   pay a 33%+ size penalty on the server->client traffic if the connection
   wasn't compressed at all
      - The simple approach is to simply allow compression to be optional
      for each frame and only use it if it reduces the size
      - This decision could be based on heuristics or by simply compressing
      and comparing the size, though the latter is likely to be inefficient for
      mobile devices.
      - A more complicated approach would be to allow asymmetric compression
      algorithms, though it isn't clear how the browser/server could take
      advantage of it without exposing some API for the application to describe
      the likely traffic.
   - Maintaining compression state across frames gives a large benefit, and
   likely overcomes the need to allow optional compression (though
   in pathological cases there are probably still cases where compression
   results in a size increase).  However, this comes at the cost of additional
   state, so striking the proper balance on mobile devices that are constrained
   on both memory and network bandwidth may be difficult.

Next Steps

   - I will get longer Wave traces and verify the measurements made here
   still hold, especially during startup
   - If anyone else has packet traces of actual WebSocket traffic they can
   share (pcap format is fine, but please don't include any sensitive data)
   please email them to me and I can include them in the analysis.  If that is
   a problem I could also send the source for the analysis tools which should
   build on any Unix-based system.
   - I want to get some functional compression test going so I can measure
   actual latency gains on real apps running over real networks, which means
   defining some plausible framing format.  Probably the most straightforward
   would just be to define frame type 0x80 as compressed UTF8 text, and the
   uncompressed bytes are decoded and passed to the app just like the 0x00
   frame.  It looks like it would be easy to add to Jetty7 (though lack of
   Z_SYNC_FLUSH from Java's Deflater makes it harder), though getting a
   functional browser implementation might be a lot of work.

--
John A. Tamplin
Software Engineer (GWT), Google