[hybi] preliminary WebSockets compression experiments
John Tamplin <jat@google.com> Fri, 23 April 2010 19:48 UTC
Return-Path: <jat@google.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 453393A6A40 for <hybi@core3.amsl.com>; Fri, 23 Apr 2010 12:48:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.377
X-Spam-Level:
X-Spam-Status: No, score=-102.377 tagged_above=-999 required=5 tests=[AWL=-0.999, BAYES_80=2, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id V1lrCHQ4Wj8R for <hybi@core3.amsl.com>; Fri, 23 Apr 2010 12:48:06 -0700 (PDT)
Received: from smtp-out.google.com (smtp-out.google.com [216.239.44.51]) by core3.amsl.com (Postfix) with ESMTP id D54B43A67D2 for <hybi@ietf.org>; Fri, 23 Apr 2010 12:48:05 -0700 (PDT)
Received: from wpaz1.hot.corp.google.com (wpaz1.hot.corp.google.com [172.24.198.65]) by smtp-out.google.com with ESMTP id o3NJllP3032734 for <hybi@ietf.org>; Fri, 23 Apr 2010 12:47:48 -0700
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta; t=1272052068; bh=BDhvEq8KaaPEobX3O5sWRjsHqrs=; h=MIME-Version:From:Date:Message-ID:Subject:To:Content-Type; b=ErMdb2n/K7LATCyws+hl0exq3kAn2zpbEVB65uauA4E82w3Ak59o8NAnHnj/BCYFi FrvTZB+ir0cJJxJNvx9GA==
DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns; h=mime-version:from:date:message-id:subject:to:content-type:x-system-of-record; b=tP0FiPuhpBM4kD5PTrLrLAQc2xyax38azASVeyfI+3T1XPFopirQDwcw/RkK5yWcw Wi2KsIgNa2gTJTJ0kImlA==
Received: from pxi10 (pxi10.prod.google.com [10.243.27.10]) by wpaz1.hot.corp.google.com with ESMTP id o3NJlj3g016054 for <hybi@ietf.org>; Fri, 23 Apr 2010 12:47:47 -0700
Received: by pxi10 with SMTP id 10so920613pxi.21 for <hybi@ietf.org>; Fri, 23 Apr 2010 12:47:44 -0700 (PDT)
Received: by 10.140.88.9 with SMTP id l9mr713759rvb.286.1272052064925; Fri, 23 Apr 2010 12:47:44 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.150.117.30 with HTTP; Fri, 23 Apr 2010 12:47:24 -0700 (PDT)
From: John Tamplin <jat@google.com>
Date: Fri, 23 Apr 2010 15:47:24 -0400
Message-ID: <q2z3f94964f1004231247zc7b60dc3l5fbb4748d129c3c@mail.gmail.com>
To: hybi@ietf.org
Content-Type: text/plain; charset="UTF-8"
X-System-Of-Record: true
Subject: [hybi] preliminary WebSockets compression experiments
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 23 Apr 2010 19:48:07 -0000
[This is a resend of a message sent on Monday that never appeared to show up - apologies if this is a duplicate] I have run some experiments on existing WebSockets applications to see what could be gained if compression were enabled. To do this, I captured traces with tcpdump, then wrote a small libpcap-based program to strip out individual TCP streams to separate files. I then wrote another program which read these files, removed the WebSocket framing, and tried compressing them in various ways using zlib and analyzed the results. The applications tested were: - A hacked version of Google Wave that used WebSockets for communication with the server, with Chrome 5 as the browser. The messages in this case are JSON consisting of ASCII text. (Note that this was a relatively small trace, I am getting a larger trace to make sure the results hold). - GWT Quake, which is an HTML5 app that gets up to 60fps running in the browser using WebSockets for communication with the server. The messages are largely binary, with individual bytes being encoded as UTF8 characters (so bytes 0x80-0xFF are encoded as a two-byte UTF8 character), and those binary values are largely IEEE floating point values. The sample taken was from two players, one using Chrome 5 on Windows and the other using WebKit nightly on Mac playing a multiplayer game. The compression methods tried were: - GZIP - using zlib with a gzip header (ie, deflateInit2(zstr, Z_DEFAULT_COMPRESSION, Z_DEFLATED, 15 | 16, 8, Z_DEFAULT_STRATEGY) - this should closely model the gzip encoding used for HTTP responses. - DEFLATE - similar, but using deflateInit(Z_DEFAULT_COMPRESSION) so only the zlib header is included - DEFLATE/STREAM - the above methods compress each frame separately, so the compression dictionary has to be rebuilt for each message. However, this method maintains compression state across messages in the same stream (using deflate(Z_SYNC_FLUSH) to finish an individual frame), so later messages can exploit redundancy from previous messages. This does have the downside of maintaining state for the duration of the connection, but there is already significant state due to keeping a TCP connection up For each stream, I collected the following information: - total payload bytes transferred (note that I did not count TCP/IP overhead or WebSockets framing overhead) - packet sizes at the 25th, 50th 75th, and 90th percentiles - percent reduction - in each case, I assumed that if the compression resulted in an increase in the frame size, it would be sent uncompressed instead (but I include the count of such frames) Google Wave client->server: 31 frames totalling 9189 bytes, percentiles: 248/249/396/397 bytes - GZIP: 6193 total bytes, percentiles 183/184/225/228, 32.60% reduction - DEFLATE: 5937 total bytes, percentiles: 175/176/217/220 bytes, 35.39% reduction, none under 63 bytes - DEFLATE/STREAM: 1274 total bytes, percentiles: 26/28/36/76 bytes, 86.14% reduction, 88.8% under 63 bytes server->client: 42 frames totalling 8962 bytes, percentiles: 90/91/111/657 bytes - GZIP: 5485 total bytes, percentiles: 87/88/109/282 bytes, 38.80% reduction - DEFLATE: 5149 total bytes, percentiles: 79/80/101/274 bytes, 42.55% reduction - DEFLATE/STREAM: 1427 total bytes, 25/26/28/37 bytes, 84.08% reduction Conclusion: - Contrary to expectations, the wave protocol has sufficient redundancy to get savings from compression - even basic gzip compression provides significant benefits for this trace of the wave protocol. - Sharing compression state across messages results in 5-6x reduction in frame sizes, which would be very important in mobile environments GWT Quake client 1 -> server: 3837 frames totalling 212653 bytes, percentiles: 50/54/60/66 bytes - GZIP: total 210242 bytes, percentiles: 50/54/60/66 bytes, 1.13% reduction, 3833 frames grew larger - DEFLATE: total 210184 bytes, percentiles: 50/54/60/66 bytes, 1.16% reduction, 3817 frames grew larger - DEFLATE/STREAM: total 94902 bytes, percentiles: 21/24/28/31 bytes, 55.37% reduction server -> client 1: 2150 frames totalling 608658 bytes, percentiles: 120/163/476/478 bytes - GZIP: total 405572 bytes, percentiles: 112/142/281/282 bytes, 33.37% reduction - DEFLATE: total 388897 bytes, percentiles: 104/134/273/274 bytes, 36.11% reduction - DEFLATE/STREAM: total 96093 bytes, percentiles: 23/44/58/68 bytes, 84.21% reduction client 2 -> server: 1996 frames totalling 103091 bytes, percentiles: 44/50/58/64 - GZIP: total 101887 bytes, percentiles: 44/50/57/64 bytes, 1.17% reduction, 1982 frames grew larger - DEFLATE: total 101647 bytes, percentiles: 44/50/57/64 bytes, 1.40% reduction, 1938 frames grew larger - DEFLATE/STREAM: 49229 bytes, percentiles 21/24/28/32 bytes, 52.25% reduction, 99.9% under 63 bytes server -> client 2: 1880 frames totalling 423606 bytes, percentiles: 110/165/333/380 - GZIP: total 303437 bytes, percentiles: 102/142/217/255 bytes, 28.37% reduction, 250 frames grew larger - DEFLATE: total 289864 bytes, percentiles: 94/134/209/247 bytes, 31.57% reduction, 77 frames grew larger - DEFLATE/STREAM: total 73246 bytes, percentiles: 18/27/53/67 bytes, 82.71% reduction Conclusion: - It is absolutely critical that uncompressed frames be allowed even when compression has been negotiated -- otherwise, the total bytes transferred would be much higher (either through loss of compression where useful, or by sending compressed data that is larger than the uncompressed data). - the client->server stream only compresses if state is maintained across frames, which also gives an order of magnitude size reduction on the server->client stream. - Even small packets benefit -- with persistent compression state, traffic that is 90% under 64 bytes still gets 2:1 compression. Implications for Protocol Changes for Compression: - At least some apps will have different characteristics between the traffic in each direction. For example, in GWT Quake, the client->server traffic would pay a roughly 13% size penalty if it had to be compressed or pay a 33%+ size penalty on the server->client traffic if the connection wasn't compressed at all - The simple approach is to simply allow compression to be optional for each frame and only use it if it reduces the size - This decision could be based on heuristics or by simply compressing and comparing the size, though the latter is likely to be inefficient for mobile devices. - A more complicated approach would be to allow asymmetric compression algorithms, though it isn't clear how the browser/server could take advantage of it without exposing some API for the application to describe the likely traffic. - Maintaining compression state across frames gives a large benefit, and likely overcomes the need to allow optional compression (though in pathological cases there are probably still cases where compression results in a size increase). However, this comes at the cost of additional state, so striking the proper balance on mobile devices that are constrained on both memory and network bandwidth may be difficult. Next Steps - I will get longer Wave traces and verify the measurements made here still hold, especially during startup - If anyone else has packet traces of actual WebSocket traffic they can share (pcap format is fine, but please don't include any sensitive data) please email them to me and I can include them in the analysis. If that is a problem I could also send the source for the analysis tools which should build on any Unix-based system. - I want to get some functional compression test going so I can measure actual latency gains on real apps running over real networks, which means defining some plausible framing format. Probably the most straightforward would just be to define frame type 0x80 as compressed UTF8 text, and the uncompressed bytes are decoded and passed to the app just like the 0x00 frame. It looks like it would be easy to add to Jetty7 (though lack of Z_SYNC_FLUSH from Java's Deflater makes it harder), though getting a functional browser implementation might be a lot of work. -- John A. Tamplin Software Engineer (GWT), Google
- Re: [hybi] preliminary WebSockets compression exp… Jamie Lokier
- Re: [hybi] preliminary WebSockets compression exp… Mike Belshe
- [hybi] preliminary WebSockets compression experim… John Tamplin
- Re: [hybi] preliminary WebSockets compression exp… Mike Belshe
- Re: [hybi] preliminary WebSockets compression exp… John Tamplin
- Re: [hybi] preliminary WebSockets compression exp… Roberto Peon
- Re: [hybi] preliminary WebSockets compression exp… John Tamplin
- Re: [hybi] preliminary WebSockets compression exp… Roberto Peon
- Re: [hybi] preliminary WebSockets compression exp… John Tamplin
- Re: [hybi] preliminary WebSockets compression exp… Roberto Peon
- Re: [hybi] preliminary WebSockets compression exp… John Tamplin
- Re: [hybi] preliminary WebSockets compression exp… Jamie Lokier
- Re: [hybi] preliminary WebSockets compression exp… Greg Wilkins
- Re: [hybi] preliminary WebSockets compression exp… Greg Wilkins
- Re: [hybi] preliminary WebSockets compression exp… John Tamplin
- Re: [hybi] preliminary WebSockets compression exp… Mike Belshe
- Re: [hybi] preliminary WebSockets compression exp… Greg Wilkins
- Re: [hybi] preliminary WebSockets compression exp… Mike Belshe