Re: [hybi] permessage-deflate performance tuning statistics
Tobias Oberstein <tobias.oberstein@tavendo.de> Tue, 15 October 2013 19:01 UTC
Return-Path: <tobias.oberstein@tavendo.de>
X-Original-To: hybi@ietfa.amsl.com
Delivered-To: hybi@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2289421F9FAE for <hybi@ietfa.amsl.com>; Tue, 15 Oct 2013 12:01:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id S94DYdbJbBCg for <hybi@ietfa.amsl.com>; Tue, 15 Oct 2013 12:01:51 -0700 (PDT)
Received: from EXHUB020-1.exch020.serverdata.net (exhub020-1.exch020.serverdata.net [206.225.164.28]) by ietfa.amsl.com (Postfix) with ESMTP id 1DC3D11E8156 for <hybi@ietf.org>; Tue, 15 Oct 2013 12:01:47 -0700 (PDT)
Received: from EXVMBX020-12.exch020.serverdata.net ([169.254.3.240]) by EXHUB020-1.exch020.serverdata.net ([206.225.164.28]) with mapi; Tue, 15 Oct 2013 12:01:46 -0700
From: Tobias Oberstein <tobias.oberstein@tavendo.de>
To: Peter Thorson <webmaster@zaphoyd.com>, "hybi@ietf.org" <hybi@ietf.org>
Date: Tue, 15 Oct 2013 12:01:43 -0700
Thread-Topic: [hybi] permessage-deflate performance tuning statistics
Thread-Index: Ac7J1TgI93EbKEIiSiKEI1JXVmJz1QAA2mgg
Message-ID: <634914A010D0B943A035D226786325D44469B06DCF@EXVMBX020-12.exch020.serverdata.net>
References: <FD138330-7D7E-4450-B4F5-64551F92F26D@zaphoyd.com>
In-Reply-To: <FD138330-7D7E-4450-B4F5-64551F92F26D@zaphoyd.com>
Accept-Language: de-DE, en-US
Content-Language: de-DE
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: de-DE, en-US
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [hybi] permessage-deflate performance tuning statistics
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Oct 2013 19:01:58 -0000
Hi Peter, this is fantastic empirical evidence - very much appreciated! As a first concrete action from this, I'll take the following issue to the Python community: "Expose the 'Memory Level' knob on the Python zlib wrapper" Cheers, Tobias > -----Ursprüngliche Nachricht----- > Von: hybi-bounces@ietf.org [mailto:hybi-bounces@ietf.org] Im Auftrag von > Peter Thorson > Gesendet: Dienstag, 15. Oktober 2013 20:34 > An: hybi@ietf.org > Betreff: [hybi] permessage-deflate performance tuning statistics > > Hi all, > > I've been doing a bit of research and testing on the compression > performance and memory usage of permessage-deflate on WebSocket like > workloads. I plan to write about this with more final numbers once the spec > is official, but some of the intermediate results and tools may be of interest > to this group during the standardization process so I'm sharing some of those > notes here. > > Some highlights: > - Permessage-deflate offers significant bandwidth savings. > - How the extension performs depends greatly on the type of data that is > being compressed and the compression settings given to deflate. > - Settings that work well for HTTP and the zlib/permessage-deflate defaults > are inefficient for some common WebSocket workflows. > - The two parameters presently in the draft specification both provide > significant and meaningful options for tuning compression performance for > those workflows. Implementations (especially browsers) are greatly > encouraged to support all options. > > Details & Methods: > > My goal is to explore a number of the settings offered by deflate and > determine what effect they have on compression performance, as well as > CPU/memory usage. To this end I have written a tool > (https://github.com/zaphoyd/ws-pmce-stats) that will produce a report of > compression related statistics when fed a transcript of messages. > > The first workflow I have explored in detail is a WebSocket service that uses a > JSON based protocol to deliver short streaming updates. My some of my > sample data is present in the datasets folder of the above git repository. > Some examples include a mock chat service with data seeded from publicly > logged mediawiki IRC channel and a mock stock ticker service with data > seeded with historical stock quote data. > > I explored the effects of the context takeover and window bits settings from > the permessage-deflate draft spec as well as a few zlib settings that can be > unilaterally specified without any extension negotiation. Some, but not all of > these, are presently exposed in higher level languages that use zlib as their > underlying compression library. I looked at two of these settings in particular, > the "Compression Level" and the "Memory Level". The former affects speed > vs compression ratio, the latter memory usage vs compression ratio. > > Preliminary results for the JSON short message service workflow: > > Context Takeover > ================ > Allowing context takeover drastically improves compression ratios. With > other stats at defaults, no_context_takeover achieves a compression ratio of > 0.84, with takeover 0.30. This is a significant gain. Note: this gain comes at a > fairly high cost. Enabling context takeover requires a separate context to be > maintained for every connection, rather than fixed number for all > connections. > > Window Bits > =========== > Window bits has a sizable but well distributed effect on ratios. It has a > significant effect on memory usage though. With all other stats at defaults: > window bits = compression ratio / buffer size per connection > 08 = 0.510 / 1+128=129KiB > 09 = 0.510 / 2+128=130KiB > 10 = 0.435 / 4+128=132KiB > 11 = 0.384 / 8+128=136KiB > 12 = 0.353 / 16+128=144KiB > 13 = 0.330 / 32+128=160KiB > 14 = 0.315 / 64+128=192KiB > 15 = 0.304 / 128+128=256KiB > Reducing window bits from the default (15) to 11 provides an 8% reduction in > compression but nearly a 50% savings in per connection memory usage. > Reduced window bits to very small values (8-9) also increases compression > runtime by 40-50%. 10 Is less slow, 11+ appear to all be about the same > speed. > > Compression Level > ================= > Compression level does not have a material impact on performance or ratios > for this workflow. > > Memory Level > ============ > Memory level does not have a significant impact on compression ratios. A > value of 9 produces the ratio 0.304 and a value of 1 produces the ratio 0.307. > It does affect memory usage and compression speed however: > mem_level value = runtime / memory usage > 1 = 13.05ms / 128+1=129KiB > 2 = 10.41ms / 128+2=130KiB > 3 = 10.15ms / 128+4=132KiB > 4 = 8.18ms / 128+8=136KiB > 5 = 7.63ms / 128+16=144KiB > 6 = 7.69ms / 128+32=160KiB > 7 = 7.92ms / 128+64=192KiB > 8 = 7.69ms / 128+128=256KiB > 9 = 7.84ms / 128+256=386KiB > > All of the stats above show the effects of changing one parameter in > isolation. Additional gains, especially with respect to memory usage per > connection can be had by combinations of parameters. Many of the speed, > compression, and memory effects of parameters are dependent on each > other. Two nice balances of all factors for the JSON short message service > data set (vs defaults) are something like.. > > context-takeover=on > window bits=11 > memory level=4 > This provides memory usage of 16KiB/connection vs 256KiB, has no runtime > speed penalty, and achieves a 0.384 vs 0.304 compression ratio. > > context-takeover=on > window bits=11 > memory level=1 > This provides memory usage of 5KiB/connection vs 256KiB, runs ~15% slower, > and achieves a similar 0.385 vs 0.304 compression ratio. > > The ws-pmce-stats tool can help you plug in values and get a sense for which > combinations of settings are optimal for your traffic mix. In general I have > found that the shorter your messages, the less you benefit from high > window bit and memory level values. If you routinely send WebSocket > messages with payloads in the high hundreds of KB or MBs you will benefit > from higher values for memory level and window bits. If you have extremely > limited memory, no context takeover will allow fixed memory usage for all > connections. Its price is heavy for small messages & JSON protocols, but less > problematic for large ones. I've found that 11 window bits and compression > memory level 4 is still quite effective even up to message payloads of > ~200KB. > > I'd love to hear any feedback anyone has about the methods or results. I am > particularly interested in collecting more sample WebSocket workflows. I > haven't run any numbers for binary connections yet. I'd love to hear details > about other workflows that might have different properties than the ones > studied here so far, especially if you have sample transcripts. > > I'd also be interested in any feedback on the ws-pmce-stats program. Is > something like this useful to anyone else? It meets my needs right now, but I > have a few ideas of how to expand it (binary message support, other > compression algorithms, machine readable output) if it sounds useful to > others. > _______________________________________________ > hybi mailing list > hybi@ietf.org > https://www.ietf.org/mailman/listinfo/hybi
- Re: [hybi] permessage-deflate performance tuning … Joakim Erdfelt
- [hybi] permessage-deflate performance tuning stat… Peter Thorson
- Re: [hybi] permessage-deflate performance tuning … Tobias Oberstein
- Re: [hybi] permessage-deflate performance tuning … Peter Thorson
- Re: [hybi] permessage-deflate performance tuning … Adam Rice
- Re: [hybi] permessage-deflate performance tuning … Tobias Oberstein