Re: bohe implementation for compression tests

James M Snell <jasnell@gmail.com> Fri, 11 January 2013 18:46 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D580F21F87E1 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 11 Jan 2013 10:46:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.612
X-Spam-Level:
X-Spam-Status: No, score=-4.612 tagged_above=-999 required=5 tests=[AWL=-0.602, BAYES_00=-2.599, FB_WORD1_END_DOLLAR=3.294, FB_WORD2_END_DOLLAR=3.294, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zaG7rwcV6MH1 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 11 Jan 2013 10:46:47 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 0B06C21F89FD for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 11 Jan 2013 10:46:46 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1Ttjaw-00064T-Eb for ietf-http-wg-dist@listhub.w3.org; Fri, 11 Jan 2013 18:44:46 +0000
Resent-Date: Fri, 11 Jan 2013 18:44:46 +0000
Resent-Message-Id: <E1Ttjaw-00064T-Eb@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <jasnell@gmail.com>) id 1Ttjan-0005rn-LW for ietf-http-wg@listhub.w3.org; Fri, 11 Jan 2013 18:44:37 +0000
Received: from mail-ie0-f174.google.com ([209.85.223.174]) by lisa.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <jasnell@gmail.com>) id 1Ttjal-0002GY-2L for ietf-http-wg@w3.org; Fri, 11 Jan 2013 18:44:37 +0000
Received: by mail-ie0-f174.google.com with SMTP id c11so2678934ieb.5 for <ietf-http-wg@w3.org>; Fri, 11 Jan 2013 10:44:08 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=v13tSsgzHM/qIU+JZgm6V+YmJ4A+XVOIfXdY98XdReU=; b=k08vddjANGzqtCcttWSvW1NAuernJO7ZKaA+wOgZeoGs0N6C/o4YdmQEgiwMaDuR78 CapGxllP41/KN/nAcSPq0xuBQFiGGXnBlnGlEveZ/lN37usqsNap1Kz+SAzVxspSkMp3 S+f4NlY5kyCmucI7DScox0YQ1XKQCgL18hCaUhnJ6i9rMkcdO687UU3+SyFUV1MEZCCz PRNkIt2OWQBSmFVE9uIG93PU/k8hfGXUOc9Nbi2PqN62jEio/vDlWVWE91S6IBKVW9OB FOrHaJyjN+MJ+oeiQlW6QcuwPp5y3Y1x0KrWSptJeK4+R8a0U4Si3LlLMeNcPoIYUWpg cmNw==
Received: by 10.50.150.174 with SMTP id uj14mr176794igb.19.1357929848692; Fri, 11 Jan 2013 10:44:08 -0800 (PST)
MIME-Version: 1.0
Received: by 10.64.26.137 with HTTP; Fri, 11 Jan 2013 10:43:48 -0800 (PST)
In-Reply-To: <CABP7Rbe-B89vVm8=OnHtAG0Y3G2UOysX+DKaTQ3+rAKBJBJyKA@mail.gmail.com>
References: <CABP7Rbe-B89vVm8=OnHtAG0Y3G2UOysX+DKaTQ3+rAKBJBJyKA@mail.gmail.com>
From: James M Snell <jasnell@gmail.com>
Date: Fri, 11 Jan 2013 10:43:48 -0800
Message-ID: <CABP7RbdSurNdLxvVV1F-9Bx4LwdhrX2btEvfA+RV=rVZ0moXNA@mail.gmail.com>
To: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="f46d043d644b42e10f04d307aeff"
Received-SPF: pass client-ip=209.85.223.174; envelope-from=jasnell@gmail.com; helo=mail-ie0-f174.google.com
X-W3C-Hub-Spam-Status: No, score=-3.5
X-W3C-Hub-Spam-Report: AWL=-2.710, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001
X-W3C-Scan-Sig: lisa.w3.org 1Ttjal-0002GY-2L 3620377c5cf40f6a883945cbe0444aef
X-Original-To: ietf-http-wg@w3.org
Subject: Re: bohe implementation for compression tests
Archived-At: <http://www.w3.org/mid/CABP7RbdSurNdLxvVV1F-9Bx4LwdhrX2btEvfA+RV=rVZ0moXNA@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/15832
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Just continuing my investigation of various header compression strategies
around the BOHE mechanism. In my personal github fork, I have just checked
in two bohe variations, one that implements selective-compression, the
other implements isolated-compression..

https://github.com/jasnell/compression-test/tree/master/compressor/bohe2
https://github.com/jasnell/compression-test/tree/master/compressor/bohe3

With bohe2 (selective-compression), a header block can consist of a
compressed set of headers and an uncompressed set of headers. Specific
headers such as Cookie, Set-Cookie, etc can be marked as "Do Not Compress".
These are dropped into the frame as-is and thus avoid the CRIME issue
completely. The rest of the headers are compressed using gzip using the
existing spdy3 dictionary. Obviously this is not ideal because Cookie data
then is passed around without any compression at all, making it far less
efficient than any of the other options on the table.

In bohe3 (isolated-compression), a header block can consist of two separate
compressed blocks generated using two separate stream compressor instances.
Selected headers (like Cookie) can be included in the secondary isolated
block which would never contain general user-provided header data.

**These are only experiments right now and are not intended as serious
proposals for the spec**. Isolated-compression (bohe3) does show promise
however. If we can successfully isolate potentially sensitive headers into
their own compression context generated independently of any general
user-supplied data, we can effectively short-circuit the CRIME attack by
making it impossible for an attacker to compare values based on the
compression ratio... and since it still uses gzip compression, we achieve a
generally better compression ratio overall than we get with the proposed
delta encoding. For now, tho, just consider this all just to be fodder for
discussion. There are still MANY issues with these experimental approaches
and I still need to go through delta in more detail to see if there is a
way bohe and delta can be used effectively together.

Just for example...

james-snells-macbook-pro:compression-test james$ ./compare_compressors.py
-c bohe3 -c bohe2 -c bohe -c delta -t
/Users/james/git/http_samples/mnot/amazon.com.har
732 req messages processed
             compressed | ratio min   max   std
req  bohe        26,035 | 0.13  0.03  0.68  0.08
req bohe2        44,195 | 0.23  0.07  0.71  0.13
req bohe3        30,944 | 0.16  0.05  0.74  0.08
req delta        33,955 | 0.17  0.02  0.71  0.09
req http1       195,386 | 1.00  1.00  1.00  0.00

732 res messages processed
             compressed | ratio min   max   std
res  bohe        39,525 | 0.25  0.04  0.67  0.07
res bohe2        47,157 | 0.29  0.12  0.71  0.08
res bohe3        44,843 | 0.28  0.06  0.70  0.07
res delta        44,499 | 0.28  0.02  0.65  0.09
res http1       159,968 | 1.00  1.00  1.00  0.00

- James



On Thu, Jan 10, 2013 at 11:08 AM, James M Snell <jasnell@gmail.com> wrote:

> I have an initial bohe implementation for the compression tests... it's
> very preliminary and uses the same gzip compression as the current spdy3.
> I'm going to be playing around with the delta compression mechanism as well
> and see how much of an impact that has. Initial results are very promising
> but I haven't done much debugging yet. Just wanted folks to know that this
> work was underway...
>
> https://github.com/jasnell/compression-test/tree/master/compressor/bohe
>
> Some test runs....
>
> ./compare_compressors.py -c bohe -c spdy3 -c delta
> ../http_samples/mnot/amazon.com.har
> 732 req messages processed
>              compressed | ratio min   max   std
> req  bohe        26,122 | 0.13  0.04  0.70  0.08
> req delta        33,955 | 0.17  0.02  0.71  0.09
> req http1       195,386 | 1.00  1.00  1.00  0.00
> req spdy3        27,238 | 0.14  0.04  0.71  0.08
>
> 732 res messages processed
>              compressed | ratio min   max   std
> res  bohe        39,628 | 0.25  0.04  0.66  0.07
> res delta        44,499 | 0.28  0.02  0.65  0.09
> res http1       159,968 | 1.00  1.00  1.00  0.00
> res spdy3        41,325 | 0.26  0.04  0.67  0.08
>
>
> ./compare_compressors.py -c bohe -c spdy3 -c delta
> ../http_samples/mnot/craigslist.org.har
> 66 req messages processed
>              compressed | ratio min   max   std
> req  bohe         1,948 | 0.15  0.06  0.73  0.11
> req delta         2,036 | 0.16  0.07  0.71  0.11
> req http1        12,894 | 1.00  1.00  1.00  0.00
> req spdy3         2,016 | 0.16  0.07  0.75  0.11
>
> 66 res messages processed
>              compressed | ratio min   max   std
> res  bohe         1,786 | 0.18  0.07  0.77  0.13
> res delta         2,858 | 0.28  0.08  0.69  0.12
> res http1        10,147 | 1.00  1.00  1.00  0.00
> res spdy3         1,869 | 0.18  0.09  0.78  0.13
>
>
> ./compare_compressors.py -c bohe -c spdy3 -c delta
> ../http_samples/mnot/flickr.com.har
> 438 req messages processed
>              compressed | ratio min   max   std
> req  bohe        11,988 | 0.10  0.02  0.69  0.07
> req delta        26,372 | 0.22  0.01  0.71  0.14
> req http1       121,854 | 1.00  1.00  1.00  0.00
> req spdy3        12,550 | 0.10  0.02  0.71  0.07
>
> 438 res messages processed
>              compressed | ratio min   max   std
> res  bohe        13,073 | 0.09  0.05  0.66  0.06
> res delta        25,236 | 0.18  0.02  0.70  0.11
> res http1       140,457 | 1.00  1.00  1.00  0.00
> res spdy3        14,142 | 0.10  0.05  0.66  0.06
>
>
> ./compare_compressors.py -c bohe -c spdy3 -c delta
> ../http_samples/mnot/facebook.com.har
> 234 req messages processed
>              compressed | ratio min   max   std
> req  bohe         6,091 | 0.15  0.06  0.78  0.07
> req delta         7,800 | 0.19  0.02  0.70  0.07
> req http1        41,980 | 1.00  1.00  1.00  0.00
> req spdy3         6,301 | 0.15  0.06  0.77  0.07
>
> 234 res messages processed
>              compressed | ratio min   max   std
> res  bohe         9,458 | 0.23  0.07  0.68  0.07
> res delta        12,045 | 0.30  0.13  0.60  0.08
> res http1        40,252 | 1.00  1.00  1.00  0.00
> res spdy3         9,788 | 0.24  0.07  0.69  0.07
>
>
>
>
>