RE: Choosing a header compression algorithm

RUELLAN Herve <Herve.Ruellan@crf.canon.fr> Tue, 26 March 2013 09:14 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DAD0D21F88FB for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 26 Mar 2013 02:14:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.249
X-Spam-Level:
X-Spam-Status: No, score=-10.249 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, HELO_EQ_FR=0.35, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id o8Ks0mBO6LH6 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 26 Mar 2013 02:14:26 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id DB47721F88D8 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Tue, 26 Mar 2013 02:14:25 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1UKPvl-0005YT-3d for ietf-http-wg-dist@listhub.w3.org; Tue, 26 Mar 2013 09:12:33 +0000
Resent-Date: Tue, 26 Mar 2013 09:12:33 +0000
Resent-Message-Id: <E1UKPvl-0005YT-3d@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <Herve.Ruellan@crf.canon.fr>) id 1UKPvZ-0005Xg-28 for ietf-http-wg@listhub.w3.org; Tue, 26 Mar 2013 09:12:21 +0000
Received: from inari-msr.crf.canon.fr ([194.2.158.67]) by lisa.w3.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from <Herve.Ruellan@crf.canon.fr>) id 1UKPvU-0006BX-3Y for ietf-http-wg@w3.org; Tue, 26 Mar 2013 09:12:21 +0000
Received: from mir-msr.corp.crf.canon.fr (mir-msr.corp.crf.canon.fr [172.19.77.98]) by inari-msr.crf.canon.fr (8.13.8/8.13.8) with ESMTP id r2Q9Bj1c023521; Tue, 26 Mar 2013 10:11:45 +0100
Received: from ADELE.crf.canon.fr (adele.fesl2.crf.canon.fr [172.19.70.17]) by mir-msr.corp.crf.canon.fr (8.13.8/8.13.8) with ESMTP id r2Q9Bj9O019618; Tue, 26 Mar 2013 10:11:45 +0100
Received: from ADELE.crf.canon.fr ([::1]) by ADELE.crf.canon.fr ([::1]) with mapi id 14.02.0342.003; Tue, 26 Mar 2013 10:11:45 +0100
From: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
To: Roberto Peon <grmocg@gmail.com>, "agl@google.com" <agl@google.com>
CC: Mark Nottingham <mnot@mnot.net>, "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>
Thread-Topic: Choosing a header compression algorithm
Thread-Index: AQHOJgaI2I9gxWQLAUOB6sMRXNhzJpixMSEAgAB5jdCAAEKngIAAFPgQgAPbnwCAAMyn8IAACJEAgABgHoCAAJ7DUA==
Date: Tue, 26 Mar 2013 09:11:44 +0000
Message-ID: <6C71876BDCCD01488E70A2399529D5E5163F4263@ADELE.crf.canon.fr>
References: <254AABEE-22B9-418E-81B0-2729902C4413@mnot.net> <A14105FB-ED1A-4B70-8840-9648847BCC3A@mnot.net> <6C71876BDCCD01488E70A2399529D5E5163F3C67@ADELE.crf.canon.fr> <CAP+FsNfFohSwrX2DxthNcnn+wDj6T5W7xpcg4yA56Gvt_nP3_Q@mail.gmail.com> <6C71876BDCCD01488E70A2399529D5E5163F3D72@ADELE.crf.canon.fr> <7CA7F3EB-A492-471A-8AC4-23293DD10840@mnot.net> <6C71876BDCCD01488E70A2399529D5E5163F4076@ADELE.crf.canon.fr> <CAP+FsNdztfCJjvP58ryVXDRgGyGSPO-37gRMjAuwikz2eviBiw@mail.gmail.com> <CAP+FsNdQ7mNbsaAiUqEF22Oh8KMaK3UWUWFzWE=K0jQbkM7t1Q@mail.gmail.com>
In-Reply-To: <CAP+FsNdQ7mNbsaAiUqEF22Oh8KMaK3UWUWFzWE=K0jQbkM7t1Q@mail.gmail.com>
Accept-Language: en-US, fr-FR
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [172.20.8.8]
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Received-SPF: none client-ip=194.2.158.67; envelope-from=Herve.Ruellan@crf.canon.fr; helo=inari-msr.crf.canon.fr
X-W3C-Hub-Spam-Status: No, score=-5.0
X-W3C-Hub-Spam-Report: AWL=-1.825, BAYES_00=-1.9, RP_MATCHES_RCVD=-1.302
X-W3C-Scan-Sig: lisa.w3.org 1UKPvU-0006BX-3Y cb3c33759bfb7e4ffa8819440b03a648
X-Original-To: ietf-http-wg@w3.org
Subject: RE: Choosing a header compression algorithm
Archived-At: <http://www.w3.org/mid/6C71876BDCCD01488E70A2399529D5E5163F4263@ADELE.crf.canon.fr>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/17140
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Roberto,

Prefix matching can be disabled by using the "delta=false" option.

There are two new strategies for prefix matching. The first one is limiting which shared prefixes are used. The last character of a shared prefix must belong to a predefined set of characters. An example of predefined set of characters is "/?=, " (with a space as the last character). For example, for the following URLs:
http://www.example.com/path/first/myfile
http://www.example.com/path/final/otherfile
With the default strategy for prefix matching, the shared prefix would be:
http://www.example.com/path/fi
With the constraints upon the end of the shared prefix, the shared prefix is:
http://www.example.com/path/
This prevents CRIME-like attacks from guessing a value character by character. Therefore we think that using this new strategy, prefix matching is not vulnerable to CRIME-like attacks. In fact, using this strategy, we are enabling some kind of fine-grained indexing of values, where well defined parts of a value can be referred to.

The second strategy is to limit the number of times a given header value can be used as a reference for prefix matching. This limitation can be very low: limiting this usage as a reference to 2 times is already sufficient to get most of the performances of prefix matching. We think that with this strategy, prefix matching is mostly protected from CRIME-like attack.

The first strategy can be selected using the "delta_type='/?= \coma'" option (possibly changing the prefix ending characters).
The second strategy can be selected using the "delta_type=2" option (or with another value).

I'm planning update the HeaderDiff codec to using the first strategy by default.

Hervé. 

From: Roberto Peon [mailto:grmocg@gmail.com] 
Sent: mardi 26 mars 2013 01:23
To: RUELLAN Herve; agl@google.com
Cc: Mark Nottingham; ietf-http-wg@w3.org Group
Subject: Re: Choosing a header compression algorithm

Herve--

We need an option which disables prefix matching on the HeaderDiff compressor. The strategies I see in the code still allow many headers to be attacked (if they include commas).
I believe that it is still possible to probe interesting data out of various fields of the URL, for example, or even cookies, assuming they aren't B64 encoded.

-=R

On Mon, Mar 25, 2013 at 11:38 AM, Roberto Peon <grmocg@gmail.com> wrote:
There are two obvious strategies here: What we do now, and using what SPDY does today (share connections if the certs match and DNS resolution of the new hostname overlaps with those of the current connection).

-=R

On Mon, Mar 25, 2013 at 10:21 AM, RUELLAN Herve <Herve.Ruellan@crf.canon.fr> wrote:
> -----Original Message-----
> From: Mark Nottingham [mailto:mnot@mnot.net]
> Sent: lundi 25 mars 2013 06:56
> To: RUELLAN Herve
> Cc: Roberto Peon; ietf-http-wg@w3.org Group
> Subject: Re: Choosing a header compression algorithm
>
>
> On 23/03/2013, at 5:04 AM, RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
> wrote:
>
> > I think it would be good to move this from the compressors to the
> streamifier. In addition, it would be interesting to look at a more realistic
> streamifier that could for example unshard hosts (expecting that HTTP/2.0
> will remove the sharding currently done by server developers).
>
> Right now, it combines all requests to the same TLD (according to the Public
> Suffix List) into a single "connection." Do you have a suggestion for how to do
> it better?
I think this should provide some "realistic" results as a starting point. Depending on what we want to measure, we may want to refine this a bit.

Hervé.

> I've just pushed a quick and dirty fix to use a new instance of each
> compressor for each connection; the results are pretty even between
> headerdiff and delta2, with a small increase in each:
>
> * TOTAL: 5948 req messages
>                                        size  time | ratio min   max   std
>                         http1     3,460,925  0.18 | 1.00  1.00  1.00  0.00
>   delta2 (max_byte_size=4096)       707,901 11.87 | 0.20  0.03  0.83  0.15
>      headerdiff (buffer=4096)       960,106  1.65 | 0.28  0.01  0.96  0.23
>
> * TOTAL: 5948 res messages
>                                        size  time | ratio min   max   std
>                         http1     2,186,162  0.28 | 1.00  1.00  1.00  0.00
>   delta2 (max_byte_size=4096)       622,837 12.86 | 0.28  0.02  1.22  0.13
>      headerdiff (buffer=4096)       596,290  3.65 | 0.27  0.02  0.92  0.18
>
> Cheers,
>
>
> --
> Mark Nottingham   http://www.mnot.net/
>
>