Re: Choosing a header compression algorithm

Roberto Peon <grmocg@gmail.com> Fri, 29 March 2013 18:42 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2660F21F8F07 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 29 Mar 2013 11:42:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -11.59
X-Spam-Level:
X-Spam-Status: No, score=-11.59 tagged_above=-999 required=5 tests=[AWL=1.008, BAYES_00=-2.599, GB_I_LETTER=-2, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pzICuke0kyBA for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 29 Mar 2013 11:42:12 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 08F1A21F915B for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 29 Mar 2013 11:42:10 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1ULeDT-0001wk-76 for ietf-http-wg-dist@listhub.w3.org; Fri, 29 Mar 2013 18:39:55 +0000
Resent-Date: Fri, 29 Mar 2013 18:39:55 +0000
Resent-Message-Id: <E1ULeDT-0001wk-76@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <grmocg@gmail.com>) id 1ULeDF-0001vB-Q7 for ietf-http-wg@listhub.w3.org; Fri, 29 Mar 2013 18:39:41 +0000
Received: from mail-bk0-f52.google.com ([209.85.214.52]) by maggie.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <grmocg@gmail.com>) id 1ULeDE-0001Jq-5r for ietf-http-wg@w3.org; Fri, 29 Mar 2013 18:39:41 +0000
Received: by mail-bk0-f52.google.com with SMTP id it16so261340bkc.25 for <ietf-http-wg@w3.org>; Fri, 29 Mar 2013 11:39:13 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=C4BLibVaUTz29X5avTg7QKxJHEuIPv2m0pMA03ojves=; b=EY6D4heagQ5axSXO5Nba3K+GlrUnKVlHsjzw4+e856hK1z4/f6VEEqSbmR1j203ocn kTuqOrfx6vFLCCO1j2bvciF1oHVBYlCLe5cVnrzTEOXKEIdfwjm2gTg1TaRVkCkgIHAX SvjBd70jDMAjuTGAoH3rP+wJpogNCdBFGw2kCWI/lLtf6ebhrd4OlgdljSg7aCT4fE6O CrdjBBbassUGhSRk3BmtXOmhIYSvS9vbipUHry/ErICN1iUqU5syreAMiL6/S+T/h3Qx OGjqOba6pnBkGHraVRq7Z0DLaaPujqxvUvSeyEl9YXvtS2DkvE8WaGWVos2UPn7fatzq 63EQ==
MIME-Version: 1.0
X-Received: by 10.205.19.74 with SMTP id qj10mr1398880bkb.61.1364582353735; Fri, 29 Mar 2013 11:39:13 -0700 (PDT)
Received: by 10.205.8.7 with HTTP; Fri, 29 Mar 2013 11:39:13 -0700 (PDT)
In-Reply-To: <6C71876BDCCD01488E70A2399529D5E5163F6C3B@ADELE.crf.canon.fr>
References: <254AABEE-22B9-418E-81B0-2729902C4413@mnot.net> <A14105FB-ED1A-4B70-8840-9648847BCC3A@mnot.net> <6C71876BDCCD01488E70A2399529D5E5163F3C67@ADELE.crf.canon.fr> <CAP+FsNfFohSwrX2DxthNcnn+wDj6T5W7xpcg4yA56Gvt_nP3_Q@mail.gmail.com> <6C71876BDCCD01488E70A2399529D5E5163F3D72@ADELE.crf.canon.fr> <7CA7F3EB-A492-471A-8AC4-23293DD10840@mnot.net> <6C71876BDCCD01488E70A2399529D5E5163F4076@ADELE.crf.canon.fr> <CAP+FsNdztfCJjvP58ryVXDRgGyGSPO-37gRMjAuwikz2eviBiw@mail.gmail.com> <CAP+FsNdQ7mNbsaAiUqEF22Oh8KMaK3UWUWFzWE=K0jQbkM7t1Q@mail.gmail.com> <6C71876BDCCD01488E70A2399529D5E5163F4263@ADELE.crf.canon.fr> <CAP+FsNeXvn6UasR6e56A7pHtrkXg6A0NnrVGmRYNW49Qu2vxqg@mail.gmail.com> <CAP+FsNfy0ognTBaF7USBcP9dTL5bQsruav30FmGB70m_0JfjyA@mail.gmail.com> <6C71876BDCCD01488E70A2399529D5E5163F68FE@ADELE.crf.canon.fr> <CAP+FsNee45EUbzRyF+58gkytX1WvD_cPJu_GW7N7YKE5NhszVw@mail.gmail.com> <6C71876BDCCD01488E70A2399529D5E5163F6C3B@ADELE.crf.canon.fr>
Date: Fri, 29 Mar 2013 11:39:13 -0700
Message-ID: <CAP+FsNddv3EHwJ+v+74ycGXc2-8BV-w6KkC1zGDSnaY60uGJow@mail.gmail.com>
From: Roberto Peon <grmocg@gmail.com>
To: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
Cc: "agl@google.com" <agl@google.com>, Mark Nottingham <mnot@mnot.net>, "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="20cf302239fd764c1204d9149692"
Received-SPF: pass client-ip=209.85.214.52; envelope-from=grmocg@gmail.com; helo=mail-bk0-f52.google.com
X-W3C-Hub-Spam-Status: No, score=-3.5
X-W3C-Hub-Spam-Report: AWL=-2.681, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1ULeDE-0001Jq-5r 8e5a8851f74815f543e69c307427cdaa
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Choosing a header compression algorithm
Archived-At: <http://www.w3.org/mid/CAP+FsNddv3EHwJ+v+74ycGXc2-8BV-w6KkC1zGDSnaY60uGJow@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/17178
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Herve--

With the way you've implemented prefix matching, there are no guarantees on
security whatsoever.
In order to provide a guarantee of at least a minimum amount of effort on
the part of the attacker, you MUST provide guarantees on the minimum
subsequence sizes, as this relates directly to the amount of tries...
As an example,
  foo.com/a/b/c
will be exceptionally easy to figure out, as each subsequence will be
guessed in a very small amount of time.

There are many caveats and gotchas in this space, and I feel uncomfortable
that you're making assertions about safety unless you've got some security
folks looking at it critically.
I've done that for the delta compressor, and I suggest you do the same. If
it turns out that we do get consensus from security folks that some kind of
prefix matching is safe, I'm happy to add it back into delta. Until then, I
don't want to go down the rabbit hole!
-=R


On Fri, Mar 29, 2013 at 8:26 AM, RUELLAN Herve
<Herve.Ruellan@crf.canon.fr>wrote:

>
>
> > -----Original Message-----
> > From: Roberto Peon [mailto:grmocg@gmail.com]
> > Sent: jeudi 28 mars 2013 20:17
> > To: RUELLAN Herve
> > Cc: agl@google.com; Mark Nottingham; ietf-http-wg@w3.org Group
> > Subject: Re: Choosing a header compression algorithm
> >
> > The net effect of the character-delimited prefix match you propose is
> that an
> > attacker must only brute force each subsequence which matches [^/&=,
> > ]*([^/&=, ]|$) ,
> >
> > Running the numbers, that is:
> >
> >       k^s * m
> >
> > where:
> >
> >       k = possible characters
> >       s = subsequence length or characters not ending in [^/&= ,] or end-
> > of-line
> >       m = number of subsequences
> >
> >
> >
> > instead of full atom matching, which is (is a good special case of above
> where
> > m is always 1 and s is always n):
> >
> >       k^n
> >
> >
> > where:
> >
> >
> >       k = possible characters
> >       n = length of field
> >
> >
> >
> > In other words, full atom matching is *exponentially* more difficult (a
> > desirable trait)!
> >
> > In the example above, assuming 27 commonly used characters per entry, I'm
> > very likely to completely discover that data in:
> >
> >       http://www.example.com/path/first/myfile
> >
> > in:
> >
> >       27^4 + 27^5 + 27^6 tries
>
> This is roughly 400 million tries. And this figure is obtained using
> several assumptions that may not hold:
> - The number of possible characters is only 27.
> - The length of each chunk is known.
>
> > Note that it is likely that an attacker would be able to do
> substantially better
> > than above if they know the letter frequency (very likely) or have
> similar
> > data:
> > In the case of a whole atom match, I'd discover the data in:
> >
> >
> >       27^(17) tries
> >
> >
> > So, full atom matching is ~ 5 quadrillion (5,353,441,417,462,320) times
> more
> > difficult... that is a lot.
>
> True full atom matching is much harder, but limited prefix matching is
> already very hard.
>
> We must also keep in mind that each try correspond to a request that the
> attacker for the client to do. Such a huge number of requests should be
> detected somewhere, just in order to prevent DoS attacks.
>
> > When I originally considered prefix matching in the past, this was the
> math
> > that I did, and I decided that I was not confident that compressor
> > implementors would ensure that their encoder prevents prefix matching
> > when the subsequence lengths are too small (and thus easily attacked).
> The
> > same consideration applies to dynamic huffman coding. It *CAN* be safe,
> > but it requires complexity and computation, and I think there is
> significant
> > risk that implementors may knowing or unknowingly sacrifice security. It
> is
> > actually more safe to ignore delimiters and allow for a match of a
> prefix so
> > long as it is greater than a certain length. At least in that case there
> is
> > guarantee of the number of times that it must be brute forced...
> >
> > Full-atom matching is simply much more difficult to get wrong from a
> security
> > perspective, and it results in pretty decent compression...
>
> I think that prefix matching with constrained ending is fairly secure: it
> compels an attacker to used brute force to break it. In addition, it
> provides very interesting results regarding compression.



>
> Hervé.
>
> > -=R
>
>