Re: Choosing a header compression algorithm

Stephen Farrell <stephen.farrell@cs.tcd.ie> Thu, 21 March 2013 15:42 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 371A021F90F8 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 21 Mar 2013 08:42:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[AWL=4.000, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cGqFb2PWshuS for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 21 Mar 2013 08:42:24 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 6B0E921F90CA for <httpbisa-archive-bis2Juki@lists.ietf.org>; Thu, 21 Mar 2013 08:42:24 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1UIhcJ-0004O2-EW for ietf-http-wg-dist@listhub.w3.org; Thu, 21 Mar 2013 15:41:23 +0000
Resent-Date: Thu, 21 Mar 2013 15:41:23 +0000
Resent-Message-Id: <E1UIhcJ-0004O2-EW@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <stephen.farrell@cs.tcd.ie>) id 1UIhc3-0004Kr-6j for ietf-http-wg@listhub.w3.org; Thu, 21 Mar 2013 15:41:07 +0000
Received: from mercury.scss.tcd.ie ([134.226.56.6]) by lisa.w3.org with esmtp (Exim 4.72) (envelope-from <stephen.farrell@cs.tcd.ie>) id 1UIhc2-0003Gn-44 for ietf-http-wg@w3.org; Thu, 21 Mar 2013 15:41:07 +0000
Received: from localhost (localhost [127.0.0.1]) by mercury.scss.tcd.ie (Postfix) with ESMTP id 5AB2ABE63; Thu, 21 Mar 2013 15:40:45 +0000 (GMT)
Received: from mercury.scss.tcd.ie ([127.0.0.1]) by localhost (mercury.scss.tcd.ie [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id y+PDSwaPj4s4; Thu, 21 Mar 2013 15:40:45 +0000 (GMT)
Received: from [IPv6:2001:770:10:203:84d8:574f:5e47:cde0] (unknown [IPv6:2001:770:10:203:84d8:574f:5e47:cde0]) by mercury.scss.tcd.ie (Postfix) with ESMTPSA id 3A2ABBE1C; Thu, 21 Mar 2013 15:40:45 +0000 (GMT)
Message-ID: <514B29FE.6010501@cs.tcd.ie>
Date: Thu, 21 Mar 2013 15:40:46 +0000
From: Stephen Farrell <stephen.farrell@cs.tcd.ie>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130308 Thunderbird/17.0.4
MIME-Version: 1.0
To: Mark Nottingham <mnot@mnot.net>
CC: "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>
References: <254AABEE-22B9-418E-81B0-2729902C4413@mnot.net>
In-Reply-To: <254AABEE-22B9-418E-81B0-2729902C4413@mnot.net>
X-Enigmail-Version: 1.5.1
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Received-SPF: none client-ip=134.226.56.6; envelope-from=stephen.farrell@cs.tcd.ie; helo=mercury.scss.tcd.ie
X-W3C-Hub-Spam-Status: No, score=-4.5
X-W3C-Hub-Spam-Report: AWL=-1.983, RP_MATCHES_RCVD=-2.497
X-W3C-Scan-Sig: lisa.w3.org 1UIhc2-0003Gn-44 8cdcdf07a774e123a664a4254d5d78e4
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Choosing a header compression algorithm
Archived-At: <http://www.w3.org/mid/514B29FE.6010501@cs.tcd.ie>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/17099
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Hiya,

I've a question. It might be silly or maybe just one to
punt on for later, but just in case...

If HTTP/2.0 does header compression, and if some form of
header authentication (e.g. a DKIM-like thing, as recently
proposed for iSchedule) were to be standardised, should
the authentication cover the compressed or uncompressed
headers?

The former would seem to be bad when considering APIs,
but the latter might mean that canonicalisation needs to
be considered when picking a compression alg.

The kind of canonicalisation requirement might be that
you need to ensure one can define a reasonable c14n
function such that c14n(X)=c14n(uncompress(compress(X)).

Mark's item 3 below triggered this, I guess one could
argue that it might be a requirement for compression
that it not break higher level canonicalisation which
isn't quite the same as being able to reconstitute the
semantics. (For example, with timestamps that specify
a zero TZ offset, or list ordering maybe.)

Ta,
S.

PS: Apologies if this is all obvious when one reads
the algorithm descriptions, which I've not;-)


On 03/21/2013 07:11 AM, Mark Nottingham wrote:
> Previously, we've talked about starting with just a delta-encoding approach for our first implementation draft. In Orlando, we focused primarily on two proposals:
> 
> * Delta2
>   Draft: http://tools.ietf.org/html/draft-rpeon-httpbis-header-compression-03
>   Python Implementation: https://github.com/http2/compression-test/tree/master/compressor/delta2
> 
> * HeaderDiff
>   Draft: http://tools.ietf.org/html/draft-ruellan-headerdiff-00
>   Python Implementation: https://github.com/http2/compression-test/tree/master/compressor/headerdiff
> 
> As I understand it, Herve et al want to work on making HeaderDiff more resistant to CRIME, and hopefully we'll see the results of that in the very near future. 
> 
> In the meantime, I'd like everyone to become familiar with both drafts and the characteristics of their implementations, so that we can have an informed discussion of them.
> 
> I'd like to see a few things happen while we do this:
> 
> 1) We need to do apples-to-apples comparison of these compressors to see how they behave under a range of constraints (especially, memory).
> 
> 2) I'd like us to verify that they are respecting those constraints, and that they're implemented in an equivalent way (this is likely to be manual).
> 
> 3) It would be very good to have a test suite that verifies that they correctly reconstitute the semantically significant parts of the headers; in particular, large/unusual values, ordering where appropriate, etc. Our current header corpus undoubtedly has holes in this regard.
> 
> If you make any progress along these lines (dare I ask for volunteers?), pleas share with the list.
> 
> Looking at our issues list, this is one of the major items preventing us from getting to a first implementation draft, so I'd like to chose a way forward soon -- especially since we're choosing a starting point, and the approach we take can evolve or, if necessary, be replaced.
> 
> Regards,
> 
> --
> Mark Nottingham   http://www.mnot.net/
> 
> 
> 
> 
>