Re: Choosing a header compression algorithm

Roberto Peon <grmocg@gmail.com> Thu, 21 March 2013 21:08 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A89CC21F8BC5 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 21 Mar 2013 14:08:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.598
X-Spam-Level:
X-Spam-Status: No, score=-10.598 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0CuiGimPcWm9 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 21 Mar 2013 14:08:12 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 40D2221F8BBB for <httpbisa-archive-bis2Juki@lists.ietf.org>; Thu, 21 Mar 2013 14:08:07 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1UImi3-0004Fm-1J for ietf-http-wg-dist@listhub.w3.org; Thu, 21 Mar 2013 21:07:39 +0000
Resent-Date: Thu, 21 Mar 2013 21:07:39 +0000
Resent-Message-Id: <E1UImi3-0004Fm-1J@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <grmocg@gmail.com>) id 1UImhr-0004Bs-Gp for ietf-http-wg@listhub.w3.org; Thu, 21 Mar 2013 21:07:27 +0000
Received: from mail-oa0-f52.google.com ([209.85.219.52]) by maggie.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <grmocg@gmail.com>) id 1UImhm-0001CP-AG for ietf-http-wg@w3.org; Thu, 21 Mar 2013 21:07:27 +0000
Received: by mail-oa0-f52.google.com with SMTP id k14so3569171oag.39 for <ietf-http-wg@w3.org>; Thu, 21 Mar 2013 14:06:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=2wPn6if6vnepn6tfmq/iV4sgNyxtf4ulAFmYnoWqSKw=; b=x9b9ZRDfa6k8QV8Lex0y4Wzfufhl3xt7HrI1OekIzKllYvAOjoODR51LCNSOvdxHYR JYt6CGSGgUQWs6CfQ5vhm1izJFY3GQS6N/D81m+yXvv/WZqTBfnZK5Ir7zxuE3ki5FuL /0bgUHGRMEQ+UhYKzLdNiPgpLtgU2mcW/OWFMz5otimCqKMDg/9PrhJIrpisAF4BPy5v RrmWfxLWAP4DKbK2X4dYbz4b6W5Cy5ixLxUr9odv0ng6sqLpPKuWLsezluG+sNdusiQG 9XvKf2eEIxbGM+04eA1S7kVYWmlVr4W9TXriftMPsrDWDIGBtXU8AoDbNVJWzDaWRKVy iQMg==
MIME-Version: 1.0
X-Received: by 10.182.132.43 with SMTP id or11mr7827639obb.67.1363900016284; Thu, 21 Mar 2013 14:06:56 -0700 (PDT)
Received: by 10.76.109.72 with HTTP; Thu, 21 Mar 2013 14:06:56 -0700 (PDT)
In-Reply-To: <6C71876BDCCD01488E70A2399529D5E5163F393F@ADELE.crf.canon.fr>
References: <254AABEE-22B9-418E-81B0-2729902C4413@mnot.net> <6C71876BDCCD01488E70A2399529D5E5163F393F@ADELE.crf.canon.fr>
Date: Thu, 21 Mar 2013 17:06:56 -0400
Message-ID: <CAP+FsNfPog7+c7jKg=NE=JaKexaWbrFZCryPCn9fY19u_X=Bkw@mail.gmail.com>
From: Roberto Peon <grmocg@gmail.com>
To: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
Cc: Mark Nottingham <mnot@mnot.net>, "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="14dae93a0c17fadd0d04d875b73b"
Received-SPF: pass client-ip=209.85.219.52; envelope-from=grmocg@gmail.com; helo=mail-oa0-f52.google.com
X-W3C-Hub-Spam-Status: No, score=-3.5
X-W3C-Hub-Spam-Report: AWL=-2.662, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1UImhm-0001CP-AG be3bf6d15a8b002012b902526119d2a3
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Choosing a header compression algorithm
Archived-At: <http://www.w3.org/mid/CAP+FsNfPog7+c7jKg=NE=JaKexaWbrFZCryPCn9fY19u_X=Bkw@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/17106
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

That is basically correct, assuming we're talking about delta2.
-=R


On Thu, Mar 21, 2013 at 7:35 AM, RUELLAN Herve
<Herve.Ruellan@crf.canon.fr>wrote:

> To help everyone better understand the two proposals, I tried to make a
> comparison of the different mechanisms they use.
>
> * Buffers:
> Delta has a static pre-filled (name, value) buffer and a dynamic (name,
> value) buffer. The size of the dynamic buffer can be limited.
> HeaderDiff has a pre-filled name buffer (which is extendable to include
> new header names) and a dynamic (name, value) buffer. The size of the
> dynamic buffer can be limited.
>
> * Buffer management for additions:
> Delta: the encoder decides whether a new entry is added to the buffer or
> not.
> HeaderDiff: the encoder decides whether a new entry is added to the buffer
> or not.
>
> * Buffer management for removals:
> Delta: the buffer uses a LRU mechanisms to find which entry is removed
> from the buffer to fulfill the size limitation when a new entry is added to
> the buffer.
> HeaderDiff: the removals of entries from the buffer are at encoder
> discretion, as long as size limitation are fulfilled (the algorithm used in
> the implementation is loosely based on LRU).
>
> * Buffer management synchronization between Encoder and Decoder:
> Delta: the same algorithm is used on both sides.
> HeaderDiff: all buffer updates are transmitted on the wire.
>
> * Default for a new header set:
> Delta: the default for a new header set is the previous header set. A
> header can be removed from the set by encoding it as an index. Delta
> supports up to 256 different header sets.
> HeaderDiff: the default for a new header set is the empty set.
>
> * Encoding possibilities:
> Delta supports the following encoding possibilities:
> - (Name, Value) index;
> - (Name, Value) index list;
> - Name index and literal Value;
> - Literal Name and literal Value.
> HeaderDiff supports the following encoding possibilities:
> - (Name, Value) index;
> - Name index and literal Value;
> - Literal Name and literal Value;
> - delta-encoding: (Name, Value) index, with a shared prefix and a new
> suffix for the Value.
>
> * Encoding choice representation:
> Delta uses opcodes, that are encoded as the opcode identifier (1 byte) and
> the number of encoded headers (1 byte).
> HeaderDiff uses a few bits (2-4) in the encoding of each header.
> Both proposals are byte-aligned at the header level.
>
> * String encoding:
> Delta uses a static Huffman encoding.
> HeaderDiff uses a raw literal encoding.
> Both proposals could easily support several types of encoding: raw
> literal, static Huffman, typed codecs...
>
>
> I hoped I capture correctly the differences and similarities between the
> two proposals.
> Roberto, if you see any misunderstanding of Delta, you're welcome to
> correct it.
>
> I'm going to details the reasons of our design choices in further e-mails.
>
> Regards,
>
> Hervé.
>
>
> > -----Original Message-----
> > From: Mark Nottingham [mailto:mnot@mnot.net]
> > Sent: jeudi 21 mars 2013 08:11
> > To: ietf-http-wg@w3.org Group
> > Subject: Choosing a header compression algorithm
> >
> > Previously, we've talked about starting with just a delta-encoding
> approach
> > for our first implementation draft. In Orlando, we focused primarily on
> two
> > proposals:
> >
> > * Delta2
> >   Draft:
> http://tools.ietf.org/html/draft-rpeon-httpbis-header-compression-
> > 03
> >   Python Implementation: https://github.com/http2/compression-
> > test/tree/master/compressor/delta2
> >
> > * HeaderDiff
> >   Draft: http://tools.ietf.org/html/draft-ruellan-headerdiff-00
> >   Python Implementation: https://github.com/http2/compression-
> > test/tree/master/compressor/headerdiff
> >
> > As I understand it, Herve et al want to work on making HeaderDiff more
> > resistant to CRIME, and hopefully we'll see the results of that in the
> very near
> > future.
> >
> > In the meantime, I'd like everyone to become familiar with both drafts
> and
> > the characteristics of their implementations, so that we can have an
> > informed discussion of them.
> >
> > I'd like to see a few things happen while we do this:
> >
> > 1) We need to do apples-to-apples comparison of these compressors to see
> > how they behave under a range of constraints (especially, memory).
> >
> > 2) I'd like us to verify that they are respecting those constraints, and
> that
> > they're implemented in an equivalent way (this is likely to be manual).
> >
> > 3) It would be very good to have a test suite that verifies that they
> correctly
> > reconstitute the semantically significant parts of the headers; in
> particular,
> > large/unusual values, ordering where appropriate, etc. Our current header
> > corpus undoubtedly has holes in this regard.
> >
> > If you make any progress along these lines (dare I ask for volunteers?),
> pleas
> > share with the list.
> >
> > Looking at our issues list, this is one of the major items preventing us
> from
> > getting to a first implementation draft, so I'd like to chose a way
> forward
> > soon -- especially since we're choosing a starting point, and the
> approach we
> > take can evolve or, if necessary, be replaced.
> >
> > Regards,
> >
> > --
> > Mark Nottingham   http://www.mnot.net/
> >
> >
> >
>
>
>