Reference set in HPACK

Kazu Yamamoto ( 山本和彦 ) <kazu@iij.ad.jp> Wed, 02 July 2014 05:32 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C398B1B28CB for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 1 Jul 2014 22:32:04 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.453
X-Spam-Level:
X-Spam-Status: No, score=-4.453 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, CHARSET_FARAWAY_HEADER=3.2, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-0.651, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id naTU79O99rd2 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 1 Jul 2014 22:32:00 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 606D61B28CC for <httpbisa-archive-bis2Juki@lists.ietf.org>; Tue, 1 Jul 2014 22:32:00 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1X2D88-0005pi-Qr for ietf-http-wg-dist@listhub.w3.org; Wed, 02 Jul 2014 05:30:52 +0000
Resent-Date: Wed, 02 Jul 2014 05:30:52 +0000
Resent-Message-Id: <E1X2D88-0005pi-Qr@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <kazu@iij.ad.jp>) id 1X2D7z-0005ot-DX for ietf-http-wg@listhub.w3.org; Wed, 02 Jul 2014 05:30:46 +0000
Received: from mo30.iij.ad.jp ([202.232.30.71] helo=omgo.iij.ad.jp) by lisa.w3.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from <kazu@iij.ad.jp>) id 1X2D7x-0004ZB-Tk for ietf-http-wg@w3.org; Wed, 02 Jul 2014 05:30:43 +0000
DKIM-Signature: v=1;a=rsa-sha256;c=relaxed/simple;d=iij.ad.jp;h=Date: Message-Id:To:Subject:From:Mime-Version:Content-Type: Content-Transfer-Encoding; i=kazu@iij.ad.jp; s=omgo2; t=1404279019; x=1405488619; bh=X4lKF2c71KTiHDq8S3z1sGdY8aoYBT9A2ieEK6qZmgo=; b=tkcpic13QfvTPdfsQaMJsq6MBWN Buax547rI5jYX/PYOPcAbGstnFTO+Q8sBmPrJ6T6urIzor70ZS2wcmwsHkmPgUHoCj41F72VIqH9d pIWCfbw5GfGixMHcUloqQxUN9IXGHHMmCDxGATYUq3RDO+sRhGDH8z3L1loee+2DIn2Fdxmi4aI6t +p9zdWi9F2UaRKAfOHZLlrZ1gHKQ/4ta9KG1faOwrzpZJpU8lL3dq1MGO/q7S15MW9c9mMkOLiEPA ckomz0FnSZ9W5TPvQ2jvVhkH93r7a5d7AWu6oMWUU6x1yJHLXLkXHt/UBFcBI7rTtkZoeBJoe+tBV Fq90WTg==;
Received: by omgo.iij.ad.jp (mo30) id s625UI9b023820; Wed, 2 Jul 2014 14:30:18 +0900
X-MXL-Hash: 53b398ea25800e2b-514ca56a461b6f2614b78568753ddc376a85bfb3
Date: Wed, 02 Jul 2014 14:30:41 +0900
Message-Id: <20140702.143041.283993814131065692.kazu@iij.ad.jp>
To: ietf-http-wg@w3.org
From: Kazu Yamamoto <kazu@iij.ad.jp>
X-Mailer: Mew version 6.6 on Emacs 24.3.92 / Mule 6.0 (HANACHIRUSATO)
Mime-Version: 1.0
Content-Type: Text/Plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Received-SPF: pass client-ip=202.232.30.71; envelope-from=kazu@iij.ad.jp; helo=omgo.iij.ad.jp
X-W3C-Hub-Spam-Status: No, score=-3.6
X-W3C-Hub-Spam-Report: AWL=-3.445, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01
X-W3C-Scan-Sig: lisa.w3.org 1X2D7x-0004ZB-Tk e0a5b112eb4fdd48adba53d0420b31a1
X-Original-To: ietf-http-wg@w3.org
Subject: Reference set in HPACK
Archived-At: <http://www.w3.org/mid/20140702.143041.283993814131065692.kazu@iij.ad.jp>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/25024
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Hi,

As you may remember, I implemented several HPACK *encoding* algorithms
and calculated compression ratio. I tried it again based on HPACK
08. I have 8 algorithms.

- Naive    -- No compression
- Naive-H  -- Using Huffman only
- Static   -- Using static table only
- Static-H -- Using static table and Huffman
- Linear   -- Using header table
- Linear-H -- Using header table and Huffman
- Diff     -- Using header table and reference set
- Diff-H   -- Using header table, reference set and Huffman

The implementations above pass all test cases in
https://github.com/http2jp/hpack-test-case/.  Using this test cases as
input, I calculated compression ratio again. The ratio is calculated
by dividing the number of bytes after compression by that before
compression.

Here is results:

Naive     1.10
Naive-H   0.86
Static    0.84
Static-H  0.66
Linear    0.39 
Linear-H  0.31
Diff      0.39
Diff-H    0.31

Linear-H and Diff-H results in almost the same. To my calculation,
Diff-H is only 1.6 byte shorter than Linear-H in average. This means
that reference set does NOT much contribute to compress headers
although it is very difficult to implement.

I have NOT seen any header examples for which reference set work
effectively so far.

So, if the authors of HPACK want to retain reference set, I would like
to see evidence that there are some cases in which reference set
contributes the compression ratio. HPACK 08 says "Updated Huffman
table, using data set provided by Google". So, I guess that the
authors can calculate the compression ratio based on this data.

If there is not such an evidence, I would like to strongly recommend
to remove reference set from HPACK. This makes HPACK much simpler, so
implementations gets bug less and inter-operability is improved. Plus,
the order of headers is reserved always.

Regards,

--Kazu