Re: Reference set in HPACK

Michael Sweet <msweet@apple.com> Wed, 02 July 2014 10:56 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5C6D61B28FC for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 2 Jul 2014 03:56:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.442
X-Spam-Level:
X-Spam-Status: No, score=-7.442 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-0.651, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_DKIM_INVALID=0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EUAz4CntlT_W for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 2 Jul 2014 03:56:17 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4331D1B28FA for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 2 Jul 2014 03:56:17 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1X2IAU-0005YW-Et for ietf-http-wg-dist@listhub.w3.org; Wed, 02 Jul 2014 10:53:38 +0000
Resent-Date: Wed, 02 Jul 2014 10:53:38 +0000
Resent-Message-Id: <E1X2IAU-0005YW-Et@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <msweet@apple.com>) id 1X2IAM-0005Xn-Jd for ietf-http-wg@listhub.w3.org; Wed, 02 Jul 2014 10:53:30 +0000
Received: from mail-out6.apple.com ([17.151.62.28] helo=mail-in6.apple.com) by maggie.w3.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from <msweet@apple.com>) id 1X2IAL-0003wO-2p for ietf-http-wg@w3.org; Wed, 02 Jul 2014 10:53:30 +0000
DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1404298379; x=2268211979; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-version:Content-type: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=ohiQw+1suvk7+rHlUtzW/grke28Dffl0L2M1tWZvtgc=; b=KH9waaOK2lzmC2j94E+gEG0QZtk6OZ0iMvYe6wRbs2YIFvSB2T/gGKFTy8U7Qu3Y MscRyFTt4wnYEmln8o2e80AaFXeZ7TcuwB2L3h7+z4yGlApTwXDS0zocu5n+N7nR TbuPYYUJlEdzgD4PPtgLJdZH5jpQ2s5mVSiYKKi74H11b7XQJ9UUxuLke8WP1Fhl 0RhCxHlDlRCwTht8DDESZOtnRnqMEJfl0WY1+AtaAAVd7NYUDZAJQZ1vzo3bFfe/ mCr6FgiRPrgH9cIYMvdWoLZcxgUk1mX1lJx17Ahryxhkks8Jwe82x3qvp5Q853ho o44gtXKYPZ08Gmd0yHmI7g==;
Received: from mail-out.apple.com (honeycrisp.apple.com [17.151.62.51]) (using TLS with cipher RC4-MD5 (128/128 bits)) (Client did not present a certificate) by mail-in6.apple.com (Apple Secure Mail Relay) with SMTP id C0.96.32596.B84E3B35; Wed, 2 Jul 2014 03:52:59 -0700 (PDT)
MIME-version: 1.0
Received: from relay6.apple.com ([17.128.113.90]) by local.mail-out.apple.com (Oracle Communications Messaging Server 7.0.5.30.0 64bit (built Oct 22 2013)) with ESMTP id <0N8200ABCZK396U0@local.mail-out.apple.com> for ietf-http-wg@w3.org; Wed, 02 Jul 2014 03:52:59 -0700 (PDT)
X-AuditID: 11973e15-f79d66d000007f54-26-53b3e48b6c2b
Received: from sesame.apple.com (sesame.apple.com [17.128.115.128]) (using TLS with cipher RC4-MD5 (128/128 bits)) (Client did not present a certificate) by relay6.apple.com (Apple SCV relay) with SMTP id B4.39.30921.E74E3B35; Wed, 2 Jul 2014 03:52:46 -0700 (PDT)
Received: from [17.153.54.170] (unknown [17.153.54.170]) by sesame.apple.com (Oracle Communications Messaging Server 7.0.5.30.0 64bit (built Oct 22 2013)) with ESMTPSA id <0N8200GARZK9ZW80@sesame.apple.com> for ietf-http-wg@w3.org; Wed, 02 Jul 2014 03:52:59 -0700 (PDT)
Content-type: multipart/signed; boundary="Apple-Mail=_77AFEFEE-6141-4EED-835C-2EB14BD7058D"; protocol="application/pkcs7-signature"; micalg="sha1"
From: Michael Sweet <msweet@apple.com>
In-reply-to: <CAP+FsNexzVzt+YV7oBeMdGrMoajbMVj1Z90XvQfaCuNMDjYdHg@mail.gmail.com>
Date: Wed, 02 Jul 2014 06:52:57 -0400
Cc: Kazu Yamamoto <kazu@iij.ad.jp>, HTTP Working Group <ietf-http-wg@w3.org>
Message-id: <1F0B6FCE-9143-42C2-AB92-500D266C1BE7@apple.com>
References: <20140702.143041.283993814131065692.kazu@iij.ad.jp> <CAP+FsNexzVzt+YV7oBeMdGrMoajbMVj1Z90XvQfaCuNMDjYdHg@mail.gmail.com>
To: Roberto Peon <grmocg@gmail.com>
X-Mailer: Apple Mail (2.1878.6)
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFupgkeLIzCtJLcpLzFFi42IRnG5nrNv9ZHOwwaplyhaHW2YxOTB6HJ23 nzWAMYrLJiU1J7MstUjfLoEr4/yaVawFu5sYK+b9fsDcwHi9pIuRk0NCwESi50MvE4QtJnHh 3nq2LkYuDiGBOUwSSx78YwRJ8AoISvyYfI8FIjGLSWLjhHYWmO5VX5axQyT6mCQa102AqprM JPHxwGSwucICyhKPZ3wHG8UsMIVRoumfPYjNJqAm8XtSHyuIzSkQLHHo7kSgGg4OFgFVictH CiDKfSQm7/vGBHGFjcTBc8uYIea3MUqsmPUXbKYI0PyGv//ZIS6Sl/jw4TjYRRICh9gk9jS/ ZZvAKDwLyRuzkNwBYSdJnP11lR3C1pZYtvA18yygO5gFdCQmL2REFYawP54/wgRhm0o8ebud DcK2lvg55xFUvaLElO6H7AsYuVYxCuUmZuboZuaZ6SUWFOSk6iXn525ihESd6A7GM6usDjEK cDAq8fBG3N0ULMSaWFZcmXuIUZqDRUmcl/0YUEggPbEkNTs1tSC1KL6oNCe1+BAjEwenVANj 7YP5V7cesF1xLoTZ6HiffiSLZAanBjdXp8ucCfuCopcn2na6TLFf2hSsxWQpOCGO9fXf0Mz2 RWp2n3vaFBeE3W+44x93aYLllYsWLX88ju4N1GNOVMubPdWZPf/ebRvtrHtalgJnxPZlpyxN 7Dl8Sdz46BkTw33Hf3Fd5VB99UzH9ZLMVjMlluKMREMt5qLiRADdksOamwIAAA==
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrFLMWRmVeSWpSXmKPExsUi2FDcoFv3ZHOwwbWv7BaHW2YxOTB6HJ23 nzWAMYrLJiU1J7MstUjfLoEr4/yaVawFu5sYK+b9fsDcwHi9pIuRk0NCwERi1Zdl7BC2mMSF e+vZuhi5OIQE+pgk5qw+BeVMZpL4eGAyE0iVsICyxOMZ3xlBbF4BA4lXPx8zg9jMAlMYJZr+ 2YPYbAJqEr8n9bGC2JwCwRKH7k4EqufgYBFQlbh8pACi3Edi8r5vTBBjbCQOnlvGDLGrjVFi xay/YPNFgHY1/P0PdZ28xIcPx9knMPLPQrJ6FpLVEHaSxNlfV9khbG2JZQtfA9VwANk6EpMX MqIKQ9gfzx9hgrBNJZ683c4GYVtL/JzzCKpeUWJK90P2BYxcqxgFilJzEivN9BILCnJS9ZLz czcxgmOhMGoHY8Nyq0OMAhyMSjy8EXc3BQuxJpYVV+YeYlQBGvFow+oLjFIsefl5qUoivP03 NgcL8aYkVlalFuXHF5XmpBYfYpTmYFES5108a2OwkEB6YklqdmpqQWoRTJaJg1OqgdGsKsY2 6E1vnKnUm37GmR+sJ8y6J5+Y85B9tV/TOZmra+9O+dj7QSPl9OOyOzu0BMRTn1f7Pj00zWJ6 0F5j0alKV2TXO4hZ/mQyvJn4erH4vMO3CoJ3/Xicsfiw3V3xW3ui3gtVCtpt/vUtamtKikKM 2TZXT5WKxGUF6+TuKShz5lidD2Gctk6JpTgj0VCLuag4EQAu1CKzjQIAAA==
Received-SPF: pass client-ip=17.151.62.28; envelope-from=msweet@apple.com; helo=mail-in6.apple.com
X-W3C-Hub-Spam-Status: No, score=-3.1
X-W3C-Hub-Spam-Report: AWL=-3.156, DKIM_SIGNED=0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_DKIM_INVALID=0.01, T_RP_MATCHES_RCVD=-0.01
X-W3C-Scan-Sig: maggie.w3.org 1X2IAL-0003wO-2p e47b26b4a0e3bf32ef1fffb1449ef414
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Reference set in HPACK
Archived-At: <http://www.w3.org/mid/1F0B6FCE-9143-42C2-AB92-500D266C1BE7@apple.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/25113
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Roberto,

On Jul 2, 2014, at 1:39 AM, Roberto Peon <grmocg@gmail.com> wrote:
> You're basing conclusions on today's data, instead of looking forward as to what might happen when the set of headers sent adapts to the compression method, making it significantly more likely for items in the reference set to be emitted.

Isn't that basically confirming what Kazu found: the reference set doesn't help with today's headers?

Here is running code that demonstrates that the reference set does not contribute significantly to the performance of HPACK. Unless you can demonstrate a significant improvement from (simple) server/client changes, your assertion that things will improve doesn't have any evidence to support it.

My observation is that the headers emitted by most web sites are not controlled by the web site developer, they will rely on the underlying web server and scripting engine (PHP, Perl, Python, Ruby, etc.) to do that.  The only header they generally do control is Set-Cookie, and then only for their own site (i.e. not for the advertising networks that are used).  What changes on the server side would be useful here to get the full benefit of the reference table?

(And IMHO if we do have this information then it should be in the HPACK spec...)



> 
> You may want to look at how many of those entries would be regularized if HPACK was in use and servers/clients intended on sending headers that were similar.
> -=R
> 
> 
> On Tue, Jul 1, 2014 at 10:30 PM, Kazu Yamamoto <kazu@iij.ad.jp> wrote:
> Hi,
> 
> As you may remember, I implemented several HPACK *encoding* algorithms
> and calculated compression ratio. I tried it again based on HPACK
> 08. I have 8 algorithms.
> 
> - Naive    -- No compression
> - Naive-H  -- Using Huffman only
> - Static   -- Using static table only
> - Static-H -- Using static table and Huffman
> - Linear   -- Using header table
> - Linear-H -- Using header table and Huffman
> - Diff     -- Using header table and reference set
> - Diff-H   -- Using header table, reference set and Huffman
> 
> The implementations above pass all test cases in
> https://github.com/http2jp/hpack-test-case/.  Using this test cases as
> input, I calculated compression ratio again. The ratio is calculated
> by dividing the number of bytes after compression by that before
> compression.
> 
> Here is results:
> 
> Naive     1.10
> Naive-H   0.86
> Static    0.84
> Static-H  0.66
> Linear    0.39
> Linear-H  0.31
> Diff      0.39
> Diff-H    0.31
> 
> Linear-H and Diff-H results in almost the same. To my calculation,
> Diff-H is only 1.6 byte shorter than Linear-H in average. This means
> that reference set does NOT much contribute to compress headers
> although it is very difficult to implement.
> 
> I have NOT seen any header examples for which reference set work
> effectively so far.
> 
> So, if the authors of HPACK want to retain reference set, I would like
> to see evidence that there are some cases in which reference set
> contributes the compression ratio. HPACK 08 says "Updated Huffman
> table, using data set provided by Google". So, I guess that the
> authors can calculate the compression ratio based on this data.
> 
> If there is not such an evidence, I would like to strongly recommend
> to remove reference set from HPACK. This makes HPACK much simpler, so
> implementations gets bug less and inter-operability is improved. Plus,
> the order of headers is reserved always.
> 
> Regards,
> 
> --Kazu
> 
> 
> 
> 
> 
> 

_________________________________________________________
Michael Sweet, Senior Printing System Engineer, PWG Chair