RE: Straw Poll: Restore Header Table and Static Table Indices

RUELLAN Herve <Herve.Ruellan@crf.canon.fr> Wed, 15 October 2014 11:36 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 69F861A1B09 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 15 Oct 2014 04:36:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.562
X-Spam-Level:
X-Spam-Status: No, score=-6.562 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_EQ_FR=0.35, RCVD_IN_DNSWL_HI=-5, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id w_7lKnIVEWHY for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 15 Oct 2014 04:36:05 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B5C001A1B0B for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 15 Oct 2014 04:36:05 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1XeMox-0002MM-8k for ietf-http-wg-dist@listhub.w3.org; Wed, 15 Oct 2014 11:32:47 +0000
Resent-Date: Wed, 15 Oct 2014 11:32:47 +0000
Resent-Message-Id: <E1XeMox-0002MM-8k@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <Herve.Ruellan@crf.canon.fr>) id 1XeMor-0002KB-Dh for ietf-http-wg@listhub.w3.org; Wed, 15 Oct 2014 11:32:41 +0000
Received: from inari-msr.crf.canon.fr ([194.2.158.67]) by lisa.w3.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from <Herve.Ruellan@crf.canon.fr>) id 1XeMop-0005Bq-SG for ietf-http-wg@w3.org; Wed, 15 Oct 2014 11:32:41 +0000
Received: from mir-msr.corp.crf.canon.fr (mir-msr.corp.crf.canon.fr [172.19.77.98]) by inari-msr.crf.canon.fr (8.13.8/8.13.8) with ESMTP id s9FBVkO5021290; Wed, 15 Oct 2014 13:31:46 +0200
Received: from ADELE.crf.canon.fr (adele.fesl2.crf.canon.fr [172.19.70.17]) by mir-msr.corp.crf.canon.fr (8.13.8/8.13.8) with ESMTP id s9FBVkfJ012837; Wed, 15 Oct 2014 13:31:46 +0200
Received: from ADELE.crf.canon.fr ([::1]) by ADELE.crf.canon.fr ([::1]) with mapi id 14.03.0210.002; Wed, 15 Oct 2014 13:31:46 +0200
From: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
To: Greg Wilkins <gregw@intalio.com>, Roberto Peon <grmocg@gmail.com>
CC: Willy Tarreau <w@1wt.eu>, Mark Nottingham <mnot@mnot.net>, HTTP Working Group <ietf-http-wg@w3.org>, Jeff Pinner <jpinner@twitter.com>
Thread-Topic: Straw Poll: Restore Header Table and Static Table Indices
Thread-Index: AQHP4XcclTjsEkkraU+xK/nlbGwVVZwjUZiAgAieGYCAAJ7wAIAAjOeAgAAIlQCAApzigIAAnGiAgACu1KA=
Date: Wed, 15 Oct 2014 11:31:45 +0000
Message-ID: <6C71876BDCCD01488E70A2399529D5E53BF5950E@ADELE.crf.canon.fr>
References: <CA+pLO_jkN67HLT7oup+FcYVY+RZ7ckhpY2gGy=TAsr2UUMnVVA@mail.gmail.com> <987FB86A-EF8B-4CD1-A9A7-52A9163E8CB3@mnot.net> <EBB30C88-7EBD-400F-9591-B646B4D3687B@mnot.net> <CAP+FsNeJU6aciA+UV3sQ318e4=fXxv9zZbsDZ1jXmYstz6XwaQ@mail.gmail.com> <E465C1C7-20DF-4F78-9936-9C914042920A@mnot.net> <20141013012326.GD13217@1wt.eu> <CAP+FsNci+YbQ9fP9LiJ1BBUSDryWOqi4A4YsKyORskY7pK0Fmg@mail.gmail.com> <CAH_y2NEfOXWRtEbO+uUCKroW+NPGtyjqxNan3p5G+uFzuxxnCA@mail.gmail.com>
In-Reply-To: <CAH_y2NEfOXWRtEbO+uUCKroW+NPGtyjqxNan3p5G+uFzuxxnCA@mail.gmail.com>
Accept-Language: en-US, fr-FR
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [172.20.8.76]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Received-SPF: none client-ip=194.2.158.67; envelope-from=Herve.Ruellan@crf.canon.fr; helo=inari-msr.crf.canon.fr
X-W3C-Hub-Spam-Status: No, score=-3.4
X-W3C-Hub-Spam-Report: AWL=-3.439, T_RP_MATCHES_RCVD=-0.01
X-W3C-Scan-Sig: lisa.w3.org 1XeMop-0005Bq-SG e5d5aa2069f73aa5928ba968392df4c2
X-Original-To: ietf-http-wg@w3.org
Subject: RE: Straw Poll: Restore Header Table and Static Table Indices
Archived-At: <http://www.w3.org/mid/6C71876BDCCD01488E70A2399529D5E53BF5950E@ADELE.crf.canon.fr>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/27625
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

I dug a bit into the numbers myself, to find which removed static headers had a strong influence on the compaction results.
For requests, there are two: ":scheme:http" and "accept-charset". ":scheme:http" is probably not a problem as more traffic is expected to move to https. "accept-charset" is apparently no more commonly used, but is still in the testing data.

For responses, there are also two: "date" and "via". Their impact is roughly the same, and significant only for 1K or 2K dynamic tables. Keeping or removing them depends on the tradeoff we want to make. For a generic web usage, keep them to gain a small compaction increase, for a more versatile usage, remove them to reduce the size of the static table.

Hervé.

> -----Original Message-----
> From: Greg Wilkins [mailto:gregw@intalio.com]
> Sent: mercredi 15 octobre 2014 04:37
> To: Roberto Peon
> Cc: Willy Tarreau; Mark Nottingham; HTTP Working Group; Jeff Pinner
> Subject: Re: Straw Poll: Restore Header Table and Static Table Indices
> 
> 
> On 15 October 2014 04:17, Roberto Peon <grmocg@gmail.com> wrote:
> 
> 
> 	One byte more overhead is significant when the amount of overhead
> was typically one byte to begin with,
> 
> 
> 
> Well if looked at in that way, removing the reference set was an increase of 0
> bytes to 1 byte for many headers, so an infinite increase in size.    Yet when
> looked at overall, the impact was negligible.
> 
> 
> I agree that we need good efficient compression for dynamic headers, and am
> willing to consider changes to ensure we get the balance right (such as
> reducing the static table).  But I think that switching the index's back adjusts the
> balance too far away from the very common use case of static headers and
> static header names.
> 
> 
> Here are my numbers for the test data run with h2-14, then removing the
> suggested static headers (except Date), then adding in the suggested static
> values:
> 
> 
> size	 H2-14	 reduced	 values
> 0	 63.50%	 64.40%	 62.10%
> 4096	 34.50%	 34.20%	 33.50%
> 8192	 33.20%	 32.70%	 32.60%
> 12288	 33.30%	 32.60%	 32.50%
> 
> So for the test data, reducing the static table size has hardly any affect and by
> adding in values, the effect is often slightly positive.    So such a change looks to
> at least do no harm.     Question is, does it do any good for lots of dynamic
> headers - I'm happy to try answer that, but can do so only if somebody can give
> me some test data to run against.
> 
> 
> 
> 
> 	The worst possible impact of this means that one cannot bit-blit a large
> number of static headers-- one must add them one at a time or fix them up.
> 
> 
> Not just static headers, but dynamic headers that use static names.
> 
> 
> 
> 	I'll bet that I can show that this makes almost zero difference in CPU
> when implemented properly (there is little magical about a bit-blit to begin
> with)-- I'd be shocked if we couldn't do 100s of millions of header-sets per
> second on a single core.
> 
> 
> While I'm sure that dedicated code can be written to generate h2 headers very
> quickly, the reality is that servers are not written to be dedicated to h2.   The
> fast bulk of the header generation and handling code is written protocol neutral
> and has to work for  http, spdy, h2, fastcgi, etc.   In that environment we have
> found it extraordinary difficult to introduce header pre-generation when the
> bytes generated are a function of the connection and sequence that the header
> will be sent.   For example, we regenerate headers when we move a static
> resource into the cache, which is done without reference to any particular
> protocol or connection.      While the actual operation of customising a header
> for a particular connection might just be a lookup and an add, there is a lot of
> complexity required to work out when and if this addition is needed and
> bringing all the required information to the correct point in the code.
> 
> 
> I know this is a specific example, but I am sure that in general far better and
> simpler optimisations can be done with encodings that are immutable rather
> than with ones that are a function of connection and time.
> 
> 
> regards
> 
> 
> 
> 
> 
> 
> 
> 
> 
> --
> 
> Greg Wilkins <gregw@intalio.com>  @  Webtide - an Intalio subsidiary
> http://eclipse.org/jetty HTTP, SPDY, Websocket server and client that scales
> http://www.webtide.com  advice and support for jetty and cometd.