RE: Header Stats

RUELLAN Herve <Herve.Ruellan@crf.canon.fr> Wed, 23 January 2013 08:57 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5BCBB21F8654 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 23 Jan 2013 00:57:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.249
X-Spam-Level:
X-Spam-Status: No, score=-10.249 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HELO_EQ_FR=0.35, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gGI4xHe5n685 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 23 Jan 2013 00:57:35 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 910A621F84E8 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 23 Jan 2013 00:57:35 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1Txw7x-0003NF-ON for ietf-http-wg-dist@listhub.w3.org; Wed, 23 Jan 2013 08:56:13 +0000
Resent-Date: Wed, 23 Jan 2013 08:56:13 +0000
Resent-Message-Id: <E1Txw7x-0003NF-ON@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <Herve.Ruellan@crf.canon.fr>) id 1Txw7s-0003Lx-Lg for ietf-http-wg@listhub.w3.org; Wed, 23 Jan 2013 08:56:08 +0000
Received: from inari-msr.crf.canon.fr ([194.2.158.67]) by maggie.w3.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from <Herve.Ruellan@crf.canon.fr>) id 1Txw7r-0003gJ-Kz for ietf-http-wg@w3.org; Wed, 23 Jan 2013 08:56:08 +0000
Received: from mir-bsr.corp.crf.canon.fr (mir-bsr.corp.crf.canon.fr [172.19.77.99]) by inari-msr.crf.canon.fr (8.13.8/8.13.8) with ESMTP id r0N8tctg005648; Wed, 23 Jan 2013 09:55:38 +0100
Received: from ADELE.crf.canon.fr (adele.fesl2.crf.canon.fr [172.19.70.17]) by mir-bsr.corp.crf.canon.fr (8.13.8/8.13.8) with ESMTP id r0N8tcko006088; Wed, 23 Jan 2013 09:55:38 +0100
Received: from ADELE.crf.canon.fr ([::1]) by ADELE.crf.canon.fr ([::1]) with mapi id 14.02.0328.009; Wed, 23 Jan 2013 09:55:37 +0100
From: RUELLAN Herve <Herve.Ruellan@crf.canon.fr>
To: James M Snell <jasnell@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Thread-Topic: Header Stats
Thread-Index: AQHN+Ok2/ge0fdsziUaAVNOoaLqysphWQxgAgABXlGA=
Date: Wed, 23 Jan 2013 08:55:37 +0000
Message-ID: <6C71876BDCCD01488E70A2399529D5E52E1C0B@ADELE.crf.canon.fr>
References: <CABP7RbeHyWOPb=iDA+oPYB6QP28Ctq4qZaf2ThApinp1aRZRaA@mail.gmail.com> <CABP7RbdYsjQELcnZseCz8x_uhqyoFcd1kQaAyA2VHvdBjZ_QKA@mail.gmail.com>
In-Reply-To: <CABP7RbdYsjQELcnZseCz8x_uhqyoFcd1kQaAyA2VHvdBjZ_QKA@mail.gmail.com>
Accept-Language: en-US, fr-FR
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [172.20.6.135]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Received-SPF: none client-ip=194.2.158.67; envelope-from=Herve.Ruellan@crf.canon.fr; helo=inari-msr.crf.canon.fr
X-W3C-Hub-Spam-Status: No, score=-3.5
X-W3C-Hub-Spam-Report: AWL=-3.450, RP_MATCHES_RCVD=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1Txw7r-0003gJ-Kz d2443c82821a38615740a59221db05bc
X-Original-To: ietf-http-wg@w3.org
Subject: RE: Header Stats
Archived-At: <http://www.w3.org/mid/6C71876BDCCD01488E70A2399529D5E52E1C0B@ADELE.crf.canon.fr>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/16128
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

James,

This is interesting and useful information.

A few suggestions:
- Sorting the header names would help finding them.
- Possibly also sorting the header values, either alphabetically or by frequency.
- The "ratio" for encoded values seems to be the percentage of size reduction, maybe rename the line.
- Adding the explanations from your email directly in the files would help understanding them.

Hervé.

> -----Original Message-----
> From: James M Snell [mailto:jasnell@gmail.com]
> Sent: mercredi 23 janvier 2013 05:34
> To: ietf-http-wg@w3.org
> Subject: Re: Header Stats
> 
> Ok.. I have updated the calculations to show:
> 
>   1. General variability of header values. The lower the number, the more
> redundant the value tends to be.
>   2. A frequency distribution of specific values per header. This is rather
> verbose but extremely informative
>   3. For date and numeric header values, comparison values between the
> text-value and optimized binary encoding value.
>   4. A summation of the total bytes saved by using the optimized binary
> encoding for dates and numeric headers.
> 
> 
> TODO:
> 
> 
>   - Implement experimental Set-Cookie, Cookie and Cache-Control headers
> to see the difference for binary encoding
>   - Properly handle null-separated value lists
> 
> 
> If there are other interesting calculations you'd like to see, let me know...
> 
> 
> The updated output is here: https://github.com/jasnell/compression-
> test/tree/master/counts
> 
> 
> On Tue, Jan 22, 2013 at 1:24 PM, James M Snell <jasnell@gmail.com> wrote:
> 
> 
> 	I've started working on generating stats for individual headers within
> messages. Rather than take up too much space here on the list for the
> results, I am keeping the results in my github fork [1] of the compression-test
> code and will be posting summaries of the results periodically on my personal
> blog [2]. I will be putting together a summary of my findings together in time
> for the interim meeting next week. Unfortunately, however, I will not be
> able to attend the meeting.
> 
> 
> 	[1] https://github.com/jasnell/compression-
> test/blob/master/counts/
> 	[2] http://chmod777self.blogspot.com/2013/01/http-20-header-
> stats.html
> 
> 
> 
> 	- James
>