Re: Delta Compression and UTF-8 Header Values

Roberto Peon <grmocg@gmail.com> Wed, 13 February 2013 00:02 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C087421F89AA for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 12 Feb 2013 16:02:59 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.459
X-Spam-Level:
X-Spam-Status: No, score=-10.459 tagged_above=-999 required=5 tests=[AWL=-0.013, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8, SARE_SUB_ENC_UTF8=0.152]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3V41ki2Ol-SQ for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 12 Feb 2013 16:02:59 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 1A8D921F89A6 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Tue, 12 Feb 2013 16:02:59 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1U5PnF-0002ae-FX for ietf-http-wg-dist@listhub.w3.org; Wed, 13 Feb 2013 00:01:45 +0000
Resent-Date: Wed, 13 Feb 2013 00:01:45 +0000
Resent-Message-Id: <E1U5PnF-0002ae-FX@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <grmocg@gmail.com>) id 1U5Pn6-0002Zu-Sr for ietf-http-wg@listhub.w3.org; Wed, 13 Feb 2013 00:01:36 +0000
Received: from mail-oa0-f41.google.com ([209.85.219.41]) by maggie.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <grmocg@gmail.com>) id 1U5Pn1-0000km-62 for ietf-http-wg@w3.org; Wed, 13 Feb 2013 00:01:36 +0000
Received: by mail-oa0-f41.google.com with SMTP id i10so721049oag.28 for <ietf-http-wg@w3.org>; Tue, 12 Feb 2013 16:01:04 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=/Srf0xXGFp7mX8oOvs19ide5yrKfE59Kcf7Yxu8EIxg=; b=zDhyVEq3QowEwC6E2bhlqf/miE0v2SEPp/+OO8st0IIUJTEJtCsysQ7TS71JNL/mTb 5anDTTT0EUtPFB4G8XX4Io5MsLnbnLnYsxgUgXx7xpnqDFQWnh74ghkwAOh1sMkvfZie 2g8MMXNCctV5qHK2J+1TS3hWCYI3+HZ3qL3gInqp1I7kDlU3Fb395u24NAfOZwepLRLt atw1oVd23ihrqOk0ltuzXc/DWe6WbJPNqAC+kR8CW5zGhDlVWwul2QrHkQ9x/LOMh9kU DjxyNQqYTwX6pcpDzFyDXXd6vV7Jg0/MHwBTKLGKEZ3JTJCoY28S03T0h9UVDEvigUdc iG0Q==
MIME-Version: 1.0
X-Received: by 10.60.27.199 with SMTP id v7mr14827006oeg.23.1360713664590; Tue, 12 Feb 2013 16:01:04 -0800 (PST)
Received: by 10.76.167.193 with HTTP; Tue, 12 Feb 2013 16:01:04 -0800 (PST)
In-Reply-To: <m38v6t6qxd.fsf@carbon.jhcloos.org>
References: <CABP7RbfRLXPpL4=wip=FvqD3DM7BM8PXi7uRswHAusXUmPO_xw@mail.gmail.com> <CE65E38D-A482-4EA9-BAF4-F6498F643A78@mnot.net> <CABP7RbcRrjV7EhwoGbkWbYJEXeWOwH4gQuaCG7N0siQqeMtcag@mail.gmail.com> <m3sj52a61n.fsf@carbon.jhcloos.org> <CAP+FsNf2x-K0OFQVLOKsc+ZM+BUDJGygcnUH=buQm4yA2Su2cw@mail.gmail.com> <m38v6t6qxd.fsf@carbon.jhcloos.org>
Date: Tue, 12 Feb 2013 16:01:04 -0800
Message-ID: <CAP+FsNeU6g7K3WQdA4V+=PhxOyzup6Ajp8NJ4asVaric1Uktww@mail.gmail.com>
From: Roberto Peon <grmocg@gmail.com>
To: James Cloos <cloos@jhcloos.com>
Cc: James M Snell <jasnell@gmail.com>, Mark Nottingham <mnot@mnot.net>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="e89a8fb200789e752504d58fd65f"
Received-SPF: pass client-ip=209.85.219.41; envelope-from=grmocg@gmail.com; helo=mail-oa0-f41.google.com
X-W3C-Hub-Spam-Status: No, score=-3.5
X-W3C-Hub-Spam-Report: AWL=-2.665, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1U5Pn1-0000km-62 d5844aa606b8edca0407ddc95221fb77
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Delta Compression and UTF-8 Header Values
Archived-At: <http://www.w3.org/mid/CAP+FsNeU6g7K3WQdA4V+=PhxOyzup6Ajp8NJ4asVaric1Uktww@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/16591
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Agreed.
-=R


On Tue, Feb 12, 2013 at 3:41 PM, James Cloos <cloos@jhcloos.com> wrote:

> >>>>> "RP" == Roberto Peon <grmocg@gmail.com> writes:
>
> RP> The header names are almost completely handled with the pre-seeded
> RP> dictionary, so they really don't affect the character frequency
> RP> count and/or thus the huffman encoding.
>
> RP> Arithmetic coding gets better compression ratios, at the expense of
> RP> gobs of CPU and complexity. I don't think that is a good tradeoff :/
>
> It is sometimes hard to guess whether huffman is chosen due to inertia,
> arithmetic patent agnst, or good technical reasons.  It is good to know
> that in this case it is the latter.
>
> I may not have expressed my primary point quite well enough though:
>
> Although I doubt that right now there is any text in the headers which
> is both common enough to warrent inclusion in a static table and not
> seven-bit clean, my point was that even if such text shows up over time,
> the fact that it is not seven-bit should not prevent its inclusion in
> future, extended versions of the static table.  As such specifying that
> text is defined to be utf-8 and the use of a static huffman table should
> not contra-indicate each other.
>
> -JimC
> --
> James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6
>