Delta Compression and UTF-8 Header Values

James M Snell <jasnell@gmail.com> Fri, 08 February 2013 19:30 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9823A21F8BB0 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 8 Feb 2013 11:30:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.41
X-Spam-Level:
X-Spam-Status: No, score=-10.41 tagged_above=-999 required=5 tests=[AWL=0.037, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, SARE_SUB_ENC_UTF8=0.152]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wiIm41pdIvOA for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 8 Feb 2013 11:30:50 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id E952121F8B62 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 8 Feb 2013 11:30:49 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1U3td6-00024g-Eu for ietf-http-wg-dist@listhub.w3.org; Fri, 08 Feb 2013 19:29:00 +0000
Resent-Date: Fri, 08 Feb 2013 19:29:00 +0000
Resent-Message-Id: <E1U3td6-00024g-Eu@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <jasnell@gmail.com>) id 1U3tcx-00023H-8N for ietf-http-wg@listhub.w3.org; Fri, 08 Feb 2013 19:28:51 +0000
Received: from mail-ia0-f171.google.com ([209.85.210.171]) by maggie.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <jasnell@gmail.com>) id 1U3tcw-0006Ze-Df for ietf-http-wg@w3.org; Fri, 08 Feb 2013 19:28:51 +0000
Received: by mail-ia0-f171.google.com with SMTP id z13so4604529iaz.16 for <ietf-http-wg@w3.org>; Fri, 08 Feb 2013 11:28:24 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:from:date:message-id:subject:to :content-type; bh=PVDzgttL27I17wAL740+DbA/o1X5CWR6G0Jmary9jh4=; b=ayrwRQBqRV9F340vUvQ+koVM2muk8WmIZ7IJWe1GuFN+Mc+5CfRK80MCqFDfdv6DBW g19IjQCTYgq90c+lfd+kg/N11l459cstZnOj9duqRmsBMNKDHIRjw0NkX2xp4wlgJBPY YVP4hwjbxZTMgDYk5G2ucoq7bX54Fo3FTOBetoJkFKIHcR6YOfExUrK+FFY/YQkzQ3tH 1mHW2n1fO4IKa3S9gUN4xynocxK18HA5n1E87VzVBgJRnb0iEsoXrFoHG2h8L0vcC3i8 FIajzklWwjYXsqUxtOzHKcmf9wSbbIlJwKUGVycJ/EFawzxg6XX1HhF2LMOX1emxcwP8 HGMA==
X-Received: by 10.50.76.168 with SMTP id l8mr4602168igw.97.1360351704386; Fri, 08 Feb 2013 11:28:24 -0800 (PST)
MIME-Version: 1.0
Received: by 10.64.53.237 with HTTP; Fri, 8 Feb 2013 11:28:04 -0800 (PST)
From: James M Snell <jasnell@gmail.com>
Date: Fri, 08 Feb 2013 11:28:04 -0800
Message-ID: <CABP7RbfRLXPpL4=wip=FvqD3DM7BM8PXi7uRswHAusXUmPO_xw@mail.gmail.com>
To: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Content-Type: text/plain; charset="UTF-8"
Received-SPF: pass client-ip=209.85.210.171; envelope-from=jasnell@gmail.com; helo=mail-ia0-f171.google.com
X-W3C-Hub-Spam-Status: No, score=-3.5
X-W3C-Hub-Spam-Report: AWL=-2.710, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1U3tcw-0006Ze-Df 09368f45662cc1a620450fa5dcc34da6
X-Original-To: ietf-http-wg@w3.org
Subject: Delta Compression and UTF-8 Header Values
Archived-At: <http://www.w3.org/mid/CABP7RbfRLXPpL4=wip=FvqD3DM7BM8PXi7uRswHAusXUmPO_xw@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/16467
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Just going through more implementation details of the proposed delta
encoding... one of the items that had come up previously in early
http/2 discussions was the possibility of allowing for UTF-8 header
values. Doing so would allow us to move away from things like
punycode, pct-encoding, Q and B-Codecs, RFC 5987 mechanisms, etc it
would bring along a range of other issues we would need to deal with.

One key challenge with allowing UTF-8 values, however, is that it
conflicts with the use of the static huffman encoding in the proposed
Delta Encoding for header compression. If we allow for non-ascii
characters, the static huffman coding simply becomes too inefficient
and unmanageable to be useful. There are a few ways around it but none
of the strategies are all that attractive.

So the question is: do we want to allow UTF-8 header values? Is it
worth the trade-off in less-efficient header compression? Or put
another way, is increased compression efficiency worth ruling out
UTF-8 header values?

(Obviously there are other issues with UTF-8 values we'd need to
consider, such as http/1 interop)

- James