Re: Delta Compression and UTF-8 Header Values
Willy Tarreau <w@1wt.eu> Sat, 09 February 2013 13:35 UTC
Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ADE6921F8A72 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 9 Feb 2013 05:35:49 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.161
X-Spam-Level:
X-Spam-Status: No, score=-10.161 tagged_above=-999 required=5 tests=[AWL=-0.014, BAYES_00=-2.599, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_HI=-8, SARE_SUB_ENC_UTF8=0.152]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Eziftm6kQJf8 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 9 Feb 2013 05:35:48 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id C6C6721F8A71 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Sat, 9 Feb 2013 05:35:48 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1U4AZS-0007aU-Iu for ietf-http-wg-dist@listhub.w3.org; Sat, 09 Feb 2013 13:34:22 +0000
Resent-Date: Sat, 09 Feb 2013 13:34:22 +0000
Resent-Message-Id: <E1U4AZS-0007aU-Iu@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <w@1wt.eu>) id 1U4AZL-0007Zp-Mj for ietf-http-wg@listhub.w3.org; Sat, 09 Feb 2013 13:34:15 +0000
Received: from 1wt.eu ([62.212.114.60]) by lisa.w3.org with esmtp (Exim 4.72) (envelope-from <w@1wt.eu>) id 1U4AZK-00053X-Hj for ietf-http-wg@w3.org; Sat, 09 Feb 2013 13:34:15 +0000
Received: (from willy@localhost) by mail.home.local (8.14.4/8.14.4/Submit) id r19DXfP6009087; Sat, 9 Feb 2013 14:33:41 +0100
Date: Sat, 09 Feb 2013 14:33:41 +0100
From: Willy Tarreau <w@1wt.eu>
To: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
Cc: Mark Nottingham <mnot@mnot.net>, James M Snell <jasnell@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <20130209133341.GA8712@1wt.eu>
References: <CABP7RbfRLXPpL4=wip=FvqD3DM7BM8PXi7uRswHAusXUmPO_xw@mail.gmail.com> <CE65E38D-A482-4EA9-BAF4-F6498F643A78@mnot.net> <511642E9.9010607@it.aoyama.ac.jp>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <511642E9.9010607@it.aoyama.ac.jp>
User-Agent: Mutt/1.4.2.3i
Received-SPF: pass client-ip=62.212.114.60; envelope-from=w@1wt.eu; helo=1wt.eu
X-W3C-Hub-Spam-Status: No, score=-4.0
X-W3C-Hub-Spam-Report: AWL=-2.109, BAYES_00=-1.9, RP_MATCHES_RCVD=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001
X-W3C-Scan-Sig: lisa.w3.org 1U4AZK-00053X-Hj 44f76b33733fba85fd33d4414f75ddda
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Delta Compression and UTF-8 Header Values
Archived-At: <http://www.w3.org/mid/20130209133341.GA8712@1wt.eu>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/16481
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>
On Sat, Feb 09, 2013 at 09:36:57PM +0900, "Martin J. Dürst" wrote: > On 2013/02/09 8:53, Mark Nottingham wrote: > >My .02 - > > > >RFC2616 implies that the range of characters available in headers is > >ISO-8859-1 > > That's a leftover from the *very* early 1990s, when ISO-8859-1 was > actually a step forward from the various 'national' ISO-646 7-bit > encodings. It was not a bad idea at that time by TimBL to make the Web > work throughout Western Europe. UTF-8 wasn't even invented then. > (see http://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt) > > The IETF understood the advantages of UTF-8 in the late 1990s, see > http://tools.ietf.org/html/rfc2277#section-3.1 > > These days, UTF-8 isn't a step forward, it's just plain obvious. The > browser folks at WHATWG would prefer not to have any Web pages in > anything else than UTF-8 anymore. That will take quite some time yet, > but the trend is very clear. See e.g. > http://googleblog.blogspot.jp/2010/01/unicode-nearing-50-of-web.html and > http://w3techs.com/technologies/details/en-utf8/all/all. Websockets was > designed with UTF-8 and binary built in from the start. For all kinds of > other protocols, UTF-8 is a non-brainer, too. > > It would be a good idea to try hard to make the new protocol forward > looking (or actually just acknowledge the present, rather than stay > frozen in the past) for the next 20 years or so in terms of character > encoding, too, and not only in terms of CPU/network performance. Well, don't confuse UTF-8 and Unicode. UTF-8 is just a space-efficient way of transporting Unicode characters for western countries. The encoding can become inefficient to transport for other charsets by inflating data by up to 50% and may make compression less efficient. Also, processing it is particularly inefficient as you have to parse each and every byte to find a length, making string comparisons quite slow. I'm not saying I'm totally against UTF-8 in HTTP/2 (eventhough I hate using it), I'm saying that it's not *THE* solution to every problem. It's just *A* solution to *A* problem : "how to extend character sets in existing documents without having to re-encode them all". I don't think this specific problem is related to the scope of the HTTP/2 work, so at first glance, I'd say that UTF-8 doesn't seem to solve a known problem here. Regards, Willy
- Re: Delta Compression and UTF-8 Header Values Mark Nottingham
- Re: Delta Compression and UTF-8 Header Values James M Snell
- Re: Delta Compression and UTF-8 Header Values Adrien W. de Croy
- Delta Compression and UTF-8 Header Values James M Snell
- Re: Delta Compression and UTF-8 Header Values Poul-Henning Kamp
- Re: Delta Compression and UTF-8 Header Values James M Snell
- Re: Delta Compression and UTF-8 Header Values Poul-Henning Kamp
- Re: Delta Compression and UTF-8 Header Values Roberto Peon
- Re: Delta Compression and UTF-8 Header Values James M Snell
- Re: Delta Compression and UTF-8 Header Values Bjoern Hoehrmann
- Re: Delta Compression and UTF-8 Header Values Martin J. Dürst
- Re: Delta Compression and UTF-8 Header Values Martin J. Dürst
- Re: Delta Compression and UTF-8 Header Values Poul-Henning Kamp
- Re: Delta Compression and UTF-8 Header Values Willy Tarreau
- Re: Delta Compression and UTF-8 Header Values Poul-Henning Kamp
- Re: Delta Compression and UTF-8 Header Values Martin Nilsson
- Re: Delta Compression and UTF-8 Header Values Martin Nilsson
- Re: Delta Compression and UTF-8 Header Values Albert Lunde
- Re: Delta Compression and UTF-8 Header Values Willy Tarreau
- Re: Delta Compression and UTF-8 Header Values Willy Tarreau
- Re: Delta Compression and UTF-8 Header Values Nico Williams
- Re: Delta Compression and UTF-8 Header Values Poul-Henning Kamp
- Re: Delta Compression and UTF-8 Header Values Adrien W. de Croy
- Re: Delta Compression and UTF-8 Header Values Poul-Henning Kamp
- Re: Delta Compression and UTF-8 Header Values Martin J. Dürst
- Re: Delta Compression and UTF-8 Header Values Martin J. Dürst
- Re: Delta Compression and UTF-8 Header Values Martin J. Dürst
- Re: Delta Compression and UTF-8 Header Values Roberto Peon
- Re: Delta Compression and UTF-8 Header Values Frédéric Kayser
- Re: Delta Compression and UTF-8 Header Values James M Snell
- Re: Delta Compression and UTF-8 Header Values Frédéric Kayser
- Re: Delta Compression and UTF-8 Header Values Roberto Peon
- Re: Delta Compression and UTF-8 Header Values Willy Tarreau
- Re: Delta Compression and UTF-8 Header Values James M Snell
- Re: Delta Compression and UTF-8 Header Values Frédéric Kayser
- Re: Delta Compression and UTF-8 Header Values Roberto Peon
- Re: Delta Compression and UTF-8 Header Values Nico Williams
- Re: Delta Compression and UTF-8 Header Values Roberto Peon
- Re: Delta Compression and UTF-8 Header Values Poul-Henning Kamp
- Re: Delta Compression and UTF-8 Header Values Poul-Henning Kamp
- Re: Delta Compression and UTF-8 Header Values Julian Reschke
- Re: Delta Compression and UTF-8 Header Values Julian Reschke
- Re: Delta Compression and UTF-8 Header Values Julian Reschke
- Re: Delta Compression and UTF-8 Header Values Willy Tarreau
- Re: Delta Compression and UTF-8 Header Values Poul-Henning Kamp
- Re: Delta Compression and UTF-8 Header Values Poul-Henning Kamp
- Re: Delta Compression and UTF-8 Header Values Willy Tarreau
- Re: Delta Compression and UTF-8 Header Values Poul-Henning Kamp
- Re: Delta Compression and UTF-8 Header Values Mark Nottingham
- Re: Delta Compression and UTF-8 Header Values Roberto Peon
- Re: Delta Compression and UTF-8 Header Values Zhong Yu
- Re: Delta Compression and UTF-8 Header Values Poul-Henning Kamp
- Re: Delta Compression and UTF-8 Header Values Zhong Yu
- Re: Delta Compression and UTF-8 Header Values Zhong Yu
- Re: Delta Compression and UTF-8 Header Values Zhong Yu
- Re: Delta Compression and UTF-8 Header Values Nico Williams
- Re: Delta Compression and UTF-8 Header Values Nico Williams
- Re: Delta Compression and UTF-8 Header Values Poul-Henning Kamp
- Re: Delta Compression and UTF-8 Header Values Nico Williams
- Re: Delta Compression and UTF-8 Header Values Nico Williams
- Re: Delta Compression and UTF-8 Header Values Phillip Hallam-Baker
- Re: Delta Compression and UTF-8 Header Values James Cloos
- Re: Delta Compression and UTF-8 Header Values Roberto Peon
- Re: Delta Compression and UTF-8 Header Values James Cloos
- Re: Delta Compression and UTF-8 Header Values Roberto Peon