Re: Delta Compression and UTF-8 Header Values

"Poul-Henning Kamp" <phk@phk.freebsd.dk> Sat, 09 February 2013 14:06 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7289721F8A96 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 9 Feb 2013 06:06:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.506
X-Spam-Level:
X-Spam-Status: No, score=-10.506 tagged_above=-999 required=5 tests=[AWL=-0.059, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, SARE_SUB_ENC_UTF8=0.152]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id B5yIZIaDrk+m for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sat, 9 Feb 2013 06:06:22 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id D15EF21F8A93 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Sat, 9 Feb 2013 06:06:22 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1U4B36-0003vX-Gz for ietf-http-wg-dist@listhub.w3.org; Sat, 09 Feb 2013 14:05:00 +0000
Resent-Date: Sat, 09 Feb 2013 14:05:00 +0000
Resent-Message-Id: <E1U4B36-0003vX-Gz@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <phk@phk.freebsd.dk>) id 1U4B30-0003un-Ft for ietf-http-wg@listhub.w3.org; Sat, 09 Feb 2013 14:04:54 +0000
Received: from phk.freebsd.dk ([130.225.244.222]) by maggie.w3.org with esmtp (Exim 4.72) (envelope-from <phk@phk.freebsd.dk>) id 1U4B2z-0002tO-U9 for ietf-http-wg@w3.org; Sat, 09 Feb 2013 14:04:54 +0000
Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.61.3]) by phk.freebsd.dk (Postfix) with ESMTP id BCDE68A50F; Sat, 9 Feb 2013 14:04:32 +0000 (UTC)
Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.5/8.14.5) with ESMTP id r19E4UpQ062761; Sat, 9 Feb 2013 14:04:30 GMT (envelope-from phk@phk.freebsd.dk)
To: Willy Tarreau <w@1wt.eu>
cc: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, Mark Nottingham <mnot@mnot.net>, James M Snell <jasnell@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
In-reply-to: <20130209133341.GA8712@1wt.eu>
From: Poul-Henning Kamp <phk@phk.freebsd.dk>
References: <CABP7RbfRLXPpL4=wip=FvqD3DM7BM8PXi7uRswHAusXUmPO_xw@mail.gmail.com> <CE65E38D-A482-4EA9-BAF4-F6498F643A78@mnot.net> <511642E9.9010607@it.aoyama.ac.jp> <20130209133341.GA8712@1wt.eu>
Date: Sat, 09 Feb 2013 14:04:30 +0000
Message-ID: <62760.1360418670@critter.freebsd.dk>
Received-SPF: none client-ip=130.225.244.222; envelope-from=phk@phk.freebsd.dk; helo=phk.freebsd.dk
X-W3C-Hub-Spam-Status: No, score=-3.4
X-W3C-Hub-Spam-Report: AWL=-3.375, RP_MATCHES_RCVD=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1U4B2z-0002tO-U9 244cf204a4fe50ff5001628944cc1259
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Delta Compression and UTF-8 Header Values
Archived-At: <http://www.w3.org/mid/62760.1360418670@critter.freebsd.dk>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/16482
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Content-Type: text/plain; charset=ISO-8859-1
--------
In message <20130209133341.GA8712@1wt.eu>, Willy Tarreau writes:

>I'm not saying I'm totally against UTF-8 in HTTP/2 [...]

What and where do you mean when you say "UTF-8" In HTTP/2 ?

I think we need to be more precise, to avoid misunderstandings.

In HTTP/1, there is a peculiar mix between protocol-mechanics, and
metadata:  If I add a custom bit of metadata, it must follow certain
rules, since otherwise it will break the protocol mechanics.

For instance, I cannot define a custom header called:

	 "FOO" CRNL CRNL ": " [8 zero bytes]

If we define HTTP/2 as a "binary" protocol in some sensible way,
this restriction could go away, and we'd just move something like:

	<HDR nlen=7,blen=8> "FOO" CRNL CRNL \0\0\0\0\0\0\0\0

down the wire, and not care about what it is, what it means or
what character set, if any, it is encoded in.

It is only the metadata that needs inspection along the way where
we need to decide about UTF-8, and it really isn't that much.

Host:
	Why would we care about the character set ?  We're
	just going to pass it to DNS anyway.

URI:
	At least the query strings, possibly all of it ?
	But do we really care ?  Provided we take the Host
	part out, as proposed, we treat this as a unit.

Cache-Control:
	And what good would UTF-8 do here in the first place ?
	

So where is it you want UTF-8, and what difference will it make ?

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.