Re: Unicode escape sequence | Re: draft-ietf-httpbis-header-structure-00, unicode range

Julian Reschke <julian.reschke@gmx.de> Wed, 14 December 2016 11:56 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4ADE8129DF6 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 14 Dec 2016 03:56:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.797
X-Spam-Level:
X-Spam-Status: No, score=-9.797 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-2.896, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3rPicjY-ZzJz for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 14 Dec 2016 03:56:33 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 883D2129DDB for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 14 Dec 2016 03:56:32 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1cH89j-00050W-6J for ietf-http-wg-dist@listhub.w3.org; Wed, 14 Dec 2016 11:55:31 +0000
Resent-Date: Wed, 14 Dec 2016 11:55:31 +0000
Resent-Message-Id: <E1cH89j-00050W-6J@frink.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by frink.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <julian.reschke@gmx.de>) id 1cH89R-0004yd-Bs for ietf-http-wg@listhub.w3.org; Wed, 14 Dec 2016 11:55:13 +0000
Received: from mout.gmx.net ([212.227.15.15]) by mimas.w3.org with esmtps (TLS1.2:DHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <julian.reschke@gmx.de>) id 1cH89L-0000xG-24 for ietf-http-wg@w3.org; Wed, 14 Dec 2016 11:55:08 +0000
Received: from [192.168.178.20] ([93.217.110.65]) by mail.gmx.com (mrgmx002 [212.227.17.190]) with ESMTPSA (Nemesis) id 0MBnvD-1cRGzj0dlf-00An6E; Wed, 14 Dec 2016 12:53:32 +0100
To: Martin Thomson <martin.thomson@gmail.com>, Poul-Henning Kamp <phk@phk.freebsd.dk>
References: <20161213173327.C1F7D1714B@welho-filter2.welho.com> <20161213175419.GA7943@LK-Perkele-V2.elisa-laajakaista.fi> <25434.1481665395@critter.freebsd.dk> <201612140628.uBE6SO3L025885@shell.siilo.fmi.fi> <36792.1481701328@critter.freebsd.dk> <CACweHNDKgWQewZHb=Kz3_2=41M58sY5472Q5OwpqPLxorvkzHQ@mail.gmail.com> <37223.1481707288@critter.freebsd.dk> <3a65ca44-f652-3b14-6d64-46f35b32df57@isode.com> <55880.1481711031@critter.freebsd.dk> <95057a05-6714-9154-8cf8-7cd302c86715@gmx.de> <60914.1481712680@critter.freebsd.dk> <CABkgnnWzOhkznH2HzweNegYo4dDHE+DT0PM=eCSvVr+-Wkup1A@mail.gmail.com>
Cc: Alexey Melnikov <alexey.melnikov@isode.com>, Matthew Kerwin <matthew@kerwin.net.au>, Kari Hurtta <hurtta-ietf@elmme-mailer.org>, Ilari Liusvaara <ilariliusvaara@welho.com>, HTTP working group mailing list <ietf-http-wg@w3.org>, Poul-Henning Kamp <phk@varnish-cache.org>
From: Julian Reschke <julian.reschke@gmx.de>
Message-ID: <36d4efd6-0287-8770-b40a-89729a7c53f6@gmx.de>
Date: Wed, 14 Dec 2016 12:53:27 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1
MIME-Version: 1.0
In-Reply-To: <CABkgnnWzOhkznH2HzweNegYo4dDHE+DT0PM=eCSvVr+-Wkup1A@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Provags-ID: V03:K0:D9lzwFpVFyfPsMll38V+7eNRwqTQayltcZdst1Avi5hk/xqNIDq E4s+7O0uFqGf3bp7wC+Ew6QwC5K+pBObAI8xeHFZrge7p7mADYUYSqXCtDcJGHX3Wc8PuZc dejfHpbOIumKX3NfLt+qzmv+tit9DllqWGL4OQ6nwDqRMEwFdLx0PhhCrcaEOQ926noTGI1 rNB0LXcfkrLRwEqZv8L9A==
X-UI-Out-Filterresults: notjunk:1;V01:K0:XSDsaUX7oLg=:i4J1G0IVsATdf6xHMHtLMP JsZnDv+18GCWm3+xRql3A8lZYIBr4dr/SqZrQCgCKQandQ3ADdVHKm8JAORdbSHy7pZSaLdAv IeNDoupTmqgLI7C2qYeYF/onmGbJLZ1KNK/LWy0JgWVxLSK+Dd2s2a8wFmOMxteJDlI7MyUAl 7Gm53ydkOOSt8I9MCRX7WUPtXn1Ky28c1X5zjXD6Q1grENpu0OaFtnPfBk+bcELa3OrV5hdKy E20Dn1t+vS7oT5B0jyTZNNPA/auKWyWVWDsZMqPhOKqRum92HRfEY7Dcasqr6CSKWkyCVAYzb 89tPN/ZTRpgEwwLgWEpwyPH3yiQENohZqeQa+afcX8xC25QkDcEcIs+5DGd+6qetS9jk/+BOK 7IIXwyDU6QjQea+Sfmn5nps2NATxAsxnY70oOiaBxlr+6vmVz1W2+vTGIsm3s0szxshWudG6Y 4ehmpH7uQzA2gwe2pxplLimi0JqWABqbVdN7z1w5zEoR5XBVHhFXr0FO0EmFUAymqoNRL0NXh 7roTZYUAz0PROoKC539rRF+qzY9ExFI4bZ8DUJxsvu281orJwYtXlx+uCswXYPBY/iaA14GWG JcwEeRHzoH6OwykSh73gwzaDS88w0adgpsteNUcnvyp3xCukSm+QcV7Ks5tq1paXFpnq7HtbW rqq+S1S44IpeTpgPtt5HcBonKIze1vAZNWBaU+9Zw206JW9xrBt2zX2llpe8EKRIosVaWPaOc 2BzYNbAM5Qd6CDGkIUAEMk1KBgHv5tUOa3oPtYkhhIMZf3lPMARrav1FBCL168S94/+8u/P7t X0J3W4e
Received-SPF: pass client-ip=212.227.15.15; envelope-from=julian.reschke@gmx.de; helo=mout.gmx.net
X-W3C-Hub-Spam-Status: No, score=-6.6
X-W3C-Hub-Spam-Report: AWL=-0.018, BAYES_00=-1.9, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1cH89L-0000xG-24 edbe3f00b0f940fbf9bccf882dce331b
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Unicode escape sequence | Re: draft-ietf-httpbis-header-structure-00, unicode range
Archived-At: <http://www.w3.org/mid/36d4efd6-0287-8770-b40a-89729a7c53f6@gmx.de>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/33193
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On 2016-12-14 12:37, Martin Thomson wrote:
> On 14 December 2016 at 21:51, Poul-Henning Kamp <phk@phk.freebsd.dk> wrote:
>> Well, UTF-8 would also go through HPACK, but by eye-ball it seems
>> that it would be more efficient.
>
> If you have lots of ASCII still, you can probably Huffman encode,
> though if you have lots of non-ASCII, you need to watch out: a three
> octet UTF-8 encoded codepoint turns into (worst case) 82 bits.  Best
> case is 58 bits (both of which are invalid, so maybe not).
>
> I can't remember, is there actually a good reason why we can't just
> start shoving UTF-8 in header fields?  I mean, h2 is probably OK with
> this.

Some APIs assume ISO-8859-1, so unexpected things might happen (of 
course that's independent of the actual transport).

Best regards, Julian