Re: draft-ietf-httpbis-header-structure-00, unicode range

"Poul-Henning Kamp" <phk@phk.freebsd.dk> Tue, 13 December 2016 21:46 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4E55C129C09 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 13 Dec 2016 13:46:42 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.797
X-Spam-Level:
X-Spam-Status: No, score=-9.797 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-2.896, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6QwWBxdvDX4m for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 13 Dec 2016 13:46:40 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 62955129C20 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Tue, 13 Dec 2016 13:45:24 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1cGuro-0003SY-GW for ietf-http-wg-dist@listhub.w3.org; Tue, 13 Dec 2016 21:44:08 +0000
Resent-Date: Tue, 13 Dec 2016 21:44:08 +0000
Resent-Message-Id: <E1cGuro-0003SY-GW@frink.w3.org>
Received: from titan.w3.org ([128.30.52.76]) by frink.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <phk@phk.freebsd.dk>) id 1cGurh-0003Pe-Me for ietf-http-wg@listhub.w3.org; Tue, 13 Dec 2016 21:44:01 +0000
Received: from phk.freebsd.dk ([130.225.244.222]) by titan.w3.org with esmtp (Exim 4.84_2) (envelope-from <phk@phk.freebsd.dk>) id 1cGurN-0005B7-MP for ietf-http-wg@w3.org; Tue, 13 Dec 2016 21:43:56 +0000
Received: from critter.freebsd.dk (unknown [192.168.55.3]) by phk.freebsd.dk (Postfix) with ESMTP id E1AC1273CA; Tue, 13 Dec 2016 21:43:18 +0000 (UTC)
Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.15.2/8.15.2) with ESMTP id uBDLhFHV025435; Tue, 13 Dec 2016 21:43:16 GMT (envelope-from phk@phk.freebsd.dk)
To: Ilari Liusvaara <ilariliusvaara@welho.com>
cc: Kari Hurtta <hurtta-ietf@elmme-mailer.org>, HTTP working group mailing list <ietf-http-wg@w3.org>, Poul-Henning Kamp <phk@varnish-cache.org>
In-reply-to: <20161213175419.GA7943@LK-Perkele-V2.elisa-laajakaista.fi>
From: Poul-Henning Kamp <phk@phk.freebsd.dk>
References: <20161213173327.C1F7D1714B@welho-filter2.welho.com> <20161213175419.GA7943@LK-Perkele-V2.elisa-laajakaista.fi>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <25433.1481665394.1@critter.freebsd.dk>
Content-Transfer-Encoding: quoted-printable
Date: Tue, 13 Dec 2016 21:43:15 +0000
Message-ID: <25434.1481665395@critter.freebsd.dk>
Received-SPF: none client-ip=130.225.244.222; envelope-from=phk@phk.freebsd.dk; helo=phk.freebsd.dk
X-W3C-Hub-Spam-Status: No, score=-6.9
X-W3C-Hub-Spam-Report: AWL=0.113, BAYES_00=-1.9, RP_MATCHES_RCVD=-3.099, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: titan.w3.org 1cGurN-0005B7-MP 0989ea1a40c422be82422691752c742c
X-Original-To: ietf-http-wg@w3.org
Subject: Re: draft-ietf-httpbis-header-structure-00, unicode range
Archived-At: <http://www.w3.org/mid/25434.1481665395@critter.freebsd.dk>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/33170
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

--------
In message <20161213175419.GA7943@LK-Perkele-V2.elisa-laajakaista.fi>, Ilari Li
usvaara writes:

>> 3.  HTTP/1 Serialization of HTTP Header Common Structure
>> https://tools.ietf.org/html/draft-ietf-httpbis-header-structure-00#section-3


>Well, that production lists UTF8-4, which is presumably 4-byte UTF-8
>sequences, and all valid ones are astral plane codepoints.

My impression was that UTF8 and 8-bit clean HTTP/1 got shot down
in previous discussions, but I left UTF8 here for now, pending a
more structured decision making on this.

I see us having four options, in my order of preference:

1) Forbid Unicode in headers.

2) Take UTF8 out and leave all (non-ASCII) unicode to the \uxxxx
   escape mechanism.

3) Leave UTF8 in, and make it clear that it may or may not work, so
   that people can use it in controlled environments.

4) Leave UTF8 in, and specify how to indicate/negotiate if it can be used.

>astral planes (and I hope the escape system there would be more sane
>than the one JSON has...)

Any suggestions ?

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.