Re: draft-ietf-httpbis-header-structure-00, unicode range

"Poul-Henning Kamp" <phk@phk.freebsd.dk> Tue, 13 December 2016 21:32 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7964D129B07 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 13 Dec 2016 13:32:46 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.797
X-Spam-Level:
X-Spam-Status: No, score=-9.797 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-2.896, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id O5h8mMQKdlPt for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 13 Dec 2016 13:32:44 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3E7A3129AD7 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Tue, 13 Dec 2016 13:32:44 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1cGude-0005nM-2d for ietf-http-wg-dist@listhub.w3.org; Tue, 13 Dec 2016 21:29:30 +0000
Resent-Date: Tue, 13 Dec 2016 21:29:30 +0000
Resent-Message-Id: <E1cGude-0005nM-2d@frink.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by frink.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <phk@phk.freebsd.dk>) id 1cGudU-0005lV-Ut for ietf-http-wg@listhub.w3.org; Tue, 13 Dec 2016 21:29:20 +0000
Received: from phk.freebsd.dk ([130.225.244.222]) by mimas.w3.org with esmtp (Exim 4.84_2) (envelope-from <phk@phk.freebsd.dk>) id 1cGudO-0002xv-Vh for ietf-http-wg@w3.org; Tue, 13 Dec 2016 21:29:15 +0000
Received: from critter.freebsd.dk (unknown [192.168.55.3]) by phk.freebsd.dk (Postfix) with ESMTP id 33AED2738B; Tue, 13 Dec 2016 21:28:51 +0000 (UTC)
Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.15.2/8.15.2) with ESMTP id uBDLSlQ2025385; Tue, 13 Dec 2016 21:28:49 GMT (envelope-from phk@phk.freebsd.dk)
To: Kari Hurtta <hurtta-ietf@elmme-mailer.org>
cc: HTTP working group mailing list <ietf-http-wg@w3.org>, Poul-Henning Kamp <phk@varnish-cache.org>
In-reply-to: <20161213173327.C1F7D1714B@welho-filter2.welho.com>
From: Poul-Henning Kamp <phk@phk.freebsd.dk>
References: <20161213173327.C1F7D1714B@welho-filter2.welho.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <25383.1481664527.1@critter.freebsd.dk>
Content-Transfer-Encoding: quoted-printable
Date: Tue, 13 Dec 2016 21:28:47 +0000
Message-ID: <25384.1481664527@critter.freebsd.dk>
Received-SPF: none client-ip=130.225.244.222; envelope-from=phk@phk.freebsd.dk; helo=phk.freebsd.dk
X-W3C-Hub-Spam-Status: No, score=-6.9
X-W3C-Hub-Spam-Report: AWL=0.113, BAYES_00=-1.9, RP_MATCHES_RCVD=-3.099, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1cGudO-0002xv-Vh 6887700982dedfd493f7f5dee1e77834
X-Original-To: ietf-http-wg@w3.org
Subject: Re: draft-ietf-httpbis-header-structure-00, unicode range
Archived-At: <http://www.w3.org/mid/25384.1481664527@critter.freebsd.dk>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/33168
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

--------
In message <20161213173327.C1F7D1714B@welho-filter2.welho.com>, Kari Hurtta wri
tes:

>2.  Definition of HTTP Header Common Structure
>https://tools.ietf.org/html/draft-ietf-httpbis-header-structure-00#section-2
>
>|     unicode_string = * unicode_codepoint
>|             # XXX: Is there a place to import this from ?
>|             # Unrestricted unicode, because there is no sane
>|             # way to restrict or otherwise make unicode "safe".
>
>What is range of unicode_codepoint ?

As far as I know, UNICODE does not have a firm upper end, but
everybody _expects_ 32 bits to be enough for everybody.

Since section two is the abstract datamodel, that's the best we can
do there.

>3.  HTTP/1 Serialization of HTTP Header Common Structure
>https://tools.ietf.org/html/draft-ietf-httpbis-header-structure-00#section-3
>[...]
>Or is unicode values > 0xFFFF
>encoded with surrogates  (values 0xd8000 - 0xdffff) ?
>( UCS-2 or UTF-16 is used )

That was the plan.

Not a particular good plan, as evindenced by the fact that I forgot
to write that, and that JSON has seen interop issues with parsers
missing that detail.

I will add text about it.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.