Re: [I18ndir] Working Group Last Call: Structured Headers for HTTP

"Patrik Fältström " <patrik@frobbit.se> Tue, 04 February 2020 22:35 UTC

Return-Path: <patrik@frobbit.se>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 33F1C1200B7 for <i18ndir@ietfa.amsl.com>; Tue, 4 Feb 2020 14:35:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.7
X-Spam-Level:
X-Spam-Status: No, score=-2.7 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=frobbit.se
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ar5NsylwS8zC for <i18ndir@ietfa.amsl.com>; Tue, 4 Feb 2020 14:35:31 -0800 (PST)
Received: from mail.frobbit.se (mail.frobbit.se [85.30.129.176]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A5B9D12003F for <i18ndir@ietf.org>; Tue, 4 Feb 2020 14:35:31 -0800 (PST)
Received: from [192.168.10.197] (c-abd9524e.028-114-73746f27.bbcust.telenor.se [78.82.217.171]) by mail.frobbit.se (Postfix) with ESMTPSA id DAB8A26FB1; Tue, 4 Feb 2020 23:35:23 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=frobbit.se; s=mail; t=1580855729; bh=IePVOIk2sWeEni2i4hN7aDgFb3KnOGIiHd83JRmRqMY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qIk86sJ7GWSaItalbxkhtqoEkKEhxzYQxXjqKFwjJu8qpJUFa173Lb8c1pntRt9gY PdKCqXxUVFQ0CUJ69yaxKI9Roed6RKM/c0Y0am1bh8DMnmAQS95mOu5WWsi8Pvj9Cd 7futlxmLYpPwM6YsSnZAUUSEsDky/8rWX3K8B2IY=
From: Patrik Fältström <patrik@frobbit.se>
To: John C Klensin <john-ietf@jck.com>
Cc: John R Levine <johnl@taugh.com>, i18ndir@ietf.org
Date: Tue, 04 Feb 2020 23:35:11 +0100
X-Mailer: MailMate (1.13.1r5676)
Message-ID: <E3DA1665-DB13-46D2-9212-33E647D92716@frobbit.se>
In-Reply-To: <4A65258034E64E1A97EFDF7A@PSB>
References: <20200203173404.88EE813AA055@ary.qy> <E2361F8BA970A15043416C2D@PSB> <alpine.OSX.2.21.99999.374.2002031653540.31381@ary.qy> <D03AE38116EF15538E10CFAF@PSB> <7D31FE0A-D4EC-4096-83FE-97D2BF4908F5@frobbit.se> <alpine.OSX.2.21.99999.374.2002041007110.33467@ary.qy> <47AEE7D582019051ACF36647@PSB> <alpine.OSX.2.21.99999.374.2002041149130.34062@ary.qy> <4A65258034E64E1A97EFDF7A@PSB>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=_MailMate_529BDEA3-50EE-495A-8133-B522E0B1611F_="; micalg="pgp-sha1"; protocol="application/pgp-signature"
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/nm8RrvfbAnOqe3s-onUHbN60fVI>
Subject: Re: [I18ndir] Working Group Last Call: Structured Headers for HTTP
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 04 Feb 2020 22:35:33 -0000

On 4 Feb 2020, at 20:24, John C Klensin wrote:

>> I think we agree, normalization is a level up from what
>> they're describing here.
>
> Ack.

Ok, I have tried to really understand what they are doing here, and it feels like if Unicode was pasted in after they had designed the whole thing.

Look for example at B.1 when they compare with JSON. They say one advantage of their format is that JSON do allow Unicode data which gives interoperability issues, but they do allow it themselves as well.

Then this is 3.3.3:

> Unicode is not directly supported in strings, because it causes a
> number of interoperability issues, and - with few exceptions - header
> values do not require it.

How do they know header values do not need it -- with a few exceptions?

> When it is necessary for a field value to convey non-ASCII content, a
> byte sequence (Section 3.3.5) SHOULD be specified, along with a
> character encoding (preferably [UTF-8]).

I think that the default encoding MUST be UTF-8 OR specified explicitly.

I further think it should be noted comparison of parameter values is NOT specified in this base specification as normalization might create non-interoperability. If needed, the specification of the header must say how comparison is managed.

I also think 4.2.7 should mention the resulting sequence of the parsing of a binary structure might be a UTF-8 encoded string with a reference to 3.3.3.

I.e. I find it hard to read (and understand why I misunderstood this at first) that the only mentioning of Unicode and UTF-8 is in "string" but the only thing it does is to reference byte sequence, which in turn do never talk about it. Neither at serialization or deserialization. So where UTF-8 strings are described thet are not mentioned.

But ok...you are right.

   Patrik