Re: [I18ndir] Working Group Last Call: Structured Headers for HTTP

"Patrik Fältström " <patrik@frobbit.se> Wed, 05 February 2020 13:55 UTC

Return-Path: <patrik@frobbit.se>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DC7B61200B3; Wed, 5 Feb 2020 05:55:58 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.7
X-Spam-Level:
X-Spam-Status: No, score=-2.7 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=frobbit.se
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2iOFWpebiUsf; Wed, 5 Feb 2020 05:55:56 -0800 (PST)
Received: from mail.frobbit.se (mail.frobbit.se [85.30.129.176]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 224B61200B6; Wed, 5 Feb 2020 05:55:56 -0800 (PST)
Received: from [10.13.36.57] (unknown [IPv6:2a01:3f0:1:0:1da1:587f:4b63:b5a3]) by mail.frobbit.se (Postfix) with ESMTPSA id A2BB72889C; Wed, 5 Feb 2020 14:55:53 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=frobbit.se; s=mail; t=1580910953; bh=XzJCrd9TOZnx7QDSQbCDSy5jEnDwQkGgQL51jXq9vH8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=E/SJSySOBTu3uL2rBvGoPllwWVVBsd+50bmnk4qmZK/82ebXr1ygWehyfTI0WUQtR MTonL3F8GBzBFncZVIqjnkvmmVrrFjybCzvEc3iEca990sjoZFs5JnEThgfA0o5Uqz yy+3vD3qYpxE+ksc1s0OxszWgXkK1NfDjSmyAIKY=
From: Patrik Fältström <patrik@frobbit.se>
To: John C Klensin <john-ietf@jck.com>
Cc: John R Levine <johnl@taugh.com>, i18ndir@ietf.org, art-ads@ietf.org, "Murray S. Kucherawy" <superuser@gmail.com>
Date: Wed, 05 Feb 2020 14:55:50 +0100
X-Mailer: MailMate (1.13.1r5676)
Message-ID: <1C6CA47A-58ED-435B-9AA9-2B022185AA13@frobbit.se>
In-Reply-To: <74CA67892D7FCDB725238B1B@PSB>
References: <20200203173404.88EE813AA055@ary.qy> <E2361F8BA970A15043416C2D@PSB> <alpine.OSX.2.21.99999.374.2002031653540.31381@ary.qy> <D03AE38116EF15538E10CFAF@PSB> <7D31FE0A-D4EC-4096-83FE-97D2BF4908F5@frobbit.se> <alpine.OSX.2.21.99999.374.2002041007110.33467@ary.qy> <47AEE7D582019051ACF36647@PSB> <alpine.OSX.2.21.99999.374.2002041149130.34062@ary.qy> <4A65258034E64E1A97EFDF7A@PSB> <E3DA1665-DB13-46D2-9212-33E647D92716@frobbit.se> <3E143C646E27AEB48F08B065@PSB> <572D8717-3545-4F37-8EC9-194D6A5A0E3A@frobbit.se> <74CA67892D7FCDB725238B1B@PSB>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=_MailMate_04337115-CD88-47DE-810B-00F14D1E6AF2_="; micalg="pgp-sha1"; protocol="application/pgp-signature"
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/UMrozkPMsDzWh0AeYxgAH8N60_0>
Subject: Re: [I18ndir] Working Group Last Call: Structured Headers for HTTP
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Feb 2020 13:55:59 -0000

On 5 Feb 2020, at 13:50, John C Klensin wrote:

> Is that about right?

Yes

> If it is, I think it is time for two directorate
> recommendations.  One is that this document, with respect to i18n issues, is not ready for IETF LC.  It may even be that the HTTP WG needs to clean up its act and, for i18n issues, may be in need of adult supervision because they are contradicting
> existing standards track documents within their own scope,
> apparently without noticing.

Yes

> The second is that it may be time to update the recommendations about character sets, Internationalization Considerations, etc.
> I don't know whether the revised recommendation should find
> expression in an IESG statement, a new BCP, or some other form of guidance or direction but it is now clear to me that, without it, the IETF is just going to dig itself in deeper and deeper...
> even more so if there is not a functioning directorate or
> equivalent than if there is.

This is something IESG have to give advice on.

> Those recommendation would be, approximately:
>
> (i) For any document that discusses or allows the use of
> non-ASCII characters, UTF-8 is mandatory to implement.  Other encodings of Unicode or use of other coded character sets may be justified by circumstances but, if they are allowed, the CCS and any additional encoding information required with it must be identified, e.g., by the mechanisms defined in RFCs 2231 and 8187.

Yes

> (ii) For any document that allows or specifies the use of
> non-ASCII characters in a protocol context (even in what is
> nominally free text), neither "use UTF-8" nor "Use UTF-8 and specify a normalization" are sufficient.  The presence of such statements without further explanation is usually an indication that an author or WG has paid insufficient attention to i18n issues but is, instead and intentionally or not, trying to blow them off.  In almost all cases that will be visible to users, character strings will eventually be compared to others, sorted, searched and/or rendered.  For one or more languages or scripts, each of those operations requires treatment that goes beyond simply examining octets.   Documents allowing such strings must either specify how those issues are addressed or must explain why they are not applicable.  For characters or strings than are protocol elements not visible to users, specific justification is required if non-ASCII characters are to be allowed.

There is one sentence that I think could be removed, so the recommendation is (one misspelling of "than" which should be "that" in the last sentence as well I think:

(ii) For any document that allows or specifies the use of non-ASCII characters in a protocol context (even in what is nominally free text), neither "use UTF-8" nor "Use UTF-8 and specify a normalization" are sufficient.  In almost all cases that will be visible to users, character strings will eventually be compared to others, sorted, searched and/or rendered.  For one or more languages or scripts, each of those operations requires treatment that goes beyond simply examining octets.   Documents allowing such strings must either specify how those issues are addressed or must explain why they are not applicable.  For characters or strings that are protocol elements not visible to users, specific justification is required if non-ASCII characters are to be allowed.

   Patrik