Re: [I18ndir] I18ndir last call review of draft-ietf-regext-dnrd-objects-mapping-06

John C Klensin <john-ietf@jck.com> Fri, 06 March 2020 05:46 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C5A793A0408 for <i18ndir@ietfa.amsl.com>; Thu, 5 Mar 2020 21:46:58 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NgfnyihLTnVq for <i18ndir@ietfa.amsl.com>; Thu, 5 Mar 2020 21:46:55 -0800 (PST)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0AEEE3A0407 for <i18ndir@ietf.org>; Thu, 5 Mar 2020 21:46:55 -0800 (PST)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1jA5oz-0004UV-Ck; Fri, 06 Mar 2020 00:46:53 -0500
Date: Fri, 06 Mar 2020 00:46:48 -0500
From: John C Klensin <john-ietf@jck.com>
To: Asmus Freytag <asmusf@ix.netcom.com>, i18ndir@ietf.org
Message-ID: <78B490AE833098E23541E672@PSB>
In-Reply-To: <2cb9e78f-32dc-3e2f-ba1a-6ae0218f3ef9@ix.netcom.com>
References: <158343520135.15044.10991712449156105132@ietfa.amsl.com> <9CD56DEFBC9108D9620ED61E@PSB> <2cb9e78f-32dc-3e2f-ba1a-6ae0218f3ef9@ix.netcom.com>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/n70q4o8SP-bEq_5LY1w7mjtCgXI>
Subject: Re: [I18ndir] I18ndir last call review of draft-ietf-regext-dnrd-objects-mapping-06
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Mar 2020 05:46:59 -0000

Asmus,

--On Thursday, March 5, 2020 16:14 -0800 Asmus Freytag
<asmusf@ix.netcom.com> wrote:

> On 3/5/2020 12:47 PM, John C Klensin wrote:
>> (All list-type other than i18ndir removed)
>> 
>> General observations about substance:
>> 
>> (1) I completely agree with Marc about the UTF-8 issue.
>> Unless there is a really good reason for allowing either other
>> normalization forms of Unicode or any other CCS, new protocols
>> and IETF-specified formats should be restricted to UTF-8.  I
>> would go a half-step further and suggest that any deviations
>> from that principle should be explicitly discussed and
>> justified in an Internationalization Considerations section,
>> i.e., that section should not be just a description of what
>> XML allows.
> 
> +1
 
> (In fact, the i18n considerations in the draft were defective
> in not
> discussing the CSV mode at all.)

An issue I did not catch.

> About the your suggested "half-step further": do we need to
> write
> a short RFC to explicitly put that on the record? Or how would
> you
> think such a mandate would be expressed?

I'll turn that one back over to Barry.  We could certainly
produce an Internet Draft that said:

	(i) IETF specifications that use Unicode SHOULD use
	UTF-8, be explicit about that, and not suggest the use
	of other encoding forms.
	
	(ii) If there are reasons why a particular specification
	must deviate from that requirement, it MUST be very
	specific about which forms are allowed and MUST explain,
	in an Internationalization Considerations section,
	exactly why  allowing alternate encoding forms is
	necessary.

But, given the fate of i18n documents in the last year or so, I
wouldn't be optimistic about getting it processed, so maybe
there is a more efficient way to proceed.

If we do need such an I-D, I'm willing to volunteer to do a
first draft, but I think starting on it is pointless without a
clear plan about how it will be progressed and progressed
without, e.g., a lengthy discussion about why we don't want to
allow, for example, UTF-7 or UTF-5.

>> (2) I have not studied this spec (or RFC 5732 carefully enough
>> to be sure, but it appears to me that, with four normalization
>> forms specified in The Unicode Standard and periodic disputes
>> about preferences between NF[K]C and NF[K]D,
>> "normalizedString" may be under-specified.  Clearly actual
>> IDN labels should (SHOULD?) be in NFC form, but most other
>> fields are less obvious.
> 
> I would say: IDN labels MUST be in NFC.

Absolutely.   My concern was that
draft-ietf-regext-dnrd-objects-mapping-06 specifies a large
number of textual fields and, as far as I could tell from fast
skimming, most or all are specified as "normalizedString"  and
very few of them are IDN labels.

> There's no reason for formal records of IDN U-labels not to be
> in the form specified by IDNA.

Of course.

>> In addition, given increasing trend on the web 9at
>> least) to do exactly what TUS says to do, which is to
>> normalize only at comparison time rather than trying to carry
>> strings around in normalized form, the application of that
>> attribute all almost all text-type values may be
>> inappropriate. I can find no evidence in the I-D that those
>> issues were considered; the document should not progress
>> until they are.
> 
> Right, there are many other contexts where it makes sense to
> keep the data as submitted and not to force normalization.
 
> However, this does not appear to be one of them.

For the many fields that do not appear to be IDN labels, why
not?  From skimming the document, I'd assume that the IDN labels
should be required to be in NFC but that a normalization
requirement is probably inappropriate for most other fields,
especially those fields that appear to be free text
descriptions.  Which ones probably requires a field-by-field
analysis.

> We may need a statement specific to IETF that captures the
> suggested policy.

Indeed.

>...

best,
   john