Re: [I18nrp] Last Call: <draft-faltstrom-unicode11-05.txt> (IDNA2008 and Unicode 11.0.0) to Informational RFC

"Paul Hoffman" <paul.hoffman@vpnc.org> Tue, 04 December 2018 14:59 UTC

Return-Path: <paul.hoffman@vpnc.org>
X-Original-To: i18nrp@ietfa.amsl.com
Delivered-To: i18nrp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C19E412F18C for <i18nrp@ietfa.amsl.com>; Tue, 4 Dec 2018 06:59:26 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EJ3d9ULXgjXE for <i18nrp@ietfa.amsl.com>; Tue, 4 Dec 2018 06:59:24 -0800 (PST)
Received: from mail.proper.com (Opus1.Proper.COM [207.182.41.91]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A226A12870E for <i18nrp@ietf.org>; Tue, 4 Dec 2018 06:59:24 -0800 (PST)
Received: from [10.32.60.87] (50-1-51-141.dsl.dynamic.fusionbroadband.com [50.1.51.141]) (authenticated bits=0) by mail.proper.com (8.15.2/8.15.2) with ESMTPSA id wB4EwJEt028431 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for <i18nrp@ietf.org>; Tue, 4 Dec 2018 07:58:20 -0700 (MST) (envelope-from paul.hoffman@vpnc.org)
X-Authentication-Warning: mail.proper.com: Host 50-1-51-141.dsl.dynamic.fusionbroadband.com [50.1.51.141] claimed to be [10.32.60.87]
From: Paul Hoffman <paul.hoffman@vpnc.org>
To: i18nrp@ietf.org
Date: Tue, 04 Dec 2018 06:59:19 -0800
X-Mailer: MailMate (1.12.2r5568)
Message-ID: <D81CDFF3-8CDF-4168-9CEA-E8DC3A133B73@vpnc.org>
In-Reply-To: <CC73FC25-92FC-4822-B267-15C41CE450F2@frobbit.se>
References: <154385119878.18333.5085298134102919486.idtracker@ietfa.amsl.com> <FF6F9EB9-C73B-4EC0-AC4F-3E3BFBABA0AB@vpnc.org> <8E20D432-01B0-4B52-80BB-3348C5FE73AF@vpnc.org> <CC73FC25-92FC-4822-B267-15C41CE450F2@frobbit.se>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18nrp/K_1x2yV1ltl-CVn4CMOChDodKpc>
Subject: Re: [I18nrp] Last Call: <draft-faltstrom-unicode11-05.txt> (IDNA2008 and Unicode 11.0.0) to Informational RFC
X-BeenThere: i18nrp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Review Procedures <i18nrp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18nrp>, <mailto:i18nrp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18nrp/>
List-Post: <mailto:i18nrp@ietf.org>
List-Help: <mailto:i18nrp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18nrp>, <mailto:i18nrp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 04 Dec 2018 14:59:27 -0000

On 3 Dec 2018, at 22:59, Patrik Fältström wrote:

> Good morning!
>
> I see there are a few comments on this draft, I will try to respond to 
> each one of them.
>
> On 4 Dec 2018, at 1:02, Paul Hoffman wrote:
>
>> Before I go to the ietf@ietf.org mailing list with my concerns about 
>> this draft, I hope it is OK to bounce them off people here in case 
>> I'm wildly off track.
>>
>> =====
>>
>> In Section 1:
>>    Specifically, the Internet Architecture Board did issue a 
>> statement
>>    [IAB] which requested IETF to resolve the issues related to the 
>> code
>>    point ARABIC LETTER BEH WITH HAMZA ABOVE (U+08A1), introduced in
>>    Unicode 7.0.0 [Unicode-7.0.0].  This document resolves this issue 
>> and
>>    suggests IDNA2008 standard is to follow the Unicode Standard and 
>> not
>>    update RFC 5892 [RFC5892] or any other IDNA2008 RFCs.
>>
>> In Section 4.1:
>>    The discussion in the IETF concluded that although it is possible 
>> to
>>    create "the same" character in multiple ways, the issue with 
>> U+08A1
>>    is not unique.  In the case of U+08A1, it can be represented with 
>> the
>>    sequence ARABIC LETTER BEH (U+0628) and ARABIC HAMZA ABOVE 
>> (U+0654).
>>    Just like LATIN SMALL LETTER A WITH DIAERESIS (U+00E4) can be
>>    represented via the sequence LATIN SMALL LETTER A (U+0061), and
>>    COMBINING DIAERESIS (U+0308).  One difference between these 
>> sequences
>>    is how they are treated in the normalization forms specified by 
>> the
>>    Unicode Consortium.
>>
>> This sounds like the IETF is saying that if the Unicode Consortium 
>> changes how a character appears in a normalization form other than 
>> for case folding (Section 2.2 of RFC 5892), that change does not 
>> affect the tables for IDNA2008. Is that correct?
>
> First of all, this document evaluates the individual changes made up 
> until and including Unicode 11. Sure, one could say this has 
> implications on the IETF view of existence of normalization rules (or 
> not) but that is not the intention here. The result of this review 
> should neither be extrapolated to future versions of Unicode nor to 
> future evolutions of normalizations.

This is good to know. In that case, could you either remove the "One 
difference between these sequences..." sentences, or add a sentence in 
the Introduction section that says "The result of this review should 
neither be extrapolated to future versions of Unicode." Either action 
would clear up this confusion.

>> =====
>>
>> In Section 4.1:
>>    As U+08A1 is discussed in draft-freytag-troublesome-characters
>>    [I-D.freytag-troublesome-characters] and elsewhere.  Regardless of
>>    whether those discussions ends in recommending including the code
>>    point in the repertoire of characters permissable for registration 
>> or
>>    not, it is acceptable to allow the code point to have a derived
>>    property value of PVALID.
>>
>> This sounds like it is saying that even though 
>> draft-freytag-troublesome-characters is meant for standards track, 
>> because it is not yet finished, this document (which is 
>> informational) can ignore the other document and make changes to the 
>> IANA registry. If that's correct, it concerns me because it could 
>> make the IANA registry unstable for characters that we know about and 
>> are actively discussing. If I'm not correct, I'd like to hear why so 
>> that maybe this document can be reworded.
>
> First of all, this document is (as it seems to me now) to be Standards 
> Track. So that issue is taken care of.

Procedurally, it is not, I believe. A new draft needs to be issued, and 
the IETF Last Call has to start again. Fortunately, this IETF Last Call 
is only a few days old, so this should not delay anything much.

> Regarding the reference to the troublesome characters, I also have 
> referenced a document by Klensin. I do not mind referencing other 
> documents as well regarding U+08A1, but I do NOT want to explain 
> everything in this document (again), i.e. no repeat please.

Agreed.

> The klensin document unfortunately have expired but look at diff 
> between -04 and -05 and you see the reference. This is also why I in 
> this version wrote "and elsewhere".
>
> If you have better more references, send them in my direction.

This misses my concern. There is an active draft 
(draft-freytag-troublesome-characters) that seems to want to change the 
IANA registry. Your draft (draft-faltstrom-unicode11) also wants to 
change the registry, but in a different way. My question is whether we 
should be making the registry unstable in this way.

--Paul Hoffman