Re: [I18nrp] Last Call: <draft-faltstrom-unicode11-05.txt> (IDNA2008 and Unicode 11.0.0) to Informational RFC

John C Klensin <john-ietf@jck.com> Tue, 04 December 2018 01:21 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: i18nrp@ietfa.amsl.com
Delivered-To: i18nrp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0C5CE130DCE for <i18nrp@ietfa.amsl.com>; Mon, 3 Dec 2018 17:21:43 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id caEyZGx5Qwp8 for <i18nrp@ietfa.amsl.com>; Mon, 3 Dec 2018 17:21:40 -0800 (PST)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A899B130DBE for <i18nrp@ietf.org>; Mon, 3 Dec 2018 17:21:40 -0800 (PST)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1gTzP8-000OeN-Uh; Mon, 03 Dec 2018 20:21:38 -0500
Date: Mon, 03 Dec 2018 20:21:33 -0500
From: John C Klensin <john-ietf@jck.com>
To: Paul Hoffman <paul.hoffman@vpnc.org>
cc: i18nrp@ietf.org
Message-ID: <A5B69D318689A6515CCB4883@PSB>
In-Reply-To: <8E20D432-01B0-4B52-80BB-3348C5FE73AF@vpnc.org>
References: <154385119878.18333.5085298134102919486.idtracker@ietfa.amsl.com> <FF6F9EB9-C73B-4EC0-AC4F-3E3BFBABA0AB@vpnc.org> <8E20D432-01B0-4B52-80BB-3348C5FE73AF@vpnc.org>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18nrp/X4cBDzW6htt2Isk4klKbZnvJ6aM>
Subject: Re: [I18nrp] Last Call: <draft-faltstrom-unicode11-05.txt> (IDNA2008 and Unicode 11.0.0) to Informational RFC
X-BeenThere: i18nrp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Review Procedures <i18nrp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18nrp>, <mailto:i18nrp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18nrp/>
List-Post: <mailto:i18nrp@ietf.org>
List-Help: <mailto:i18nrp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18nrp>, <mailto:i18nrp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 04 Dec 2018 01:21:43 -0000

Paul,

This is a very difficult situation and I want to be clear that I
understand Patrik has done what he has been asked to do, that I
am not blaming him for accepting those requests, and that I
always (this case included) believe we are better off with a
document on the table to which people can react than by trying
to make progress by collective handwaving.

Let me state a variation on your two concerns much more broadly
and add a third...

(1) When the IAB issued its statement in January 2015, the
problem was believed to be about U+08A1.   At about that time or
soon thereafter, we learned more about the issues and started
posting versions of draft-klensin-idna-5892upd-unicode70 that
explored the generalization of that problem into what some of us
have come to call the "non-decomposing code point" problem and a
few related issues.  That I-D, which is not referenced, does not
actually propose solutions although it goes much further than
the IAB Statement in exploring and explaining the issues.  The
IAB issued a subsequent statement on the subject (also not
referenced) that can be interpreted in a variety of ways, one of
which is fairly close to "the IETF isn't engaging on this, so
never mind".  Describing this document as having resolved the
issues is true iff "resolved" in the IETF now means "blown off
without any meaningful solution" or something close to it.

In addition, I personally read "suggests IDNA2008 standard is to
follow the Unicode Standard and not update RFC 5892 [RFC5892] or
any other IDNA2008 RFCs" as effectively eliminating the
version-by-version review process called for in Section 5.1 of
RFC 5892 (and followed by Patrik and yourself to produce RFC
6452) if favor of uncritically accepting changes in The Unicode
Standard.  Insofar as that reading is correct, this is
necessarily a Standards Track document that updates RFC 5892 and
makes what I believe is a very substantive change to it.

(2) Your question about normalization is a bit different.  At
least AFAICT, the question of UTC changing the normalization
properties of a given code point is a non-issue because those
properties are covered by stability rules and won't change.  If
they were to change in a way that altered whether a putative
label was NFC-compliant, that would certainty be reflected in
IDNA2008 processing because that processing requires strings in
NFC form.  However, we made assumptions that were reflected in
the design of IDNA2008 about there being no code points such
that one form could be constructed by a combining sequence of
two or more other and both that form and the combining sequence
be NFC-valid.  See draft-klensin-idna-5892upd-unicode70 for more
discussion on that point and its implications.

(3) The entanglements between this document and several others
(including draft-klensin-idna-5892upd-unicode70,
draft-freytag-troublesome-characters (which is either the basis
for modifying 5892 as I think your comments suggest or which
complements the clarification about registry responsibilities in
draft-klensin-idna-rfc5891bis)), as well as the IETF's inability
to act on any of them, were the motivation of the BOF at IETF
102 that led to this mailing list.  I am told, although I wasn't
able to find it quickly, that there is now a notation from the
AD in the file that this document must be reviewed by the
directorate that BOF decided should be created (and with which
the AD(s) agreed).  I believe that directorate should be
launched and its charter and scope be made clear and that it
should then look at this document in context with the others,
examining the questions you have raised and others, and then
make recommendations as to how to proceed.  I believe that the
plan that came out of the BOF was completely consistent with
that view.  Only after the directorate's recommendations are
available to the community for review and consideration do I
believe that an IETF Last Call on this document is appropriate.


And, yes, I have expressed that view to Alexey (in his
Responsible AD role).

best,
    john



--On Monday, December 3, 2018 16:02 -0800 Paul Hoffman
<paul.hoffman@vpnc.org> wrote:

> Before I go to the ietf@ietf.org mailing list with my concerns
> about this draft, I hope it is OK to bounce them off people
> here in case I'm wildly off track.
> 
> =====
> 
> In Section 1:
>     Specifically, the Internet Architecture Board did issue a
> statement
>     [IAB] which requested IETF to resolve the issues related
> to the code
>     point ARABIC LETTER BEH WITH HAMZA ABOVE (U+08A1),
> introduced in
>     Unicode 7.0.0 [Unicode-7.0.0].  This document resolves
> this issue and
>     suggests IDNA2008 standard is to follow the Unicode
> Standard and not
>     update RFC 5892 [RFC5892] or any other IDNA2008 RFCs.
> 
> In Section 4.1:
>     The discussion in the IETF concluded that although it is
> possible to
>     create "the same" character in multiple ways, the issue
> with U+08A1
>     is not unique.  In the case of U+08A1, it can be
> represented with the
>     sequence ARABIC LETTER BEH (U+0628) and ARABIC HAMZA ABOVE
> (U+0654).
>     Just like LATIN SMALL LETTER A WITH DIAERESIS (U+00E4) can
> be
>     represented via the sequence LATIN SMALL LETTER A
> (U+0061), and
>     COMBINING DIAERESIS (U+0308).  One difference between
> these sequences
>     is how they are treated in the normalization forms
> specified by the
>     Unicode Consortium.
> 
> This sounds like the IETF is saying that if the Unicode
> Consortium changes how a character appears in a normalization
> form other than for case folding (Section 2.2 of RFC 5892),
> that change does not affect the tables for IDNA2008. Is that
> correct?
> 
> =====
> 
> In Section 4.1:
>     As U+08A1 is discussed in
> draft-freytag-troublesome-characters
>     [I-D.freytag-troublesome-characters] and elsewhere.
> Regardless of
>     whether those discussions ends in recommending including
> the code
>     point in the repertoire of characters permissable for
> registration or
>     not, it is acceptable to allow the code point to have a
> derived
>     property value of PVALID.
> 
> This sounds like it is saying that even though
> draft-freytag-troublesome-characters is meant for standards
> track, because it is not yet finished, this document (which is
> informational) can ignore the other document and make changes
> to the IANA registry. If that's correct, it concerns me
> because it could make the IANA registry unstable for
> characters that we know about and are actively discussing. If
> I'm not correct, I'd like to hear why so that maybe this
> document can be reworded.
> 
> --Paul Hoffman
> 
> _______________________________________________
> i18nRP mailing list
> i18nRP@ietf.org
> https://www.ietf.org/mailman/listinfo/i18nrp