[I18ndir] draft-faltstrom-unicode12 (was: Re: Getting restarted and triage)

John C Klensin <john-ietf@jck.com> Sat, 22 June 2019 00:39 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B3A54120073; Fri, 21 Jun 2019 17:39:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xx1YKXWmywaa; Fri, 21 Jun 2019 17:39:17 -0700 (PDT)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 624BD120033; Fri, 21 Jun 2019 17:39:17 -0700 (PDT)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1heU3l-000DYa-R2; Fri, 21 Jun 2019 20:39:13 -0400
Date: Fri, 21 Jun 2019 20:39:07 -0400
From: John C Klensin <john-ietf@jck.com>
To: Pete Resnick <resnick@episteme.net>, Alexey Melnikov <alexey.melnikov@isode.com>
cc: i18ndir@ietf.org, Peter Saint-Andre <stpeter@mozilla.com>, art-ads@ietf.org
Message-ID: <FE907EC05D207D554919CBBE@PSB>
In-Reply-To: <774C5663-F336-4F5E-B4D6-2CD7C85FAD8E@episteme.net>
References: <F2B84580-7E5A-4B86-BF9C-0205D4E6121D@episteme.net> <843EAB4535391A494DA216CC@PSB> <774C5663-F336-4F5E-B4D6-2CD7C85FAD8E@episteme.net>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/wn-LpHngoH2MEv1-2ntaj6ugXxk>
Subject: [I18ndir] draft-faltstrom-unicode12 (was: Re: Getting restarted and triage)
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 22 Jun 2019 00:39:22 -0000

As indicated in my earlier note, this probably deserves a
message, and thread, of its own...

--On Friday, June 21, 2019 14:12 -0500 Pete Resnick
<resnick@episteme.net> wrote:

>...

> (I'm of course leaving aside draft-faltstrom-unicode12, which
> is on Alexey's plate.)

But you shouldn't because there are two special issues with it,
especially given that it has waited this long, that probably
justify (require?) opening a Last Call as if it were completely
new and reviewing the document.  The causes for both are
addressed in draft-klensin-idna-unicode-review but, of course,
it, and its conclusions, are not addressed in
draft-faltstrom-unicode12.  At the risk of introducing actual
technical topics, they area:

(1) If we are going to do a new review for every new version of
Unicode --which is more or less what I think the community was
promised and 5892 implies -- then Unicode 12.1 is out and
another review and set of tables are required.  The current
version of draft-klensin-idna-unicode-review makes that
requirement apply only to major versions, i.e., 12.0 and then
13.0 when it arrives, but, unless it is put out there and gets
community consensus, the fact that the current version of
draft-faltstrom-unicode12 does not address 12.1 is a known
technical omission.  The good news is that 12.1 is trivial, with
one new code point that, incidentally, decomposes the way RFC
5892 anticipates.  But still (and unrelated to 12.1)...

(2) RFC 5892 is reasonably clear that the intended normal
resolution when a code point is discovered whose Unicode
properties have changed enough between Unicode versions to alter
their derived property status under IDNA is to list that code
point as an exception, thereby being sure that any possible
label that was valid earlier remains valid and that any possible
label that was invalid earlier  (with the obvious exception of
ones containing previously-unassigned code points) remains
invalid.  We didn't do it that way with the Unicode 6.0 review
reflected in RFC 6452.  Instead, we decided to just accept the
new properties and stay consistent with calculations on the
current Unicode version.  We didn't really explain why, nor did
we change 5892 to encourage that choice.  I presume the reason
(or at least our excuse in retrospect) was that Unicode 6.0
followed quickly enough on the core IDNA2008 documents and came
before IDNA2008 was widely used, that consistency with Unicode
was reasonable and would cause no damage (especially given the
code points involved).

We didn't follow the RFC 5892 requirement in the review for
Unicode 11.0 or the retrospective reviews for 7.x, 8.x, 9.x. and
10.x either (some of those did not contain incompatibility that
would raise any concerns -- see draft-faltstrom-unicode12 for
discussion).  Our reason for that, or at least my understanding
of the IAB's reasoning in telling Patrik to go ahead and get the
tables out, was that we concluded that, after so much time has
passed, there was significant risk that tables that had been
calculated directly from the relevant Unicode versions would
diverge from the ones IANA was about to publish, creating a
different and harmful kind of incompatibility.
draft-faltstrom-unicode12 does discuss this (in Section 5), but
the explanation is, IMO, less than clear in the context of the
strong preference in 5892. 

There is no such excuse for 12.0 (much less 12.1): the reviews
occurred more or less contemporaneously with the release of the
new versions of Unicode and, if we were going to return to the
specification in RFC 5892, that probably deserves some comment
(keep reading before telling me what I/we already know).

But, in reviewing the discussion in the first part of the year,
the reason why 5892 had the default it did, and the various
promises the IETF made to the domain name registry and user
communities about stability, Patrik and I concluded that, rather
than change 5892 to prefer Unicode conformance, it was time to
return to the consensus at that time IDNA2008 was adopted and
prefer label stability for future reviews, at least unless there
was a really strong reason (that we were willing to document)
for doing otherwise.    Obviously, that conclusion is subject to
discussion, but I'm now convinced the argument for doing what we
concluded should be done in 2010 is quite strong.

With 12.0 (and 12.1), we lucked out and there are no code points
whose properties have changed in a way that requires action
--either staying with Unicode, its properties, and their
implications or preserving compatibility via the backwards
compatibility table in IDNA2008.  But the text of
draft-faltstrom-unicode12 implies that the way we look at and
handle code points is unchanged from 6.0 through 12.0 and it
isn't.  So the I-D probably needs either a normative reference
to draft-klensin-idna-unicode-review, a bit of discussion that
isn't there yet, or both.

So, while we can sit on that until Alexey finishes his writeup
if you (Pete, Peter, Alexey, directorate participants) like, it
will come back to get us sooner or later.  And this is another
reason why these documents are fairly seriously intertwined.

best,
    john