Re: [Gen-art] Genart telechat review of draft-faltstrom-unicode11-08

John C Klensin <john-ietf@jck.com> Mon, 18 March 2019 17:21 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3C55D13111D; Mon, 18 Mar 2019 10:21:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iEOUWW76LHqI; Mon, 18 Mar 2019 10:21:46 -0700 (PDT)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 074CC130E94; Mon, 18 Mar 2019 10:21:46 -0700 (PDT)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1h5vxH-0007Ks-Ek; Mon, 18 Mar 2019 13:21:43 -0400
Date: Mon, 18 Mar 2019 13:21:37 -0400
From: John C Klensin <john-ietf@jck.com>
To: Dan Romascanu <dromasca@gmail.com>, gen-art@ietf.org
cc: draft-faltstrom-unicode11.all@ietf.org, Alissa Cooper <alissa@cooperw.in>, Barry Leiba <barryleiba@computer.org>, idna-update@ietf.org, ietf@ietf.org
Message-ID: <458987D953A5B3227D3A791F@PSB>
In-Reply-To: <155289429627.26188.2047331005281292889@ietfa.amsl.com>
References: <155289429627.26188.2047331005281292889@ietfa.amsl.com>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/gen-art/V0AsKghvDXLi3j8OoDItbI6IZi0>
Subject: Re: [Gen-art] Genart telechat review of draft-faltstrom-unicode11-08
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/gen-art/>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 18 Mar 2019 17:21:50 -0000

Dan,

I have read Barry's two responses, Alissa's response, and your
response to Barry, but this note seems the easiest one to which
to respond.  This particular draft is, as I understand it, OBE,
but there are (still) a few things going on that are probably
not visible from this particular I-D alone and that were even
less visible in draft-faltstrom-unicode11-07.   The draft and
those issues  has been discussed extensively in the
recently-created I18n directorate and many, perhaps most, of the
changes between -07 and -08 are the result of those discussions
(the summary below might be consistent with those directorate
discussions, but I'm not sure there is sufficient consensus
there to make such a claim, so take the summary as my personal
opinion).   A bit of that perspective follows for those who have
not been following IDNA or i18n issues closely; the IDNA-update
list is copied in the hope that I (and Patrik and others) won't
need to go through this too many more times.

(1) This review of new Unicode versions and these IANA tables
are inherently odd.  Usually, when information is sent to IANA
and put, by them, in a registry, the contents of that registry
are considered the authoritative source of the information.  As
one of the older examples still in use, if there is a list of
assigned port numbers floating around the Internet somewhere and
the values in it are different from the values in the IANA
registry (
https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml
), the IANA registry is authoritative and that list is simply
wrong.   That is not the case for these IDNA-related tables.
For them, according to the IDNA2008 specs (and RFC 5892 in
particular) the authoritative information for a particular
application or implementation is to pick the version of Unicode
that application or implementation is using (in theory, at a
given instant if that is changing) and then calculate the
derived properties as specified from RFC 5892, and then use that
information (if the implementation needs a table, that is how it
is supposed to get the table).  The WG was persuaded, after
what, IIR, was considerable demand, to create the IANA registry
and put tables in it, but that does not make the IANA tables
authoritative, nor do they distinguish between supported code
points or supported versions of Unicode, etc.  

(2) In a more perfect world, the review called for by RFC 5892
and reflected in RFC 6452 and this I-D would not be necessary at
all.  Unicode would announce a new version, assorted entities
would calculate tables for their use with that version, and
everyone would go merrily on their way.  The IDNA-update WG was
painfully aware of the imperfection of the world and, for this
particular case, was worried about two possibilities (both of
which we hoped would be very, very rare).  One was that, for a
new version of Unicode, the Unicode Consortium would make
changes in the properties or category values they assigned to
particular code points in a way that would change the derived
properties calculated by the RFC 5892 algorithm for those code
points.  This I-D primarily reflects a review designed to detect
such changes.  As I read RFC 5892, if such cases are detected,
the normal response is to introduce additional backward
compatibility exceptions into the list in RFC 5892 to preserve
stability (Patrik reads that a bit differently than I do).  The
other option is to leave the exception list in 5892 unchanged,
but that means that an application using Unicode version YY and
tables corresponding to it (and possibly a current set of IANA
tables) is not going to behave in a way that is completely
consistent (except for code points not allowed earlier because
they were unassigned) with an application using Unicode version
XX and corresponding IDNA tables).   Ultimately, the choice is
whether, for a particular code point whose Unicode properties
have changed, to maintain consistency with Unicode changes or to
maintain consistency within implementations that calculated IDNA
derived properties based on an earlier Unicode version and are
now recalculating.   

The second possible problem, to which this I-D pays little
attention, is one in which it is discovered that some action
(such as assigning a new code point) or other discovery exposes
an issue or other behavior that was not anticipated by IDNA2008.
This is obviously a much more complicated problem, harder to
review for and detect and much harder to figure out what to do
with.  It was, however, precisely the sort of situation that we
discovered when reviewing the additions in Unicode 7.0.  That
review led to the 2015 IAB statement and to
draft-klensin-idna-5892upd-unicode70.   If the issue were just
about one code point, we could either make it an exception or
ignore it, and move on.  However,  it isn't and examining it led
to a series of still-unresolved issues about assumptions
IDNA2008 made about Unicode and its evolution.  Since that
discovery in 2014, time and Unicode versions have moved on and
applications and libraries that are based on doing their own
derived property calculations (as IDNA2008 requires) have
diverged more and more from any assumption that the tables
available from IANA are somehow "current" or that other
inferences can be drawn from them and the version of Unicode
they reference.

(3) As long as there are no changes or issues of either of the
types described in (2) above, a reasonable reading of RFC 5892
is that the Designated Expert shrugs his or her shoulders,
passes a new table off to IANA (or validates a new table that
IANA calculates) and moves on, with no report to IESG or I-D
that is expected to become an RFC required.   Viewed that way,
RFC 6452 was published only to confirm that, for the first
version of Unicode to appear after the IDNA2008 docs were
finalized, things were working as predicted and no IETF action
was required.  draft-faltstrom-unicode11 is significant, less
because several years have passed and we need to catch up (and
maybe explain) but because, between Unicode 6.x and 11.0 (or
12.0) we've discovered examples of both changed derived
properties for code points that existed before the relevant
Unicode version and a case of discovering an "other problem"
that opens one or more cans of worms.

(4) Because the IDNA requirement is that applications calculate
derived properties (or find and use an appropriate table) that
corresponds to whatever version of Unicode they are using,
IANA's having a registry that consists only of the table
corresponding to one particular version of Unicode seems
inappropriate, unwise, and maybe inconsistent with that
provision of the standard.

It is fairly clear to me that sorting the above out and
clarifying things to the point that confusion is considerably
reduced and expectations of the Designated Expert(s) are clear
should now be a reasonably high priority.   It is almost equally
clear that putting those clarifications in the critical path in
front of publication of draft-faltstrom-unicode11 (or
-unicode12), much less getting relatively more current tables
into IANS's hands, would be unwise pragmatically, however much
more satisfying it might be procedurally.

Specific suggestions about how this might be untangled, text and
ideas about where to put it, etc., would be very welcome (at
least by me).

best,
    john



--On Monday, March 18, 2019 00:31 -0700 Dan Romascanu via
Datatracker <noreply@ietf.org> wrote:

> Reviewer: Dan Romascanu
> Review result: Ready with Issues
> 
> I am the assigned Gen-ART reviewer for this draft. The General
> Area Review Team (Gen-ART) reviews all IETF documents being
> processed by the IESG for the IETF Chair. Please wait for
> direction from your document shepherd or AD before posting a
> new version of the draft.
> 
> For more information, please see the FAQ at
> 
> <https://trac.ietf.org/trac/gen/wiki/GenArtfaq>.
> 
> Document: draft-faltstrom-unicode11-08
> Reviewer: Dan Romascanu
> Review Date: 2019-03-18
> IETF LC End Date: 2019-03-05
> IESG Telechat date: Not scheduled for a telechat
> 
> Summary:
> 
> This document is Ready. However, I have signaled one minor
> issue in my LC review which was not addressed in the revision,
> neither have I seen an answer to the clarification that I
> requested (unless I missed some mail).
> 
> Major issues:
> 
> Minor issues:
> 
> (from my previous review, not addressed)
> 
> The IANA Considerations text is not clear to me.
> 
>> IANA is requested to update the IDNA Parameters registry of
>> derived
>    property values, after the expert reviewer validates that
> the derived    property values are calculated correctly.
> 
> What does this exactly say? Are these recommendations for
> future actions, or do they refer specifically to property
> values described in this document? Are the expert reviewer
> validation and IDNA Parameters registry update required to be
> performed prior to approval of this document?
> 
> Nits/editorial comments:
> 
>