Re: [I18ndir] In-depth analysis of draft-faltstrom-unicode11-07

"Patrik Fältström " <paf@frobbit.se> Sun, 10 March 2019 15:16 UTC

Return-Path: <paf@frobbit.se>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 756A3127817 for <i18ndir@ietfa.amsl.com>; Sun, 10 Mar 2019 08:16:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.722
X-Spam-Level:
X-Spam-Status: No, score=-1.722 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FROM_EXCESS_BASE64=0.979, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=frobbit.se
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id k13k7YnlkRf3 for <i18ndir@ietfa.amsl.com>; Sun, 10 Mar 2019 08:16:33 -0700 (PDT)
Received: from mail.frobbit.se (mail.frobbit.se [85.30.129.185]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 90C3B12796C for <i18ndir@ietf.org>; Sun, 10 Mar 2019 08:16:31 -0700 (PDT)
Received: from [169.254.156.0] (pc5.it-anacpk1-unet.ocn.ne.jp [153.150.27.29]) by mail.frobbit.se (Postfix) with ESMTPSA id 69081237D0; Sun, 10 Mar 2019 16:16:26 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=frobbit.se; s=mail; t=1552230988; bh=3d0MlNeF/33VSfM3rfB1HDzNr4hSWgJUI3a2DUnY6Bk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=c8d3HGTyanYUwK9OX5WPEBu9IqfoL3l7zuwlbZPpwcDDX5eIxZo1PJ4yKguREzvJO w/QDfsLAdEi3f7nsCg3wfNpMTESdJlxMgn/G7xNW9XLvbHHdOfRZAJTSGP+9FUs92b NVY+gYppd0aIKUJLwOHmUzZR1jw+/zpXwVD95Tyo=
From: Patrik Fältström <paf@frobbit.se>
To: John C Klensin <john-ietf@jck.com>
Cc: i18ndir@ietf.org
Date: Mon, 11 Mar 2019 00:16:21 +0900
X-Mailer: MailMate (1.12.4r5597)
Message-ID: <258AD377-FBAC-4A34-BF36-D88266952751@frobbit.se>
In-Reply-To: <DCAD48F7F821C77EC623233D@PSB>
References: <DCAD48F7F821C77EC623233D@PSB>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=_MailMate_06B864EC-1C63-4D4F-B946-E2C539B4562B_="; micalg="pgp-sha1"; protocol="application/pgp-signature"
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/I0pyLCvstXP-aJLJi7LibflihOc>
Subject: Re: [I18ndir] In-depth analysis of draft-faltstrom-unicode11-07
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 10 Mar 2019 15:16:36 -0000

FWIW, I have taken care of these things, rewritten, but also in many cases removed text from the draft. Specifically when it is suggested the draft is moving out of scope.

Regarding the one incompatible change made, some new text was created based on the application for the code point that I found: <https://www.unicode.org/L2/L2012/12322-n4330-sharada-sandhi-mark.pdf>

   Patrik

On 10 Mar 2019, at 19:29, John C Klensin wrote:

> Hi.  I've avoided turning cryptic notes on paper into this note so far because of guidance from Peter and Pete but, because I seem to be seeing things in draft-faltstrom-unicode11-07 that others don't and vice versa and Harald's draft of a few hours ago doesn't begin to cover this span of issues, so maybe it is time to post a relatively careful analysis.  I've been writing continuously for many hours and my eyes are fading to the point that I cannot reliably see the screen, so I have decided to
> post this now (early in Harald's 24 hours) rather than
> carefully proofread, so please forgive typos and bad sentences.
>
> This note is necessarily long (I think longer than the document prior to references and tables); my apologies to those who find long analyses hard to take.  For those who like their
> conclusions at the beginning, it is that, for multiple reasons, the document is not ready for publication (not even "Almost
> ready, with caveats").  Those reasons include a few technical errors; asserting IETF discussions and consensus that have not occurred; tutorial material that is not really part of the
> scope of the document as described in the Abstract or
> Introduction (or borrowed from RFC 6452) including a discussion of deployment that is inadequate and possibly incorrect; making assertions about why things are being done and the possible
> effects of some actions that are unsupported by observable
> reality; and assuming that the IETF is shifting authority
> boundaries and taking on (or endorsing) efforts that have not even been seriously discussed.
>
> In addition, I believe that the
> intent of the IDNAbis WG was that the review prior to moving forward with a new version of Unicode was to consist of two
> parts -- a check for older code points who properties had
> changed in ways that might affect IDNA and at least a
> superficial examination of newly-added code points to see if any stuck out as needing special treatment.   That two-part
> review model was followed in the work leading to RFC 6452.  The second element of review was what first identified the likely issue with U+08A1 (aka the Hamza mess).   I'm not aware of that type of review being conducted (or even seriously attempted)
> for Unicode versions 8 through 11 nor any IETF discussion about abandoning it; the I-D appears to assume it is no longer
> relevant.
>
> Where I could, I've suggested text to at least mitigate the
> problems.   My personal preference continues to be that we at least figure out how we are going to address the issues raised by other i18n documents (and, if appropriate, another one
> identified below) before coming back to review this I-D, but, because others seem to be feeling more urgency (or are
> convinced that we will never get to the other documents and
> issues), I've tried to focus the comments below on getting a revision of this document together and getting it published.
>
> Issues are discussed below in the order in which the relevant topics are covered in the I-D, not in order of importance.
> Some (although few) of the comments are nit-picking but no one else seemed to have dnoe prior to last week so they seemed
> worth including.  I have not, however, identified some problems that are so obvious that it is clear that the RFC Editor would fix them (most of which were pointed out in Martin's review
> anyway) such as "also recomments" in the Abstract.
>
>
>     ---- Issues and Detailed Comments ----
>
> (1) Meta-comment: This document jumps back and forth between tutorial materials that explain the context and applicability of IDNA (a few of which might be considered as updates to
> 5894), statements that essentially clarify the IDNA2008 base documents (and arguably update 5892 or 5891), and the normative statements about the disposition of new Unicode versions and code points that the document is nominally about.  That is not necessarily a problem, but we should be aware that it is
> happening and should try to avoid any statements we'd be likely to change on further review.
>
> (2) Nit (but substantive, not editorial).  The first sentence is Section 1 is simply wrong.  According to tracker (as well as my memory) the IDNAbis WG was not chartered until April 2008 and, despite the wildly optimistic schedule shown at the bottom of https://datatracker.ietf.org/wg/idnabis/about/, the
> documents did not really start to come together enough to make a "largely completed" statement plausible until well into 2009.
> According to my notes, IETF Last Call stretched into January 2010 and the documents, of course, bear August 2010 publication dates.  Some readers of this note will recall that there was event a fairly extended discussion as to whether "IDNA2008" was appropriate for 2010 documents.  Suggestion:
>
> Old:
> 	...was largely completed in 2008, and is thus known within
> New:
>     ...was initiated in 2008 and, despite not being completed
> 	until 2010, is widely known
>
> (3) Section 1, bullets.  The two bullets miss an important issue that
> lies at the root of the non-decomposing character problem and other potential issues, including some speculation about how the various emoji issues might eventually play out (details on request, but not critical to this analysis).   Suggest adding a third bullet, to read something like
>
>   o Problems can also be created if the properties assigned to
>     those code points are inconsistent with IDNA2008 assumptions
>     about how properties are assigned and/or about how code
> points
>     with those properties are used or behave.
>
> In the last sentence of the second bullet the phrase "a code point that was not allowed (and thus is blocked in some
> situations" appears.  Noting that, under IDNA2008 a code point that is DISALLOWED MUST NOT be stored in a label and that
> lookup applications MUST reject a label containing such a code point (and not look it up), I'm not sure where "some
> situations" comes from and what it is expected to imply.  If the intent was to say that implementations that had not been updated to the latest version of Unicode (and set of tables)
> would reject the code point while more recently updated ones would allow it, there are probably ways to say that which are more clear.  I also note that the WG explicitly discussed the implications of these sorts of transitions and accepted them, despite arguments both that no changes should be allowed and that DISALLOWED-> PVALID changes should be allowed but not
> PVALID -> DISALLOWED.
>
> (4) Section 1, Introduction, paragraph 3 (starting
> "Historically...").
>
> IMO, this is very misleading.  As far as I
> can tell, "accepted all implications of changes in the Unicode Standard" encompasses exactly the three code points called out in RFC 6452 and described in the last bullet of Section 3.1 of the present I-D (more on that below).  Those three code points were examined very carefully and broadly enough to make a claim of IETF rough consensus plausible (even then, the second
> paragraph of its Acknowledgements section may be important).
> While not clearly called out by the RFC, there was also at
> least a superficial review of all of the new code points in the hope of catching any those that might pose special problems.
> U+19DA got special attention in that regard because any labels that incorporated would be rendered invalid; the discussion, IIR, included the likelihood that such labels might actually exist.
>
> There is little evidence that, with a very small number of
> exceptions, the individual codes points and changes identified in the current I-D have received a nearly equivalent level of review.  Patrik can speak for himself but, at least for me, one of the reasons we asked for the I18NRP BOF, as well as having had several other discussions, was to fix our apparent
> inability to review documents like this the way the IDNA2008 specs intended.
>
> That intent is another core issue both generally with IDNA2008 and with this document.   My understanding in the 2009-2010
> period was that the running of the tables and checking for, and then carefully examining, the code points for which the derived properties had changed was only part of the process.  The other part was to do at least a sanity check on all of the
> new code points in the hope of detecting anything weird going on.   It was precisely the latter type of review on Unicode 7.0 that turned up the issue with U+0A81 and that led to the
> improved understanding that followed: it was not a change in the derived property of an existing code point; it was an
> entirely newly-assigned code point in 7.0.0.   Had the intend of IDNA2008 been to accept new code points without review or possible criticism, the 2015 IAB "hold" request would have been unjustified and completely unreasonable.
>
> I believe we also have criteria for when a newly-assigned code point (or one whose properties have changed) is problematic
> enough that special treatment is justified.  That is the list of code points that the IETF singled out for special,
> exceptional, treatment in Section 2.6 of RFC 5892.  The mere existence of that list contradicts what I think many people
> will read into "staying with the Unicode Standard has been
> viewed as important", i.e., that we have always accepted
> decisions of the Unicode Standard.
>
> I believe (and believed when 6452 was under development and may have said so explicitly then) that the list in Section 6.2 was not a "one time and never again" list but rather that it set reasonable criterion for evaluating new code points and changes is whether the issues with those code points are less, or more or equivalent, serious than those with the code points already in Section 2.6.  If less, we accept them and, at most, grumble a bit.  If more, we talk about them very carefully and with the full understanding that letting something go by that would move the "minimum problem level to justify an exception" needle
> upward sets a precedent that will be difficult to avoid in the future when Unicode does something equally or slightly less
> egregious.
>
> There is also an issue with the last sentence of that
> paragraph.   There is zero evidence that this is the primary reason for the choice.  There were, by definition, not a
> "diversity of implementations" when the IDNA2008 specs were
> published.  I don't even know if that statement would have
> been true when 6452 was published.  If it represents a change of policy with this draft, that policy implicitly updates the IDNA2008 specs (again, noting that those specs call out a list of differences from the set of code points that would have been determined by Unicode properties alone) and that should be
> noted and a serious discussion held about it.  Certainly there is a preference for staying as close to the Unicode standard as possible consistent with IDNA2008 principles, but it has little or nothing to do with the number of different things claiming to be IDNA that are out there (not really a diversity of
> implementations of IDNA2008 either).  Suggestion:
>
> Old:
>    Historically, the IETF has accepted all implications of
>    changes in the Unicode Standard even though the changes have
>    resulted in problematic changes in the derived property
>    value.  The primary reason for that choice is that staying
>    with the Unicode Standard has been viewed as important
>    because of the diversity of implementations already existing
>    in the wild.
>
> New:
>    <<>>>
>
> (5) Section 1, paragraph 4 (starting "As described in Section 4")
>
> Halfway through this paragraph, the sentence "If a change occurs, and it is between any of the derived property values except
> DISALLOWED, there is not a problem."   I am supposed to be an expert on IDNA2008 and I have no idea what that sentence means.
> I'm not completely sure about the first sentence of the
> paragraph either, but at least there I can guess the intent.
>
> (6) Section 1, paragraph 6 (starting "In 2015, the Internet")
>
> The last sentence claims that the issue is resolved by the
> current document.  First of all, while the IAB statement was never updated to reflect discussions and a more general
> understanding of what became known as the "non-decomposing
> character problem", including the attempted LUCID BOF
> discussions, we quickly learned that the issue was not just
> about U+08A1 or the closely-related problem of
> language-dependent code points within a script (the latter was called out even before the IAB statement).  So while the
> document may resolve the BEH WITH HAMZA ABOVE issue (I contend that it does not, see below), concentrating on the IAB
> statement exclusively is deeply misleading relative to
> discussions in the IETF (and the late IAB Program) about the issue, essentially blowing them off.
>
> (7) Section 3.1 (Editorial)
>
> This section would be far easier to read and understand it
> document titles, or at least abbreviated ones, were given.  For example:
>
> Old:
>   o  RFC 5890 [RFC5890], informally
> New:
>   o  Internationalized Domain Names for Applications (IDNA):
>      Definitions and Document Framework [RFC5890].  This
> 	 document is informally
>
> The RFC Editor may have better suggestions and may prefer that the title be in quotes.  Similarly for the other documents
> listed.  Note that, because the sentence structure for the
> final document in the set (RFC 6452) is different from the
> others and it does not have an informal name, it will need a slightly different change.
>
> (8) Section 3.2 Deployment
>
> While I think this is extremely useful information (at least after some small corrections), it does not appear to be part of the mandate of this document and is certainly not reflected in the abstract.  If one were to follow the structure of the
> existing IDNA2008 documents, this Section is really an update to RFC 5894.  Consideration should be given to removing it
> entirely and putting the information somewhere else, perhaps in an expanded discussions of those variations on the IDNA
> approach.  The following assumes that the section is,
> nonetheless, retained.
>
> New (add a new sentence to the Abstract, reading something
> like)"
> 	To improve understanding, this document describes systems
> 	that are being used as alternatives to those that conform
> 	to IDNA2008.
>
> Note that the Abstract is getting long for RFC Editor (and good sense) preferences and might be in need of rethinking.
>
> The list itself is not complete and is arguably a bit
> misleading.  In particular, a fully-conforming implementation of IDNA2003 is bound to Stringprep as described in RFC 3454 and Nameprep as described in RFC 3491 (and hence is tied to Unicode 3.2).  Because each of the documents that describe IDNA2003
> (including Stringprep, on which Nameprep is normatively
> dependent) has been obsoleted by later work, the status of
> IDNA2003 is a little weird, but I don't think this document
> needs to get into that.  However, there are a number of things out there that effectively claim to be IDNA2003 implementations but that use extrapolations, by their developers) of what
> Stringprep and Nameprep would look like if the IETF had updated them to reflect various versions of Unicode.
>
> That this distinction is important is illustrated by a recent on-list discussion.   If "IDNA2003" really means "IDNA2003" as specified in the cited standards, then all of the code points introduced since those specifications were published are
> invalid for use in stored labels.   If it means "the rules of IDNA2003 applied to whatever tables an implementer thinks are appropriate but informed mostly by toCaseFold and NFKC", then, well, who knows but probably, e.g., emoji are valid in domain name labels.
>
> Old:
>
>   o  IDNA2003 as specified in RFC 3490 [RFC3490] and RFC 3491
> 	 [RFC3491] which implies using a table within which it is
> 	 said whether code points are allowed to be used or not,
> 	 after doing the normalization specified in IDNA2003.
>
> New:
>
>   o  IDNA2003 as specified in RFC 3490 [RFC3490] and RFC 3491
> 	 [RFC3491].  Those specifications are dependent on case
> 	 folding and NFKC normalization and on tables that specify
> 	 for each code point whether it is allowed to be used or
> 	 not, with a distinction made between use for "stored
> 	 strings" and "query strings".  The tables themselves are
> 	 dependent on version 3.2 of Unicode [Unicode-3.2.0].
>
>   o  A number of variations on IDNA2003, sometimes presented as
>      "updated IDNA2003" or the like, which follow the
> 	 principles of IDNA2003 as understood by the implementers
> 	 but that use tables that represent how the implementers
> 	 believe Stringprep [RFC3454] and Nameprep [RFC3491] would
> 	 have evolved had the IETF not moved in the direction of
> 	 IDNA2008 instead.
>
> Note that the second bullet in that section in
> draft-faltstrom-unicode11-07 is a specific IDNA2003
> non-conforming variation in which validity of post-Unicode 3.2 code points is calculated according to IDNA2008 rules but
> codepoints actually shown in Stringprep have their IDNA2003
> validity values.  Such an implementation would be wildly
> inconsistent internally, enough so that user confusion and
> complaints would be nearly certain.   For example, it would
> treat symbols that appear in Unicode 3.2 as valid but those
> that were added later as invalid and would, I think, treat
> characters removed by NFKC but that appeared in Unicode 3.2 as invalid while such characters added later than Unicode 3.2 (but unaffected by case folding) would be valid.
>
> I don't believe I've seen such an implementation in the wild but will take Patrik's word for their existence.
>
> (9) Section 3.2, first paragraph.
>
> As a general observation about terminology, the I-D (and our other discussions) should probably use "implementation" much more carefully.    As an example, the first sentence of this paragraph talks about the "level of deployment of IDNA2008", while the second talks about "existing implementations" without specifying of what, hence implying that they are
> implementations of IDNA2008.   That is fairly close to deadly in some contexts, e.g., if statements were made requiring use of IDNA2008 in some contexts, having an IETF document that
> seems to claim that all of these variations are
> "implementations" of IDNA2008 would almost certainly be
> exploited by assorted bad actors.  Those who are at the ICANN meeting will almost certainly know what I'm talking about;
> others should consider themselves lucky.
>
> Suggestion:
> Old:
> 	that existing implementations are known
>
> New:
> 	that implementations that claim to be IDNA or variations on
> 	IDNA are known
>
> (10) Section 3.2, Last bullet (starting "A mix between
> IDNA2003 and IDNA2008 according")
>
> This mischaracterizes the present state of UTS#46.  Yes, there is flexibility based on how one selects so-called transition options that can produce many different outcomes for assorted edge cases, but its core in all recent versions (probably going all the way back, but I haven't checked) is a normative table that is much more Stringprep-like than anything having to do with IDNA2008.  See comment 17 below.
>
> (11) Section 3.2, Second paragraph (starting "The issue is
> further...")
>
> This paragraph seems a bit out of place.  RFC 5894 is
> informational and does not, formally, have requirements.  It is not clear what DNS registry operators have to do with this
> document, which is about new and changed code points, not
> IDNA2008 operations.  If the intention is to slide the subject matter of draft-klensin-idna-rfc5891bis, that document should probably be referenced explicitly and the paragraph rewritten.
>
> (12) Section 3.2, third paragraph (starting "In practice, the Unicode")
>
> I find this paragraph a little bit confusing as written, in
> part because it isn't clear whether unassigned code points are part of that Unicode Consortium maximum.   However IDNA2008
> actually creates several subsets, each associated with a
> derived property value (remembering that there are four, not two).
>
> Suggestion:
>
> Old:
>
> 	The IDNA2008 rules based on the Unicode Standard create a
> 	subset of these by assigning the PVALID derived property
> 	value to them.
>
> New:
>
> 	The IDNA2008 rules use the Unicode Standard to create a
> 	further subset of code points and context that are
> 	permitted in DNS labels associated with its PVALID,
> 	CONTEXTJ, and CONTEXTO derived property values.
>
> I think that can be said better, but I'm out of ideas at the moment.
>
>
> (13) Section 4.1, paragraph 1.
>
> As noted above, that extensive discussion in the IETF rapidly moved beyond U+08A1 and to the more general issues it raised.
> Even where the IAB statement was concerned, I believe it would have been completely unreasonable for the IAB to issue that
> precautionary "just wait" statement over a single code point that our research suggested was part of a writing system that was not in significant contemporary use, especially on the
> Internet (even though "contemporary use" was never a criterion for IDNA2008 acceptability).  I don't believe that was the
> IAB's motivation.  Instead, the "Hamza" discussion was almost entirely about the two issues that it raised and how they might have led to misunderstandings in the design of IDNA2008.  Those two issues were that some code points, including newly-added ones, appeared to be composable by combining sequences of
> earlier code points without NFC decomposing to those sequences (something the IDNAbis design team believed we were promised would never occur and designed IDNA2008 accordingly) and the presence of different ways to code an abstract character or
> character sequence within the same script, essentially
> asserting that they were different abstract characters because they were used differently in different languages.
>
> The real discussion of the issues that the U+08A1 discovery
> opened up is in draft-klensin-idna-5892upd-unicode70, while
> draft-freytag-troublesome-characters just identifies that code point and a few others (and points to
> draft-klensin-idna-5892upd-unicode70).  While I hope no one
> could claim consensus for either, the draft-klensin document was on the agenda of a BOF and therefore has received some, I hope serious, discussion in the IETF and well as in the IAB
> Program and elsewhere (my recollection is that it even come up in PRECIS discusions, but Peter can confirm).  The
> draft-freytag document has, if I recall, been discussed only among a small circle of friends (most of whom are now on the directorate) and has never been on an IETF or WG agenda for
> substantive discussion.
>
> The final sentence of the paragraph has no claim whatsoever on IETF Consensus although, if the document is approved, it is
> certain that claims will be made that it represents such
> consensus.  See the comments above about the bar for actually making an addition to the exception list IDNA2008
> specifications.  Saying here that "it is still acceptable to allow the code point to have a derived property value of
> PVALID", without further explanation, is equivalent to
> concluding that all (present, past, and future) code point and code point sequences within a script that are distinguished
> only by language and/or do not decompose to reasonable and
> plausible combining sequences should be treated as PVALID if Unicode considers them letters and that either disallowing them or classifying them as CONTEXTO by exception will not be
> considered.   The alternative is to walk us into either
> significant inconsistency or retroactive incompatible changes.
> I believe that Asmus's recent observations about combining
> characters that are not intended for use with letters further complicates these problems.
>
> For whatever it worth, we are talking about IDNA2008 here.
> "recommendations to include the code point in the repertoire of characters permissible for registration or not" are not part of the IDNA2008 vocabulary and have no place in this document
> without significant IETF protocol specification work.  IDNA2008 itself does not anticipate such recommendations: a code point is either PVALID (or CONTEXTx and meeting the contextual rules as needed) or it isn't.  Much as I'm sympathetic to the
> draft-freytag work (although I see significant problmes with it), we should not be using it to get the IETF into the
> recommendation business as the result of a casual remark in
> this I-D.  In the longer term, it poses another alternative, which is to decide that all issues involving such types of
> characters and maybe others should be dealt with simply by
> calling the attention of registries to them, hoping (even in the presence of evidence to the contrary) that all registries will be responsible about their delegations, and then move on.
> But the IETF has not agreed to any such thing and, If that is what we intend, it should be explicit.  It would also be a
> sufficiently important decision that it should not be slipped into a document after IETF LC has nominally closed.  That is especially true if the intent is to effectively eliminate the importance of future reviews of this sort because because we are just going to leave all decisions about new code points or changed properties to lists like the "troublesome" one and
> registries rather than --even as a possibility-- adding
> additional cases to the IDNA2008 exception list.  I've been
> assured several times that the I-D does not intend to preempt future actions by the IETF but, insofar as consistency with
> prior actions is an important criterion and we continue to have a high level of reluctance to make incompatible changes, if
> this code point is going to be accepted, especially with the sort of statement made in that section, solid explanation as to how such precedent-setting is to be avoided is appropriate,
> necessary, and certainly does not appear in the document.
>
> It would be future work but, if the plan is really to shift
> responsibilities that now belong at the IDNA2008 protocol level to registries, we should examine whether the CONTEXTO, and
> maybe even the CONTEXTJ, categories are needed or whether we should reduce the complexity of IDNA2008 by eliminating them in favor of more advice to (and responsibility for) registries.
>
> I want to stress that I don't believe there is IETF consensus for how to handle language-sensitive (within script) or
> non-decomposing code point, saying that there is when there
> hsas been little discussion, none of it conclusive, seems to ne to be inappropriate and risky.  I do see two ways out of
> there is the directorate and community believe that there is urgency about processing and publishing this I-D or its
> successor.   One would be to update IDNA2008 to create a
> protocol-level derived value we might call "tentative",
> assigned only by exception (as CONTEXTO is now assigned) that would be treated the same as PVALID except with the caution
> that the classification is, well, tentative and that users of labels containing such characters and registries delegating
> labels containing them should be aware that, as knowledge
> improves, they might be DISALLOWED without our paying much
> attention to crities of agony about incompatible changes.
> Advising them to seek additional advice from the "troublesome"
> list and elsewhere would be entirely appropriate in that
> context.
>
> The other would be to simply exclude those non-decomposing code points from the decisions and tables of this document, treating them as if they were unassigned until (and unless) we can come up with better and more definitive answers.
>
> (14) Section 4.2
>
> While I believe the statement that no changes were been made that change the derived property value for code points that
> existed earlier, as far as I know, no one has done a review of that new code points similar to the reviews conducted for
> versions 6.x and 7.0 of Unicode.  If anyone has done such a
> review in an IDNA2008 context, they should identify themselves.
>
> This omission presumably also to version 11.0 because all
> the second paragraph, first bullet of Section 4.2 says is we should go off and calculate, presumably without examining the code points even superficially.
>
> If the description in Section 4.3 includes mention of how many code points were added (something that has nothing to do with changes) why is that information not present for versions 7
> through 9?
>
> (15) Section 4.3, paragraph 2, bullet 3 (starting "The
> U+111C9...")
>
> After heated debate, especially with Mark Davis, who insisted that changes from DISALLOWED to PVALID were or but changes from PVALID to DISALLOWED should never be allowed, the WG very
> clearly concluded that both types of changes needed to be
> examined and treated carefully.   Part of the reason was that the requirement that lookup applications that conform to
> IDAN2008 are required to check labels for code point validity and reject any labels containing DISALLOWED code points without actually getting near the DNS (another requirement to which
> Mark objected strenuously but was far out in the rough), with the result that a change from DISALLOWED to PVALID could result in false negatives.
>
> So, on what basis is this "an acceptable change"?  I think it probably is (and that the deviation isn't worth the trouble)
> but, unless some IETF process reversed that clear consensus in the WG, stating it as acceptable as if we had adopted Mark's model doesn't fly.  And, again, if there has been any IETF
> discussion of this issue, or even a serious attempt at such a discussion, I haven't been aware of it and would be happy to be pointed to email message, minutes, etc.
>
>
> (16) Section 5, paragraph 1.
>
> I note that this paragraph talks about "the conclusion of this document", which is entirely reasonable whether one agrees with it or not, and avoids the language about IETF conclusions
> which, as noted above, I do not believe can be substantiated.
>
> (17) Section 5, paragraph 2 (starting "To increase overall
> harmonization")
>
> This gets back to the "implementation" question discussed
> above, the issue of whether the IETF should be making
> recommendations about how to make non-confirming systems work better (or less badly), and the unproven hypothesis that there are a non-trivial number of implementations out there that are using IDNA2008 calculations for new values in local extensions of Stringprep and Nameprep tables.  Certainly the most
> important and widely-cited of the alternative IDNA protocols --
> UTS#46 does not do so: it is based on a normative table of its own,
> https://unicode.org/Public/idna/11.0.0/IdnaMappingTable.txt, and examination of that table and the text that describes it in the USS#46 base document give no hints of paying attention to IDAN2008's calculations.
>
> Then the text says "derived property values MUST be calculated as specified in the documents listed in section Section 3.1"
> (noting section Section" as an editorial nit).  This is either a new requirement, in which case it must update RFC 5892 or
> potentially 5890, or (as I believe) it isn't, unless this
> document is really slipping over into the "recommendations for registries" business.  To me, IDAN2008 is perfectly clear: one either property values as the IDNA2008 specifications provide or it isn't IDNA2008.
>
> Recommendation: either drop or drastically rewrite this
> paragraph.
>
> (18) Section 5, paragraph 3 (starting "All DNS registries (and other organizatios)" (sic)
>
> If this recommendation is wroth making or repeating here, what are this document and the IANA tables for?  That is a
> rhetorical question, but this paragraph should either be
> rewritten to explain the relationship or should be dropped.
>
> -- 
> I18ndir mailing list
> I18ndir@ietf.org
> https://www.ietf.org/mailman/listinfo/i18ndir