Re: [I18ndir] Finding a way to conclude the review of draft-faltstrom
John C Klensin <john-ietf@jck.com> Tue, 19 February 2019 21:46 UTC
Return-Path: <john-ietf@jck.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B2560130FBB for <i18ndir@ietfa.amsl.com>; Tue, 19 Feb 2019 13:46:45 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kHt2HfH-P2Aw for <i18ndir@ietfa.amsl.com>; Tue, 19 Feb 2019 13:46:43 -0800 (PST)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1650D130FAB for <i18ndir@ietf.org>; Tue, 19 Feb 2019 13:46:43 -0800 (PST)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1gwDDt-000IA2-6F; Tue, 19 Feb 2019 16:46:41 -0500
Date: Tue, 19 Feb 2019 16:46:36 -0500
From: John C Klensin <john-ietf@jck.com>
To: Asmus Freytag <asmusf@ix.netcom.com>
cc: i18ndir@ietf.org
Message-ID: <D35AAAD52083CD14E6E5ACF1@PSB>
In-Reply-To: <bcf6de35-e1db-5022-5109-76764357c8b0@ix.netcom.com>
References: <d3b501b1-dfcf-debf-b256-a0642ff560e3@alvestrand.no> <3E8A24AA2DB42A56A429133E@PSB> <bcf6de35-e1db-5022-5109-76764357c8b0@ix.netcom.com>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/uYfw1RgmMWYxG6qPSSbOSOTII-I>
Subject: Re: [I18ndir] Finding a way to conclude the review of draft-faltstrom
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 19 Feb 2019 21:46:46 -0000
--On Tuesday, February 19, 2019 09:38 -0800 Asmus Freytag <asmusf@ix.netcom.com> wrote: >... >> That position is that there >> have been a rather long series of issues raised, in multiple >> documents, about IDNA2008 and the appropriateness of >> uncritically applying its procedures to new versions of >> Unicode. > > And some of the many positions were concerned with the fact > that the other ones were focused on the wrong problem. Perhaps. Or perhaps that is part of the key problem. > Given the design of the protocol, moving forward is the > appropriate option. Ok. While some of the differences in positions and perspective are, IMO, important and should be explored further, many were, as I suggested was possible early on, were just differences in vocabulary and focus. I think the difference of opinion between the two of us comes down to two issues. As an example of the vocabulary issue, I have heard no arguments against "moving forward". The question is whether the best way to move forward is to publish draft-faltstrom now, more or less as it appears in -07, and then deal with the other stuff [a] or whether there is other stuff that should be addressed and possibly resolved before draft-faltstrom is published (with draft-faltstrom reflecting that work if appropriate. I think that choice comes down to two questions, the first tied closely to "the design of the protocol" Background: Many details aside, the basic design target for what IDNA2008 was intended to allow was, quite consciously, the same as the target for pre-DNS host names, the syntax allowed in the host-part of domain-part by SMTP, and the "preferred syntax" of RFC 1034, i.e., letters [b], digits, and as little punctuation as feasible. From that very high conceptual level, the only difference between the target criteria of IDNA2008 and those predecessors is expansion from the ASCII repertoire to the range of abstract characters supported by Unicode. The question of how that conceptual definition is turned into rules, and ultimately into lists of code points, is an implementation detail. All of the more specific issues, including whether labels are conceivably words in some languages (or obey other orthographic conventions), whether mixed-script label are allowed, are left to the registry and registries are expected to behave responsibly (and have been since at least RFC 1591 -- that wasn't an IDN innovation either). Now if, for example, some combining marks were never intended for use as letters (or, if one prefers, parts of words) and were assigned code points for other purposes or notations, then that letter-digit model dictates that they should not be permitted in labels. If a particular version of the specification allows them, it is what is called a "mistake", arguably an IDNA protocol mistake due to Unicode's not making a distinction between non-spacing marks used with letters (even sometimes, which might have been a case for more CONTEXTO classifications) and non-spacing marks used only in non-letter contexts. Even the non-decomposing character issue is arguably a mistake because inferences were made about what normalization would accomplish that turned out to be incorrect (and, incidentally, the slash characters you mentioned were found in the subsequent investigation and are called out in draft-klensin-idna-5892upd-unicode70). (1) The first question is tied critically to "the design of the protocol" and what to do about mistakes. The tradition in a number of communities is to view them as irreversible and just try to warn about them or work around them. Unicode's view of stability of code point assignments and a number of properties is consistent with that model (maybe even an extreme version of it) but they are by no means the only one. The Internet (including IETF) tradition has been to fix mistakes but to be very pragmatic about it (see the second question). I observe that no one is running IPv1 these days. For the specific case of IDNA2008, the observation that the review and update procedure specifically allows additions to the exception list and context rules strongly suggests that the design of the protocol contemplated fixing mistakes, not just digging in and moving forward because nothing about the rules could be changed. In any event, the question of whether and how to fix mistakes gives us three general options: (1.1) For anything that appears to be a mistake, create exceptions or new rules for the "new" code points affected _and_ for all (or at least some) code points that were assigned earlier that are shown to share the same problems. (1.2) For anything that appears to be a mistake, create exceptions or new rules for the "new" code points affected, but let the old ones go in the interest of stability, not invalidating existing labels, etc. IMO, if we are smart and don't want to invite confusion, we explain the situation --and why categories for older code points are not-- changed somewhere, but that is a separate judgment call. (1.3) We take the position that old decisions are immutable and that the best thing to do with new code points that may duplicate older mistakes (or omissions from the rules, or whatever more kind terminology we choose) is to be mistake-consistent. Perhaps we say, somewhere, "well this is a problem but it is better to make it a problem for the registries and count on their good behavior rather than making a change to the protocol". Of those three choices, the first two seem to me to require delaying processing and publication of draft-faltstrom until we can devise the right fixes and language to describe them ... or at least until we can devise a strategy and language that avoids having draft-faltstrom say "this code point is PVALID" only to have some subsequent, mistake-fixing, document come along in (I would hope) a short time and say "no it isn't". For the third, I'd still avoid waiting to process and publish draft-faltstrom until we can explain what the problem was and why we decided to ignore it, push it off, or otherwise refrain from changing RFC 5892. However, there are other ways to handle that and they depend considerably on the second question. (2) How critical is it that we get draft-faltstrom out in a hurry? Who needs it, for what, and why should the IETF make whatever compromises are necessary to prioritize getting it out? I've been asking those questions on and off for some time and still don't think I've seen an answer. However, assuming it really is critical (in ways that the IETF recognizes), then I think we should look for ways to get on with publication while compromising our ability to make the decisions under (1) as little as possible. That does not imply rushing the document to publication in a way that effectively preempts that discussion, leaving us with either 1.3 or trying to deceive the community about the timing associated with 1.2 (although those are options) because it would be awkward, but not difficult, to insert text into draft-faltstrom explaining the loose ends, indicating that the IETF was going to take the other issues up as soon as possible, and that people should treat all (or at least some if we can describe them without normative references [c]) of the newly-added code points with special care until those issues are sorted out. I assume Patrik could write such a paragraph quickly; if not, I'll volunteer. But, again, if getting the document out is not critical enough to allow careful consideration of what might actually be a mistake that should be considered as input to a protocol update, I think prudence and the design of the protocol require that we consider the above issues (and, ideally, the others in the queue) carefully before moving forward with publication. best, john [a] I am deliberately using obviously vague terminology because I do not believe there is demonstrated consensus about what the list of other stuff is (although there may be consensus about some of the items on that list), nor demonstrated consensus about what work is the IETF's responsibility and what should be done elsewhere if at all, nor demonstrated consensus about what conclusions the IETF should accept uncritically from other bodies (and which bodies). But, as long as the list of "other stuff" is non-null, the immediate question about publication of draft-faltstrom does not requiring getting agreement on any of that. [b] I hope we have general agreement on what that term means and that it is closely bound to abstract characters used to write the words (or other conceptual units) of human languages. For example, mathematical symbols are not "letters". [c] That comment hides an interesting procedural issue. Assume some of the possible changes mentioned above result in text for draft-faltstrom that includes normative forward references to documents we aren't ready to process and approve yet. That would put draft-faltstrom into the RFC Editor queue with a normative reference hold so that it would not actually be published until the other document(s) were ready. Under normal circumstances, that would be more than adequate -- while it would be unusual, draft-faltstrom could be pulled out of the queue and revised if that later work actually required that. However, it isn't clear, absent other instructions --instructions that we have not discussed-- whether IANA would act to update its tables at the time of IESG approval of the document or whether they would wait for publication. If the former, we would have exactly the same issues with defining a code point as PVALID and then changing that which are part of the reason to delay processing and approval.
- [I18ndir] Finding a way to conclude the review of… Harald Alvestrand
- Re: [I18ndir] Finding a way to conclude the revie… John C Klensin
- Re: [I18ndir] Finding a way to conclude the revie… Asmus Freytag
- Re: [I18ndir] Finding a way to conclude the revie… Asmus Freytag
- Re: [I18ndir] Finding a way to conclude the revie… John C Klensin