[I18ndir] I18ndir last call review of draft-faltstrom-unicode11-07

Harald Alvestrand via Datatracker <noreply@ietf.org> Wed, 27 March 2019 12:42 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: i18ndir@ietf.org
Delivered-To: i18ndir@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 9A0B11202B9; Wed, 27 Mar 2019 05:42:13 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Harald Alvestrand via Datatracker <noreply@ietf.org>
To: i18ndir@ietf.org
Cc: draft-faltstrom-unicode11.all@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.94.1
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Harald Alvestrand <harald@alvestrand.no>
Message-ID: <155369053353.10373.11941444859783198409@ietfa.amsl.com>
Date: Wed, 27 Mar 2019 05:42:13 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/SV0WbpSi4tWjUVQlz9Dx-ry_XcY>
Subject: [I18ndir] I18ndir last call review of draft-faltstrom-unicode11-07
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Mar 2019 12:42:14 -0000

Reviewer: Harald Alvestrand
Review result: On the Right Track

Overall conclusion: Not ready yet, needs some updates. New I-D recommended.
[Note: As part of the discussion that resulted in this text, a new I-D has been
issued.]

Context issues
=============
The discussion of draft-faltstrom-unicode11 in the directorate has shown that
the directorate members share a number of concerns about the current state of
IDNA, only some of which are directly relevant to this memo.

IDNA2008 considered limits to what was reasonable to register and use in the
DNS at a number of levels:

- A level of “don’t register stuff that causes confusion”. This requires human
judgment, and reasonable people may disagree about what causes confusion. - A
level of “don’t register stuff that is structurally invalid under the relevant
writing system”. Aspects of this can be captured in rulesets (ICANN’s RZ-LGR
efforts fall into this category), but requires deep expertise; this is captured
in IDNA2008 as the “don’t register what you don’t understand” rule. - A level
of “this is stuff that you should never register, and applications can
reasonably choose to treat it as an error or an attack if it ever shows up”.
This is the distinction that is captured in the classification of codepoints as
DISALLOWED, and where IDNA2008 (with updates) gives precise rules.

The current document focuses on the last level only - the maintenance of the
distinction between PVALID and DISALLOWED. (It also considers whether new
CONTEXTO and CONTEXTJ rules are needed).

It is clear from directorate discussion that work needs to be done at the other
levels outlined above too, but it is not clear from the discussion what form
that work should take or what fora that work is reasonably performed in; the
work may or may not involve a revision of the basic IDNA2008 specifications.

We suggest to insert a paragraph in the document describing the context of the
state of IDNA2008, and explain what issues this document does not attempt to
address. Specifically that the conclusion of the document is what to do
regarding Unicode versions up to and including 11, and that this is not to be
used as expectations of future versions of Unicode.

In addition, it’s become clear that IDNA2008 does not specify the mechanisms
and expectations of the review of new versions of Unicode in enough detail;
with the review of a number of versions of Unicode behind us, we should be able
to describe those procedures and expectations better than IDNA2008 does.
However, this may need to happen in another document than this one.

Content issues
============
Section 4.1 does not specify where to find the conclusion of the IETF
discussion on U+08A1. It is not easy to see from the text whether the
algorithms and procedures will render U+0628 U+0654 an illegal sequence or a
legal sequence. No matter what the resolution is, the document should make it
obvious what the conclusion is (and why).

RFC 5892 states that SPHERICAL ANGLE OPENING UP is DISALLOWED not PVALID:
27D0..2B4C  ; DISALLOWED - the draft says it’s PVALD; this needs changing.

Section 4.1 ought to include numbers for how many characters ended up in
DISALLOWED vs PVALID - ideally, for each Unicode version since IDNA2008 was
issued. This may also be something that is recommended for the IANA tables
rather than this document.

Given the time that has passed since this work started, we should consider
whether or not to include Unicode 12.

Nits
====
These have been submitted separately to the author, and are not enumerated here.