Heads-up on a possible Unicode 7.0.0 issue

John C Klensin <klensin@jck.com> Mon, 21 July 2014 19:55 UTC

Return-Path: <klensin@jck.com>
X-Original-To: idna-update@alvestrand.no
Delivered-To: idna-update@alvestrand.no
Received: from localhost (localhost [127.0.0.1]) by mork.alvestrand.no (Postfix) with ESMTP id BC1087C3AF4 for <idna-update@alvestrand.no>; Mon, 21 Jul 2014 21:55:18 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at alvestrand.no
Received: from mork.alvestrand.no ([127.0.0.1]) by localhost (mork.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bI7VE6KIoqRr for <idna-update@alvestrand.no>; Mon, 21 Jul 2014 21:55:16 +0200 (CEST)
X-Greylist: delayed 00:16:22 by SQLgrey-1.8.0-rc2
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) by mork.alvestrand.no (Postfix) with ESMTPS id 987487C39C7 for <idna-update@alvestrand.no>; Mon, 21 Jul 2014 21:55:16 +0200 (CEST)
Received: from localhost ([::1]) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <klensin@jck.com>) id 1X9JM2-00048B-RK for idna-update@alvestrand.no; Mon, 21 Jul 2014 15:34:35 -0400
Date: Mon, 21 Jul 2014 15:38:50 -0400
From: John C Klensin <klensin@jck.com>
To: idna-update@alvestrand.no
Subject: Heads-up on a possible Unicode 7.0.0 issue
Message-ID: <37D3814B40486D50DD2A17C2@JCK-EEE10>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: ::1
X-SA-Exim-Mail-From: klensin@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
X-BeenThere: idna-update@alvestrand.no
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: IDNA update work <idna-update.alvestrand.no>
List-Unsubscribe: <http://www.alvestrand.no/mailman/options/idna-update>, <mailto:idna-update-request@alvestrand.no?subject=unsubscribe>
List-Archive: <http://www.alvestrand.no/pipermail/idna-update>
List-Post: <mailto:idna-update@alvestrand.no>
List-Help: <mailto:idna-update-request@alvestrand.no?subject=help>
List-Subscribe: <http://www.alvestrand.no/mailman/listinfo/idna-update>, <mailto:idna-update-request@alvestrand.no?subject=subscribe>
X-List-Received-Date: Mon, 21 Jul 2014 19:55:18 -0000

Hi.

Just as a heads-up to those who are interested in IDNA protocol
issues but are not following Internet-Draft announcements or
some other IETF-related IDN work...

In general, we depend on Unicode normalization (and the IDNA
requirement that the putative labels be in NFC form before
processing) to be sure that two ways to represent the same
character (that is "same [abstract] character", not a question
about, e.g., visual similarities such as those between two
characters from different scripts).

A new precomposed character was added to 7.0.0 that could
previously by represented by a sequence of a base character and
a combining one.
   NFC (new character) .neq. NFC (combining sequence)   and
   NFD (new character) .neq. NFD (combining-sequnece)
Since there are important stability rules for normalization,
neither NFC nor NFD(combining sequence) can change because a new
character is added (and they haven't).   But, normally, when a
situation like this arises, the normalized forms of the new
character (both NFC and NFD) decompose it back to the earlier
combining sequence so that the results compare equal.

In this case (and a few others that no one noticed before) the
precomposed character does not decompose to the combining
sequence for reasons that, to the degree they can be summarized
in one line have to do with different uses of the character, not
its name or concrete or abstract forms.   

>From some quite reasonable perspectives, that makes a lot of
sense but it appears to us that for it is unfortunate and
dangerous for IDNA.   Consequently, the document announced below
makes the new character DISALLOWED in IDNA.

Whether that is the conclusion the IETF ultimately reaches
--there are some tradeoffs that the draft tries to identify-- it
is probably important to understand the fact that there are such
character combinations out there, preferably before the bad guys
figure it out.

best,
    john

---------- Forwarded Message ----------
Date: Monday, 21 July, 2014 04:03 -0700
From: internet-drafts@ietf.org
To: i-d-announce@ietf.org
Subject: I-D Action: draft-klensin-idna-5892upd-unicode70-00.txt


A New Internet-Draft is available from the on-line
Internet-Drafts directories.


        Title           : IDNA Update for Unicode 7.0.0
        Authors         : John C Klensin
                          Patrik Faltstrom
	Filename        : draft-klensin-idna-5892upd-unicode70-00.txt
	Pages           : 10
	Date            : 2014-07-21

Abstract:
   The current version of the IDNA specifications anticipated
that each    new version of Unicode would be reviewed to verify
that no changes    had been introduced that required adjustments
to the set of rules    and, in particular, whether new
exceptions or backward compatibility    adjustments were needed.
That review was conducted for Unicode 7.0.0    and identified a
problematic new code point.  This specification    updates RFC
5982 to disallow that code point and provides information
about the reasons why that exclusion is appropriate.  It also
applies    an editorial clarification that was the subject of an
earlier    erratum.


The IETF datatracker status page for this draft is:
https://datatracker.ietf.org/doc/draft-klensin-idna-5892upd-unic
ode70/

There's also a htmlized version available at:
http://tools.ietf.org/html/draft-klensin-idna-5892upd-unicode70-
00


Please note that it may take a couple of minutes from the time
of submission until the htmlized version and diff are available
at tools.ietf.org.

Internet-Drafts are also available by anonymous FTP at:
ftp://ftp.ietf.org/internet-drafts/

_______________________________________________
I-D-Announce mailing list
I-D-Announce@ietf.org
https://www.ietf.org/mailman/listinfo/i-d-announce
Internet-Draft directories: http://www.ietf.org/shadow.html
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt

---------- End Forwarded Message ----------