Heads-up on a possible Unicode 7.0.0 issue
John C Klensin <klensin@jck.com> Mon, 21 July 2014 19:55 UTC
Return-Path: <klensin@jck.com>
X-Original-To: idna-update@alvestrand.no
Delivered-To: idna-update@alvestrand.no
Received: from localhost (localhost [127.0.0.1]) by mork.alvestrand.no (Postfix) with ESMTP id BC1087C3AF4 for <idna-update@alvestrand.no>; Mon, 21 Jul 2014 21:55:18 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at alvestrand.no
Received: from mork.alvestrand.no ([127.0.0.1]) by localhost (mork.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bI7VE6KIoqRr for <idna-update@alvestrand.no>; Mon, 21 Jul 2014 21:55:16 +0200 (CEST)
X-Greylist: delayed 00:16:22 by SQLgrey-1.8.0-rc2
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) by mork.alvestrand.no (Postfix) with ESMTPS id 987487C39C7 for <idna-update@alvestrand.no>; Mon, 21 Jul 2014 21:55:16 +0200 (CEST)
Received: from localhost ([::1]) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <klensin@jck.com>) id 1X9JM2-00048B-RK for idna-update@alvestrand.no; Mon, 21 Jul 2014 15:34:35 -0400
Date: Mon, 21 Jul 2014 15:38:50 -0400
From: John C Klensin <klensin@jck.com>
To: idna-update@alvestrand.no
Subject: Heads-up on a possible Unicode 7.0.0 issue
Message-ID: <37D3814B40486D50DD2A17C2@JCK-EEE10>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: ::1
X-SA-Exim-Mail-From: klensin@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
X-BeenThere: idna-update@alvestrand.no
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: IDNA update work <idna-update.alvestrand.no>
List-Unsubscribe: <http://www.alvestrand.no/mailman/options/idna-update>, <mailto:idna-update-request@alvestrand.no?subject=unsubscribe>
List-Archive: <http://www.alvestrand.no/pipermail/idna-update>
List-Post: <mailto:idna-update@alvestrand.no>
List-Help: <mailto:idna-update-request@alvestrand.no?subject=help>
List-Subscribe: <http://www.alvestrand.no/mailman/listinfo/idna-update>, <mailto:idna-update-request@alvestrand.no?subject=subscribe>
X-List-Received-Date: Mon, 21 Jul 2014 19:55:18 -0000
Hi. Just as a heads-up to those who are interested in IDNA protocol issues but are not following Internet-Draft announcements or some other IETF-related IDN work... In general, we depend on Unicode normalization (and the IDNA requirement that the putative labels be in NFC form before processing) to be sure that two ways to represent the same character (that is "same [abstract] character", not a question about, e.g., visual similarities such as those between two characters from different scripts). A new precomposed character was added to 7.0.0 that could previously by represented by a sequence of a base character and a combining one. NFC (new character) .neq. NFC (combining sequence) and NFD (new character) .neq. NFD (combining-sequnece) Since there are important stability rules for normalization, neither NFC nor NFD(combining sequence) can change because a new character is added (and they haven't). But, normally, when a situation like this arises, the normalized forms of the new character (both NFC and NFD) decompose it back to the earlier combining sequence so that the results compare equal. In this case (and a few others that no one noticed before) the precomposed character does not decompose to the combining sequence for reasons that, to the degree they can be summarized in one line have to do with different uses of the character, not its name or concrete or abstract forms. >From some quite reasonable perspectives, that makes a lot of sense but it appears to us that for it is unfortunate and dangerous for IDNA. Consequently, the document announced below makes the new character DISALLOWED in IDNA. Whether that is the conclusion the IETF ultimately reaches --there are some tradeoffs that the draft tries to identify-- it is probably important to understand the fact that there are such character combinations out there, preferably before the bad guys figure it out. best, john ---------- Forwarded Message ---------- Date: Monday, 21 July, 2014 04:03 -0700 From: internet-drafts@ietf.org To: i-d-announce@ietf.org Subject: I-D Action: draft-klensin-idna-5892upd-unicode70-00.txt A New Internet-Draft is available from the on-line Internet-Drafts directories. Title : IDNA Update for Unicode 7.0.0 Authors : John C Klensin Patrik Faltstrom Filename : draft-klensin-idna-5892upd-unicode70-00.txt Pages : 10 Date : 2014-07-21 Abstract: The current version of the IDNA specifications anticipated that each new version of Unicode would be reviewed to verify that no changes had been introduced that required adjustments to the set of rules and, in particular, whether new exceptions or backward compatibility adjustments were needed. That review was conducted for Unicode 7.0.0 and identified a problematic new code point. This specification updates RFC 5982 to disallow that code point and provides information about the reasons why that exclusion is appropriate. It also applies an editorial clarification that was the subject of an earlier erratum. The IETF datatracker status page for this draft is: https://datatracker.ietf.org/doc/draft-klensin-idna-5892upd-unic ode70/ There's also a htmlized version available at: http://tools.ietf.org/html/draft-klensin-idna-5892upd-unicode70- 00 Please note that it may take a couple of minutes from the time of submission until the htmlized version and diff are available at tools.ietf.org. Internet-Drafts are also available by anonymous FTP at: ftp://ftp.ietf.org/internet-drafts/ _______________________________________________ I-D-Announce mailing list I-D-Announce@ietf.org https://www.ietf.org/mailman/listinfo/i-d-announce Internet-Draft directories: http://www.ietf.org/shadow.html or ftp://ftp.ietf.org/ietf/1shadow-sites.txt ---------- End Forwarded Message ----------
- Heads-up on a possible Unicode 7.0.0 issue John C Klensin