[precis] the Exceptions category

Peter Saint-Andre <stpeter@stpeter.im> Tue, 08 May 2012 02:50 UTC

Return-Path: <stpeter@stpeter.im>
X-Original-To: precis@ietfa.amsl.com
Delivered-To: precis@ietfa.amsl.com
Received: from localhost (localhost []) by ietfa.amsl.com (Postfix) with ESMTP id EA16121F84D2 for <precis@ietfa.amsl.com>; Mon, 7 May 2012 19:50:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.547
X-Spam-Status: No, score=-102.547 tagged_above=-999 required=5 tests=[AWL=0.052, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([]) by localhost (ietfa.amsl.com []) (amavisd-new, port 10024) with ESMTP id 7blc41M3KVsj for <precis@ietfa.amsl.com>; Mon, 7 May 2012 19:50:57 -0700 (PDT)
Received: from stpeter.im (mailhost.stpeter.im []) by ietfa.amsl.com (Postfix) with ESMTP id 4F1DC21F84AE for <precis@ietf.org>; Mon, 7 May 2012 19:50:57 -0700 (PDT)
Received: from [] (unknown []) (Authenticated sender: stpeter) by stpeter.im (Postfix) with ESMTPSA id A6BBB40058 for <precis@ietf.org>; Mon, 7 May 2012 21:06:12 -0600 (MDT)
Message-ID: <4FA88A10.2030903@stpeter.im>
Date: Mon, 07 May 2012 20:50:56 -0600
From: Peter Saint-Andre <stpeter@stpeter.im>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:12.0) Gecko/20120428 Thunderbird/12.0.1
MIME-Version: 1.0
To: "precis@ietf.org" <precis@ietf.org>
X-Enigmail-Version: 1.4.1
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Subject: [precis] the Exceptions category
X-BeenThere: precis@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Preparation and Comparison of Internationalized Strings <precis.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/precis>, <mailto:precis-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/precis>
List-Post: <mailto:precis@ietf.org>
List-Help: <mailto:precis-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/precis>, <mailto:precis-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2012 02:50:58 -0000

RFC 5892 defines a category called Exceptions, which lists codepoints
whose assignment differs from what the assignment would have been based
solely on the core property value. For example, while working on the
PRECIS codepoint table I just found U+0F0B (TIBETAN MARK INTERSYLLABIC
TSHEG), which has a core property value of Po ("Punctuation, other") and
which thus would have been DISALLOWED in IDNA2008 if it had not been
explicitly placed in the Exceptions category.

Unfortunately, the decisions of the IDNA2008 team with regard to these
exceptions are not documented in RFC 5892 or elsewhere (AFAIK), so it's
not easy to understand whether it would be best for PRECIS to follow
IDNA2008 here or instead to base our assignments on the core property
values for some or all of the codepoints in the Exceptions category.
(Using the same example, if we follow IDNA2008 then U+0F0B would be
PVALID, whereas if we base assignment on the core property value then
this codepoint would be FREE_PVAL and NAME_DIS.)

I understand the reasoning behind codepoints like sharp S and Greek
final sigma because they were extensively discussed on the IDNA list,
but other codepoints were not as controversial.

I suppose the safest course would be to follow IDNA2008 here. The
second-safest course would be to base all assignments on the core
property value. The least safe course would be revisiting each codepoint
individually and thus defining a PrecisExceptions table that differs in
subtle ways from the IDNA2008 Exceptions table.


Peter Saint-Andre