Re: [precis] Ambiguity in specification of case mapping in RFC 7613 and draft-ietf-precis-nickname

John C Klensin <> Wed, 04 November 2015 20:05 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id 5B2731B33BC for <>; Wed, 4 Nov 2015 12:05:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.61
X-Spam-Status: No, score=-2.61 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id Hmnv1qiSzBHp for <>; Wed, 4 Nov 2015 12:05:03 -0800 (PST)
Received: from ( []) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id E38B51B33BB for <>; Wed, 4 Nov 2015 12:04:59 -0800 (PST)
Received: from [] ( by with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <>) id 1Zu4Id-000LFO-ES; Wed, 04 Nov 2015 15:04:51 -0500
Date: Wed, 04 Nov 2015 15:04:46 -0500
From: John C Klensin <>
To: Peter Saint-Andre <>, Tom Worster <>, Alexey Melnikov <>
Message-ID: <>
In-Reply-To: <>
References: <> <> <> <> <> <> <> <> <>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Scanned: No (on; SAEximRunCond expanded to false
Archived-At: <>
Subject: Re: [precis] Ambiguity in specification of case mapping in RFC 7613 and draft-ietf-precis-nickname
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Preparation and Comparison of Internationalized Strings <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 04 Nov 2015 20:05:04 -0000

--On Wednesday, November 04, 2015 08:37 -0700 Peter Saint-Andre
<> wrote:

>> I could take a shot at a paragraph explaining that if it is
>> what people want.  Otherwise, I'd be very careful about
>> getting further into that space than the present text goes.
>> Personally, I think the text is still dancing around the
>> issue too much, rather than addressing it, but I may be in
>> the rough.
> That may be. We could simply remove the offending sentence:
>     3.  Case Mapping Rule: Unicode Default Case Folding MUST
> be applied,
>         as defined in the Unicode Standard [Unicode] (at the
> time of this
>         writing, the algorithm is specified in Chapter 3 of
>         [Unicode7.0]).  In applications that prohibit
> conflicting
>         nicknames, this rule helps to reduce the possibility of
>         confusion by ensuring that nicknames differing only by
> case
>         (e.g., "stpeter" vs. "StPeter") would not be presented
> to a
>         human user at the same time.
> If we do that, we're no longer dancing around issues.

Different dance, I think, but we may have different objectives.
If we keep the explanatory text, it should be accurate.  The
sentence you removed, by itself, is misleading, so, in that
sense, getting rid of it improves accuracy and is an improvement.

On the other hand, if we remove the explanation, we encourage
those who build libraries to simply apply toCaseFold and believe
they don't need to think further about the situation and those
who build user-facing systems to call those libraries and do the
same thing, secure in the belief that the IETF told them things
would be ok.

The problem remains that, while toCaseFold works well and as
predicted in most locales, in "minority" locales and script and
language contexts, it produces results that are surprising and
that may be bizarre.  It doesn't even catch/match all of the
cases one would prefer as well as matching some things that
shouldn't be (probably the more harmless of the two types of

let me say this strongly: unless the IETF is willing to take the
position that we are interested in and supporting an Internet
only for "majority" languages and writing systems ("majority"
defined by "what works well with Unicode generally and
toCaseFold in particular), then a document like this almost
certainly needed a paragraph warning about local issues and
sensitivities.   I can imagine several ways to do that and have
suggested outlines for some of them.   But just dropping text to
avoid saying anything wrong except by implication doesn't do
much for me, largely because I believe that an IETF that takes
such a position would have outlived its usefulness to the global
Internet community.