Re: i18n requirements (was: Re: NF* (Re: PKCS#11 URI slot attributes & last call))

Jan Pechanec <> Thu, 08 January 2015 18:45 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id AF15C1A007D; Thu, 8 Jan 2015 10:45:52 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -6.21
X-Spam-Status: No, score=-6.21 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, GB_I_LETTER=-2, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, UNPARSEABLE_RELAY=0.001] autolearn=ham
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id mCG53B3OcTZk; Thu, 8 Jan 2015 10:45:50 -0800 (PST)
Received: from ( []) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 1BB591A006C; Thu, 8 Jan 2015 10:45:50 -0800 (PST)
Received: from ( []) by (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id t08IjhEK010828 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 8 Jan 2015 18:45:44 GMT
Received: from ( []) by (8.14.4+Sun/8.14.4) with ESMTP id t08Ijfoq005708 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Thu, 8 Jan 2015 18:45:42 GMT
Received: from ( []) by (8.14.4+Sun/8.14.4) with ESMTP id t08IjeR4005670; Thu, 8 Jan 2015 18:45:41 GMT
Received: from (/ by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 08 Jan 2015 10:45:40 -0800
Date: Thu, 8 Jan 2015 10:45:38 -0800 (PST)
From: Jan Pechanec <>
X-X-Sender: jpechane@keflavik
To: John C Klensin <>
Subject: Re: i18n requirements (was: Re: NF* (Re: PKCS#11 URI slot attributes & last call))
In-Reply-To: <>
Message-ID: <alpine.GSO.2.00.1501081022410.8929@keflavik>
References: <> <alpine.GSO.2.00.1412300946340.4549@keflavik> <> <> <20141231070328.GK24442@localhost> <> <20141231074641.GM24442@localhost> <> <20141231082551.GN24442@localhost> <E4837FDB76D5ACDEB1C568DF@[]> <20150102030130.GN24442@localhost> <alpine.GSO.2.00.1501032124490.6923@keflavik> <alpine.GSO.2.00.1501071105250.8929@keflavik> <>
User-Agent: Alpine 2.00 (GSO 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Source-IP: []
Archived-At: <>
X-Mailman-Approved-At: Fri, 09 Jan 2015 08:39:59 -0800
Cc: Darren J Moffat <>, Stef Walter <>, Jaroslav Imrich <>,, =?ISO-8859-15?Q?Patrik_F=E4ltstr=F6m?= <>, Shawn Emery <>,, Christian Huitema <>, Nikos Mavrogiannopoulos <>
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: IETF-Discussion <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 08 Jan 2015 18:45:52 -0000

On Thu, 8 Jan 2015, John C Klensin wrote:

>> On Sat, 3 Jan 2015, Jan Pechanec wrote:
>> 	hi, I haven't received any other comments on the draft 
>> recently (I know the LC already ended on Dec 29 though) so I
>> think I  can file changes discussed and drafted in this thread
>> as draft 18 on  Friday.  Thank you all for feedback, I really
>> appreciate it.
>> 	one more change for the draft 18 (v2 attached) is to spell 
>> "NFC" and reference the Unicode Annex on normalization based
>> on  comments from Jaroslav and Christian.
>I don't have a lot of time to spend on this and am not an expert
>on either X.509 or PKCK (#11 or otherwise).  At least the first
>may be unfortunate, but it is what it is.
	hi John, I very much appreciate time you already spent on 
this.  Please see my comments inline.

>While I think the changes you have made are definitely
>improvements, this i18n stuff is complicated.  As with Security,
>there is a completely inadequate supply of magic pixie dust that
>can be thrown at problems to make them go away.  "Normalize to
>NFC" (with spelling-out and references) is a vast improvement or
>"use [valid] UTF-8" but there are many other issues.  You have
>noted some and omitted others.  For example, case-independent
>matching is a very simple and completely deterministic issue for
>ASCII (one essentially just masks off one bit within a certain
>range), it can get very messy if one tries to be sensitive to
>different locales that have different conventions about what to
>do with diacritical marks when lower-case characters are
>converted to upper case.  There are Unicode "CaseFold" rules

	I understand from the previous discussion that the topic is a 
very complex one and that my draft needed to acknowledge that.


>I don't know how far in explaining this your document should go.
>I would urge, as I think I did before, some fairly strong
>warnings that, at least until the issues are clarified in
>PKCS#11 itself, one should be very certain one knows what one is
>doing (and what the consequences of one's choices will be) if
>one decides to move beyond the safety and general understanding
>of the ASCII/ ISO 646/ IA5 letter and digit repertoire.  That
>sort of warning should supplement your NFC language, not replace
>it-- neither is a substitute for the other.   Whether you
>incorporate it or not, your I-D should not assume that, by
>saying "NFC", you have somehow resolved the full range of issues
>in this area, any more than saying "UTF-8" did. 

	I understand that.  The note about spelling NFC was on top of 
the first changes I incorporated.  I don't know if you saw those, I 
know there were many emails and your time you could spend on this is 
very limited.  So, in section on URI matching, I tried to be very 
explicit and based the warning I added on one of your comments from 
the previous discussion:

+   As noted in Section 6, the PKCS#11 specification is not clear about
+   how to normalize UTF-8 encoded Unicode characters [RFC3629].  Those
+   who discover a need to use characters outside the ASCII repertoire
+   should be cautious, conservative, and expend extra effort to be sure
+   they know what they are doing and that failure to do so may create
+   both operational and security risks.  It means that when matching
+   UTF-8 string based attributes (see Table 1) with such characters,
+   normalizing all UTF-8 strings before string comparison may be the
+   only safe approach.  For example, for objects (keys) it means that
+   PKCS#11 attribute search template would only contain attributes that
+   are not UTF-8 strings and another pass through returned objects is
+   then needed for UTF-8 string comparison after the normalization is
+   applied.

	do you suggest a stronger warning than that?

	more on that was incorporated into a new Internationalization 
Condiderations section, based on new text drafted by Nico:

+6.  Internationalization Considerations
+   The PKCS#11 specification does not specify a canonical form for
+   strings of characters of the CK_UTF8CHAR type.  This presents the
+   usual false negative and false positive (aliasing) concerns that
+   arise when dealing with unnormalized strings.  Because all PKCS#11
+   items are local and local security is assumed, these concerns are
+   mainly about usability.
+   In order to improve the user experience, applications that create
+   PKCS#11 objects or label tokens SHOULD normalize labels to
+   Normalization Form C (NFC) [UAX15].  For the same reason PKCS#11
+   libraries, slots (token readers), and tokens SHOULD normalize their
+   names to NFC.  When listing PKCS#11 libraries, slots, tokens, and/or
+   objects, an application SHOULD normalize their names to NFC.  When
+   matching PKCS#11 URIs to libraries, slots, tokens, and/or objects,
+   applications MAY use form-insensitive Unicode string comparison for
+   matching, as those might pre-date these recommendations.  See also
+   Section 3.5.

	and a new paragraph was also added to the existing Security 
Considerations section:

+   The PKCS#11 specification does not provide means to authenticate
+   devices to users; it only allows to authenticate users to tokens.
+   Instead, local and physical security are demanded: the user must be
+   in possession of their tokens, and system into whose slots the users'
+   tokens are inserted must be secure.  As a result, the usual security
+   considerations regarding normalization do not arise.  For the same
+   reason, confusable script issues also do not arise.  Nonetheless, it
+   is best to normalize to NFC all strings appearing in PKCS#11 API
+   elements.  See also Section 6.

	I think these new paragraphs convey the message that users 
should very careful when using characters outside ASCII, and what to 
do to mitigate problems that can arise from such use.  Do you think 
more should be said in the draft itself?

>For more information, you might have a look at some of the
>PRECIS work, notably draft-ietf-precis-framework.
>I also remain convinced that the best place to fix this is in
>the PKCS#11 spec itself.  One is always at a disadvantage when
>trying to work around an inadequate specification in a different
>specification that has to depend on it and your work is no
>exception.  I wish there were whatever liaison arrangements
>between the IETF and others (presumably notably RSA) to be sure
>that happened or at least there was clear awareness on the PKCS
>side of the deficiencies.

	last week I did contact OASIS PKCS 11 TC which is where 
PKCS#11 moved to since 2013.  However, even if the issue is going to 
be fixed there I don't think it will be in new version 2.40 which is 
close to be published.

	Happy New Year to you, too.

	best regards, Jan.

Jan Pechanec <>