Re: [drinks] on internationalization.

Peter Saint-Andre <stpeter@stpeter.im> Wed, 07 September 2011 15:37 UTC

Return-Path: <stpeter@stpeter.im>
X-Original-To: drinks@ietfa.amsl.com
Delivered-To: drinks@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4E3A221F8ACC for <drinks@ietfa.amsl.com>; Wed, 7 Sep 2011 08:37:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.599
X-Spam-Level:
X-Spam-Status: No, score=-102.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qlNHD7uTBALZ for <drinks@ietfa.amsl.com>; Wed, 7 Sep 2011 08:36:47 -0700 (PDT)
Received: from stpeter.im (mailhost.stpeter.im [207.210.219.225]) by ietfa.amsl.com (Postfix) with ESMTP id F3E1021F848E for <drinks@ietf.org>; Wed, 7 Sep 2011 08:36:46 -0700 (PDT)
Received: from leavealone.cisco.com (unknown [72.163.0.129]) (Authenticated sender: stpeter) by stpeter.im (Postfix) with ESMTPSA id B2B3D418BB; Wed, 7 Sep 2011 09:41:31 -0600 (MDT)
Message-ID: <4E678FF8.9030503@stpeter.im>
Date: Wed, 07 Sep 2011 09:38:32 -0600
From: Peter Saint-Andre <stpeter@stpeter.im>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.5; rv:6.0.2) Gecko/20110902 Thunderbird/6.0.2
MIME-Version: 1.0
To: Alexander Mayrhofer <alexander.mayrhofer@nic.at>
References: <8BC845943058D844ABFC73D2220D46650AD29D98@nics-mail.sbg.nic.at>
In-Reply-To: <8BC845943058D844ABFC73D2220D46650AD29D98@nics-mail.sbg.nic.at>
X-Enigmail-Version: 1.3.1
OpenPGP: url=https://stpeter.im/stpeter.asc
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Cc: drinks@ietf.org
Subject: Re: [drinks] on internationalization.
X-BeenThere: drinks@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: IETF DRINKS WG <drinks.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/drinks>, <mailto:drinks-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/drinks>
List-Post: <mailto:drinks@ietf.org>
List-Help: <mailto:drinks-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/drinks>, <mailto:drinks-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Sep 2011 15:37:07 -0000

Here is a start at some discussion of the topic...

On 9/7/11 9:21 AM, Alexander Mayrhofer wrote:
> One of the action items from the interim was to look at
> Internationalization in SPPP. 

First, what are you trying to achieve? Saying "internationalization" is
a bit like saying "security": it has many facets and often means
different things to different people.

> I've looked at the Internationalization
> text from EPP, and have only found general text, eg. in Section 5 of RFC
> 5731:
> 
> http://tools.ietf.org/html/rfc5733#section-5
> 
> That section explains that since EPP is based on XML, native Unicode
> support is included, and also has some text on time/date values.
> 
> However, there are no additional constraints in EPP regarding the
> contents of any field in the protocol. That means that - unless i've
> overlooked something - EPP object data can include arbitrary Unicode
> characters. 

To be precise, UTF-8-encoded Unicode code points. :)

See http://tools.ietf.org/html/draft-ietf-appsawg-rfc3536bis-06 for a
helpful explanation of various i18n terms.

> Repository object identifiers are, however, limited to "Word" characters
> (using the XSD pattern facet limited to "\w" characters). The set of
> characters that matches is pretty broad, though:
> 
> [#x0000-#x10FFFF]-[\p{P}\p{Z}\p{C}] (all characters except the set of
> "punctuation", "separator" and "other" characters)
> 
> (from http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#dt-regex)
> 
> Therefore, to conclude, EPP allows for a broad range of characters in
> identifiers, 

Earlier you talked about fields in the protocol, now you are talking
about identifiers. Do you care more about identifiers than any other
kind of field? Is there a distinction here? What do you *do* with these
various fields and identifiers? Do you, for example, compare them? If
so, for what purposes (authentication, authorization, etc.)? What are
the implications if a comparison operation yields a false positive or a
false negative?

See http://tools.ietf.org/html/draft-iab-identifier-comparison-00 for a
detailed discussion of these issues.

As to comparison of internationalized strings in application protocols,
please consult the work of the PRECIS WG to see if it is relevant here:

http://tools.ietf.org/html/draft-ietf-precis-problem-statement-03

http://tools.ietf.org/html/draft-ietf-precis-framework-00

If you are going to be comparing identifiers for purposes that have
security implications, then I think you will need to say something more
than "just use UTF-8".

> and since EPP has gone through IESG review successfully 3
> times, i would suggest that we include a similiar "Internationalization
> Considerations" section in the SPPP protocol document, and not try to
> restrict fields more than EPP does.

What is the relationship between EPP and SPPP? Does SPPP essentially
emulate EPP? If so, it might make sense to re-use some text from the EPP
spec. On the other hand, it's possible that i18n issues were not really
addressed in the EPP spec (despite surviving IESG review three times)
and that you might want to think about these issues anew for SPPP.

HTH,

Peter

-- 
Peter Saint-Andre
https://stpeter.im/