Re: [drinks] on internationalization.

Alexander Mayrhofer <alexander.mayrhofer@nic.at> Fri, 09 September 2011 13:25 UTC

Return-Path: <alexander.mayrhofer@nic.at>
X-Original-To: drinks@ietfa.amsl.com
Delivered-To: drinks@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AAF1821F8AF2 for <drinks@ietfa.amsl.com>; Fri, 9 Sep 2011 06:25:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.384
X-Spam-Level:
X-Spam-Status: No, score=-9.384 tagged_above=-999 required=5 tests=[AWL=0.046, BAYES_00=-2.599, HELO_EQ_AT=0.424, HOST_EQ_AT=0.745, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 21iHLzRAvX9A for <drinks@ietfa.amsl.com>; Fri, 9 Sep 2011 06:25:55 -0700 (PDT)
Received: from mail.sbg.nic.at (mail.sbg.nic.at [83.136.33.227]) by ietfa.amsl.com (Postfix) with ESMTP id 855E221F84A2 for <drinks@ietf.org>; Fri, 9 Sep 2011 06:25:53 -0700 (PDT)
Received: from nics-exch.sbg.nic.at ([10.17.175.3]) by mail.sbg.nic.at over TLS secured channel (TLSv1/SSLv3:AES128-SHA:128) with XWall v3.47 ; Fri, 9 Sep 2011 15:27:41 +0200
Received: from nics-mail.sbg.nic.at (10.17.175.2) by NICS-EXCH.sbg.nic.at (10.17.175.3) with Microsoft SMTP Server id 14.1.323.3; Fri, 9 Sep 2011 15:27:45 +0200
Content-Class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: base64
Date: Fri, 09 Sep 2011 15:27:38 +0200
X-MimeOLE: Produced By Microsoft Exchange V6.5
Message-ID: <8BC845943058D844ABFC73D2220D46650AD2A024@nics-mail.sbg.nic.at>
In-Reply-To: <4E678FF8.9030503@stpeter.im>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [drinks] on internationalization.
Thread-Index: AcxtdDqwyd+OEz9WTtq2fv1S83DWkABfYRzA
References: <8BC845943058D844ABFC73D2220D46650AD29D98@nics-mail.sbg.nic.at> <4E678FF8.9030503@stpeter.im>
From: Alexander Mayrhofer <alexander.mayrhofer@nic.at>
To: Peter Saint-Andre <stpeter@stpeter.im>
X-XWALL-BCKS: auto
Cc: drinks@ietf.org
Subject: Re: [drinks] on internationalization.
X-BeenThere: drinks@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: IETF DRINKS WG <drinks.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/drinks>, <mailto:drinks-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/drinks>
List-Post: <mailto:drinks@ietf.org>
List-Help: <mailto:drinks-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/drinks>, <mailto:drinks-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 09 Sep 2011 13:25:55 -0000

> > One of the action items from the interim was to look at
> > Internationalization in SPPP.
> 
> First, what are you trying to achieve? Saying "internationalization" is a bit like
> saying "security": it has many facets and often means different things to
> different people.

Peter, that's an excellent comment and comparison. I think the considerations are as follows:

- We have "data fields" and "identifiers" in the protocol:  "data fields" are just passed through the protocol / registry ("garbage in / garbage out" style), and "identifiers" need to be compared, eg. against parameters in commands, because they're used to identify objects that are provisioned.

- I understand (and agree) that modern IETF protocols should not be limited to pure ASCII.

- "data fields" are probably easy to internationalize, since the components handling the procotol do not need to examine their contents. However, there might be issues when such fields are displayed (for example, when LTR and RTL strings are mixed within a field) - are there any recommendations on text regarding such issues?

- "identifiers" are more complicated, because they're affected by all the obvious normalization / comparison issues. A quick win would be to restrict such fields to ASCII only, however, i understand that might be unacceptable from the Internationalization perspective.

[..]
> > However, there are no additional constraints in EPP regarding the
> > contents of any field in the protocol. That means that - unless i've
> > overlooked something - EPP object data can include arbitrary Unicode
> > characters.
> 
> To be precise, UTF-8-encoded Unicode code points. :)

agreed - and, that's the option we would go for in SPPP as well.

> See http://tools.ietf.org/html/draft-ietf-appsawg-rfc3536bis-06 for a helpful
> explanation of various i18n terms.

Ah, thanks for the pointer - great!

[..]
> > Therefore, to conclude, EPP allows for a broad range of characters in
> > identifiers,
> 
> Earlier you talked about fields in the protocol, now you are talking about
> identifiers. Do you care more about identifiers than any other kind of field? Is
> there a distinction here? What do you *do* with these various fields and
> identifiers? Do you, for example, compare them? If so, for what purposes
> (authentication, authorization, etc.)? What are the implications if a
> comparison operation yields a false positive or a false negative?

As outlined above, i think the identifiers are the crucial part (which doesn't mean that other fields don't matter, but they're typically not used to identify objects). 

Comparison operations are important since those identifiers are used to identify relations between objects, and also "address" objects affected by a certain command. False positives and/or negatives could lead to loosing data integrity between objects, or commands affecting the wrong objects. 
 
> See http://tools.ietf.org/html/draft-iab-identifier-comparison-00 for a
> detailed discussion of these issues.
>
> As to comparison of internationalized strings in application protocols, please
> consult the work of the PRECIS WG to see if it is relevant here:
> 
> http://tools.ietf.org/html/draft-ietf-precis-problem-statement-03
> 
> http://tools.ietf.org/html/draft-ietf-precis-framework-00
> 
> If you are going to be comparing identifiers for purposes that have security
> implications, then I think you will need to say something more than "just use
> UTF-8".

I will look at the PRECIS work. 

> > and since EPP has gone through IESG review successfully 3 times, i
> > would suggest that we include a similiar "Internationalization
> > Considerations" section in the SPPP protocol document, and not try to
> > restrict fields more than EPP does.
> 
> What is the relationship between EPP and SPPP? Does SPPP essentially
> emulate EPP? If so, it might make sense to re-use some text from the EPP
> spec. On the other hand, it's possible that i18n issues were not really
> addressed in the EPP spec (despite surviving IESG review three times) and
> that you might want to think about these issues anew for SPPP.

EPP and SPP serve similar purposes (provisioning objects in a "registry"), but serve different domains. SPPP is not an emulation of EPP, but has some similar properties. The impact of comparison operations on identifiers in EPP is similar to those in SPPP, which is why i was looking for respective text in those RFCs.

Alex