Re: [apps-discuss] i18n intro, Sunday 14:00-16:00

"Martin J. Dürst" <> Thu, 21 July 2011 13:30 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 7E0C721F8B2D for <>; Thu, 21 Jul 2011 06:30:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -100.815
X-Spam-Status: No, score=-100.815 tagged_above=-999 required=5 tests=[AWL=0.975, BAYES_00=-2.599, GB_I_LETTER=-2, HELO_EQ_JP=1.244, HOST_EQ_JP=1.265, MIME_8BIT_HEADER=0.3, USER_IN_WHITELIST=-100]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id hZSVoBngHr4T for <>; Thu, 21 Jul 2011 06:30:49 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id EED9921F8B12 for <>; Thu, 21 Jul 2011 06:30:48 -0700 (PDT)
Received: from ([]) by (secret/secret) with ESMTP id p6L743mL012791 for <>; Thu, 21 Jul 2011 16:04:03 +0900
Received: from ([]) by (secret/secret) with SMTP id p6L742CY014521 for <>; Thu, 21 Jul 2011 16:04:02 +0900
Received: from (unknown []) by with smtp id 1262_ba8a_9d22be0c_b367_11e0_8ad4_001d096c5b62; Thu, 21 Jul 2011 16:04:02 +0900
Received: from [IPv6:::1] ([]:52960) by with [XMail 1.22 ESMTP Server] id <S1531882> for <> from <>; Thu, 21 Jul 2011 16:04:04 +0900
Message-ID: <>
Date: Thu, 21 Jul 2011 16:03:12 +0900
From: =?UTF-8?B?Ik1hcnRpbiBKLiBEw7xyc3Qi?= <>
Organization: Aoyama Gakuin University
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv: Gecko/20100722 Eudora/3.0.4
MIME-Version: 1.0
To: Peter Saint-Andre <>
References: <> <>
In-Reply-To: <>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Subject: Re: [apps-discuss] i18n intro, Sunday 14:00-16:00
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 21 Jul 2011 13:30:53 -0000

Some comments:

Slide 7: there are thousands of languages and scripts:
Thousands of languages: yes; thousands of scripts: no
( currently has 95 
(including 'Common'), so "well over a hundred" would me much more 

Slide 10: we need to encode more than [A-Z][a-z]:
This should probably be: [A-Za-z]

Slide 22: Why not UPPER vs. lower vs. Title ?

Slide 24: You could do full, half

Slide 55: There actually is U+1E9E LATIN CAPITAL LETTER SHARP S. But 
case conversion only goes downwards, not upwards.

Slide 113: There is no such thing as Compat Recomp!

Slide 114: "Fastest processing": That depends on various assumptions, in 
particular on the assumption that you actually implement processing as 
in slide 113 (which typically is not done).

Slide 117: "Compared to NFKC: Produces more false positives during
comparison operations": This is confusing/wrong. If matches are 
positive, then NFKC will match more than NFC, and if some of these 
matches are considered false, then NFKC will produce more false positives.

Slides 114-118: Some of the arguments given for a single NF apply to a 
certain aspect of NFs, e.g. C or D or K.

Slide 123: Good to see that. By the way, I seem to remember both John 
and me begging you for an explanation of why Jabber wants to use NFD a 
few months ago, and I'm not sure I have seen an answer. Now might be a 
good time (if you already sent one, a pointer would be appreciated).

Slide 131: "UTF-8 is the preferred IETF encoding (RFC 3629)":
RFC 3629 is the reference for UTF-8 per se, the IETF preference is 
expressed in RFC 2277. So the text should say
"UTF-8 (RFC 3629) is the preferred IETF encoding (RFC 2277)"
(or some such), and add RFC 2277 to the references.

Slide 132: integers -> bytes (or octets)
(we are really now on a lower, somewhat more physical level, and 
byte/octet is completely adequate here (indeed anything else would be 
needlessly confusing).

Slide 135: This should come close to normalization. I'd move the part on 
encoding earlier, but maybe that's just me.

Slide 168: Fussball vs. Fußball isn't a normalization issue (not even 
NFKC). Of the two HenryIV, IDNA only allows one (2008) or maps to one 

Slide 184, middle bullet: I'd add 'occasionally' to put that issue in 

I remember having seen this talk before, but I don't know where. I 
thought it was very good. I'd be a bit worried if I'd have to spend 2h 
to present it, it's written for a fast pace.

Regards,   Martin.

On 2011/07/20 4:20, Peter Saint-Andre wrote:
> On 7/19/11 12:48 PM, Peter Saint-Andre wrote:
>> You might have noticed a curious item on the agenda at 14:00 on Sunday:
>> "Apps Area Preparatory Meeting for Internationalization Working Groups".
>> At that time, I will present an introduction to internationalization,
>> assisted by Pete Resnick (who will correct me where I go wrong). The
>> intent of this session is to help apps-area folks learn more about
>> internationalization, especially in preparation for the PRECIS WG
>> meeting on Thursday. The room we've been assigned (2103) holds up to 60
>> people so we should have plenty of space, and there is no need to sign
>> up if you want to attend.
>> If this session goes well, Pete and I might offer a more general
>> tutorial at a future IETF meeting. Consider Sunday's session a dry run.
> Sorry, I neglected to provide a pointer to my slides:
> Peter