Re: [precis] local case mapping
Peter Saint-Andre <stpeter@stpeter.im> Tue, 08 October 2013 19:55 UTC
Return-Path: <stpeter@stpeter.im>
X-Original-To: precis@ietfa.amsl.com
Delivered-To: precis@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9271821F9E6D for <precis@ietfa.amsl.com>; Tue, 8 Oct 2013 12:55:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.599
X-Spam-Level:
X-Spam-Status: No, score=-102.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YicXm6kyvBR9 for <precis@ietfa.amsl.com>; Tue, 8 Oct 2013 12:55:35 -0700 (PDT)
Received: from stpeter.im (mailhost.stpeter.im [207.210.219.225]) by ietfa.amsl.com (Postfix) with ESMTP id 1154521F9D69 for <precis@ietf.org>; Tue, 8 Oct 2013 12:55:35 -0700 (PDT)
Received: from sjc-vpn5-1390.cisco.com (unknown [128.107.239.233]) (Authenticated sender: stpeter) by stpeter.im (Postfix) with ESMTPSA id 7B254414D9; Tue, 8 Oct 2013 14:01:26 -0600 (MDT)
Message-ID: <5254632F.6060106@stpeter.im>
Date: Tue, 08 Oct 2013 13:55:27 -0600
From: Peter Saint-Andre <stpeter@stpeter.im>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:17.0) Gecko/20130801 Thunderbird/17.0.8
MIME-Version: 1.0
To: Andrew Sullivan <ajs@anvilwalrusden.com>
References: <5227A979.7050403@stpeter.im> <E0DDC70E-DF8C-4163-8ED5-4ADA115DDB72@kmd.keio.ac.jp> <4C8248EF-51BD-4736-A930-E2FEE610EC03@kmd.keio.ac.jp> <20131005031746.GC38902@mx1.yitter.info>
In-Reply-To: <20131005031746.GC38902@mx1.yitter.info>
X-Enigmail-Version: 1.5.2
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Cc: precis@ietf.org
Subject: Re: [precis] local case mapping
X-BeenThere: precis@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Preparation and Comparison of Internationalized Strings <precis.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/precis>, <mailto:precis-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/precis>
List-Post: <mailto:precis@ietf.org>
List-Help: <mailto:precis-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/precis>, <mailto:precis-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Oct 2013 19:55:42 -0000
Hi Andrew, thanks for helping to move things forward. On 10/4/13 9:17 PM, Andrew Sullivan wrote: > Dear colleagues, > > I reviewed draft-ietf-precis-mappings-03 today, at long last. I > apologise for being so late. No worries. Better that we get it right than that we finish it quickly. Or maybe that's just a convenient excuse. :-) > I put together a number of incoherent questions about the case folding > stuff, but fortunately I re-read the mailing list archives on this > topic before posting a long message. I agree with Peter: I find this > section of the document very confusing, and I think it may be wrong. > In particular … > > On Thu, Sep 19, 2013 at 04:39:12PM +0900, Takahiro Nemoto wrote: > >> Considering the maintenance and preservation of the document, >> leaving it the way it is now is not a bad idea. > > …I am pretty sure it shouldn't be left the way it is. > > It seems to me that our principle generally needs to be that Unicode > is the thing we use, and if Unicode is broken it's Not Our Problem. > So we should figure out how to say, "Do the Unicode-y right thing > here," and then put that in. I especially don't want to get into > specifying special language-specific tables ourselves: we don't have > the expertise, I think. > > I can't think of any better suggested text than what Peter already > sent, so I think that's the right direction. In the unlikely event > something clearer comes to me in the night, I promise to write it > down. I suggested text to clear up the first paragraph. However, my message merely asked the key question, but left it unanswered: what are we trying to accomplish here? As I noted, Appendix B.1 simply matches the Language-Sensitive Mappings from the SpecialCasing.txt file in the Unicode Character Database. If that's *all* we're trying to accomplish, then we could simply say "apply the Language-Sensitive Mappings in SpecialCasing.txt". However, I get the sense that we're actually trying to accomplish more, e.g., applying at least the context-sensitive mapping for Greek final sigma -- in my example, a nickname of "ΦΙΛΟΣ ΜΟΙ" would be case folded to "φιλος μοι" (with a Greek final sigma, which is correct in Greek) and not to "φιλοσ μοι" (with a Greek medial sigma, which is incorrect in Greek). It's also not clear to me if we have a position on full case folding vs. simple case folding (e.g., ẞ = U+1E9E to "ss" instead of "ß" = U+00DF). It seems to me that we might want to suggest a consistent approach here so that we have improved interoperability. So IMHO one approach would be: 1. Apply the language-sensitive mappings from SpecialCasing.txt 2. Apply the context-sensitive (i.e., "language-insensitive") mappings from SpecialCasing.txt I'm still not sure what to do about about full vs. simple case mapping, but I see no strong reason to prefer simple case mapping because I don't see a problem with our algorithm resulting in two characters (e.g., "ss") instead of one. Peter -- Peter Saint-Andre https://stpeter.im/
- [precis] local case mapping Peter Saint-Andre
- Re: [precis] local case mapping Takahiro Nemoto
- Re: [precis] local case mapping Takahiro Nemoto
- Re: [precis] local case mapping Andrew Sullivan
- Re: [precis] local case mapping Peter Saint-Andre
- Re: [precis] local case mapping Andrew Sullivan
- Re: [precis] local case mapping Peter Saint-Andre
- Re: [precis] local case mapping Andrew Sullivan
- Re: [precis] local case mapping Takahiro Nemoto
- Re: [precis] local case mapping Takahiro Nemoto
- Re: [precis] local case mapping Peter Saint-Andre