Re: [precis] toLower() vs. toCaseFold()

Martin J. Dürst <duerst@it.aoyama.ac.jp> Mon, 09 May 2016 04:17 UTC

Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: precis@ietfa.amsl.com
Delivered-To: precis@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F143A12D0CC for <precis@ietfa.amsl.com>; Sun, 8 May 2016 21:17:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.902
X-Spam-Level:
X-Spam-Status: No, score=-1.902 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=itaoyama.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id c6vk-6UUZPkb for <precis@ietfa.amsl.com>; Sun, 8 May 2016 21:17:28 -0700 (PDT)
Received: from JPN01-TY1-obe.outbound.protection.outlook.com (mail-ty1jpn01on0101.outbound.protection.outlook.com [104.47.93.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4720A12B01D for <precis@ietf.org>; Sun, 8 May 2016 21:17:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=itaoyama.onmicrosoft.com; s=selector1-it-aoyama-ac-jp; h=From:To:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=kQV+v+/8hICTRO845u0siDt718ZRGjfEYdLVDKHnhek=; b=uTyUGtFgctkIvVXfNzgD/aSbU6124i1oFPz1iQx/qjAbMfvHyRyDU8L7Vx3HmeUJHr/b4FLtwLL2maTqgat/hSuAN12h3GGQ68lwnLVsNPqvF7rr6jVcl39xon7JdvCa0ssH6/OPWihxihkRTowTsFRab8czTFX7rNfl+3BuUA0=
Authentication-Results: ietf.org; dkim=none (message not signed) header.d=none;ietf.org; dmarc=none action=none header.from=it.aoyama.ac.jp;
Received: from [133.2.210.64] (133.2.210.64) by OSXPR01MB0920.jpnprd01.prod.outlook.com (10.167.148.150) with Microsoft SMTP Server (TLS) id 15.1.492.11; Mon, 9 May 2016 04:17:26 +0000
To: John C Klensin <john-ietf@jck.com>, Peter Saint-Andre <stpeter@stpeter.im>, <precis@ietf.org>
References: <6F0075DBF071EB43A3F97F73@JcK-HP8200.jck.com>
From: =?UTF-8?Q?Martin_J._D=c3=bcrst?= <duerst@it.aoyama.ac.jp>
Organization: Aoyama Gakuin University
Message-ID: <398ad402-5058-05aa-7cf8-3fe4b40e5f17@it.aoyama.ac.jp>
Date: Mon, 9 May 2016 13:17:25 +0900
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.0
MIME-Version: 1.0
In-Reply-To: <6F0075DBF071EB43A3F97F73@JcK-HP8200.jck.com>
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 8bit
X-Originating-IP: [133.2.210.64]
X-ClientProxiedBy: TY1PR01CA0024.jpnprd01.prod.outlook.com (10.161.131.162) To OSXPR01MB0920.jpnprd01.prod.outlook.com (10.167.148.150)
X-MS-Office365-Filtering-Correlation-Id: 3e6cf2f5-40d1-4ff6-1c69-08d377c0d3fa
X-Microsoft-Exchange-Diagnostics: 1; OSXPR01MB0920; 2:Lfd81G+1DJ+kxFJhAOROPRiOeETgV4ay3qk67RKzI1cpqWbvtWQgrWYkO9CbnguRl8Mfq5V8U6zC/RMienX1h9kgfhyFibs4Ahxnpro9b/CIE83Tp69ZZUyzPYub3E+LQvjlJWaiLZySjj11LrwGTjM1eg7SNab//gCionAkCGAMYKjLb2fRf0MHjpIvgVRi; 3:S8lU6LOUDlBke07pgQROIpDmsAfbuoLanPWF3T8GpUUr7bzOEmCrB2xeofCiBp/WE7ulNYwlHInoiboOECfi+mOpxwRy+IOcanQcqUI1FloWQOeD7n/fwO1q0nXlCemK; 25:oz5Zkna6JH1uf9pit4eu7IuLSiHaa/kQ3ddGEPt4I5DapwMxuOzITKx4A/46CQuf79/6XdHDlZ8jt/lNW4cigS0RrkQf8J1/5Tgop3AXlvgun4z6EJMzYGgRUhxvsLgCWOzU94efrnblHCKGWSJZedMxzlFcX14dsMDzD334PMDnvv88tSRtw74sU2wqtNch387DGoynJAN1CsRoEQmwutHjpLyLxPq4N5SFJuzAXW/LpCdbcazdow6UuM+ZFsy+BAWIQAXiAfngTjpwyTi2TNlGa0OjRRvynK6U0APqM6K7JlvzwdqjHty2Zyr5xgmeXTgk3t/W2+JUubD9jXLQ3N+1k1aaZ5gUDU0aFAC1UwZpex5h/IA0xjRnoeyycQEtWWxeMbCdiENtR3u93MT7c2JvPnsoolMeuMMLZ/AGJAhmWPNz7HkScLdImwnA+kHF
X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:OSXPR01MB0920;
X-Microsoft-Antispam-PRVS: <OSXPR01MB09200C148DD440FE10CC9C98CA700@OSXPR01MB0920.jpnprd01.prod.outlook.com>
X-Exchange-Antispam-Report-Test: UriScan:;
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(6040130)(2401047)(5005006)(8121501046)(3002001)(10201501046)(6041072)(6043046); SRVR:OSXPR01MB0920; BCL:0; PCL:0; RULEID:; SRVR:OSXPR01MB0920;
X-Microsoft-Exchange-Diagnostics: 1; OSXPR01MB0920; 4:1U7J3RUR+kyVe4MC4l4Q1el5Q+maZtkdfOKoAfkFvF+phqeSHVM7aBPd4in/FswAf3hT1RxfsS3ROrbZXzOaqIAMYKKmbLxFVU1Gopf6nWKIx13EhF4joD274n28jN9mI2qZEwMVwyTjsejHs/NMfKdghShg2ayMY0yZMNt8VbrmmABmFQdkLLce7ECOQBWdf0H9qE+KD6dFyJB5tIZF/L1hVJrqJJf0w0wR3/ouKi4oI7UjaIeOQXiGp8nxYJbFFvNW3iFHqnHeT+P06SB8j9BUI+JFaD9mAKfzAHmP3mLcIO8u5U/GPIWHfuv7vNV778cPjYzrvliLAMsnM5UsAhrABtXo7mWpSEXQ9QUDHX5MffVktwQasL50rB/QOMp46X8gzdx9MkVzRjrwdXHLkjZak5qPOAKD9zsLIa5dVsg=
X-Forefront-PRVS: 0937FB07C5
X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10019020)(4630300001)(6009001)(6049001)(24454002)(9170700001)(64126003)(31696002)(42186005)(586003)(2906002)(5004730100002)(77096005)(86362001)(47776003)(31686004)(2870700001)(65956001)(65806001)(81166005)(19580395003)(19580405001)(107886002)(66066001)(50466002)(23676002)(92566002)(54356999)(50986999)(33646002)(76176999)(83506001)(189998001)(2950100001)(74482002)(6116002)(5008740100001)(5001770100001)(65826006)(3940600001); DIR:OUT; SFP:1102; SCL:1; SRVR:OSXPR01MB0920; H:[133.2.210.64]; FPR:; SPF:None; MLV:sfv; LANG:en;
X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtPU1hQUjAxTUIwOTIwOzIzOlBrVS8wdTJxVitlSG0yR2o0b2tZODc2SW4v?= =?utf-8?B?M0ZLS0d2dlYwdmlIOTM3dnZvd3BNNy8zdjJHcUYwVWJvczk4Wk9KRWxxTXpj?= =?utf-8?B?eGRwTlIxVHhROGFlbkFuakxGRm43SjNIakg4a1dGajhOYTFxZDBDR2lCNVRR?= =?utf-8?B?UWMyU29Ma2hZbDJRaVlVQUN4SEtTdGtKZlhDVEg4SkVpUWhDR3hLN2lsb1hC?= =?utf-8?B?VlYycUtwaEhZa0RURGtYc2Z3Wng4ZEVXdnVyV2xTd2pyVy9ab1l1bVBob3Jt?= =?utf-8?B?ci9BNWl2cFVjcHNmY1BNTWJvdktHRWdZcW84Y2hmZHB6dDNMSEZrY3BlL0Ns?= =?utf-8?B?TVB5NG9MemQ3TXYwRWFtanFqWDEzZ0J6V2srL2Z5V3VpbW5IVHB2Um9TM0pt?= =?utf-8?B?ZTlHN0Rjby9Qai9acEFzekptYzcraEJwb004OUJ3YzE0eVcwM1BNbWpLVlZk?= =?utf-8?B?dEdrSEhVSzBPUjZSSGM4UkVOV2dBeWJpRFlhRkJIRzJ2dXRGU2ZnWEF6bnBy?= =?utf-8?B?ekgzNmc0aUtUaDF4TU05SHh5dEx4a00xQXBETWFCdm5qN0FUcEg5SSs4WHAz?= =?utf-8?B?cGpzdkpjQjFhWWkrLzNTRzNmVjl3N3cwQUJTUVB0RW1NMWJkT0xsQU5NOHVX?= =?utf-8?B?eEhRQlJvNklrSFNsQTBKUW5rZm1qT1d5dC9xMlZIZWUxSWx0elYzTGYvSjh6?= =?utf-8?B?dy9wTDJpenZiaUJjblhlSWZhVWJnYkxYb0xnNFkvYnV0b0NQY0J3VENLRzR6?= =?utf-8?B?ZWFnWU9oWnNDdmovQUhPcW1mZ0dJODBNSWpQRCsrcGJwKzhsaVd0dTNMMGFB?= =?utf-8?B?MWNFUTM1aTVpVnVaRklXclFxZVZkbDFJb3lRZXRrL3FBYWdzQ200ZnpRdjJN?= =?utf-8?B?YTVPS2E0MHZQNEVGR0t1eWFyVm9kV0JlVG9OdDRyNGxTZWRjM0pkMVI4Nzdj?= =?utf-8?B?Zmllc3NCQU5WUjNLTmZjUnMzdm5BN3BraXA0Rlo5d0ZuaFlkczNNaTAwRGN2?= =?utf-8?B?V29FOVV5K0hBcWNrNlBmbE5kVzNWYk5EbEQ4OWJ1Q05mN3hVNnFBRnhpQS9l?= =?utf-8?B?Rk9MWnluODcvbUg4bjN5R3dnYjdxY0diUjN0UkJEZkFiZENUd3Y5bnZrOGE1?= =?utf-8?B?THRaV0pjdWpyNlM3MzlNWWFnRlNoOWNuR1VOMGl1WTU2OFlRaGJ1aGpPV0tK?= =?utf-8?B?Mzltc3UxUE5VU2RVNGFTTmJLYStWTllxNXlmcFBxbmdvTlpZTWloUUxxRFZL?= =?utf-8?B?aVUvRDZteXpnY3k2SFNtbk9IVFpZQnFQRG1UMzAweTJIVm5ubHo1WjBlSHdo?= =?utf-8?B?UlkrSWNLNHNZRFJjM2VhRkllbGNlcTlmVFlLYlllNG9uc0RxZlhma0RKNzVL?= =?utf-8?B?YUl1OEJPM0xiRml4c291VWNuV05qKzQ0YStUQ21La3R6T3EyQnlKTmlSRCtx?= =?utf-8?Q?dZ+t88=3D?=
X-Microsoft-Exchange-Diagnostics: 1; OSXPR01MB0920; 5:LsIXnVi4yFXt7pgxSo4vtO3Try4S8zo4sUloxUMMesKnIBXaeHhObkM+ZWC3JzYFaJFunX07Ic6vMp1WQFYa4pB2GHQhkBOU0anZle2E3vD4vmSI2Fh7tDvmZdffjsJW3Z1XRkfU1ay9lOfnh0bIbw==; 24:hbcMtuwpwujLwZheBWXSmY7Jn0dYpCdBniXrUU4pcy8bcgtXLdr1elJSgfThoDkMYDuwSh/KK70I5h06j8kVu2WH7kC+UpWpBHFyPrMWuOs=; 7:WQyC79ROZF2xyZ92NLS9HZNugDrCKXMZVCvnuKC5C2VwAaZA6SrfU6m0X6nZUJP5df3N3G4QJ3hOuki+C8UDs98Eyff7sSK7/iBRf/OIpJQuY+S9TM1Q7MyQ964+F+ccVVnPTEjjO6b4MIB8AsDDWV2J3S+sQinMnlI5HsH14bFCO7eRLHvqMxkhRYGwmRh9
SpamDiagnosticOutput: 1:23
SpamDiagnosticMetadata: NSPM
X-OriginatorOrg: it.aoyama.ac.jp
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 May 2016 04:17:26.1931 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-Transport-CrossTenantHeadersStamped: OSXPR01MB0920
Archived-At: <http://mailarchive.ietf.org/arch/msg/precis/wToosAUhJr-eMIcBLLeUpPf3mtw>
Subject: Re: [precis] toLower() vs. toCaseFold()
X-BeenThere: precis@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Preparation and Comparison of Internationalized Strings <precis.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/precis>, <mailto:precis-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/precis/>
List-Post: <mailto:precis@ietf.org>
List-Help: <mailto:precis-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/precis>, <mailto:precis-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 09 May 2016 04:17:31 -0000

On 2016/05/07 11:40, John C Klensin wrote:

> --On Friday, May 06, 2016 15:54 +0900 "Martin J. Dürst"
> <duerst@it.aoyama.ac.jp>; wrote:

>> On 2016/05/05 07:43, Peter Saint-Andre wrote:

>>> Suggestions for improvement are welcome, especially from
>>> John. (E.g., we might want to more explicitly call out
>>> comparison vs. other contexts in the normative text elsewhere
>>> in §5.2.3).
>>
>> I think 'compare' should be changed to 'search'. That's the
>> prototypical use case for CaseFold.
>
> Hmm.  If we have to choose, I think I prefer "compare".  I just
> looked at the subsections on "Default Case Folding" and "Default
> Caseless Matching" in Section 3.13 of TUS 8.0 and it says a lot
> about comparison and nothing about search.   Recommended
> compromise:  Make the relevant sentence fragment read "most
> appropriate when an application needs to compare two strings
> such as in search operations."

Fine by me.

> I'd still prefer to denounce toCaseFold completely, especially
> where identifiers are concerned.

I didn't know which direction we are leaning, but if that's where we are 
moving, that would be very fine by me, too.

> It just has far too much
> potential for being destructive and creating false results
> (either positive or negative) when the language context is
> unknown.  People/designers/implementers who are not prepared to
> understand those issues and their implications should really not
> be using the thing.

Agreed.

>> Also, the language in the "Therefore" sentence is somewhat
>> convoluted. It's unclear which alternative this text prefers.
>> I suggest that if we want to put the two alternatives on an
>> equal footing (i.e. make sure the application designer thinks
>> carefully), then a more parallel sentence structure, avoiding
>> words such as "carefully", "truly", and "would", would be more
>> appropriate. What about:
>>
>>                                         Therefore, application
>> developers
>>     are advised to carefully consider whether toCaseFold() or
>>     toLower() is more appropriate.
>
> For the reasons above, I'm not sure that an even footing is
> appropriate.  I'd rather have the guidance be closer to "use
> toLowerCase, which your users are likely to understand, unless
> you need CaseFolding for some particular reason and understand
> its implications"

I'm fine with that. I just had difficulties understanding which way the 
bias in the "Therefore" sentence was going, if any. And my guess is that 
others may have the same difficulties.

Regards,   Martin.