Re: [precis] Enforcement as an Idempotent operation

William Fisher <william.w.fisher@gmail.com> Mon, 13 February 2017 07:05 UTC

Return-Path: <william.w.fisher@gmail.com>
X-Original-To: precis@ietfa.amsl.com
Delivered-To: precis@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 06CCC126FDC for <precis@ietfa.amsl.com>; Sun, 12 Feb 2017 23:05:02 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SK1bzjoXR3ft for <precis@ietfa.amsl.com>; Sun, 12 Feb 2017 23:05:00 -0800 (PST)
Received: from mail-it0-x22e.google.com (mail-it0-x22e.google.com [IPv6:2607:f8b0:4001:c0b::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 85BB412941A for <precis@ietf.org>; Sun, 12 Feb 2017 23:05:00 -0800 (PST)
Received: by mail-it0-x22e.google.com with SMTP id 203so176037982ith.0 for <precis@ietf.org>; Sun, 12 Feb 2017 23:05:00 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=3Me02TBzL2Hv4wsioi0TI+TYWdJwtwGn1wLcsvCKA8k=; b=V02t6NW9/iPF+Ifv6SzdELHC5l1EpjjOsHtGAE7AJ2JEG9Vs9AV9ErO/GtjW+LdFH4 P7vVrzeNM9FcbBXnlHGC+Q6zo1uGuZRTrQrX7m/JYCrmnlWyc9gTIAIymf9VF+Ueck1g 3/fSsjKap0nT9CNpJfVQfDavDHt8HUmD3/Qkp4I4RPY6HEz2JYmQX8+VJ1ANDGK1J8rV JIwp3Nk2qNIK+3y8blOXL2dWdt/VGdR/7W9mO+nIaY3XFBwpFh0Mkr0wcdlLS4mXSmoa OW9m3c3adiDmp3L7Rrb2O2jK8Da7CZZJ8KZIhStXaZPLF0nad3V1t92ggLHwzLiC8REg iV4w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=3Me02TBzL2Hv4wsioi0TI+TYWdJwtwGn1wLcsvCKA8k=; b=WRnZuP5nYMZdPTvzXC5ap9BWo6BWoH76A3isUl1n5EvttGavrz0TJfseFrhUX/xlQs qhWNwhltJJtoNAegPjj0zrSvr1/C5Hrb7uJa+hPVpnrgw0rauRHxpFJfdUdsSD6MFVX4 nNxQ3A9a5iYy3WBHlHnKwSH/FH1fwNM3XiBOupSQqPzShVRH1Dx97qqqkBXcin8pdS31 wVmswG0+bpgsXKkTYutPHA1tmaiJHAQwv8GiNw4GJym41LuvQyRa3VaYEhKKfr4kurKY 0jjCjXLPql8Iwvi4wzU6XDm7U5HSwX8Y79IAn79vxnmCmw55Fi+B9CGE+X3x5qulHDJn XEjw==
X-Gm-Message-State: AMke39mDAZv0MLkeCA7K6iaAB9GFRVjMDyeLuDZfHkks36Tm6o2f///FEKuV8on4dXgsxZmhu6Odredsh4MCnQ==
X-Received: by 10.36.207.212 with SMTP id y203mr19360500itf.63.1486969499777; Sun, 12 Feb 2017 23:04:59 -0800 (PST)
MIME-Version: 1.0
Received: by 10.107.136.68 with HTTP; Sun, 12 Feb 2017 23:04:59 -0800 (PST)
In-Reply-To: <15c31273-c278-af61-2a01-0b68ab8af182@stpeter.im>
References: <CAHVjMKHVvmS6jty3-jwnnuqy-xdw-xY2j+5ExLRr6tXCMRbC2Q@mail.gmail.com> <f9b49a96-2189-bccd-5dc0-a4dc8146cbcc@stpeter.im> <CAHVjMKEVTOCV68OTfXnXhWKiXT798m2osGkwHVRhw4Cs0RLw0w@mail.gmail.com> <15c31273-c278-af61-2a01-0b68ab8af182@stpeter.im>
From: William Fisher <william.w.fisher@gmail.com>
Date: Mon, 13 Feb 2017 00:04:59 -0700
Message-ID: <CAHVjMKHXL_gHrQ1+jC2T4VrJj5n+mRB5j7iD7kGHc06wpq+PEA@mail.gmail.com>
To: Peter Saint-Andre <stpeter@stpeter.im>
Content-Type: text/plain; charset=UTF-8
Archived-At: <https://mailarchive.ietf.org/arch/msg/precis/GI5spt9n6ZwSvKJQ0nAHq_ch-hU>
Cc: precis@ietf.org
Subject: Re: [precis] Enforcement as an Idempotent operation
X-BeenThere: precis@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Preparation and Comparison of Internationalized Strings <precis.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/precis>, <mailto:precis-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/precis/>
List-Post: <mailto:precis@ietf.org>
List-Help: <mailto:precis-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/precis>, <mailto:precis-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 13 Feb 2017 07:05:02 -0000

On Sun, Feb 12, 2017 at 12:27 PM, Peter Saint-Andre <stpeter@stpeter.im> wrote:
> Did you mean U+212A (KELVIN SIGN)? That decomposes to U+004B (LATIN CAPITAL
> LETTER K).
>
>> The full example is:
>> "\U0001f11aevin" => "(K)evin" => "(k)evin"

I'm talking about 'PARENTHESIZED LATIN CAPITAL LETTER K' (U+1F11A).
Sorry it's not clear that the A is part of the unicode escape.

With casefold or tolower, the result is the same for these Nicknames:

Not idempotent: "\U0001f11A" => "(K)" => "(k)"
Not idempotent: "\U0001f13A" => "K" => "k"
Not idempotent: "\u210c" => "H" => "h"
Not idempotent: "\u210d" => "H" => "h"
Not idempotent: "\u20a8" => "Rs" => "rs"

When you apply the comparison steps from RFC 7700, Section 2.4, you
still get something that is upper case. If you apply the comparison
steps again, you now get lower case.

>> I wrote a program to categorize characters that are not idempotent
>> under Nickname "ToLower" (ignoring white space). The numbers are the
>> same for Unicode 6.3, 8.0 and 9.0.
>>
>> {
>>   '<font>': 467,
>>   '<square>': 90,
>>   '<compat>': 35,
>>   '<super>': 27,
>>   '<circle>': 4
>> }
>
>
> Would you mind sending me your list of characters?

I will send it to you in a separate email.

Thanks,
Bill