Re: [precis] Applying the rules three times to get a stable output string?

Christian Schudt <christian.schudt@gmx.de> Sat, 09 December 2017 21:27 UTC

Return-Path: <christian.schudt@gmx.de>
X-Original-To: precis@ietfa.amsl.com
Delivered-To: precis@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BBDCB1242EA for <precis@ietfa.amsl.com>; Sat, 9 Dec 2017 13:27:56 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.899
X-Spam-Level:
X-Spam-Status: No, score=-4.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-2.8, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JTnl24P7xEw6 for <precis@ietfa.amsl.com>; Sat, 9 Dec 2017 13:27:55 -0800 (PST)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7D4C412009C for <precis@ietf.org>; Sat, 9 Dec 2017 13:27:54 -0800 (PST)
Received: from christihudtsmbp.fritz.box ([88.77.188.33]) by mail.gmx.com (mrgmx102 [212.227.17.168]) with ESMTPSA (Nemesis) id 0MKbZD-1eNVPU3ESV-0020jS; Sat, 09 Dec 2017 22:27:51 +0100
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
From: Christian Schudt <christian.schudt@gmx.de>
In-Reply-To: <CAHVjMKGmZK1DQJmbM-4Gb6W8NUbzG-qQXnXBScr6Yh+o==wxuw@mail.gmail.com>
Date: Sat, 09 Dec 2017 22:27:49 +0100
Cc: precis@ietf.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <C31DFCC1-31BB-49E4-A9BD-071BF5AC6C02@gmx.de>
References: <C64B78C6-8109-4F36-BB76-EA8AB229FCE2@gmx.de> <CAHVjMKGmZK1DQJmbM-4Gb6W8NUbzG-qQXnXBScr6Yh+o==wxuw@mail.gmail.com>
To: William Fisher <william.w.fisher@gmail.com>
X-Mailer: Apple Mail (2.3273)
X-Provags-ID: V03:K0:okttsnBCRgY7X/nV9CHSAO6NFljArbJKd7xeu+KdDXCHRFk/nL5 Xl4YckVbUTon2hfz84taGPcKh8P25fS5PQcLNkUmW8hNlG7h6Wftf8Nbp+Bfd4UejskHJRf UYDx1YQCNMNxAP2sORxovXKw6VUZv3iYDfxKTNc49NaEY4ATSdQnUocIgVATwbAJkMYvjqI cMmMYSw4NERwWyBsFmYlA==
X-UI-Out-Filterresults: notjunk:1;V01:K0:jAMArVPk/1o=:2G8FVLhS4VLLQlmoBt3+LF 2qXMqvaFuQuK12pfOwerFTWGclIE66H2jMnVYOghfeUBa9fNm5x5gRJ9LOP9FWIY2YsnDl9ZQ 0DHsw0ji0hfh7tEGvFNAt5tedC/FK47nIkp8Wp75VeID/sU8vueZwSB19wC7EVEMfTWK5Ki4q 0N+qx6F0IOXmetbwaxufBCpvFk63dERGxZikiP3k+aYR1MqyDhWLt2F8gaxAOnKGkJLdba8U/ /hYI6zgldFKL41EokhssSp9spc1GJHhOatwItXEXUNTNPQhtGB+fIpIWN3Tjba2ubYjSMVaV6 QiYkwBKFPkfEI8kiklMxDuJx0YNH/oBmU3zEVgIgU5LTroxSvo0w/wH6dvgM1HvcJKyR6IbXy BLFUc6K2cAp6z+VW8ck9I07BUs3yay9bhs2W2Ai4BH1tncpv7PefWsLsHF3Qa1heCVKtK7QRW KGXRjixrnCf+Vslx+HVnan6hODRa54PFj9Z+BnqdmnAktUorH76+mavPbzv0ckixHGrEBbjqR ayPyk0qFoKjX1NUF4rf3lBv7+YEeU/XA07+IP0SL/pTlwPixTYynPV97qnSTBjK0w5SdiJlUV BPA22nY60/zmKKEj6SiZsqM/1KXXqBqd4Woz+n5ZCdHPlxqFbBFYwjl3UC0Q7aM2so5xd+Z4T 6ZA29N8g1xEprQn7rhGLiEjltpYW7AGdIVVvQnZPpkKiqZuzBkAlAUJMKBuUY5KC87e/45xs6 JTDyQ6qS/VicifLja428ODVCoD6cw+C8Q5/mbOuDCf4fNgoVvWKWZEy0eNHGW+8j9ZVq/Cekh I+OaT6hsCMNjMcS0zYaZmvAWqXrpgQat5eqS66dOf5hAq39mgk=
Archived-At: <https://mailarchive.ietf.org/arch/msg/precis/4EbKhDLBOy3buLlamvCD6VKF6qc>
Subject: Re: [precis] Applying the rules three times to get a stable output string?
X-BeenThere: precis@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Preparation and Comparison of Internationalized Strings <precis.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/precis>, <mailto:precis-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/precis/>
List-Post: <mailto:precis@ietf.org>
List-Help: <mailto:precis-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/precis>, <mailto:precis-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Dec 2017 21:27:57 -0000

Great, thanks! These code points revealed some bugs :-). They should have been included in the Examples.

Are there any known code points for the IdentifierClass / Usernames as well?
Seems like all these code points are disallowed anyway.

If not, implementations could save 1-2 iterations and only apply the „3-times“-rule for FreeformClass.



> Am 09.12.2017 um 20:34 schrieb William Fisher <william.w.fisher@gmail.com>:
> 
> Where it makes a difference for NicknameCaseMapped:
> 
> "\u210c"
> "\u20a8"
> 
> Where it makes a difference for Nickname due to spaces:
> 
> "\u00a8"
> "\u02dc"
> 
> 
> On Sat, Dec 9, 2017 at 8:37 AM, Christian Schudt
> <christian.schudt@gmx.de> wrote:
>> Hi,
>> 
>> RFC 8264 introduced these new sentences:
>> 
>>   under certain circumstances, such as when Unicode
>>   Normalization Form KC is used, performing Unicode normalization after
>>   case mapping can still yield uppercase characters for certain code
>>   points
>> 
>>   Therefore, an implementation SHOULD apply the rules
>>   repeatedly until the output string is stable
>> 
>> 
>> I could imagine these sentences refer to code points of the „Unstable“ category, but this category is unused.
>> 
>> Are there any concrete code points or input strings which show this unstable behaviour?
>> I am asking for some test vectors, i.e. an input string, which doesn’t have the expected output string after the first rule application, but after the second one.
>> 
>> Thanks,
>> — Christian
>> _______________________________________________
>> precis mailing list
>> precis@ietf.org
>> https://www.ietf.org/mailman/listinfo/precis