Re: [precis] Enforcement as an Idempotent operation

Peter Saint-Andre <stpeter@stpeter.im> Wed, 22 March 2017 01:30 UTC

Return-Path: <stpeter@stpeter.im>
X-Original-To: precis@ietfa.amsl.com
Delivered-To: precis@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2D7D1129415 for <precis@ietfa.amsl.com>; Tue, 21 Mar 2017 18:30:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.721
X-Spam-Level:
X-Spam-Status: No, score=-2.721 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=stpeter.im header.b=cJ+/E88h; dkim=pass (2048-bit key) header.d=messagingengine.com header.b=OFmtb2G+
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1m4chMDf8L9F for <precis@ietfa.amsl.com>; Tue, 21 Mar 2017 18:30:04 -0700 (PDT)
Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B7C031292F4 for <precis@ietf.org>; Tue, 21 Mar 2017 18:30:04 -0700 (PDT)
Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.nyi.internal (Postfix) with ESMTP id 25ADA20BA2; Tue, 21 Mar 2017 21:30:04 -0400 (EDT)
Received: from frontend1 ([10.202.2.160]) by compute2.internal (MEProxy); Tue, 21 Mar 2017 21:30:04 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=stpeter.im; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-sender :x-me-sender:x-sasl-enc:x-sasl-enc; s=fm1; bh=FFCWGVYjkYEBhACeOB ZOEimLjbPFNORI0uzxuGgKUv0=; b=cJ+/E88hFByjLcSy/ZhitBYM/AWRepvSX9 bXCHf0QIeWiyeMBObWOFnNyQA96oa10E9tnGYP/pP8rTm37kRDjyMuX+Y8KRTcGl pizz3ctcWH8CYa3jdXEBs7S1ticDrQNy3EeFT35bVVz4lop9GdKjVytfAqL6txVu pFm5XUEHKw35g/ANyx75cgBDcNC89qUdokXg+oDE1nH6ZWHxZkzmRkMUN9NGHQoo D0yZgEa/xXDkK2DZw2R+1onzjy6zcp+5Nvsl4waAmy32q0WmrgGWVH+KKeFrtSpI R0/GBFdGKgFA9vTgiKE6qe3v3HWhHUUxfoSrRZvqJhCHKCWHT89g==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-sender:x-me-sender:x-sasl-enc:x-sasl-enc; s= fm1; bh=FFCWGVYjkYEBhACeOBZOEimLjbPFNORI0uzxuGgKUv0=; b=OFmtb2G+ R2IbPqq3ZZe7pR/yicet9vTdKpzH2qFlXXaMg01M5PyxQxOjWa6gqLbj76CvV5NG FJHoUawWB63VgMwIyHFrcq4ekpeET86wtvUsWopFNnP5wG3lMsEVaKvPbmzMq4Kg Cf0evp4QOBxDCxtytGXyFtaCK99D1q8wuE1tgWi4aYS5ClL1y6JxylSTA9vQOn4X v1Zv0dV5VF7uoRmj2GY7UO6qGD0gVgKFFDmUSy87nQJkbU4oHPF59azOUEtiFQv/ S9bc3eJ1RuphqVZc9HicsULhGQDcseAneBfFSttbNvqQZzag0FzMIlIPGxRCzACQ Xdmq//kflqNJww==
X-ME-Sender: <xms:nNPRWPBjgCICv-VWJ77lgciWYsWethEcRgBxHigzR30O5ehqYWiTpg>
X-Sasl-enc: 4wdbMZEjhjBGkCN2FpcK0UOGdB3MuaCHtPkWK+kQqo89 1490146203
Received: from aither.local (unknown [76.25.4.24]) by mail.messagingengine.com (Postfix) with ESMTPA id ACBDD7E65A; Tue, 21 Mar 2017 21:30:03 -0400 (EDT)
To: William Fisher <william.w.fisher@gmail.com>
References: <CAHVjMKHVvmS6jty3-jwnnuqy-xdw-xY2j+5ExLRr6tXCMRbC2Q@mail.gmail.com> <f9b49a96-2189-bccd-5dc0-a4dc8146cbcc@stpeter.im> <CAHVjMKEVTOCV68OTfXnXhWKiXT798m2osGkwHVRhw4Cs0RLw0w@mail.gmail.com> <15c31273-c278-af61-2a01-0b68ab8af182@stpeter.im> <CAHVjMKHXL_gHrQ1+jC2T4VrJj5n+mRB5j7iD7kGHc06wpq+PEA@mail.gmail.com> <0f5b55f8-5fcb-2a61-435e-7b93d2d8f9e6@stpeter.im>
Cc: precis@ietf.org
From: Peter Saint-Andre <stpeter@stpeter.im>
Message-ID: <6df28263-cdfa-cc61-4ba9-1bdae17bcca8@stpeter.im>
Date: Tue, 21 Mar 2017 19:30:02 -0600
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <0f5b55f8-5fcb-2a61-435e-7b93d2d8f9e6@stpeter.im>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/precis/58j_lgKQpSeJQ2j-FU5-odOoKAQ>
Subject: Re: [precis] Enforcement as an Idempotent operation
X-BeenThere: precis@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Preparation and Comparison of Internationalized Strings <precis.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/precis>, <mailto:precis-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/precis/>
List-Post: <mailto:precis@ietf.org>
List-Help: <mailto:precis-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/precis>, <mailto:precis-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 22 Mar 2017 01:30:06 -0000

On 2/26/17 5:48 PM, Peter Saint-Andre wrote:
> On 2/13/17 12:04 AM, William Fisher wrote:
>> On Sun, Feb 12, 2017 at 12:27 PM, Peter Saint-Andre <stpeter@stpeter.im> wrote:
>>> Did you mean U+212A (KELVIN SIGN)? That decomposes to U+004B (LATIN CAPITAL
>>> LETTER K).
>>>
>>>> The full example is:
>>>> "\U0001f11aevin" => "(K)evin" => "(k)evin"
>>
>> I'm talking about 'PARENTHESIZED LATIN CAPITAL LETTER K' (U+1F11A).
>> Sorry it's not clear that the A is part of the unicode escape.
> 
> Thanks for the clarification.
> 
>> With casefold or tolower, the result is the same for these Nicknames:
>>
>> Not idempotent: "\U0001f11A" => "(K)" => "(k)"
>> Not idempotent: "\U0001f13A" => "K" => "k"
>> Not idempotent: "\u210c" => "H" => "h"
>> Not idempotent: "\u210d" => "H" => "h"
>> Not idempotent: "\u20a8" => "Rs" => "rs"
>>
>> When you apply the comparison steps from RFC 7700, Section 2.4, you
>> still get something that is upper case. If you apply the comparison
>> steps again, you now get lower case.
> 
> I see what you mean. I'm now leaning toward moving the case mapping rule
> after the normalization rule, but first I want to think about the
> implications for all of the PRECIS profiles (e.g., when using NFC vs.
> using NFKC). If we go down this road, we will also want to describe the
> reasoning in Section 5.2.3 of 7564bis.

Thinking about this further, I now lean against making this change in
the PRECIS processing rules, for several reasons:

1. Existing PRECIS implementations would need to be modified, resulting
in a behavioral difference between older and newer implementations (or
older and newer versions of the same implementation).

2. The order of operations in PRECIS was intended to be consistent with
IDNA2008 (in which case mapping is performed before normalization,
albeit in the application before the protocol is invoked), and with
IDNA2003 and Stringprep prior to IDNA2008 (note also that several PRECIS
profiles were designed as modernized replacements for Stringprep
profiles). Making PRECIS inconsistent with IDNA might make it harder to
reuse code and might lead to unexpected and undesirable consequences.

3. Idempotence, although a desirable quality, in my opinion falls into
the category of "nice but not necessary". (If we were designing PRECIS
anew, my opinion might be different.)

A safer approach would be to add an implementation note to the effect
that PRECIS processing might not be idempotent, and that implementations
might need to apply the rules more than once to the same string.

Peter