Re: [precis] Enforcement as an Idempotent operation

Peter Saint-Andre <> Thu, 13 October 2016 04:03 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id ED439129818 for <>; Wed, 12 Oct 2016 21:03:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -4.898
X-Spam-Status: No, score=-4.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-2.996, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 2VSDHZhtRZrc for <>; Wed, 12 Oct 2016 21:03:41 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 4D4B6129811 for <>; Wed, 12 Oct 2016 21:03:41 -0700 (PDT)
Received: from aither.local (unknown []) (Authenticated sender: stpeter) by (Postfix) with ESMTPSA id 715084032A; Wed, 12 Oct 2016 22:06:27 -0600 (MDT)
References: <>
From: Peter Saint-Andre <>
Message-ID: <>
Date: Wed, 12 Oct 2016 22:03:29 -0600
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.4.0
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Archived-At: <>
Subject: Re: [precis] Enforcement as an Idempotent operation
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Preparation and Comparison of Internationalized Strings <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 13 Oct 2016 04:03:43 -0000

On 10/12/16 1:56 PM, William Fisher wrote:
> Should enforcing a string using PRECIS be idempotent?

As far as I know, that was not a design criterion for PRECIS. Naturally, 
it might be a desirable property nonetheless.

> If I apply the
> enforce operation to a string twice, should I get the same result as
> applying it just once?
> The nickname profile is NOT idempotent for some inputs.
> 1. Certain characters are NFKC normalized to sequences with ASCII
> spaces. This can lead to nicknames that begin with a space or contain
> adjacent interior spaces that are removed if you apply the nickname
> profile again.
>   U+00A8  =>  U+0020 U+0308  =>  U+0308

That's a good example.

One could argue that the leading/trailing space and adjacent interior 
space rules are application-specific and don't really belong in the 
nickname profile (indeed, I seem to recall a message to the WG about 
that years ago). I've been on the fence about that several times.

>  2. Some characters can be further case mapped after NFKC normalization.
>   U+1F11  => (K) => (k)

It's not clear to me that U+1F11 has the problem you describe; perhaps 
could you sketch it out further?

> I also noticed that the RFC 7700 has case-mapping defined only when
> comparing nicknames.  I thought this was confusing. I didn't understand
> why username is split into two profiles (CasePreserved and CaseMapped),
> but nickname is not.

We try really hard not to multiply profiles beyond necessity. In this 
instance, we deemed acceptable not to apply the case mapping rule for 
enforcement (e.g., "StPeter" is a fine nickname) but would like to avoid 
nicknames in the same address space (e.g., a chatroom) that differ only 
by case because that would be confusing (e.g., "StPeter" and "stpeter"). 
As you suggest, we could have accomplished the same result by defining 
two separate profiles.

> If not all PRECIS profiles are idempotent, it would help to mention this
> in the IANA Profile registry, e.g.
>    Idempotent:  No.
> As an implementer, I would prefer profiles that are idempotent.

Thanks for your input. Personally I will think about it further and post 
again after I do so.