Re: [precis] WGLC: draft-ietf-precis-framework-09.txt

Florian Zeitz <florob@babelmonkeys.de> Sat, 12 October 2013 01:40 UTC

Return-Path: <florob@babelmonkeys.de>
X-Original-To: precis@ietfa.amsl.com
Delivered-To: precis@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 568F711E81B7 for <precis@ietfa.amsl.com>; Fri, 11 Oct 2013 18:40:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2oFcqfZ0qIfz for <precis@ietfa.amsl.com>; Fri, 11 Oct 2013 18:40:07 -0700 (PDT)
Received: from babelmonkeys.de (babelmonkeys.de [IPv6:2a02:d40:3:1:10a1:5eff:fe52:509]) by ietfa.amsl.com (Postfix) with ESMTP id 2E98E11E8158 for <precis@ietf.org>; Fri, 11 Oct 2013 18:40:04 -0700 (PDT)
Received: from xdsl-87-79-139-227.netcologne.de ([87.79.139.227] helo=[192.168.0.131]) by babelmonkeys.de with esmtpsa (TLS1.0:DHE_RSA_CAMELLIA_256_CBC_SHA1:256) (Exim 4.80) (envelope-from <florob@babelmonkeys.de>) id 1VUoDP-00042h-UK; Sat, 12 Oct 2013 03:42:00 +0200
Message-ID: <5258A86C.7080708@babelmonkeys.de>
Date: Sat, 12 Oct 2013 03:39:56 +0200
From: Florian Zeitz <florob@babelmonkeys.de>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.0
MIME-Version: 1.0
To: Peter Saint-Andre <stpeter@stpeter.im>, precis@ietf.org
References: <20130828154603.a94201dea74f29229b4767b2@jprs.co.jp> <522FD033.3070001@babelmonkeys.de> <5254CB86.40706@stpeter.im>
In-Reply-To: <5254CB86.40706@stpeter.im>
X-Enigmail-Version: 1.5.2
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Subject: Re: [precis] WGLC: draft-ietf-precis-framework-09.txt
X-BeenThere: precis@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Preparation and Comparison of Internationalized Strings <precis.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/precis>, <mailto:precis-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/precis>
List-Post: <mailto:precis@ietf.org>
List-Help: <mailto:precis-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/precis>, <mailto:precis-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Oct 2013 01:40:08 -0000

On 09.10.2013 05:20, Peter Saint-Andre wrote:
> Hi Florian, thanks for the review! Comments inline.
> 
> On 09/10/2013 08:06 PM, Florian Zeitz wrote:
>> The major thing that bothers me about this draft is that string classes
>> IMHO conflate to separate concepts. On the one hand they specify valid
>> and disallowed codepoints. On the other hand they specify (or rather,
>> let the application protocol specify) mappings and normalization.
>> The problem I have with this is, that it makes it unclear which strings
>> are valid in a certain class.
> 
> You are correct. Validity really applies at the level of a profile, not
> a class.
>>
>> E.g. consider an applications protocol that specifies FreeformClass
>> mixed with NFKC. This means characters, which have a compatibility
>> equivalent are valid in the sense that they are FREE_PVAL, but are
>> invalid in the normalization form. It is unclear to me, whether a string
>> containing characters with a compatibility equivalent would be contained
>> in the FreeformClass, or more precisely, this specialization thereof.
>>
>> Similar considerations are true for e.g. mixing case mapping with
>> IdentifierClass. Uppercase characters are PVALID/ID_PVAL, but shouldn't
>> be present after mapping.
>>
>> I would prefer it if we specified classes solely in terms of valid and
>> disallowed codepoints and directionality requirements.
> When you suggest that we specify a class in terms of codepoints, are you
> suggesting that go back to something like the stringprep model, in which
> a class or profile defines a lookup table?
Well, yes and no. We certainly want the rule/category based algorithm in
order to have Unicode version agility, and I'm not suggesting we get rid
of it. I'm also not suggesting we drop the rules about having some
codepoints only valid in a certain context.
I do however think it may be more sensible to say a string is within a
PRECIS class iff all its characters are PVALID, CONTEXTO, or CONTEXTJ
for this class, and a contextual rule is fulfilled, if required.

This may even already be the intent, but as I said a profile can easily
be defined such, that a string matches this criteria, but can never be
produced after the specified normalization and all mappings were applied.
At any rate I think we need clearer text about the intention here,
answering the question: "When is a string allowed by a profile?". I
personally can not really tell from the draft right now.

>> We would then have separate text saying that an application protocol
>> MUST also specify which mappings and normalization to apply, what entity
>> needs to apply them (e.g. only the server), and when they need to be
>> applied (e.g. when comparing strings, before storing them, before
>> display to a user). Both StringPrep-bis and 6122bis already have text to
>> this effect. It seems sensible to me to generally require application
>> protocols to specify the "who", and "when" beyond the "what". E.g. it is
>> often sensible to display identifiers with their case as entered, but
>> compare them after case folding. The current text might suggest that
>> mappings have to be applied to user input immediately.
> I agree that all good application protocols that use PRECIS need to
> specify the enforcement rules, as we already do for SASL and XMPP. I am
> less sure that the PRECIS framework needs to legislate that.
I think not legislating this only gives people a great way to shoot
themselves in the foot. I could be convinced otherwise though.

>> [...]
I think we have agreement (or separate threads) about all other points.

Florian