Re: [precis] WGLC: draft-ietf-precis-framework-09.txt

Peter Saint-Andre <stpeter@stpeter.im> Sat, 12 October 2013 02:33 UTC

Return-Path: <stpeter@stpeter.im>
X-Original-To: precis@ietfa.amsl.com
Delivered-To: precis@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 85D9F11E81E2 for <precis@ietfa.amsl.com>; Fri, 11 Oct 2013 19:33:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.295
X-Spam-Level:
X-Spam-Status: No, score=-102.295 tagged_above=-999 required=5 tests=[AWL=0.304, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1SBV1qO-3eJQ for <precis@ietfa.amsl.com>; Fri, 11 Oct 2013 19:33:30 -0700 (PDT)
Received: from stpeter.im (mailhost.stpeter.im [207.210.219.225]) by ietfa.amsl.com (Postfix) with ESMTP id 898E711E8158 for <precis@ietf.org>; Fri, 11 Oct 2013 19:33:30 -0700 (PDT)
Received: from [192.168.1.3] (unknown [71.237.13.154]) (Authenticated sender: stpeter) by stpeter.im (Postfix) with ESMTPSA id 2B11D40FA9; Fri, 11 Oct 2013 20:39:36 -0600 (MDT)
Message-ID: <5258B4F8.4030601@stpeter.im>
Date: Fri, 11 Oct 2013 20:33:28 -0600
From: Peter Saint-Andre <stpeter@stpeter.im>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.0
MIME-Version: 1.0
To: Florian Zeitz <florob@babelmonkeys.de>, precis@ietf.org
References: <20130828154603.a94201dea74f29229b4767b2@jprs.co.jp> <522FD033.3070001@babelmonkeys.de> <5254CB86.40706@stpeter.im> <5258A86C.7080708@babelmonkeys.de>
In-Reply-To: <5258A86C.7080708@babelmonkeys.de>
X-Enigmail-Version: 1.6
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Subject: Re: [precis] WGLC: draft-ietf-precis-framework-09.txt
X-BeenThere: precis@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Preparation and Comparison of Internationalized Strings <precis.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/precis>, <mailto:precis-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/precis>
List-Post: <mailto:precis@ietf.org>
List-Help: <mailto:precis-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/precis>, <mailto:precis-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 12 Oct 2013 02:33:36 -0000

Heh, I was just now addressing your feedback in my working copy of the
spec. -)

On 10/11/2013 07:39 PM, Florian Zeitz wrote:
> On 09.10.2013 05:20, Peter Saint-Andre wrote:
>> Hi Florian, thanks for the review! Comments inline.
>>
>> On 09/10/2013 08:06 PM, Florian Zeitz wrote:
>>> The major thing that bothers me about this draft is that string classes
>>> IMHO conflate to separate concepts. On the one hand they specify valid
>>> and disallowed codepoints. On the other hand they specify (or rather,
>>> let the application protocol specify) mappings and normalization.
>>> The problem I have with this is, that it makes it unclear which strings
>>> are valid in a certain class.
>>
>> You are correct. Validity really applies at the level of a profile, not
>> a class.
>>>
>>> E.g. consider an applications protocol that specifies FreeformClass
>>> mixed with NFKC. This means characters, which have a compatibility
>>> equivalent are valid in the sense that they are FREE_PVAL, but are
>>> invalid in the normalization form. It is unclear to me, whether a string
>>> containing characters with a compatibility equivalent would be contained
>>> in the FreeformClass, or more precisely, this specialization thereof.
>>>
>>> Similar considerations are true for e.g. mixing case mapping with
>>> IdentifierClass. Uppercase characters are PVALID/ID_PVAL, but shouldn't
>>> be present after mapping.
>>>
>>> I would prefer it if we specified classes solely in terms of valid and
>>> disallowed codepoints and directionality requirements.
>> When you suggest that we specify a class in terms of codepoints, are you
>> suggesting that go back to something like the stringprep model, in which
>> a class or profile defines a lookup table?
> Well, yes and no. We certainly want the rule/category based algorithm in
> order to have Unicode version agility, and I'm not suggesting we get rid
> of it. I'm also not suggesting we drop the rules about having some
> codepoints only valid in a certain context.
> I do however think it may be more sensible to say a string is within a
> PRECIS class iff all its characters are PVALID, CONTEXTO, or CONTEXTJ
> for this class, and a contextual rule is fulfilled, if required.

The way I see it, it doesn't make much sense to talk about a string
matching a class. In practice within an application protocol, a string
will be checked against the full set of rules as defined by a profile. A
string class provides a kind of "substrate", if you will, but it doesn't
define things in enough detail to perform string matching.

> This may even already be the intent, but as I said a profile can easily
> be defined such, that a string matches this criteria, but can never be
> produced after the specified normalization and all mappings were applied.
> At any rate I think we need clearer text about the intention here,
> answering the question: "When is a string allowed by a profile?". I
> personally can not really tell from the draft right now.

In part, I don't think it is the responsibility of this specification to
answer that question, other than to make it clear that you need to check
a string against the full set of rules defined by a profile. I do think
it would be helpful to provide some examples, although I think they
probably belong in the various specs that define the profiles (so far
that would be nickname, saslprepbis, and 6122bis).

>>> We would then have separate text saying that an application protocol
>>> MUST also specify which mappings and normalization to apply, what entity
>>> needs to apply them (e.g. only the server), and when they need to be
>>> applied (e.g. when comparing strings, before storing them, before
>>> display to a user). Both StringPrep-bis and 6122bis already have text to
>>> this effect. It seems sensible to me to generally require application
>>> protocols to specify the "who", and "when" beyond the "what". E.g. it is
>>> often sensible to display identifiers with their case as entered, but
>>> compare them after case folding. The current text might suggest that
>>> mappings have to be applied to user input immediately.
>> I agree that all good application protocols that use PRECIS need to
>> specify the enforcement rules, as we already do for SASL and XMPP. I am
>> less sure that the PRECIS framework needs to legislate that.
> I think not legislating this only gives people a great way to shoot
> themselves in the foot. I could be convinced otherwise though.

Yes, we are trying to prevent such "foot guns". I don't think we can get
very specific (e.g., some technologies that use PRECIS might not have a
client-server architecture). I'll see about proposing some text here...

>>> [...]
> I think we have agreement (or separate threads) about all other points.

Great, thanks.

Peter