Re: [idn] Re: stringprep: PRI #29

"Adam M. Costello" <idn.amc+0@nicemice.net.RemoveThisWord.cnri.reston.va.us> Sun, 27 March 2005 21:36 UTC

Date: Sun, 27 Mar 2005 21:29:49 +0000
From: "Adam M. Costello" <idn.amc+0@nicemice.net.RemoveThisWord.cnri.reston.va.us>
To: idn@ops.ietf.org
Subject: Re: [idn] Re: stringprep: PRI #29
Message-ID: <20050327212949.GA15994~@nicemice.net>
Reply-To: IETF idn working group <idn@ops.ietf.org>
References: <423754F3.50405@vanderpoel.org> <ilumzt47ezc.fsf@latte.josefsson.org> <20050316091126.GA24254~@nicemice.net> <iluzmx36h6t.fsf@latte.josefsson.org> <423CD9DC.5080401@vanderpoel.org> <ilur7iaycuj.fsf@latte.josefsson.org> <423DB883.1020408@vanderpoel.org> <iluis3mxbqq.fsf@latte.josefsson.org> <423F12A9.60009@vanderpoel.org> <ilupsxs7skt.fsf@latte.josefsson.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <ilupsxs7skt.fsf@latte.josefsson.org>
User-Agent: Mutt/1.5.6+20040907i
Sender: owner-idn@ops.ietf.org
Precedence: bulk

Simon Josefsson <jas@extundo.com> wrote:

> Having a draft outlining the alternatives would be a useful
> contribution.

For an implementation, I see the following alternatives:

1) Reject the problem sequences (and no others) before normalizing (add
   a step 0 to the stringprep algorithm).

2) Reject some superset of the problem sequences that is easier to
   compute and still rejects only implausible strings.

3) Reject nothing, but make sure the normalization algorithm follows the
   proposed fixed normalization spec (and the original sample code).

4) Reject nothing, but make sure the normalization algorithm follows the
   current broken normalization spec (not the original sample code).

5) Reject nothing, and follow whichever normalization spec is already
   implemented (that is, ignore the problem).

I have listed these in order of decreasing quality (though the order of
2 and 3 is debatable), and in order of decreasing complexity (except
that 3 and 4 are equally complex).

I don't expect anyone to defend 5.

Given that there already exist implementations of both kinds of
normalization, and neither is more right or more wrong than the other
(in light of the internally inconsistent spec), it would be silly for
stringprep to standardize on the broken one, so we can discard 4.

I can imagine an implementor making a case for any of 1-3.  The next
question is how much latitude the stringprep spec should give the
implementor.

We might consider allowing implementations to choose from among more
than one approved alternative.  I lean against that idea, because
stringprep is used to define valid strings in protocols, and it's hard
to analyze what can happen when implementations disagree about the
validity of a string.  (There is already one way this can happen in
stringprep: unassigned code points become assigned.  The analysis was
very tricky, and I'd prefer to keep that sort of thing to a minimum.)

If stringprep were to pick alternative 3 as the one and only allowed
behavior, some implementations would consider that an unacceptable
security risk and implement 1 or 2 anyway, so in practice we'd still
have implementations choosing from multiple alternatives.

This argues for the stringprep spec to require the rejection of either
(1) the problem sequences or (2) some particular superset of the problem
sequences that is easier to compute.  But the implementation simplicity
of (3) is very appealing, and part of me hopes to be persuaded to go
with that in spite of the analytical complexity.

If we choose 1 or 2, we still need to decide how to define the sequences
that must be rejected.  For example, if we choose 1, I suggest something
like this:

    Definition:  A string must be rejected if and only if oldNormalize
    and newNormalize return different results.

    Theorem:  The rejected strings are exactly the ones containing
    code points with such-and-such properties in such-and-such
    combinations...

    Theorem:  In Unicode version x.y, the previous theorem amounts to
    the following table of specific code point combinations...

Notice that if an implementation has access to both oldNormalize and
newNormalize, it can implement the definition directly without the
complexity of applying either theorem.  It could also do a simple check
for the presence of any code points from the table, and optimize away
the second normalization if none are present.

AMC

[idn] related work Erik van der Poel
[idn] Unicode categories Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue JFC (Jefsey) Morfin
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue John C Klensin
Re: [idn] nameprep2 and the slash homograph issue Adam M. Costello
Re: [idn] something a little lighter for the week… Doug Ewell
Re: [idn] stability Erik van der Poel
Re: [idn] Re: character tables Erik van der Poel
Re: [idn] Re: process Adam M. Costello
Re: [idn] punctuation John C Klensin
Re: [idn] Re: stability JFC (Jefsey) Morfin
Re: [idn] Re: character tables Gervase Markham
Re: [idn] stringprep: PRI #29 Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue Gervase Markham
Re: [idn] Re: stability Erik van der Poel
Re: [idn] process Paul Hoffman
Re: [idn] Re: character tables YAO Jiankang
Re: [idn] nameprep2 and the slash homograph issue JFC (Jefsey) Morfin
Re: [idn] nameprep2 and the slash homograph issue Adam M. Costello
Re: [idn] punctuation John C Klensin
Re: [idn] punctuation tedd
Re: [idn] Re: character tables JFC (Jefsey) Morfin
Re: [idn] punctuation Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue Gervase Markham
Re: [idn] Re: stability Erik van der Poel
Re: [idn] Re: character tables Adam M. Costello
[idn] Re: character tables John C Klensin
Re: [idn] Re: character tables Erik van der Poel
Re: [idn] Re: stability JFC (Jefsey) Morfin
Re: [idn] Re: character tables Paul Hoffman
Re: [idn] Re: stability Martin v. Löwis
Re: [idn] Re: character tables Erik van der Poel
Re: [idn] Re: stability John C Klensin
[idn] Re: Unicode categories John C Klensin
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
[idn] character tables Erik van der Poel
Re: [idn] Re: character tables John C Klensin
Re: [idn] Re: stability Mark Davis
Re: [idn] Re: stringprep: PRI #29 Erik van der Poel
[idn] stability Erik van der Poel
Re: [idn] Re: character tables Erik van der Poel
Re: [idn] Re: dichotomies JFC (Jefsey) Morfin
Re: [idn] process Adam M. Costello
Re: [idn] Re: character tables William Tan
Re: [idn] Re: process James Seng
[idn] Re: stability Simon Josefsson
Re: [idn] stability Erik van der Poel
[idn] Re: stability Martin v. Löwis
Re: [idn] Re: process Jaap Akkerhuis
Re: [idn] Re: stringprep: PRI #29 Adam M. Costello
Re: [idn] punctuation tedd
[idn] Re: dichotomies Erik van der Poel
Re: [idn] Re: stability Martin v. Löwis
Re: [idn] punctuation Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] process JFC (Jefsey) Morfin
[idn] Re: stability Simon Josefsson
Re: [idn] nameprep2 and the slash homograph issue JFC (Jefsey) Morfin
[idn] Re: stringprep: PRI #29 Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue Adam M. Costello
Re: [idn] process John C Klensin
Re: [idn] Re: Unicode categories Mark Davis
Re: [idn] process Doug Ewell
Re: [idn] Re: stability Adam M. Costello
Re: [idn] process Erik van der Poel
[idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] punctuation tedd
[idn] punctuation Erik van der Poel
Re: [idn] Re: stability James Seng
[idn] Re: stability Simon Josefsson
[idn] something a little lighter for the weekend Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] something a little lighter for the week… Adam M. Costello
Re: [idn] process Gervase Markham
[idn] Re: character tables Cary Karp
[idn] Mozilla? JFC (Jefsey) Morfin
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] punctuation Erik van der Poel
[idn] Re: Unicode categories Erik van der Poel
[idn] Re: stability Simon Josefsson
Re: [idn] Re: character tables JFC (Jefsey) Morfin
[idn] Re: process Stephane Bortzmeyer
Re: [idn] process Erik van der Poel
Re: [idn] punctuation Jaap Akkerhuis
Re: [idn] Re: character tables Gervase Markham
Re: [idn] Re: process Jaap Akkerhuis
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] Re: process James Seng
[idn] stringprep mailing list Erik van der Poel
Re: [idn] Re: dichotomies Erik van der Poel
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
Re: [idn] Re: stability Erik van der Poel
Re: [idn] Re: character tables Erik van der Poel
Re: [idn] Re: stability JFC (Jefsey) Morfin
Re: [idn] Re: process Erik van der Poel
[idn] Re: stringprep: PRI #29 Simon Josefsson
Re: [idn] punctuation Erik van der Poel
Re: [idn] stability Martin v. Löwis
[idn] stringprep: PRI #29 Erik van der Poel
Re: [idn] Re: character tables Paul Hoffman
Re: [idn] nameprep2 and the slash homograph issue Erik van der Poel
[idn] Re: stability Simon Josefsson
[idn] process Erik van der Poel
[idn] stringprep: existing profiles and string pr… Erik van der Poel
Re: [idn] Re: stability Erik van der Poel
[idn] dichotomies Erik van der Poel
Re: [idn] stability JFC (Jefsey) Morfin
[idn] Re: character tables Cary Karp
Re: [idn] Re: process Erik van der Poel
[idn] Re: stringprep mailing list Simon Josefsson
Re: [idn] Re: Unicode categories Martin v. Löwis
Re: [idn] Re: stability JFC (Jefsey) Morfin
Re: [idn] something a little lighter for the week… John C Klensin
Re: [idn] something a little lighter for the week… Adam M. Costello
Re: [idn] Re: dichotomies JFC (Jefsey) Morfin
Re: [idn] Re: stability Erik van der Poel
Re: [idn] Re: stability Erik van der Poel
[idn] Re: stringprep: PRI #29 Simon Josefsson
Re: [idn] stability Erik van der Poel
[idn] Re: stringprep: PRI #29 Simon Josefsson