Re: "Difficult Characters" draft

"Martin J. Duerst" <mduerst@ifi.unizh.ch> Fri, 02 May 1997 16:20 UTC

Date: Fri, 02 May 1997 17:58:31 +0200
From: "Martin J. Duerst" <mduerst@ifi.unizh.ch>
To: Larry Masinter <masinter@parc.xerox.com>
cc: URI mailing list <uri@bunyip.com>
Subject: Re: "Difficult Characters" draft
In-Reply-To: <3369AC9E.281F@parc.xerox.com>
Message-ID: <Pine.SUN.3.96.970502175231.245j-100000@enoshima>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Sender: owner-uri@bunyip.com
Precedence: bulk

On Fri, 2 May 1997, Larry Masinter wrote:

> Other issues:
> The bidi issues for RLT languages in conjunction with
> normal punctuation used in and around identifiers. (Will
> the identifiers present themselves 'correctly' without
> these characters in all cases?)

That in an important problem, but should go into a separate draft,
because it is basically about display, not about input.

> Using UCS in identifiers that are normally "case insensitive"
> in ASCII, and the issues, e.g., similar upper-case forms,
> the role of accents and equivalence.

With "the role of accents", do you mean the French case, where
accents may be removed on uppercasing?

> I think "white space" or spacing characters in general
> need to be addressed.

Yes, definitely. They all need to be prohibited.

> You need to decide whether you're doing canonicalization/normalization
> or just equivalence.

I already decided, with the normalization algorithms in the draft.
But I guess I need to state it more clearly.

> Equivalence is probably easier to define,
> and less politically sensitive, even though not as useful.

I think equivalence is not useful, because it puts the burdens on
software that otherwise doesn't have any clue (and doesn't have
to have a clue) about internationalization.
Normalization is politically sensitive, but we either get
something working or something useless.

Regards,	Martin.

Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
Re: Using UTF-8 for non-ASCII Characters in URLs Dan Connolly
Re: Using UTF-8 for non-ASCII Characters in URLs Michael Kung <MKUNG.US.ORACLE.COM>
Re: Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
Re: Using UTF-8 for non-ASCII Characters in URLs Dan Oscarsson
Re: Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
Re: Using UTF-8 for non-ASCII Characters in URLs Dan Oscarsson
Re: Using UTF-8 for non-ASCII Characters in URLs Gary Adams - Sun Microsystems Labs BOS
Re: Using UTF-8 for non-ASCII Characters in URLs Gary Adams - Sun Microsystems Labs BOS
Re: Using UTF-8 for non-ASCII Characters in URLs Francois Yergeau
Re: Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
Re: Using UTF-8 for non-ASCII Characters in URLs Michael Kung <MKUNG.US.ORACLE.COM>
Re: Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
Re: Using UTF-8 for non-ASCII Characters in URLs Martin J. Duerst
Re: Using UTF-8 for non-ASCII Characters in URLs Martin J. Duerst
Re: Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
Re: Using UTF-8 for non-ASCII Characters in URLs Dan Oscarsson
Re: "Difficult Characters" draft Martin J. Duerst
Re: Using UTF-8 for non-ASCII Characters in URLs Martin J. Duerst
Re: Using UTF-8 for non-ASCII Characters in URLs Edward Cherlin
Re: Using UTF-8 for non-ASCII Characters in URLs Chris Newman
Re: "Difficult Characters" draft Larry Masinter
Re: "Difficult Characters" draft Alain LaBont/e'/
Re: "Difficult Characters" draft Martin J. Duerst
Re: Using UTF-8 for non-ASCII Characters in URLs Martin J. Duerst
Re: "Difficult Characters" draft Leslie Daigle
Re: "Difficult Characters" draft Alain LaBont/e'/
Re: "Difficult Characters" draft Martin J. Duerst
Re: "Difficult Characters" draft Patrik Faltstrom
Re: Using UTF-8 for non-ASCII Characters in URLs Martin J. Duerst
Re: Using UTF-8 for non-ASCII Characters in URLs Alain LaBont/e'/