Re: Using UTF-8 for non-ASCII Characters in URLs

Larry Masinter <masinter@parc.xerox.com> Fri, 02 May 1997 09:45 UTC

Message-ID: <3369AC9E.281F@parc.xerox.com>
Date: Fri, 02 May 1997 01:58:06 -0700
From: Larry Masinter <masinter@parc.xerox.com>
Organization: Xerox PARC
MIME-Version: 1.0
To: "Martin J. Duerst" <mduerst@ifi.unizh.ch>
CC: URI mailing list <uri@bunyip.com>
Subject: Re: Using UTF-8 for non-ASCII Characters in URLs
References: <Pine.SUN.3.96.970501211303.245P-100000@enoshima>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: owner-uri@bunyip.com
Precedence: bulk

This is a great start at dealing with the issues that would
otherwise cause great confusion.

Other issues:
The bidi issues for RLT languages in conjunction with
normal punctuation used in and around identifiers. (Will
the identifiers present themselves 'correctly' without
these characters in all cases?)

Using UCS in identifiers that are normally "case insensitive"
in ASCII, and the issues, e.g., similar upper-case forms,
the role of accents and equivalence.

I think "white space" or spacing characters in general
need to be addressed.

You need to decide whether you're doing canonicalization/normalization
or just equivalence. Equivalence is probably easier to define,
and less politically sensitive, even though not as useful.

Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
Re: Using UTF-8 for non-ASCII Characters in URLs Dan Connolly
Re: Using UTF-8 for non-ASCII Characters in URLs Michael Kung <MKUNG.US.ORACLE.COM>
Re: Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
Re: Using UTF-8 for non-ASCII Characters in URLs Dan Oscarsson
Re: Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
Re: Using UTF-8 for non-ASCII Characters in URLs Dan Oscarsson
Re: Using UTF-8 for non-ASCII Characters in URLs Gary Adams - Sun Microsystems Labs BOS
Re: Using UTF-8 for non-ASCII Characters in URLs Gary Adams - Sun Microsystems Labs BOS
Re: Using UTF-8 for non-ASCII Characters in URLs Francois Yergeau
Re: Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
Re: Using UTF-8 for non-ASCII Characters in URLs Michael Kung <MKUNG.US.ORACLE.COM>
Re: Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
Re: Using UTF-8 for non-ASCII Characters in URLs Martin J. Duerst
Re: Using UTF-8 for non-ASCII Characters in URLs Martin J. Duerst
Re: Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
Re: Using UTF-8 for non-ASCII Characters in URLs Dan Oscarsson
Re: "Difficult Characters" draft Martin J. Duerst
Re: Using UTF-8 for non-ASCII Characters in URLs Martin J. Duerst
Re: Using UTF-8 for non-ASCII Characters in URLs Edward Cherlin
Re: Using UTF-8 for non-ASCII Characters in URLs Chris Newman
Re: "Difficult Characters" draft Larry Masinter
Re: "Difficult Characters" draft Alain LaBont/e'/
Re: "Difficult Characters" draft Martin J. Duerst
Re: Using UTF-8 for non-ASCII Characters in URLs Martin J. Duerst
Re: "Difficult Characters" draft Leslie Daigle
Re: "Difficult Characters" draft Alain LaBont/e'/
Re: "Difficult Characters" draft Martin J. Duerst
Re: "Difficult Characters" draft Patrik Faltstrom
Re: Using UTF-8 for non-ASCII Characters in URLs Martin J. Duerst
Re: Using UTF-8 for non-ASCII Characters in URLs Alain LaBont/e'/