Re: Using UTF-8 for non-ASCII Characters in URLs
Dan Oscarsson <Dan.Oscarsson@trab.se> Wed, 30 April 1997 07:02 UTC
Received: from cnri by ietf.org id aa05669; 30 Apr 97 3:02 EDT
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa03927; 30 Apr 97 3:02 EDT
Received: (from daemon@localhost) by services.bunyip.com (8.8.5/8.8.5) id CAA09383 for uri-out; Wed, 30 Apr 1997 02:53:11 -0400 (EDT)
Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.8.5/8.8.5) with ESMTP id CAA09378 for <uri@services.bunyip.com>; Wed, 30 Apr 1997 02:53:09 -0400 (EDT)
Received: from malmo.trab.se (malmo.trab.se [131.115.48.10]) by mocha.bunyip.com (8.8.5/8.8.5) with ESMTP id CAA29437 for <uri@bunyip.com>; Wed, 30 Apr 1997 02:53:05 -0400 (EDT)
Received: from valinor.malmo.trab.se (valinor.malmo.trab.se [131.115.48.20]) by malmo.trab.se (8.7.5/TRAB-primary-2) with ESMTP id IAA17793; Wed, 30 Apr 1997 08:52:18 +0200 (MET DST)
Received: by valinor.malmo.trab.se (8.7.5/TRM-1-KLIENT); Wed, 30 Apr 1997 08:52:17 +0200 (MET DST) (MET)
Date: Wed, 30 Apr 1997 08:52:17 +0200
From: Dan Oscarsson <Dan.Oscarsson@trab.se>
Message-Id: <199704300652.IAA09984@valinor.malmo.trab.se>
To: uri@bunyip.com, masinter@parc.xerox.com
Subject: Re: Using UTF-8 for non-ASCII Characters in URLs
Mime-Version: 1.0
Content-MD5: 5tFAsRBqSXseK4wOLNsKUA==
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: owner-uri@bunyip.com
Precedence: bulk
> Since no one else has, here's a rough draft of a UTF-8 URL > internet-draft, which I intend to submit in a few days time, > after taking another pass on it. > > > ----- > INTERNET-DRAFT Larry Masinter, Xerox Corporation > draft-masinter-url-i18n-00xx April 27, 1997 > Expires: October 27, 1997 > 3.2 Requirements for URL generation and interpretation > > Systems that are offering resources through the internet > where those resources have logical names sometimes offer > the ability to generate URLs for the resources they offer. > For example, some HTTP servers offer the ability to > generate a 'directory listing' for file directories > under their purvue, and then to respond to the generated > URLs with the files. If the names of the files consist > solely of US-ASCII characters, the transcription is > simple, but other file systems offer a wider variety > of characters. It is recommended that the generation > of directories result in hex-encoded UTF-8 for non-USASCII > characters in the listing, and that the interpretation > of URLs accept both the raw UTF-8 or the hex-encoded version. > This is not right. A directory listing service generates a html document that is sent back to the web browser. All URLs within a html document should use the same character set as the document uses. That is, if the document uses iso 8859-1, the URLs will be in iso 8859-1, and if the document is in UTF-8, the URLs will be in UTF-8. If the browser knows how to handle the character set of the html document, it also should know how to translate the embedded URLs into UTF-8 when the user follows a link. In general, URLs used without a context that defines the characters used, should be encoded using UTF-8. URLs used within a context where the meaning of the characters is defined should use the character encoding of the context. Dan
- Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
- Re: Using UTF-8 for non-ASCII Characters in URLs Dan Connolly
- Re: Using UTF-8 for non-ASCII Characters in URLs Michael Kung <MKUNG.US.ORACLE.COM>
- Re: Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
- Re: Using UTF-8 for non-ASCII Characters in URLs Dan Oscarsson
- Re: Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
- Re: Using UTF-8 for non-ASCII Characters in URLs Dan Oscarsson
- Re: Using UTF-8 for non-ASCII Characters in URLs Gary Adams - Sun Microsystems Labs BOS
- Re: Using UTF-8 for non-ASCII Characters in URLs Gary Adams - Sun Microsystems Labs BOS
- Re: Using UTF-8 for non-ASCII Characters in URLs Francois Yergeau
- Re: Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
- Re: Using UTF-8 for non-ASCII Characters in URLs Michael Kung <MKUNG.US.ORACLE.COM>
- Re: Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
- Re: Using UTF-8 for non-ASCII Characters in URLs Martin J. Duerst
- Re: Using UTF-8 for non-ASCII Characters in URLs Martin J. Duerst
- Re: Using UTF-8 for non-ASCII Characters in URLs Larry Masinter
- Re: Using UTF-8 for non-ASCII Characters in URLs Dan Oscarsson
- Re: "Difficult Characters" draft Martin J. Duerst
- Re: Using UTF-8 for non-ASCII Characters in URLs Martin J. Duerst
- Re: Using UTF-8 for non-ASCII Characters in URLs Edward Cherlin
- Re: Using UTF-8 for non-ASCII Characters in URLs Chris Newman
- Re: "Difficult Characters" draft Larry Masinter
- Re: "Difficult Characters" draft Alain LaBont/e'/
- Re: "Difficult Characters" draft Martin J. Duerst
- Re: Using UTF-8 for non-ASCII Characters in URLs Martin J. Duerst
- Re: "Difficult Characters" draft Leslie Daigle
- Re: "Difficult Characters" draft Alain LaBont/e'/
- Re: "Difficult Characters" draft Martin J. Duerst
- Re: "Difficult Characters" draft Patrik Faltstrom
- Re: Using UTF-8 for non-ASCII Characters in URLs Martin J. Duerst
- Re: Using UTF-8 for non-ASCII Characters in URLs Alain LaBont/e'/