Re: URL internationalization!
"Martin J. Duerst" <mduerst@ifi.unizh.ch> Wed, 26 February 1997 14:56 UTC
Received: from cnri by ietf.org id aa18075; 26 Feb 97 9:56 EST
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa12457; 26 Feb 97 9:56 EST
Received: (from daemon@localhost) by services.bunyip.com (8.8.5/8.8.5) id JAA28823 for uri-out; Wed, 26 Feb 1997 09:04:25 -0500 (EST)
Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.8.5/8.8.5) with SMTP id JAA28818 for <uri@services.bunyip.com>; Wed, 26 Feb 1997 09:04:22 -0500 (EST)
Received: from josef.ifi.unizh.ch by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA09580 (mail destined for uri@services.bunyip.com); Wed, 26 Feb 97 09:04:19 -0500
Received: from enoshima.ifi.unizh.ch by josef.ifi.unizh.ch with SMTP (PP) id <02236-0@josef.ifi.unizh.ch>; Wed, 26 Feb 1997 15:04:50 +0100
Date: Wed, 26 Feb 1997 15:04:48 +0100
From: "Martin J. Duerst" <mduerst@ifi.unizh.ch>
To: Jonathan Rosenne <Jonathan_Rosenne@compuserve.com>
Cc: URI List <uri@bunyip.com>
Subject: Re: URL internationalization!
In-Reply-To: <199702251306_MC2-11B1-87E5@compuserve.com>
Message-Id: <Pine.SUN.3.95q.970226145545.245G-100000@enoshima>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Sender: owner-uri@bunyip.com
Precedence: bulk
On Tue, 25 Feb 1997, Jonathan Rosenne wrote: > > As an example, > > let's take a resource name with a G with breve (U+011E). Let's > > assume that on the server, resource names are encoded in iso-8859-3. > > Then the G with breve contains appears as %AB in a well-formed > > URL. Now suppose somebody put that URL into an HTML document > > that is encoded in iso-8859-3, in 8-bit form (i.e. the URL contains > > the octet 0xAB for the G with breve character), and that that > > document is correctly tagged as iso-8859-3. > > > > Now assume a browser sends a request with > > Accept-Charset: iso-8859-5 > > The server (or a proxy) translates the whole document from > > iso-8859-3 to iso-8859-5 to honor the request of the browser. > > The G with breve gets changed to 0xD0. The client receives > > the 0xD0. If it "behaves the same as if it had received the > > corresponding %XX", i.e. %D0, the URL will not work at all. > > I don't understand. What if the user uses 8859-8, which has no G-breve? I > mean, what if it says Accept-Charset: iso-8859-8? Then this depends on the sophistication of the transcoding server/proxy. For (i18n) HTML, the obvious solution is to replace the G-breve with Ğ, the decimal value of U+011E. For formats other than HTML, we might be out of luck. The server/ proxy may convert it to a sequence %HH%HH corresponding to G-breve in UTF-8 if it is sure that the G-breve is in an URL. But it is much more difficult to decide what could be an URL in an arbitrary format than to replace all unrepresentable characters by numeric character references in HTML (which can be done irrespective of whether it is an URL or something else. This is an additional reason for why we should be careful with the introduction of natively encoded URLs, and why I am abstaining for the moment to fully propose it. Regards, Martin.
- URL internationalization! Martin J. Duerst
- URL internationalization! Martin J. Duerst
- Re: URL internationalization! Roy T. Fielding
- Re: URL internationalization! Gregory J. Woodhouse
- Re: URL internationalization! Francois Yergeau
- Re: URL internationalization! Martin J. Duerst
- Re: URL internationalization! Dan Oscarsson
- Re: URL internationalization! Alain LaBont/e'/
- Re: URL internationalization! Gregory J. Woodhouse
- Re: URL internationalization! Francois Yergeau
- Re: URL internationalization! Gregory J. Woodhouse
- Re: URL internationalization! Martin J. Duerst
- Symbolic vs Numeric identifiers (was Re: URL inte… Daniel LaLiberte
- Re: URL internationalization! Martin J. Duerst
- Re: Symbolic vs Numeric identifiers (was Re: URL … Gregory J. Woodhouse
- Re: URL internationalization! Dan Oscarsson
- Re: URL internationalization! Martin J. Duerst
- Re: URL internationalization! Jonathan Rosenne
- Re: URL internationalization! Larry Masinter
- Re: URL internationalization! Alain LaBont/e'/
- Re: Symbolic vs Numeric identifiers Daniel LaLiberte
- Re: URL internationalization! Martin J. Duerst
- Re: URL internationalization! Martin J. Duerst
- Re: Symbolic vs Numeric identifiers (was Re: URL … Martin J. Duerst
- Re: Symbolic vs Numeric identifiers (was Re: URL … Gavin Nicol