Re: I18N Concensus - Generic Syntax Document
"Roy T. Fielding" <fielding@kiwi.ics.uci.edu> Fri, 07 March 1997 10:06 UTC
Received: from cnri by ietf.org id aa23283; 7 Mar 97 5:06 EST
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa04711; 7 Mar 97 5:06 EST
Received: (from daemon@localhost) by services.bunyip.com (8.8.5/8.8.5) id EAA02194 for uri-out; Fri, 7 Mar 1997 04:41:59 -0500 (EST)
Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.8.5/8.8.5) with SMTP id EAA02189 for <uri@services.bunyip.com>; Fri, 7 Mar 1997 04:41:56 -0500 (EST)
Received: from paris.ics.uci.edu by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA19443 (mail destined for uri@services.bunyip.com); Fri, 7 Mar 97 04:41:54 -0500
Received: from kiwi.ics.uci.edu by paris.ics.uci.edu id aa29868; 7 Mar 97 1:37 PST
To: "Martin J. Duerst" <mduerst@ifi.unizh.ch>
Cc: URI List <uri@bunyip.com>
Subject: Re: I18N Concensus - Generic Syntax Document
In-Reply-To: Your message of "Thu, 06 Mar 1997 20:40:08 +0100." <Pine.SUN.3.95q.970306203216.245a-100000@enoshima>
Date: Fri, 07 Mar 1997 01:37:25 -0800
From: "Roy T. Fielding" <fielding@kiwi.ics.uci.edu>
Message-Id: <9703070137.aa29868@paris.ics.uci.edu>
Sender: owner-uri@bunyip.com
Precedence: bulk
>+ It is recommended that UTF-8 [RFC 2044] be used to represent characters >+ with octets in URLs, wherever possible. > >+ For schemes where no single character->octet encoding is specified, >+ a gradual transition to UTF-8 can be made by servers make resources >+ available with UTF-8 names on their own, on a per-server or a >+ per-resource basis. Schemes and mechanisms that use a well- >+ defined character->octet encoding which is however not UTF-8 should >+ define the mapping between this encoding and UTF-8, because generic >+ URL software is unlikely to be aware of and to be able to handle >+ such specific conventions. Here is where you lose me. I have no desire to add a UTF-8 character mapping table to our server. An HTTP server doesn't need one -- its URLs are either composed by computation (in which case knowing the charset is not possible) or by derivation from the filesystem (in which case it will use whatever charset the filesystem uses, and in any case has no way of determining whether or not that charset is UTF-8). The server doesn't care and should not care. It is therefore inappropriate to suggest that it should add such a table when doing so would only bloat the server and slow-down the URL<->resource mapping process. >> Data corresponding to excluded characters must be escaped in order >> to be properly represented within a URL. However, there do exist >> some systems that allow characters from the "unwise" and "national" >> sets to be used in URL references (section 3); a robust >> implementation should be prepared to handle those characters when >> it is possible to do so. > >Change to: > >There exist some systems that allow characters/octets from the >"unwise" and "others" sets to be used in URL references (section 3). >Until a uniform representation for characters within URLs is firmly >established, such practice is not stable with respect to transcoding >and therefore should be avoided. >However, robust implementations should be prepared to handle those >octet values when it is possible to do so. No thanks -- the existing paragraph is far better. Transcoding is not an issue unless they are already violating the specification, in which case they are prepared to suffer the consequences. The purpose of the paragraph is to prevent an implementer from interpreting the spec too literally and crashing on a non-urlc character. .....Roy
- I18N Concensus - Generic Syntax Document Rich Petke
- Re: I18N Concensus - Generic Syntax Document Martin J. Duerst
- Re: I18N Concensus - Generic Syntax Document Roy T. Fielding
- Re: I18N Concensus - Generic Syntax Document Rich Salz
- Re: I18N Concensus - Generic Syntax Document Martin J. Duerst
- Re: I18N Concensus - Generic Syntax Document Roy T. Fielding
- Re: I18N Concensus - Generic Syntax Document Dan Oscarsson
- Re: I18N Concensus - Generic Syntax Document Roy T. Fielding
- Re: I18N Concensus - Generic Syntax Document Martin J. Duerst
- Re: I18N Concensus - Generic Syntax Document Martin J. Duerst