Re: I18N Concensus - Generic Syntax Document

"Martin J. Duerst" <mduerst@ifi.unizh.ch> Fri, 07 March 1997 17:05 UTC

Received: from cnri by ietf.org id aa09969; 7 Mar 97 12:05 EST
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa11603; 7 Mar 97 12:05 EST
Received: (from daemon@localhost) by services.bunyip.com (8.8.5/8.8.5) id KAA14806 for uri-out; Fri, 7 Mar 1997 10:57:06 -0500 (EST)
Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.8.5/8.8.5) with SMTP id KAA14801 for <uri@services.bunyip.com>; Fri, 7 Mar 1997 10:57:03 -0500 (EST)
Received: from josef.ifi.unizh.ch by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA21664 (mail destined for uri@services.bunyip.com); Fri, 7 Mar 97 10:56:56 -0500
Received: from enoshima.ifi.unizh.ch by josef.ifi.unizh.ch with SMTP (PP) id <14801-0@josef.ifi.unizh.ch>; Fri, 7 Mar 1997 16:53:07 +0100
Date: Fri, 07 Mar 1997 16:53:06 +0100
From: "Martin J. Duerst" <mduerst@ifi.unizh.ch>
To: "Roy T. Fielding" <fielding@kiwi.ics.uci.edu>
Cc: URI List <uri@bunyip.com>
Subject: Re: I18N Concensus - Generic Syntax Document
In-Reply-To: <9703070729.aa02583@paris.ics.uci.edu>
Message-Id: <Pine.SUN.3.95q.970307164951.245J-100000@enoshima>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Sender: owner-uri@bunyip.com
Precedence: bulk

On Fri, 7 Mar 1997, Roy T. Fielding wrote:

> >> >+ It is recommended that UTF-8 [RFC 2044] be used to represent characters
> >> >+ with octets in URLs, wherever possible.
> >> >
> >> >+ For schemes where no single character->octet encoding is specified,
> >> >+ a gradual transition to UTF-8 can be made by servers make resources
> >> >+ available with UTF-8 names on their own, on a per-server or a
> >> >+ per-resource basis. Schemes and mechanisms that use a well-
> >> >+ defined character->octet encoding which is however not UTF-8 should
> >> >+ define the mapping between this encoding and UTF-8, because generic
> >> >+ URL software is unlikely to be aware of and to be able to handle
> >> >+ such specific conventions.

> >> I have no desire to add a UTF-8 character
> >> mapping table to our server.
> >
> >There is no need to do so. The above is only a *recommendation*.
> 
> Sorry, I misread the paragraph.  It would be clearer to say
> 
>    URL creation mechanisms that generate the URL from a source which
>    is not restricted to a single character->octet encoding are
>    encouraged, but not required, to transition resource names toward
>    using UTF-8 exclusively.
> 
>    URL creation mechanisms that generate the URL from a source which 
>    is restricted to a single character->octet encoding should use UTF-8
>    exclusively.  If the source encoding is not UTF-8, then a mapping
>    between the source encoding and UTF-8 should be used.

This is excellent! Let's go with it!

Regards,	Martin.