Re: revised "generic syntax" internet draft

Edward Cherlin <> Fri, 25 April 1997 07:47 UTC

Received: from cnri by id aa09047; 25 Apr 97 3:47 EDT
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa04367; 25 Apr 97 3:47 EDT
Received: (from daemon@localhost) by (8.8.5/8.8.5) id DAA07034 for uri-out; Fri, 25 Apr 1997 03:17:13 -0400 (EDT)
Received: from (mocha.Bunyip.Com []) by (8.8.5/8.8.5) with SMTP id DAA07025 for <>; Fri, 25 Apr 1997 03:17:09 -0400 (EDT)
Received: from by with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA26693 (mail destined for; Fri, 25 Apr 97 03:17:06 -0400
Received: from [] ( []) by (8.8.5/8.6.5) with ESMTP id AAA19620 for <uri@Bunyip.Com>; Fri, 25 Apr 1997 00:16:49 -0700 (PDT)
Message-Id: <v03007834af85fb0914ed@[]>
In-Reply-To: <>
References: <v0300780baf8466ef2424@[]> (message from Edward Cherlin on Thu, 24 Apr 1997 00:09:11 -0700)
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Thu, 24 Apr 1997 23:25:10 -0700
From: Edward Cherlin <>
Subject: Re: revised "generic syntax" internet draft
Precedence: bulk

"Karen R. Sollins" <> wrote:


>   I have said at least ten times in this discussion, with no acknowledgement
>   from anyone, that we are to assume that people will not publish Unicode
>   URLs without knowing that their servers support them.
>   If I am going to create an ftp: site, and I don't check what version of
>   what ftp server I'm using, I'm a fool, and likewise for gopher: and telnet:
>   and the others. If I put out an https: URL and I don't have a secure server
>   to receive it, I'm a fool. If I intend to accept encoded UTF-8, I need to
>   find out how my server can deal with it. If I don't intend to accept it, I
>   can regard encoded UTF-8 in URLs as plain ASCII, without breaking any
>   process that is not already broken.
>   --
>   Edward Cherlin

>Edward and everyone,
>I have tried VERY hard to stay out of this discussion, but I know have
>to ask a question as suggested by the extraction above.  Must one
>conclude from a position of supporting encoding of character sets in
>UTF-8 that the server at the site of the resource MUST be of a certain
>flavor supporting that character set, and furthermore that perhaps the
>general practice will be that each server will only support one or a
>small number?

This is similar to the need for a server that can handle CGI scripting
before publishing pages with forms and CGI scripts. We must implement the
proposal in software, and deploy the software.

So those publishing UTF-8 URLs will have to use servers capable of serving
them. This means that they can translate %HH encoding to UTF-8 and
presumably to Unicode, and then look up the requested resource, or else it
means that they accept the ASCII version in %HH-encoding, and look that up.
Every server is capable of one or the other strategy, since the second
strategy is what we do today in ASCII.

>With no general solution implemented globally, those
>with less popular character sets (this often goes hand in hand with
>less technology and less economic strength) are much more likely to be
>left out in the cold.  So much for general internationalization,
>unless this means only internationalization for the larger, richer

Those with less popular character sets are out in the cold today. Unicode
will bring them in from the cold, since it is a general solution that has
fairly wide implementation (in Windows NT, Macintosh OS, several flavors of
UNIX, Java, and so on, and in applications such as Microsoft Office 97 and
Alis Tango Web browser).

There will be ample provision of fonts for every supported script.
Bitstream has done us the very great service of creating a *free* font
supporting all scripts presently included in Unicode. High quality fonts
exist today for every commercially important script, and many that are not
commercially important, and they can easily have Unicode code points added.

There is no hope of getting every legacy character encoding incorporated
into modern software by any means other than Unicode.

>			Karen Sollins
>Karen R. Sollins
>Research Scientist				Phone: 617/253-6006
>M.I.T. Laboratory for Computer Science		Fax:   617/253-2673
>545 Technology Square
>Cambridge, MA 02139

Edward Cherlin     Everything should be made
Vice President     Ask. Someone knows.       as simple as possible,
NewbieNet, Inc.                                 __but no simpler__.                Attributed to Albert Einstein