Re: revised "generic syntax" internet draft

Francois Yergeau <> Mon, 14 April 1997 04:18 UTC

Received: from cnri by id aa20032; 14 Apr 97 0:18 EDT
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa01516; 14 Apr 97 0:18 EDT
Received: (from daemon@localhost) by (8.8.5/8.8.5) id XAA12794 for uri-out; Sun, 13 Apr 1997 23:56:21 -0400 (EDT)
Received: from (mocha.Bunyip.Com []) by (8.8.5/8.8.5) with SMTP id XAA12788 for <>; Sun, 13 Apr 1997 23:56:19 -0400 (EDT)
Received: from by with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA00642 (mail destined for; Sun, 13 Apr 97 23:56:17 -0400
Received: from ([]) by (8.7.5/8.7.3) with SMTP id XAA13012; Sun, 13 Apr 1997 23:55:33 -0400 (EDT)
Message-Id: <>
X-Mailer: Windows Eudora Pro Version 3.0.1 (32)
Date: Sun, 13 Apr 1997 23:54:47 -0400
To: "Roy T. Fielding" <>
From: Francois Yergeau <>
Subject: Re: revised "generic syntax" internet draft
In-Reply-To: <>
References: <Your message of "Wed, 09 Apr 1997 18:19:29 +0200." <Pine.SUN.3.96.970408105030.245J-100000@enoshima>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-MIME-Autoconverted: from quoted-printable to 8bit by id XAA12789
Precedence: bulk
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable by id XAA12794

À 14:52 11-04-97 -0700, Roy T. Fielding a écrit :
>The only question that matters is whether or not the draft as it
>currently exists is a valid representation of what the existing
>practice is

The current spec doesn't do that.  Non-ASCII characters are routinely
rolled into URLs, yet the spec doesn't define the mapping.  IMHO, the spec
is not worthy of becoming a Draft Standard, in fact it doesn't even meet
one the requirements for Proposed Standard (from RFC 2026):

   A Proposed Standard should have no known technical omissions
   with respect to the requirements placed upon it.

> and what the vendor community agrees is needed in the
>future to support interoperability.

I'm not aware that the Internet standards process excludes non-vendors.

>Since it is my opinion that it is NEVER desirable
>to show a URL in the unencoded form given in Francois' examples,
>you cannot claim to hold anything even remotely like consensus. 

A bit preposterous, isn't it?  *Your* opinion alone is enough to break any

I also happen to disagree with this particular opinion.  ASCII characters
are not the only ones worth displaying.  User-friendliness should not be
the exclusive apanage of ASCII users.

>IF you can persuade the creators of URLs to always use UTF-8, which
>is definitely not the case today (Apache, NCSA, and CERN servers all
>use whatever charset is used by the underlying filesystem, which on
>most Unix-based systems is iso-8859-1 or iso-2022-*), ...

It is interesting that you should use this argument.  Yes, Apache, NCSA and
CERN all use the platform's charset for mapping filenames to URLs (which
can be remedied by a simple script, BTW).

But these three also transmit documents in the charset that is found in the
document (transparency, no transcoding), yet you claimed loudly in the HTTP
WG that they somehow defaulted to ISO 8859-1, and insisted strongly that
this fictitious default charset remain in the HTTP/1.1 spec.

In both cases the major servers behave transparently w/r to character
encoding, in one case to filenames, in the other to document contents.  But
we have two different conclusions: the servers do not support UTF-8 URLs,
but they somehow manage to uphold the official ISO 8859-1 default document
charset. Go figure!

François Yergeau <>
Alis Technologies Inc., Montréal
Tél : +1 (514) 747-2547
Fax : +1 (514) 747-2561