Re: revised "generic syntax" internet draft

"Roy T. Fielding" <> Fri, 18 April 1997 16:10 UTC

Received: from cnri by id aa02485; 18 Apr 97 12:10 EDT
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa13290; 18 Apr 97 12:10 EDT
Received: (from daemon@localhost) by (8.8.5/8.8.5) id LAA20582 for uri-out; Fri, 18 Apr 1997 11:20:51 -0400 (EDT)
Received: from (mocha.Bunyip.Com []) by (8.8.5/8.8.5) with SMTP id LAA20577 for <>; Fri, 18 Apr 1997 11:20:44 -0400 (EDT)
Received: from by with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA07664 (mail destined for; Fri, 18 Apr 97 11:20:42 -0400
Received: from by id aa08758; 18 Apr 97 8:19 PDT
To: "Martin J. Duerst" <>
Subject: Re: revised "generic syntax" internet draft
In-Reply-To: Your message of "Thu, 17 Apr 1997 12:55:34 +0200." <Pine.SUN.3.96.970417124447.708G-100000@enoshima>
Date: Fri, 18 Apr 1997 08:19:05 -0700
From: "Roy T. Fielding" <>
Message-Id: <>
Precedence: bulk

>To Roy Fielding, I would suggest that he (re)reads Francois'
>web page on international URLs and my original proposal that
>started this discussion (both available via

Martin, I haven't forgotten about your very detailed problem statement
at <>.  My question was
whether all the other people advocating non-ASCII URLs agree to that
problem statement, and in particular to the course of action for the
current draft revision.  The problem I am having is that every time
I explain why one solution won't work, people defend it by describing
the merits of some other solution (or even some other problem).  It seems
to me that if you can't agree on what problem is being solved, then
arguing about a solution is pointless.

>and looks into the way configuration information can be
>setup for Apache to inform it about special needs of scripts
>and stuff, before he again claims things to be impossible.

It is impossible for Apache to correctly transcode incoming URLs for the
same reason that it is impossible for current browsers to decode and display
the encoded octets of received URLs -- a program cannot transcode bytes to
a different charset unless it knows how the bytes are currently encoded.
There is nothing you can do in the Apache configuration to change that
fact, since it is a property of how the URL is generated (either by some
other part of the server or some part of the user agent or some author
of any page in the Web).

I say that not as an average user of Apache, but as a co-founder and core
member of the Apache Group who has been working on that server code
for over two years.  If you think I am wrong, you should at least be
able to justify your remarks.

I think there is a way to define UTF-8 preference for URL encoding
such that it won't break existing services, by forbidding transcoding
of already-encoded octets.  However, I won't bother to explain that
until there is broad agreement on what needs to be solved.

 ...Roy T. Fielding
    Department of Information & Computer Science    (
    University of California, Irvine, CA 92697-3425    fax:+1(714)824-1715