Re: revised "generic syntax" internet draft

"Roy T. Fielding" <> Wed, 16 April 1997 10:43 UTC

Received: from cnri by id aa14241; 16 Apr 97 6:43 EDT
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa07990; 16 Apr 97 6:42 EDT
Received: (from daemon@localhost) by (8.8.5/8.8.5) id GAA29816 for uri-out; Wed, 16 Apr 1997 06:01:49 -0400 (EDT)
Received: from (mocha.Bunyip.Com []) by (8.8.5/8.8.5) with SMTP id GAA29811 for <>; Wed, 16 Apr 1997 06:01:46 -0400 (EDT)
Received: from by with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA21205 (mail destined for; Wed, 16 Apr 97 06:01:45 -0400
Received: from by id aa00639; 16 Apr 97 3:01 PDT
Subject: Re: revised "generic syntax" internet draft
In-Reply-To: Your message of "Tue, 15 Apr 1997 12:55:28 PDT." <v03007830af796f9a9a4a@[]>
Date: Wed, 16 Apr 1997 03:01:11 -0700
From: "Roy T. Fielding" <>
Message-Id: <>
Precedence: bulk

Okay, I give up.  Can the people who are advocating change(s) to the
existing draft please communicate with each other and develop
at least one problem statement that covers what it is you want the
editors to fix, how you want it fixed, and an estimation of what
it will take to deploy the change?  In the past 24 hours I have been
told four different and conflicting goals:

   1. natural language URLs (i.e., non-ASCII, non-encoded URL strings)

   2. URLs that must always be UTF-8 in order to pass form data around
      in a deprecated manner that is somehow different than the charset
      of the form.

   3. URLs that are not natural language and always represented as
      ASCII, but are restricted to UTF-8 in order to avoid future
      transcoding problems [which itself is odd, since there are
      no transcoding problems if the URL is always represented as ASCII
      and never converted by the recipient].

   4. URLs that are always transmitted as ASCII, but must be encoded
      as UTF-8 so that browsers can display the URL in decoded form
      [this is the URN syntax compromise].

   5. URLs that are recommended to be UTF-8 but only if that's okay
      with the server, apparently so that more people will use UTF-8
      instead of some other charset.  [this is Martin's proposal]

I thought the goal was to solve 1 and 2 (and that 2 is already solved).
Number 3 appears to be a solution without a goal.  Number 4 is a solution
if you don't mind invalidating existing use of %xx encoding.
Number 5 appears to be a political statement, since it doesn't solve any
problem (at least with existing systems).

It is pointless to argue about what will or will not solve the problem
when we are all talking about different problems.  Since you apparently
don't agree with my problem statements, please find one that you do
agree with so we can have some hope in hell of making progress.
I don't want to see eight different problem statements vaguely related
to UTF-8, just one that defines an actual protocol problem that has
to be fixed right now.  If possible, include a range of solutions so
that we can at least see that the alternatives have been considered.