Re: revised "generic syntax" internet draft

Chris Newman <Chris.Newman@innosoft.com> Wed, 16 April 1997 00:23 UTC

Date: Tue, 15 Apr 1997 17:12:09 -0700
From: Chris Newman <Chris.Newman@innosoft.com>
Subject: Re: revised "generic syntax" internet draft
In-Reply-To: <9704151612.aa22167@paris.ics.uci.edu>
To: "Roy T. Fielding" <fielding@kiwi.ics.uci.edu>
Cc: IETF URI list <uri@bunyip.com>
Message-Id: <Pine.SOL.3.95.970415164238.22015V-100000@eleanor.innosoft.com>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Sender: owner-uri@bunyip.com
Precedence: bulk

On Tue, 15 Apr 1997, Roy T. Fielding wrote:
> >(3) whatever localized character set is in use
> >
> >(3) Never works, because it doesn't interoperate.  It results in a bunch
> >of islands which can't communicate, except via US-ASCII.
> 
> But that is what Martin said he wanted -- the ability of an author to
> decide what readership is most important.  Why is it that it is okay
> to localize the address, but not to localize the charset?

I can't speak for Martin.  But if I understand what you're
saying, my response is that people want to use their own language in URLs
and will do so whatever the standard says.  If we define a standard way
for them to include their national characters in such a way that those
characters won't be misinterpreted by the recipient, then we've achived 
interoperability.  That's the goal of protocol design.

> >(5) Works fine, and has potential to be easier to support than (4).
> 
> Excuse me, but it doesn't work at all unless all systems use the same
> charset for encoding URLs.  Since that is not the case today, we would
> have to scrap all existing servers and browsers in order for (5) to work.
> In other words, it is not an acceptable solution to those of use who
> have to implement the specified protocol.

I don't think any of the programs which display URLs try to interpret hex
encoded %80 - %FF.  So no URL display programs will break.  Now if there's
a URL entry program which permits non-ASCII characters and maps them to
%80 - %FF using local conventions, that program will break.  But that
program is also already in violation of the current specification (which
restricts URLs to US-ASCII).  Therefore the only software which is forced
to upgrade by this change is software which already violates the standard.
If anything, that's an argument to make this change.

So the transition plan is simple:

(A) URL entry programs (which currently are restricted to US-ASCII by the
specification) are upgraded so they map non-ASCII characters to hex
encoded UTF-8.

(B) URL display programs are upgraded so they map hex encoded UTF-8 to the
correct display characters.

(C) URL display programs which aren't upgraded just show hex encoded
UTF-8, as they do today.

> (3) does move toward (5).  It even becomes (5) when people are using UTF-8.

(4) can move towards (5), but (3) can't.   With unlabelled character sets
you just get interoperability problems.  Look at it this way: if fred and
sam are using localized character set thingbats, and fred tries to
transition to UTF-8, all of a sudden fred and sam are completely unable to
communicate and see garbage at the other end.  A transition is only
achievable if the character set is labelled.

Any time a spec either implicity or explicitly says X is implementation
defined, it is promoting a non-interoperable solution.  The URL spec
currently leaves the interpretation of %80 - %FF as implementation
defined.

Re: revised "generic syntax" internet draft Foteos Macrides
leading ".." (Re: revised ...) Gregory J. Woodhouse
Re: revised "generic syntax" internet draft Roy T. Fielding
Re: revised "generic syntax" internet draft Francois Yergeau
Re: revised "generic syntax" internet draft Roy T. Fielding
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft Francois Yergeau
Transcribing non-ascii URLs [was: revised "generi… Dan Connolly
Re: revised "generic syntax" internet draft Edward Cherlin
Re: Transcribing non-ascii URLs [was: revised "ge… Martin J. Duerst
Re: revised "generic syntax" internet draft Roy T. Fielding
Re: revised "generic syntax" internet draft Dan Oscarsson
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft John C Klensin
Re: revised "generic syntax" internet draft Gary Adams - Sun Microsystems Labs BOS
Re: revised "generic syntax" internet draft Larry Masinter
Re: revised "generic syntax" internet draft Gary Adams - Sun Microsystems Labs BOS
Re: revised "generic syntax" internet draft Chris Newman
Re: revised "generic syntax" internet draft Chris Newman
Re: revised "generic syntax" internet draft Keld J|rn Simonsen
Re: revised "generic syntax" internet draft Roy T. Fielding
Re: revised "generic syntax" internet draft Chris Newman
Re: revised "generic syntax" internet draft Roy T. Fielding
Re: revised "generic syntax" internet draft Roy T. Fielding
Re: revised "generic syntax" internet draft Roy T. Fielding
Re: revised "generic syntax" internet draft Edward Cherlin
Re: revised "generic syntax" internet draft Larry Masinter
Re: revised "generic syntax" internet draft Harald.T.Alvestrand
Re: revised "generic syntax" internet draft Roy T. Fielding
Re: revised "generic syntax" internet draft Jon Knight
Re: revised "generic syntax" internet draft Jon Knight
Re: revised "generic syntax" internet draft John C Klensin
Re: revised "generic syntax" internet draft Ron Daniel, Jr.
Re: Transcribing non-ascii URLs [was: revised "ge… Bert Bos
Re: revised "generic syntax" internet draft Gary Adams - Sun Microsystems Labs BOS
Re: revised "generic syntax" internet draft Gary Adams - Sun Microsystems Labs BOS
Re: revised "generic syntax" internet draft Gary Adams - Sun Microsystems Labs BOS
A workable alternative to "hex-encoded UTF-8 enco… Larry Masinter
Re: revised "generic syntax" internet draft Larry Masinter
Re: revised "generic syntax" internet draft Larry Masinter
Re: revised "generic syntax" internet draft John C Klensin
Re: revised "generic syntax" internet draft Harald.T.Alvestrand
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft Roy T. Fielding
Re: revised "generic syntax" internet draft Chris Newman
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: A workable alternative to "hex-encoded UTF-8 … Martin J. Duerst
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft Larry Masinter
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft Keld J|rn Simonsen
Re: revised "generic syntax" internet draft Keld J|rn Simonsen
Re: revised "generic syntax" internet draft Jonathan Rosenne
Re: revised "generic syntax" internet draft Keld J|rn Simonsen
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft Keld J|rn Simonsen
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft Edward Cherlin
Opaque right hand sides (was: Re: revised "generi… John C Klensin
Re: revised "generic syntax" internet draft Karen R. Sollins
UTF-8 and URLs Larry Masinter
Re: UTF-8 and URLs Dan Connolly
Re: UTF-8 and URLs Chris Newman
Re: UTF-8 and URLs John C Klensin
Re: UTF-8 and URLs Francois Yergeau
Re: UTF-8 and URLs Dan Connolly
Re: revised "generic syntax" internet draft Edward Cherlin
Re: revised "generic syntax" internet draft John C Klensin
Re: revised "generic syntax" internet draft Keld J|rn Simonsen
Re: UTF-8 and URLs Martin J. Duerst
Re: UTF-8 and URLs Francois Yergeau
Re: UTF-8 and URLs Dan Connolly
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: revised "generic syntax" internet draft Martin J. Duerst
New proposal (was Re: UTF-8 and URLs) Edward Cherlin
Re: UTF-8 and URLs Larry Masinter
Re: revised "generic syntax" internet draft Martin J. Duerst
Re: UTF-8 and URLs Martin J. Duerst
initial "relative-looking" elements. Larry Masinter
Re: revised "generic syntax" internet draft Edward Cherlin
Re: initial "relative-looking" elements. Roy T. Fielding