Re: revised "generic syntax" internet draft

Larry Masinter <> Tue, 15 April 1997 18:33 UTC

Received: from cnri by id aa05290; 15 Apr 97 14:33 EDT
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa17183; 15 Apr 97 14:33 EDT
Received: (from daemon@localhost) by (8.8.5/8.8.5) id NAA00529 for uri-out; Tue, 15 Apr 1997 13:33:38 -0400 (EDT)
Received: from (mocha.Bunyip.Com []) by (8.8.5/8.8.5) with SMTP id NAA00524 for <>; Tue, 15 Apr 1997 13:33:35 -0400 (EDT)
Received: from alpha.Xerox.COM by with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA13987 (mail destined for; Tue, 15 Apr 97 13:33:33 -0400
Received: from ([]) by with SMTP id <17421(8)>; Tue, 15 Apr 1997 10:31:27 PDT
Message-Id: <>
Date: Tue, 15 Apr 1997 10:30:53 -0700
From: Larry Masinter <>
Organization: Xerox PARC
X-Mailer: Mozilla 3.01Gold (Win95; I)
Mime-Version: 1.0
To: Gary Adams - Sun Microsystems Labs BOS <>
Subject: Re: revised "generic syntax" internet draft
References: <libSDtMail.9704151153.13046.gra@zeppo>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Precedence: bulk

> Are there any "facts" still in need of investigation 
> or are the only unresolved issues questions of "opinion"? (My opinion
> is that the current system is already broken, if this could be 
> subtantiated would that invalidate the "status quo" as a viable 
> alternative?)

At this point, I think we need not just "facts" but some
actual "design". Exactly how does this all work in a way that
actually solves the problem?

Let's suppose someone wants to publish information
about their product and put up a URL in a magazine.

a) what URLs do they support in their server?
b) what gets printed in the magazine?
c) what does the user type into the browser?
d) what does the browser do with what the user typed
   in order to turn it into the URL that was generated in (a).

how does this work for 
  1) Japanese (16-bit characters)
  2) Hebrew (right to left)

What happens with "/" and the path components? How does
directionality get represented? What are the considerations
for ambiguity beyond the familiar 0O0O0O1l1l1l for ASCII?

When the details of this are worked out, and we actually
have something that works to allow non-ASCII URLs, then
we can look and see if %xx-hex encoded UTF-8 encoded Unicode
actually forms part of the solution. But it doesn't seem
"trivial" to me, or at all certain that the current proposal
is actually part of the solution.