Re: draft-fielding-url-syntax-05.txt

Larry Masinter <masinter@parc.xerox.com> Fri, 02 May 1997 19:35 UTC

Received: from cnri by ietf.org id aa02354; 2 May 97 15:35 EDT
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa19592; 2 May 97 15:35 EDT
Received: (from daemon@localhost) by services.bunyip.com (8.8.5/8.8.5) id PAA11040 for uri-out; Fri, 2 May 1997 15:15:48 -0400 (EDT)
Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.8.5/8.8.5) with ESMTP id PAA11033 for <uri@services.bunyip.com>; Fri, 2 May 1997 15:15:45 -0400 (EDT)
Received: from alpha.xerox.com (alpha.Xerox.COM [13.1.64.93]) by mocha.bunyip.com (8.8.5/8.8.5) with SMTP id PAA23238 for <uri@bunyip.com>; Fri, 2 May 1997 15:15:40 -0400 (EDT)
Received: from casablanca.parc.xerox.com ([13.2.16.111]) by alpha.xerox.com with SMTP id <18214(11)>; Fri, 2 May 1997 12:14:51 PDT
Received: from bronze-208.parc.xerox.com ([13.0.209.122]) by casablanca.parc.xerox.com with SMTP id <74495>; Fri, 2 May 1997 12:14:34 PDT
Message-ID: <336A3D11.73A8@parc.xerox.com>
Date: Fri, 2 May 1997 12:14:25 PDT
From: Larry Masinter <masinter@parc.xerox.com>
Organization: Xerox PARC
X-Mailer: Mozilla 3.01Gold (Win95; I)
MIME-Version: 1.0
To: Chris Newman <Chris.Newman@innosoft.com>
CC: IETF URI list <uri@bunyip.com>
Subject: Re: draft-fielding-url-syntax-05.txt
References: <Pine.SOL.3.95.970502110058.18890J-100000@eleanor.innosoft.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-uri@bunyip.com
Precedence: bulk

#I find this much too wishy-washy. 
Not every section of a document can explicitly forbid
everything that is forbidden. In general, standards
documents work best when they say "how do I use this"
rather than listing lots of rules.

# I think we should explicitly forbid the use of 8-bit characters

I think this document is clear that URLs are currently
written with a limited repertoire of characters that
is a subset of US-ASCII. That subset does not include
"8-bit characters" or "9-bit characters" or "38-bit characters".

#  and hex-encoded 8-bit characters

I think it would be incorrect to disallow hex-encoded
8-bit octets that didn't actually correspond to characters,
e.g., in guid schemes, in FTP URLs for FTP servers that
*don't* implement UTF-8, etc. So, no.

# except as defined by the future I18N URL standard.

The future standard will set the standard for the future.
All this document says is that it doesn't set that standard.

# We need to make it very clear that programs sending 8-bit URLs over
# the wire are broken (unless they use UTF8 according to the # future
standard).

The purpose of this document is to define the standard for
how URLs work, and not to 'send a message' about a future
standard. I and Martin have actually started work on the
'message', and if you want to help 'send a message' about UTF8,
I invite you to actually help craft the 'message'.

When we have a standard for UTF8 URLs, we'll have a standard
for UTF8 URLs. But that's the only message that you can
send that will have any meaning.

Larry