Re: [URN] Agenda for Washington Meeting -- Questions on URN

"Sam X. Sun" <ssun@CNRI.Reston.VA.US> Wed, 03 December 1997 08:24 UTC

Received: (from daemon@localhost) by services.bunyip.com (8.8.5/8.8.5) id DAA19311 for urn-ietf-out; Wed, 3 Dec 1997 03:24:56 -0500 (EST)
Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.8.5/8.8.5) with ESMTP id DAA19305 for <urn-ietf@services.bunyip.com>; Wed, 3 Dec 1997 03:24:51 -0500 (EST)
Received: from ns.cnri.reston.va.us (ns.CNRI.Reston.VA.US [132.151.1.1]) by mocha.bunyip.com (8.8.5/8.8.5) with ESMTP id DAA11143; Wed, 3 Dec 1997 03:24:45 -0500 (EST)
Received: from newcnri.CNRI.Reston.Va.US (newcnri [132.151.1.84]) by ns.cnri.reston.va.us (8.8.5/8.8.7a) with SMTP id DAA11363; Wed, 3 Dec 1997 03:27:36 -0500 (EST)
Received: from ssun2.CNRI.Reston.Va.US by newcnri.CNRI.Reston.Va.US (SMI-8.6/SMI-SVR4) id DAA01249; Wed, 3 Dec 1997 03:24:41 -0500
Message-Id: <199712030824.DAA01249@newcnri.CNRI.Reston.Va.US>
From: "Sam X. Sun" <ssun@CNRI.Reston.VA.US>
To: Leslie Daigle <leslie@bunyip.com>
Cc: urn-ietf@bunyip.com
Subject: Re: [URN] Agenda for Washington Meeting -- Questions on URN
Date: Wed, 03 Dec 1997 03:23:49 -0500
X-MSMail-Priority: Normal
X-Priority: 3
X-Mailer: Microsoft Internet Mail 4.70.1155
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Sender: owner-urn-ietf@Bunyip.Com
Precedence: bulk
Reply-To: "Sam X. Sun" <ssun@CNRI.Reston.VA.US>
Errors-To: owner-urn-ietf@Bunyip.Com

In light of the handle system draft(draft-sun-handle-system-00.txt), and
continuing on the earlier discussions with Roy, Martin, and Keith, I would
like to raise the following questions before we wrap up the working group:

1.	What is the definition for URN addressed by the working group? Are we
defining it as a particular implementation limited to "urn:" URL scheme, or
a specification that should embrace all persistent, location independent
global naming scheme? 

	The current URN specification seems to limit the URN into a particular URL
scheme, eg. "urn:", which is really just one way of implementing
persistent, location independent, universal resource names. URN in a
broader sense should not confuse its specification with any particular
implementation. 

	For example, URL as a specification for locating WWW resources, it
embraces various implementations defined under their corresponding URL
schemes, including, but not limited to, "http:", "ftp:", and even "urn:".
Handle System is another implementation of persistent, location independent
global naming scheme. It utilized URL scheme "hdl:" in the WWW context, and
is considered an implementation of URN in the broader sense.
 
	It seems URI is the URN in the broader sense. But the recent URI draft
(draft-fielding-uri-syntax-01.txt) defines URN as "the subset of URI that
are intended to remain globally unique and persistent even when the
resource ceases to exist or becomes unavailable". This makes it more nature
of thinking URN, as addressed by the working group, is by no means limited
to a particular implementation.

2.	Most of the reserved/excluded characters of current URN are really
restrictions from some current URL schemes, mostly from "http URL". For URN
name space defined independently from URL, like Handle System, they are not
required, and should not be put into the URN specification in the broader
sense. For example, RFC1738 (url syntax), and the new URI draft
(draft-fielding-uri-syntax-01.txt) allows reserved/excluded character sets
to be defined on the individual URL scheme basis. Again, we are talking
about URN in the broader sense here. How it's to be done under "urn:"
implementation is another issue.

3.	Some URN specifications specifically exclude the support of user
friendly names. While user friendly name may not be appropriate in some
case, there are situations where friendly names are desired or even
required. The underlying technology should not limit the usage of user
friendly names. 

	On top of this, for any global naming scheme to support friendly names, it
should not be limited to ASCII only, but should allow any native characters
to be used directly without hex encoding. Otherwise, it can only support
friendly names in English, but not in other languages like Greek, Russian,
Chinese, Japanese, Korean, etc. For example, how could anyone tell that
%C2%B7 is a friendly name in Greek? (Note: %C2%B7 is the hex encoded  UTF-8
encoding of a Greek symbol, code point B7 defined in ISO-8859-7.)

Most of this issues seems already been discussed in the mailing list. It
would be more helpful if they can be addressed in a formal document,
INCLUDING the reasoning behind their perspective conclusions. Digging into
the huge mailing list archives is good for background checking, but appears
to be hard for coming up a conclusive answer (By the way: could the archive
be threaded by topic, like in the newsgroup?).

Regards.
Sam
ssun@cnri.reston.va.us