Re: draft-seantek-rdf-urn-00 and draft-seantek-xmlns-urn-00

Sean Leonard <dev+ietf@seantek.com> Fri, 19 December 2014 11:12 UTC

Return-Path: <dev+ietf@seantek.com>
X-Original-To: urn-nid@ietfa.amsl.com
Delivered-To: urn-nid@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E87BD1A8784 for <urn-nid@ietfa.amsl.com>; Fri, 19 Dec 2014 03:12:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.498
X-Spam-Level:
X-Spam-Status: No, score=0.498 tagged_above=-999 required=5 tests=[BAYES_20=-0.001, J_CHICKENPOX_34=0.6, J_CHICKENPOX_35=0.6, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sO-NyPlWXAx5 for <urn-nid@ietfa.amsl.com>; Fri, 19 Dec 2014 03:12:34 -0800 (PST)
Received: from mxout-07.mxes.net (mxout-07.mxes.net [216.86.168.182]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D34E81A8779 for <urn-nid@ietf.org>; Fri, 19 Dec 2014 03:12:33 -0800 (PST)
Received: from [192.168.123.151] (unknown [23.241.1.22]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTPSA id EC8BE22E2CA; Fri, 19 Dec 2014 06:12:27 -0500 (EST)
Content-Type: text/plain; charset="windows-1252"
Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\))
Subject: Re: draft-seantek-rdf-urn-00 and draft-seantek-xmlns-urn-00
From: Sean Leonard <dev+ietf@seantek.com>
In-Reply-To: <87fvcep0t1.fsf@hobgoblin.ariadne.com>
Date: Fri, 19 Dec 2014 03:12:26 -0800
Content-Transfer-Encoding: quoted-printable
Message-Id: <69F5E07A-C1FC-4437-B099-064008059D59@seantek.com>
References: <87fvcep0t1.fsf@hobgoblin.ariadne.com>
To: "Dale R. Worley" <worley@ariadne.com>
X-Mailer: Apple Mail (2.1878.6)
Archived-At: http://mailarchive.ietf.org/arch/msg/urn-nid/4ZOJAreJfLEbgp9YguXiFEnd-qY
Cc: urn-nid@ietf.org
X-BeenThere: urn-nid@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: discussion of new namespace identifiers for URNs <urn-nid.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn-nid>, <mailto:urn-nid-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/urn-nid/>
List-Post: <mailto:urn-nid@ietf.org>
List-Help: <mailto:urn-nid-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn-nid>, <mailto:urn-nid-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Dec 2014 11:12:38 -0000

On Dec 17, 2014, at 10:51 AM, Dale R. Worley <worley@ariadne.com> wrote:

> Sean Leonard <dev+ietf@seantek.com> writes:
>> Thought about it. But urn:xmlns:foobar and urn:rdf:foobar are
>> preferable to urn:SOMENID:xmlns:foobar and urn:SOMENID:rdf:foobar. The
>> shorter the better--remember that we are competing with
>> http://example.com/foobar.
> 
> I was considering not syntactically distinguishing XML namespaces and
> RDF references, so both would be of the form urn:id:foobar.  ("id" is
> available as a NID!)  That would be even shorter.

It would be shorter. But my observation is that namespaces make the most sense when they uniformly identify a certain class of things, because the common semantics of the things can be assumed, catalogued, and retrieved in a systematic (automated) way.

Consider a category NID like “oid”. In that case, urn:oid:* stands for all sorts of things but each and every thing has an OID and associated OID semantics: a number, optional secondary identifiers, (since ~2005) Unicode identifiers, and other information that is stored in databases such as www.oid-info.com (Description, Information, registration data).

A converse example might be an organization-delgated NID like “ietf”. In that case, urn:ietf:* might stand for all sorts of “things” (XML namespaces, parameter values, documents, etc.), but we know that all of those things are “IETF things” and therefore there is one place to get the definitions: from the IETF.

> 
> In regard to "remember that we are competing with
> http://example.com/foobar", I know of no such "competition".  Can you
> update the draft to explain why this is important?’

Ok.

> 
>> Furthermore, urn:xmlns and urn:rdf represent distinct namespaces. The
>> (abstract) resources accessible via and relevant to urn:xmlns are
>> different from urn:rdf, and additional utility is gained by having
>> particular classes of resources made permanently accessible in those
>> distinct namespaces.
> 
> I don't see any value for syntactically distinguishing which URNs are
> XML namespace identifiers and which are RDF references, since the
> contexts in which the two would be used are disjoint.  Can you update
> the draft to explain why this is important?

Ok. (See above; I will elaborate in the next draft.)

> 
> For that matter, I don't see any value for ensuring the URNs are brief.
> To my knowledge, humans rarely input or output URNs.  The draft says
> "several URN namespace registrations have been proposed over the years,
> where the primary (yet only occasionally stated) purpose is to create
> short URIs for the registrant's XML namespaces".  But it seems to me
> that generally NIDs are created to delegate the ability to create URNs
> to some industry consortium, not to ensure that the consortium can
> create brief URNs.  Can you update the draft to explain why this is
> important?

Ok. (See above. To continue the examples, the IETF has several OIDs at its disposal, so it could easily mint unique URNs in urn:oid:1.3.6.1.* or elsewhere. But whether it’s vanity or something else, the IETF wanted urn:ietf. Note that there is also a urn:iso [RFC5141] even though ISO also has urn:oid:1.* which is way up at the top of the tree. Vanity or something else? You be the judge!)

> 
>>> In regard to the registration of the URIs associated with URNs and the
>>> resolution of the URNs into the associated URIs, let me propose that
>>> this data be be retrievable from some suitable zone of the DNS along
>>> the following lines:
>> 
>> Thanks, it's a good idea.
>> 
>> However, I was thinking that the URIs be accessible at the resource(s)
>> at something starting with:
>> http://www.iana.org/assignments/urn-xmlns-names
>> http://www.iana.org/assignments/urn-rdf-names
> 
> Whatever the resolution mechanism is, it's clear that the method of
> accessing the resolution database is intrinsic to your proposal.  But
> your proposal gives no details as to how to access the resolution
> database, or how IANA would implement it.  The former is an intrinsic
> part of the definition of the namespace.  The latter is a critical piece
> of IANA infrastructure.  So you need to discuss these and ensure that
> there is a robust conversation about these questions.

Actually I would say that the resolution database is not intrinsic to the proposal. Such a database is extremely helpful (as www.oid-info.com is for OIDs) for implementers and designers, but it is not required. Both XML namespaces and RDF nodes have definitions that do not require an URI to be dereferenceable to particular resources: the only requirement is that the URI is unique. The fact that some URIs are retrievable helps when someone (typically an engineer, designer, or protocol analyst) is interrogating some unknown data item.

> 
> In regard to comparison with using "http://example.com/foobar", the
> latter has the advantage that *IANA* doesn't have to maintain resolution
> information, the URL itself can be used to fetch resolution information,
> and the server for the information can be maintained by someone else.

Or it could be maintained by IANA. Domains come and go, but IANA, like diamonds, is forever. :)

As pointed out above, it’s not really expected that the database is going to get billions of hits per hour. We are not talking about the DNS root servers here. We’re talking about a convenient way for analysts (i.e., software engineers) to look up data to figure out what urn:rdf:elephant means. Basically it’s on the order of the Private Enterprise Numbers “database”, which is just a flat text file delimited with newlines. <http://www.iana.org/assignments/enterprise-numbers/enterprise-numbers>

Sean