[urn] Terminology issue: 'URL'

ht@inf.ed.ac.uk (Henry S. Thompson) Thu, 17 November 2016 15:31 UTC

Return-Path: <ht@inf.ed.ac.uk>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1F446129976 for <urn@ietfa.amsl.com>; Thu, 17 Nov 2016 07:31:34 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.418
X-Spam-Level:
X-Spam-Status: No, score=-3.418 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.497, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4eWG37q-5fbD for <urn@ietfa.amsl.com>; Thu, 17 Nov 2016 07:31:31 -0800 (PST)
Received: from loire.is.ed.ac.uk (loire.is.ed.ac.uk [129.215.16.10]) by ietfa.amsl.com (Postfix) with ESMTP id 2A42912944E for <urn@ietf.org>; Thu, 17 Nov 2016 07:31:30 -0800 (PST)
Received: from crunchie.inf.ed.ac.uk (crunchie.inf.ed.ac.uk [129.215.33.180]) by loire.is.ed.ac.uk (8.14.7/8.14.6) with ESMTP id uAHFVTUJ009051; Thu, 17 Nov 2016 15:31:29 GMT
Received: from troutbeck.inf.ed.ac.uk (troutbeck.inf.ed.ac.uk [129.215.25.32]) by crunchie.inf.ed.ac.uk (8.14.4/8.14.4) with ESMTP id uAHFVSTn014308; Thu, 17 Nov 2016 15:31:29 GMT
Received: from troutbeck.inf.ed.ac.uk (localhost [127.0.0.1]) by troutbeck.inf.ed.ac.uk (8.14.7/8.14.7) with ESMTP id uAHFVSeZ019026; Thu, 17 Nov 2016 15:31:28 GMT
Received: (from ht@localhost) by troutbeck.inf.ed.ac.uk (8.14.7/8.14.7/Submit) id uAHFVS0B019025; Thu, 17 Nov 2016 15:31:28 GMT
X-Authentication-Warning: troutbeck.inf.ed.ac.uk: ht set sender to ht@inf.ed.ac.uk using -f
To: Peter Saint-Andre <stpeter@stpeter.im>
References: <3bee74ce-c5ce-3f28-5cc9-437f34b778e9@stpeter.im> <f5bvavmuu5w.fsf@troutbeck.inf.ed.ac.uk>
From: ht@inf.ed.ac.uk
Date: Thu, 17 Nov 2016 15:31:27 +0000
In-Reply-To: <f5bvavmuu5w.fsf@troutbeck.inf.ed.ac.uk> (Henry S. Thompson's message of "Thu\, 17 Nov 2016 13\:12\:43 +0000")
Message-ID: <f5b60nmunqo.fsf_-_@troutbeck.inf.ed.ac.uk>
User-Agent: Gnus/5.1012 (Gnus v5.10.12) XEmacs/21.5-b34 (linux)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Edinburgh-Scanned: at loire.is.ed.ac.uk with MIMEDefang 2.78, Sophie, Sophos Anti-Virus, Clam AntiVirus
X-Scanned-By: MIMEDefang 2.78 on 129.215.16.10
Archived-At: <https://mailarchive.ietf.org/arch/msg/urn/jw4jgZvljo1cOpJMjqYaubhbnyI>
Cc: urn@ietf.org
Subject: [urn] Terminology issue: 'URL'
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Nov 2016 15:31:34 -0000

TL;DR: Replace 'resolve to a URL' with 'resolve to a Locator'
throughout

The current draft (correcly, IMO) defines 'URN' as "A name that is a
member of a URN namespace", that is, a name of the form urn:nid:...

It doesn't list 'URL' as a term, but does say, towards the end of
section 1.2 *Terminology* 

  this document uses the term Uniform Resource Locator (URL), rather
  than the generic term Uniform Resource Identifier (URI), to refer to
  locators; see also Section 1.1.3 of [RFC3986].

I think this either partakes of, or at least encourages, a confusion
between properties of URI schemes and properties of individual URIs,
thereby missing the key point of the referenced section of 3986:

  An individual scheme does not have to be classified as being just
  one of "name" or "locator".  Instances of URIs from any given scheme
  may have the characteristics of names or locators or both... [1]

That is, although many http: URIs are locators, some are names (and
some are both), and although all urn: URIs are names, some may be
locators as well.

A URI is a locator if it can be dereferenced, if, for example "[there
are] services related to the identified resource, such as
... delivering a document object from a convenient source" [2].
That's not a property of the URI as such, but of its context of use
and the services available therefrom.  As that quote indicates, some
URNs may have that property in some circumstances.

On the other hand, just because a resolution service provides you with
an http: URI when invoked on a URN does not mean you are now on your
way to a document object, because that http: URI may itself be a name
and not a locator.

So, I'd suggest that you add a new term to the term list in section
1.2, as follows:

   Locator: A retrieval-enabled URI

and use that term in most of the places where 'URL' is currently used.

The penultimate paragraph of section 1.2 would then need to read
something along these lines:

   This document uses the terms "resolution" and "resolver" in roughly
   the sense in which they were used in the original discussion of
   architectural principles for URNs [RFC2276], i.e., "resolution" is
   the act of supplying services related to the identified resource,
   such as translating the persistent URN into one or more current
   Locators for the resource, delivering metadata about the resource
   in an appropriate format, or even delivering a document object from
   a convenient source without requiring further intermediaries.  At
   the time of this writing, resolution services are described in
   [RFC2483].

   Section 1.1.3 of [RFC3986] helpfully observes that "A URI can be
   ... classified as a locator, a name, or both" and goes on to define
   locators as URIs which "provide a means of locating the resource
   [which they identify] by describing its primary access mechanism
   (e.g., its network 'location')".  This is what we mean by
   'retrieval-enabled' above, and we use the term 'Locator' to refer
   to the class of individual URIs which have this property.

A few places where 'URL' is effectively used as shorthand for
"non-urn-scheme-URI' will require some more explicit rewriting, e.g.

  The URN r-component has no syntactic counterpart in URLs

 -->

  The URN r-component has no syntactic counterpart in any other URI
  scheme that we are aware of

The overall point here is that RFCs are meant for the entire
community, and it's at best misleading for an RFC to redefine a widely
used term to have a meaning other than the one already in wide (and
documented) use.  I do of course understand that there exists a
sub-community within which this distinct meaning is understood and
intended, but this spec can and should reach out more widely.  So note
that I'm _not_ proposing that the spec should use 'URL' as defined in
3986, but rather use 'URN' to mean urn:nid:..., as it does already, to
use 'URI' as the generic, and _not to use 'URL' at all_.

ht

[1] https://tools.ietf.org/html/rfc3986#section-1.1.3
[2] https://tools.ietf.org/html/draft-ietf-urnbis-rfc2141bis-urn-18#section-1.2
-- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
 [mail from me _always_ has a .sig like this -- mail without it is forged spam]