Re: [urn] Tuning the "URNs are not URIs" spec

worley@alum.mit.edu (Dale R. Worley) Tue, 17 June 2014 20:15 UTC

Return-Path: <worley@ariadne.com>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7F6171A0170 for <urn@ietfa.amsl.com>; Tue, 17 Jun 2014 13:15:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.9
X-Spam-Level:
X-Spam-Status: No, score=-3.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, GB_I_LETTER=-2] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ylui32K22pe9 for <urn@ietfa.amsl.com>; Tue, 17 Jun 2014 13:15:04 -0700 (PDT)
Received: from qmta03.westchester.pa.mail.comcast.net (qmta03.westchester.pa.mail.comcast.net [IPv6:2001:558:fe14:43:76:96:62:32]) by ietfa.amsl.com (Postfix) with ESMTP id C28C11A011C for <urn@ietf.org>; Tue, 17 Jun 2014 13:15:03 -0700 (PDT)
Received: from omta10.westchester.pa.mail.comcast.net ([76.96.62.28]) by qmta03.westchester.pa.mail.comcast.net with comcast id FW6n1o0050cZkys53YF3Gc; Tue, 17 Jun 2014 20:15:03 +0000
Received: from hobgoblin.ariadne.com ([24.34.72.61]) by omta10.westchester.pa.mail.comcast.net with comcast id FYF21o00E1KKtkw3WYF2uW; Tue, 17 Jun 2014 20:15:03 +0000
Received: from hobgoblin.ariadne.com (hobgoblin.ariadne.com [127.0.0.1]) by hobgoblin.ariadne.com (8.14.7/8.14.7) with ESMTP id s5HKF2OZ012087; Tue, 17 Jun 2014 16:15:02 -0400
Received: (from worley@localhost) by hobgoblin.ariadne.com (8.14.7/8.14.7/Submit) id s5HKF0Xx012074; Tue, 17 Jun 2014 16:15:00 -0400
Date: Tue, 17 Jun 2014 16:15:00 -0400
Message-Id: <201406172015.s5HKF0Xx012074@hobgoblin.ariadne.com>
From: worley@alum.mit.edu
Sender: worley@alum.mit.edu
To: John C Klensin <john-ietf@jck.com>
In-reply-to: <964DC8688FC7E02C49E21E2D@JcK-HP8200.jck.com> (john-ietf@jck.com)
References: <964DC8688FC7E02C49E21E2D@JcK-HP8200.jck.com>
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20140121; t=1403036103; bh=pWLGidUeJElFNx6t7HIpQkW4aXmpBTyvMdsDqs75Xc8=; h=Received:Received:Received:Received:Date:Message-Id:From:To: Subject; b=anCsdUByKzGhhNWndNhYIU6tFeLpi4DFLQnk1oOYkTAOQUOn8MepWUabVgLNTXHfw KR/ROIhebzbxnYEUEsF87HhfHEHPQSoatrozRGUYoKSfPiA/umN8O0uUvWq6vsD+IA C8v1k3ad4vXur7n2XOQQEK3YOuxZemICO/ufhSlfqfuMRgHIAXrESvKU+BYWLKfsN5 1E1musYJx+YSPYihiEz+xAoENsUhQjSpYFuU9kkq++Uny4sHgkweRCYZjAiqLE2BL0 EBR4EhgaLGHu7/X99CgEMztVD8FGrTAOrQZqB5F4ixWyhMG/5gvsx05XTEcesJRtfV T0PrWMhNWCEmA==
Archived-At: http://mailarchive.ietf.org/arch/msg/urn/VxSzGyEModB9nYx-_hsPXuzJRNQ
Cc: urn@ietf.org
Subject: Re: [urn] Tuning the "URNs are not URIs" spec
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 17 Jun 2014 20:15:05 -0000

> From: John C Klensin <john-ietf@jck.com>

>    Like most controversies in which one group does not accept
>    the definitions, facts, or logic of another, the
>    differences are unlikely to be resolved by further
>    discussion, no matter how sensible and patient.

I'd phrase it "... in which EACH group does not accept ... of THE
OTHER".

>    Instead, the question is whether the IETF is willing to
>    evolve and adapt the URN definition to accommodate those
>    perceived needs or whether if prefers to have that work
>    done elsewhere, either by adoption in the broader
>    community and marketplace of a different approach or,
>    potentially, even a competing URN standard.

As far as I can tell, the key question is "Is the syntax of RFC 2141
("URN syntax") inadequate for the needs of important groups of users?"
If the syntax is adequate, then various groups of users can define
namespaces that have the properties that they desire without causing
any upset in running code; at most, revisions would be needed to RFC
3406 ("URN namespace definition mechanisms").  (Unlike IPv4 addresses,
we aren't in danger of running out of namespace identifiers.)

(And for example, this flexibility allows multiple sets of identifiers
which handle name/locator differences in distinctly different ways.)

What seems to be constantly hinted at in this discussion -- but never
explicitly stated -- is that important groups of users have needs that
CANNOT be satisfied within the syntax of RFC 2141.  Not just that the
particular definitions of "fragment" and "query" in RFC 3986 are
inadequate for their needs, but that those needs cannot be satisfied
by *any alternative means* that can be represented within the syntax
of RFC 2141.

If that actually is so, it is a fairly concrete fact, it shouldn't be
very difficult to make the case in a convincing way.  And once the
case is made, it would be a sound reason to move ahead with
significant changes to the status quo.

One objection I can see to RFC 2141 is how restrictive it is:  The
namespace-specific string can contain only letters (two cases),
digits, and 15 of the 32 ASCII special characters.  (That's not
counting %, since it has a special purpose.)  I can see how a group
may want to have much more syntactic flexibility in designing an
identifier, and they wouldn't enjoy %-encoding a large fraction of the
characters in their identifiers.

OTOH, there's no reason that a user group would have to *see* what
their URNs look like very often.  Most user interfaces that would
refer to sophisticated identifiers don't have to reveal the URNs that
are used one the wire, any more than that we have to see the bits that
encode the characters of our e-mails.

For instance, there's no reason that DOIs can't be encoded much the
same way that EIDRs are encoded (draft-pal-eidr-urn-03):

    The DOI "10.1051/0004-6361:20054201"
    becomes the URN "urn:doi:10.1051:0004-6361%3a20054201".

(Any sequence of Unicode characters can be represented as a sequence
of %-escapes per RFC 2141 section 2.2.)

The great benefit of this approach is that generic operations on these
URNs can be done by the great bulk of currently running code.

Dale