Re: [urn] I want URNs for hashes and large random numbers

John C Klensin <john-ietf@jck.com> Fri, 12 September 2014 19:33 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0CF9D1A876E for <urn@ietfa.amsl.com>; Fri, 12 Sep 2014 12:33:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.252
X-Spam-Level:
X-Spam-Status: No, score=-4.252 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, RP_MATCHES_RCVD=-1.652] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4Dz7Md-YGCoC for <urn@ietfa.amsl.com>; Fri, 12 Sep 2014 12:33:55 -0700 (PDT)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9B4D91A8757 for <urn@ietf.org>; Fri, 12 Sep 2014 12:33:54 -0700 (PDT)
Received: from h8.int.jck.com ([198.252.137.35] helo=JcK-HP8200.jck.com) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1XSWbQ-000M0E-3X; Fri, 12 Sep 2014 15:33:52 -0400
Date: Fri, 12 Sep 2014 15:33:47 -0400
From: John C Klensin <john-ietf@jck.com>
To: Sean Leonard <dev+ietf@seantek.com>
Message-ID: <07834ADA3935CDB73BC75D99@JcK-HP8200.jck.com>
In-Reply-To: <54133919.5010103@seantek.com>
References: <54129263.7080109@seantek.com> <541293C5.9030205@gmx.de> <54129F90.3090201@seantek.com> <725D9113FA12205449854DF4@JcK-HP8200.jck.com> <54133919.5010103@seantek.com>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.35
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: http://mailarchive.ietf.org/arch/msg/urn/h2fV2x3GNhsxE6XTjvYJA_B4ecA
Cc: Julian Reschke <julian.reschke@gmx.de>, urn@ietf.org
Subject: Re: [urn] I want URNs for hashes and large random numbers
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Sep 2014 19:33:58 -0000


--On Friday, September 12, 2014 11:19 -0700 Sean Leonard
<dev+ietf@seantek.com> wrote:

> On 9/12/2014 2:18 AM, John C Klensin wrote:
>> Sean,
>> 
>> Just to help me understand, two questions...
>> 
> I think there were more than two questions. :)
> 
> [SNIP]
>> 
>>> Specifically I want the definition of URN to be able to
>>> accommodate these kinds of naming schemes.
>>> 
>>> It's not a challenge to the ni: URI. It's just being
>>> realistic: people are using mathematically deterministic
>>> processes to uniquely and persistently identify (i.e., name)
>>> things in real life already. So if URNs uniquely and
>>> persistently identify things, don't we have a match?
>> Again, what do you believe bars that use?   Is it in 2141,
>> 3986, or one of the documents the WG is now working on?  Or
>> are you just trying to warn against our making some change
>> that would make such a URN namespace invalid?
> 
> Over the years I have made a couple of URN proposals. One
> source of pushback has been that when the URN uses some large
> random number or cryptographic hash, the assignment process
> does not "guarantee" enough uniqueness. Some respondents said
> that you need to have an organization or an IANA registry
> doling out identifiers. (There were/are other issues, but they
> are out-of-scope for this discussion.)
> 
> If you compare RFC 3406 and and
> draft-ietf-urnbis-rfc3406bis-urn-ns-reg-09, it seems clear
> that non-human processes are permitted, so long as they
> provide "consistent assignment"--organizations are no longer
> required. Compare, in particular, Page 5 of RFC 3406 with Page
> 4 of the urnbis draft.

Speaking personally, this, in combination with 

	-- your earlier observation that, if one delegates a
	namespace to a particular organization or process, one
	effectively loses the ability to second-guess particular
	entries in that namespace and 
	
	-- the more general realization that registration to
	prevent the same label (in this case NID) from being
	accidentally used for two different things is a good
	thing even if the definitions are not of the quality one
	might like

are the reasons some of us have been pushing for a registration
process in which the IETF community could offer advice on
documentation and encouragement to do things in other ways but
would not have a "approve" a particular namespace, its
management, or its definitions or restrictions.

>> Going back to your original note in this thread:
>> 
>> --On Thursday, September 11, 2014 23:27 -0700 Sean Leonard
>> <dev+ietf@seantek.com> wrote:
>> 
>>> and I want identifiers that are valid in their respective
>>> namespace, that represent (unbroken) cryptographic hashes or
>>> large random numbers, to be valid URNs as well, without any
>>> complaints:
>>> 
>>> urn:oid:2.25.324969006592305634633390616021200786553 ***
>> It is one of several substantive points that have gotten lost
>> in the 3986 debate, one that probably should have been on my
>> "decisions to make" list as "do we still believe this?", but
>> one of the reasons 3406bis is moving toward a registration
>> model (rather than the IETF Consensus one called for in 3406)
>> is precisely to allow easy registration and use of NIDs for
>> externally-defined namespaces.
> [SNIP]
> (some of the premise of the text was incorrect--hopefully my
> description of UUIDs clarifies that issue)
> 
> All I'll say is that we need some standards. URNs shouldn't be
> a free-for-all.

It does clarify the issue.  As to free-for-alls, I agree, but
setting up the IETF community as arbiters of taste, especially
when the community may have a shortage of knowledge about the
particular issues associated wit the namespace but surplus of
opinions on the subject, is not a solution either.  My own hope,
which I think is more or less consistent with 3406bis, is that
we will move toward to review committee procedure that can
supply advice and some persuasion about better and worse ways to
handle things, but that cannot block a registration.   Of
course, blocking some community from inventing an NID and using
it in a URN is close to impossible (and the reason I make cracks
about the powerful protocol police).

>...
> [SNIP]
>>> Should 3406bis be written to simply accept that,
>>> i.e., to allow pointing to an en external specification and,
>>> if what it specifies is not unique, assume it is Someone
>>> Else's Problem.   Or should there be a requirement for an
>>> explanation of why the string is [sufficiently] unique and
>>> under what conditions (e.g., the likelihood of collisions
>>> with a cryptographic hash is not zero but is calculable) and
>>> some serious attempt to review that explanation?
> 
> Per the above, the metric that I advocate for is "engineering
> reasonableness". Persistence and location-independence
> (specifically, independent of Internet topology) are clear and
> desirable goals for URNs, that URLs like http, ldap, gopher,
> ws, snmp, dns, file, and others lack. (The commonality with
> the URLs mentioned is that they use // syntax with an
> authority part; thus, they are intended to identify resources
> accessible via the Internet using IP or DNS.)

Agreed, modulo my being concerned about setting up a "URN tsar",
aka a single appointed expert whose authority goes to his head,
rather than an advisory and consultative process.

> Anyway though, to get that persistence, you have to delegate
> the allocation of names using a process. All I'm saying is
> that processes based on natural phenomena, i.e., bounded and
> describable by physics, mathematics, or other sciences, are
> just as reliable as commitments by human organizations.
> Organizations (being made up of humans) fail; organizations
> make mistakes; organizations change their minds. Natural
> phenomena don't change like that--they have other problems.
> Sometimes natural phenomena (i.e., MD5) fail spectacularly.
> But the commonality between organizations and natural
> processes is that for engineering purposes, they're good
> enough.

I hope we have learned enough by now that no one disagrees with
you.  Because people forget, it clearly needs to be made clear
in the document.  My notion of a consultative process during
namespace registration would include some "are you sure you
really want to do that?" questions if someone came along today
with a plan about a namespace based on MD5, but, until someone
discovers perfect foresight, it probably shouldn't go much
future than that, e.g. to reject a contemporary hash on the
grounds that it might not be good enough to guarantee uniqueness
for all eternity.

>...
> Honestly, engineering reasonableness is achievable by using
> the Specification Required standard, as long as the people
> ratifying the standard are reasonable engineers. :-) I don't
> think that rfc3406bis-urn-ns-reg needs to lower the bar. We
> just need to accept that there is more than one way to
> generate identifiers that are unique within the bounds that
> matter for engineered systems.

First of all, we aren't at "Specification Required" today, we
are at "IETF Consensus".  IMO, _that_ bar needs to be lowered
for all sorts of reasons.  The difficulty with a "reasonable
engineers" standard is that many reasonable engineers (and
reasonable other people) tend to get narrowly focused on the
particular problems they are trying to solve and lose sight of
other issues.   That is precisely the reason the IETF tries to
insist on cross-area review for standards-track documents.

My preference would be that we have two procedures:

(1) For something that is merely a URN encapsulation of a
standard from some body whose review and approval procedures the
IETF considers competent (we use the term "recognized standards
developer" or something like it for Media Types), there should
be a _really_ fast track.   If what they want from a URN is
anything beyond "embed our established identifier as the NSS"
(i.e., "fragments", "queries", or "path segments"), then they
have responsibility for defining those in a stable way, but I
don't think we need to care what they pick, whether it is part
of their base standard that establishes the name space, part of
a technical report or supplemental standard, or something
published somewhere else, i.e., in the RFC Series.  I'd strongly
prefer that such encapsulated namespace URNs conform to the
syntax restrictions of 3986 and 4121bis and think we should be
willing to offer advice to help that happen, but we need to
recognize that we have no power to stop such a body from doing
anything it decides is needed.

(2) For anything else, we establish an "expert review committee"
that consists of volunteer experts from the URN-experience
community, ideally representing multiple perspectives.  Their
role would be to review applications for registration and to
work with applicants to improve the quality of those
registration applications and, where appropriate, to raise
questions and offer advice about how that particular URN
namespace might be defined and used.  Part of that is close to
"Specification Required", but the focus should be on getting the
to best specs and best way of defining the URN that is possible
under the circumstances rather than on "approval".  The
committee might try to talk someone out of doing something in a
particular way, but should not have (or exercise except under
the most unusual and egregious of circumstances) the ability to
say "no, registration rejected".

I think that model would meet your needs.  Plus or minus fine
tuning, I think it is consistent with 3406bis.  It does go a
little further than we have been going because it recognizes
that, should some namespace-owner be determined to do something
stupid (e.g., in sufficient violation of 3996 syntax to cause
generic parsers problems), we can't stop them -- our only choice
is ultimately whether to register what they are doing (perhaps
prevent inadvertent conflicts in NID strings) or refuse to do
that on the theory that our lack of implicit endorsement might
accomplish something.

    john