Re: Application for a formal URN NID ("EIDR")

worley@ariadne.com (Dale R. Worley) Thu, 06 February 2014 19:21 UTC

Return-Path: <worley@shell01.TheWorld.com>
X-Original-To: urn-nid@ietfa.amsl.com
Delivered-To: urn-nid@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7A6981A044C for <urn-nid@ietfa.amsl.com>; Thu, 6 Feb 2014 11:21:30 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.254
X-Spam-Level:
X-Spam-Status: No, score=-1.254 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_BL_SPAMCOP_NET=1.347, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1W05UvZnWjde for <urn-nid@ietfa.amsl.com>; Thu, 6 Feb 2014 11:21:28 -0800 (PST)
Received: from TheWorld.com (pcls5.std.com [192.74.137.145]) by ietfa.amsl.com (Postfix) with ESMTP id A54C01A0466 for <urn-nid@apps.ietf.org>; Thu, 6 Feb 2014 11:21:27 -0800 (PST)
Received: from shell.TheWorld.com (svani@shell01.theworld.com [192.74.137.71]) by TheWorld.com (8.14.5/8.14.5) with ESMTP id s16JKIC7012526; Thu, 6 Feb 2014 14:20:21 -0500
Received: from shell01.TheWorld.com (localhost.theworld.com [127.0.0.1]) by shell.TheWorld.com (8.13.6/8.12.8) with ESMTP id s16JG1dE4397647; Thu, 6 Feb 2014 14:16:01 -0500 (EST)
Received: (from worley@localhost) by shell01.TheWorld.com (8.13.6/8.13.6/Submit) id s16JG0J64392388; Thu, 6 Feb 2014 14:16:00 -0500 (EST)
Date: Thu, 06 Feb 2014 14:16:00 -0500
Message-Id: <201402061916.s16JG0J64392388@shell01.TheWorld.com>
From: worley@ariadne.com
Sender: worley@ariadne.com
To: Pierre-Anthony Lemieux <pal@sandflow.com>
In-reply-to: <CAF_7JxASOJEKwZ_XAohHwqVjaDz5zqqbG349dCNiQHRh+nnHxw@mail.gmail.com> (pal@sandflow.com)
Subject: Re: Application for a formal URN NID ("EIDR")
References: <CAF_7JxASOJEKwZ_XAohHwqVjaDz5zqqbG349dCNiQHRh+nnHxw@mail.gmail.com>
Cc: urn-nid@apps.ietf.org
X-BeenThere: urn-nid@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: discussion of new namespace identifiers for URNs <urn-nid.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn-nid>, <mailto:urn-nid-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/urn-nid/>
List-Post: <mailto:urn-nid@ietf.org>
List-Help: <mailto:urn-nid-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn-nid>, <mailto:urn-nid-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Feb 2014 19:21:30 -0000

If there's a difference between EIDR and DOI, you ought to make that
clear.  The two terms are used throughout the document and I vaguely
assumed that they are the same.  Actually, the underlying problem is
that you write the document assuming that the reader is thoroughly
familiar with EIDR's, DOI's, and the like.  Start with enough tutorial
so that someone who has never heard of either before (me) will have a
clear understanding of what is going on.

> From: Ted Hardie <ted.ietf@gmail.com>
> 
> EIDR-NSS = DOI-PREFIX ":" DOI-SUFFIX
> 
> This would seem to imply that any DOI prefix may be
> encountered and that this NID could be used with any
> registered DOI.  IF that were the intent, I would
> suggest registering the namespace "DOI" instead.  I
> understand that the DOI folks have consciously chosen not
> to do that(cf: http://www.doi.org/factsheets/DOIIdentifierSpecs.html),
> though, so I suspect your intent is to limit this to a subset
> of DOIs.  Is that correct?  Is it essentially limited to 10.5240?
> If not, how will the appropriate subset be identified?

If the NID is to provide an encoding for all DOI names, then it should
be named "doi".  But if the DOI people have decided that it is not a
good thing to provide a NID for all DOI names, then:

- the NID should be "eidr"

- the syntax should not give a DOI-PREFIX, because that will *always*
  be "10.5240", and there's no point including a long invariant string
  in the syntax

Fundamentally, you need to determine if the DOI people are
fundamentally against mapping DOI names into URNs, or whether they
just don't want to put in the work (and you are actually doing the job
for them).

Also, if you are thinking of providing a NID that can encompass all
DOIs, you have to worry about character sets.  According to Wikipedia,
"Most legal Unicode characters are allowed in these strings", whereas
the %-encoding system can only represent ASCII characters.

>       where DOI-PREFIX and DOI-SUFFIX are DOI Name prefix and suffix,
>       respectively, translated into canonical NSS format according to
>       [RFC2141].  DOI Name syntax is specified in [ISO26234].

What you mean to say is something like:  A DOI name consists of a
prefix and a suffix, which are character strings [a fact the reader
didn't know before].  They are translated into the DOI-PREFIX and
DOI-SUFFIX by replacing all characters which are not XXX with
corresponding %-escapes.  (See RFC 2141 section 2.2.)

Exactly what the set XXX is needs to be specified with some care.
Section 2.2 specifies that all characters that may not appear in URNs
*at all* must be escaped.  But of course, ":" may appear in URNs and
by that specification need not be escaped.  OTOH, if ":" appears in a
prefix or suffix, you very well want it escaped.  I'm pretty sure that
you want XXX to be <pchar> as defined in RFC 3986 (the infamous
"Uniform Resource Identifier (URI): Generic Syntax").

Also, you are depending on the fact that EIDR suffixes consist of
ASCII characters, which can be represented by %-escapes (whereas DOI
suffixes can contain Unicode characters, which can't).

NIDs are case-insensitive (RFC 2141 section 5), but usually are
presented in lower case.

>           EIDR-SUFFIX  = 5*5(4*4HEXDIG "-") CHECK

You can just say

           EIDR-SUFFIX  = 5(4HEXDIG "-") CHECK

>    Identifier persistence considerations:
> 
>       As a DOI Name, the persistence of EIDR-NSS is guaranteed by the
>       ISO 26324 Registration Authority.  A DOI Name remains valid
>       indefinitely.

It would be clearer if you said something like

       The ISO 26324 Registration Authority assigns DOI Names to
       works(?).  A DOI Name remains valid indefinitely.  As a
       consequence, the URN derived from a DOI Name remains valid
       indefinitely.

Similar editing of the other items in this section would be helpful.

I can't quite put my finger on what seems to be the problem with the
writing.  I *think* the problem is that the text is written from the
point of view of someone who is thoroughly familiar with DOIs/EIDRs,
to the point where it never really has to be said what they are *for*
or how they work, whereas the correct way to write these sections is
from the point of view of someone who is familiar with URNs but has
never heard of a DOI before.  "We are talking about URNs that look
like this: ...  These URNs are used to specify DOIs, which are used in
XXX industry to designate YYYs.  DOIs are assigned to YYYs by ZZZ."

In the above paragraph, the sentence starts with "As a DOI Name...",
which is actually trying to leverage that the reader *already
understands* how DOI Names work.

>       As a DOI Name, the resolution of EIDR-NSS is handled by the ISO
>       26324 Registration Authority.
> 
>       The ISO 26324 Registration Authority operates a web service that
>       allows an EIDR-NSS to be resolved by issuing an HTTP GET request
>       to the following URI:
> 
>                "http://doi.org/" DOI-PREFIX "/" DOI-SUFFIX

As written, this doesn't specify anything, because you can apply that
process to any alleged EIDR.  In order to make this meaningful, you
have to specify what the format of the HTTP *response* is and the
significance of the elements of the response.  (Presumably there is an
ISO standard you can reference here.)

Dale