Re: Application for a formal URN NID ("EIDR")

Pierre-Anthony Lemieux <pal@sandflow.com> Mon, 03 March 2014 19:26 UTC

Return-Path: <pal@sandflow.com>
X-Original-To: urn-nid@ietfa.amsl.com
Delivered-To: urn-nid@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D8BDD1A0218 for <urn-nid@ietfa.amsl.com>; Mon, 3 Mar 2014 11:26:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.978
X-Spam-Level:
X-Spam-Status: No, score=-1.978 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DwcYU27aeYNV for <urn-nid@ietfa.amsl.com>; Mon, 3 Mar 2014 11:26:11 -0800 (PST)
Received: from mail-qc0-f174.google.com (mail-qc0-f174.google.com [209.85.216.174]) by ietfa.amsl.com (Postfix) with ESMTP id 61F401A0216 for <urn-nid@apps.ietf.org>; Mon, 3 Mar 2014 11:26:11 -0800 (PST)
Received: by mail-qc0-f174.google.com with SMTP id x13so4215505qcv.19 for <urn-nid@apps.ietf.org>; Mon, 03 Mar 2014 11:26:08 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=AGcNt1AUBgqq731y7ddQ/3hhjtl7cT/kXxQjSjlsrZM=; b=bDiYEvNLibm8jPp4eWk3qMY1XvNQlTs565ZItRhOHUsYu3VXxllMM72GqDwkPX43Tl YN47wWDOihx9mG3K84MnxkfYd4gJ3ZnMT0M58+L4Yr1OytAO6sW8Q50ZqgxSw72+eK8V TqvFRwDRXbGjoBUxwjM0vPN2ejDpwDlGw2kzJiLVKnms8jZSsSUuGQRCZ126qPEfxw6g d64chvZbcxc8IavSp7h1x+9+VmLHOU7ZHx2kBiwcq2eXgq2NT616680l+DQ9KdifOg0Y G6M4DeBjeJZ7kYd9hBbNVKNmB0ltsGUijQfQXJIuI8GPb7rK9l5BB5QeeZPMGelpCOhb FiWg==
X-Gm-Message-State: ALoCoQlUQx5XCTUvWNYTFbggNxJz0u47o+7+yCqgYcRAbU9lSRBKo3uc9ZGCQpdmBupbUrKIv5QK
X-Received: by 10.140.92.213 with SMTP id b79mr4083790qge.108.1393874768282; Mon, 03 Mar 2014 11:26:08 -0800 (PST)
Received: from mail-qa0-f48.google.com (mail-qa0-f48.google.com [209.85.216.48]) by mx.google.com with ESMTPSA id v2sm41349821qat.3.2014.03.03.11.26.07 for <urn-nid@apps.ietf.org> (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 03 Mar 2014 11:26:07 -0800 (PST)
Received: by mail-qa0-f48.google.com with SMTP id m5so3273484qaj.35 for <urn-nid@apps.ietf.org>; Mon, 03 Mar 2014 11:26:07 -0800 (PST)
X-Received: by 10.224.132.65 with SMTP id a1mr26016363qat.69.1393874767388; Mon, 03 Mar 2014 11:26:07 -0800 (PST)
MIME-Version: 1.0
Received: by 10.229.91.72 with HTTP; Mon, 3 Mar 2014 11:25:47 -0800 (PST)
In-Reply-To: <CAF_7JxB046x5zZOsjV+VocasxXyVDjJqgdcN7h3mmXxizmu53A@mail.gmail.com>
References: <CAF_7JxASOJEKwZ_XAohHwqVjaDz5zqqbG349dCNiQHRh+nnHxw@mail.gmail.com> <201402061916.s16JG0J64392388@shell01.TheWorld.com> <CAF_7JxAsn7StRvXf8B+dPpu8a97XACH+DAf8ftZN6OVFkJmXkg@mail.gmail.com> <CAF_7JxB046x5zZOsjV+VocasxXyVDjJqgdcN7h3mmXxizmu53A@mail.gmail.com>
From: Pierre-Anthony Lemieux <pal@sandflow.com>
Date: Mon, 3 Mar 2014 11:25:47 -0800
Message-ID: <CAF_7JxDoCws8JCLSwikFgoSGp-c+fJQPXULEXzdPPhr1uN6-kg@mail.gmail.com>
Subject: Re: Application for a formal URN NID ("EIDR")
To: "urn-nid@apps.ietf.org" <urn-nid@apps.ietf.org>
Content-Type: text/plain; charset=ISO-8859-1
Archived-At: http://mailarchive.ietf.org/arch/msg/urn-nid/LVg8x5iNn6DR0D_vU_vfrwnJSJM
X-BeenThere: urn-nid@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: discussion of new namespace identifiers for URNs <urn-nid.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn-nid>, <mailto:urn-nid-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/urn-nid/>
List-Post: <mailto:urn-nid@ietf.org>
List-Help: <mailto:urn-nid-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn-nid>, <mailto:urn-nid-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 03 Mar 2014 19:26:15 -0000

> I plan to update the I-D when the submission window reopens.

Just an FYI. Revised I-D posted. See [1]

[1] http://datatracker.ietf.org/doc/draft-pal-eidr-urn/

Best,

-- Pierre

On Thu, Feb 20, 2014 at 5:09 PM, Pierre-Anthony Lemieux
<pal@sandflow.com> wrote:
> Good morning/evening,
>
> Please find attached a revised application for the formal "EIDR" URN
> NID. The revised application is intended to address the comments
> received on the list.
>
> Some highlights:
>
> - the introduction was expanded to provide more details on DOI Names,
> EIDR Identifiers and their relationship.
> - the EIDR-URN syntax was refactored to focus on EIDR Identifiers, and
> not DOI Names in general.
> - the EIDR Identifier prefix and suffix character sets are now
> explicitly specified to remove potential ambiguities in mapping them
> to the URN character set
> - the HTTP response to a resolution request to the ISO 26324
> Registration Authority is now specified
>
> I plan to update the I-D when the submission window reopens.
>
> Thanks again to the commenters.
>
> Best,
>
> -- Pierre
>
> On Thu, Feb 6, 2014 at 6:07 PM, Pierre-Anthony Lemieux <pal@sandflow.com> wrote:
>> Hi Ted and Dale,
>>
>> Thanks for the detailed comments, which I am in the process of
>> addressing. In the meantime, some additional background and
>> clarification on the request.
>>
>> The requested EIDR NID is not intended to accommodate any and all DOI
>> Names, but specifically DOI Names allocated by EIDR organization, i.e.
>> DOI Names with a prefix assigned to EIDR organization. This means that
>> (a) EIDR organization essentially controls the syntax of the EIDR
>> Identifiers, e.g. add check codes and constrain it to the ASCII set,
>> and (b) the NID is specifically intended for audiovisual works. The
>> fact that EIDR Identifiers are valid DOI Names ensures persistence,
>> uniqueness and an open resolution infrastructure.
>>
>> Currently, EIDR Identifiers use the 10.5240 prefix for audiovisual
>> works. The idea is to leave the door open for additional prefixes (and
>> corresponding suffixes) to be defined in the future (with EIDR NID
>> specification being updated accordingly). In all cases, these prefixes
>> and suffixes would be defined and controlled by EIDR organization.
>>
>> So, if an implementation receives an EIDR-NSS with an unknown prefix,
>> it can still accept it, treating it as an opaque (case insensitive)
>> identifier (with the option of resolving it as generic DOI Name.) If
>> the prefix is known, additional processing can occur, e.g. error
>> detection in the case of the 10.5240 prefix.
>>
>> Does this makes sense? I assume it is acceptable to reply cc'ing the
>> reflector. If not, I am happy to chat offline.
>>
>> Best,
>>
>> -- Pierre
>>
>> On Thu, Feb 6, 2014 at 11:16 AM, Dale R. Worley <worley@ariadne.com> wrote:
>>> If there's a difference between EIDR and DOI, you ought to make that
>>> clear.  The two terms are used throughout the document and I vaguely
>>> assumed that they are the same.  Actually, the underlying problem is
>>> that you write the document assuming that the reader is thoroughly
>>> familiar with EIDR's, DOI's, and the like.  Start with enough tutorial
>>> so that someone who has never heard of either before (me) will have a
>>> clear understanding of what is going on.
>>>
>>>> From: Ted Hardie <ted.ietf@gmail.com>
>>>>
>>>> EIDR-NSS = DOI-PREFIX ":" DOI-SUFFIX
>>>>
>>>> This would seem to imply that any DOI prefix may be
>>>> encountered and that this NID could be used with any
>>>> registered DOI.  IF that were the intent, I would
>>>> suggest registering the namespace "DOI" instead.  I
>>>> understand that the DOI folks have consciously chosen not
>>>> to do that(cf: http://www.doi.org/factsheets/DOIIdentifierSpecs.html),
>>>> though, so I suspect your intent is to limit this to a subset
>>>> of DOIs.  Is that correct?  Is it essentially limited to 10.5240?
>>>> If not, how will the appropriate subset be identified?
>>>
>>> If the NID is to provide an encoding for all DOI names, then it should
>>> be named "doi".  But if the DOI people have decided that it is not a
>>> good thing to provide a NID for all DOI names, then:
>>>
>>> - the NID should be "eidr"
>>>
>>> - the syntax should not give a DOI-PREFIX, because that will *always*
>>>   be "10.5240", and there's no point including a long invariant string
>>>   in the syntax
>>>
>>> Fundamentally, you need to determine if the DOI people are
>>> fundamentally against mapping DOI names into URNs, or whether they
>>> just don't want to put in the work (and you are actually doing the job
>>> for them).
>>>
>>> Also, if you are thinking of providing a NID that can encompass all
>>> DOIs, you have to worry about character sets.  According to Wikipedia,
>>> "Most legal Unicode characters are allowed in these strings", whereas
>>> the %-encoding system can only represent ASCII characters.
>>>
>>>>       where DOI-PREFIX and DOI-SUFFIX are DOI Name prefix and suffix,
>>>>       respectively, translated into canonical NSS format according to
>>>>       [RFC2141].  DOI Name syntax is specified in [ISO26234].
>>>
>>> What you mean to say is something like:  A DOI name consists of a
>>> prefix and a suffix, which are character strings [a fact the reader
>>> didn't know before].  They are translated into the DOI-PREFIX and
>>> DOI-SUFFIX by replacing all characters which are not XXX with
>>> corresponding %-escapes.  (See RFC 2141 section 2.2.)
>>>
>>> Exactly what the set XXX is needs to be specified with some care.
>>> Section 2.2 specifies that all characters that may not appear in URNs
>>> *at all* must be escaped.  But of course, ":" may appear in URNs and
>>> by that specification need not be escaped.  OTOH, if ":" appears in a
>>> prefix or suffix, you very well want it escaped.  I'm pretty sure that
>>> you want XXX to be <pchar> as defined in RFC 3986 (the infamous
>>> "Uniform Resource Identifier (URI): Generic Syntax").
>>>
>>> Also, you are depending on the fact that EIDR suffixes consist of
>>> ASCII characters, which can be represented by %-escapes (whereas DOI
>>> suffixes can contain Unicode characters, which can't).
>>>
>>> NIDs are case-insensitive (RFC 2141 section 5), but usually are
>>> presented in lower case.
>>>
>>>>           EIDR-SUFFIX  = 5*5(4*4HEXDIG "-") CHECK
>>>
>>> You can just say
>>>
>>>            EIDR-SUFFIX  = 5(4HEXDIG "-") CHECK
>>>
>>>>    Identifier persistence considerations:
>>>>
>>>>       As a DOI Name, the persistence of EIDR-NSS is guaranteed by the
>>>>       ISO 26324 Registration Authority.  A DOI Name remains valid
>>>>       indefinitely.
>>>
>>> It would be clearer if you said something like
>>>
>>>        The ISO 26324 Registration Authority assigns DOI Names to
>>>        works(?).  A DOI Name remains valid indefinitely.  As a
>>>        consequence, the URN derived from a DOI Name remains valid
>>>        indefinitely.
>>>
>>> Similar editing of the other items in this section would be helpful.
>>>
>>> I can't quite put my finger on what seems to be the problem with the
>>> writing.  I *think* the problem is that the text is written from the
>>> point of view of someone who is thoroughly familiar with DOIs/EIDRs,
>>> to the point where it never really has to be said what they are *for*
>>> or how they work, whereas the correct way to write these sections is
>>> from the point of view of someone who is familiar with URNs but has
>>> never heard of a DOI before.  "We are talking about URNs that look
>>> like this: ...  These URNs are used to specify DOIs, which are used in
>>> XXX industry to designate YYYs.  DOIs are assigned to YYYs by ZZZ."
>>>
>>> In the above paragraph, the sentence starts with "As a DOI Name...",
>>> which is actually trying to leverage that the reader *already
>>> understands* how DOI Names work.
>>>
>>>>       As a DOI Name, the resolution of EIDR-NSS is handled by the ISO
>>>>       26324 Registration Authority.
>>>>
>>>>       The ISO 26324 Registration Authority operates a web service that
>>>>       allows an EIDR-NSS to be resolved by issuing an HTTP GET request
>>>>       to the following URI:
>>>>
>>>>                "http://doi.org/" DOI-PREFIX "/" DOI-SUFFIX
>>>
>>> As written, this doesn't specify anything, because you can apply that
>>> process to any alleged EIDR.  In order to make this meaningful, you
>>> have to specify what the format of the HTTP *response* is and the
>>> significance of the elements of the response.  (Presumably there is an
>>> ISO standard you can reference here.)
>>>
>>> Dale