Re: [urn] draft-ietf-urnbis-rfc2141bis-urn-19

Julian Reschke <julian.reschke@gmx.de> Sat, 07 January 2017 21:01 UTC

Return-Path: <julian.reschke@gmx.de>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A2FA8129B91 for <urn@ietfa.amsl.com>; Sat, 7 Jan 2017 13:01:57 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.756
X-Spam-Level:
X-Spam-Status: No, score=-3.756 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-1.156, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QawDBef6l0HO for <urn@ietfa.amsl.com>; Sat, 7 Jan 2017 13:01:55 -0800 (PST)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.21]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0D37C129B74 for <urn@ietf.org>; Sat, 7 Jan 2017 13:01:54 -0800 (PST)
Received: from [192.168.178.20] ([93.217.111.75]) by mail.gmx.com (mrgmx103 [212.227.17.168]) with ESMTPSA (Nemesis) id 0M7ojs-1cdDi70TM8-00vPYG; Sat, 07 Jan 2017 22:01:47 +0100
To: Barry Leiba <barryleiba@computer.org>, "urn@ietf.org" <urn@ietf.org>
References: <C39D7B0C7841906AF86E94AD@PSB> <CAC4RtVDJUgFwH4mPCAVV6YKecRLSgGz3NiBYMfr=_MjMQQDs1g@mail.gmail.com>
From: Julian Reschke <julian.reschke@gmx.de>
Message-ID: <956b23b0-9f16-7af9-9d41-d664cb8c60c0@gmx.de>
Date: Sat, 07 Jan 2017 22:01:46 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0
MIME-Version: 1.0
In-Reply-To: <CAC4RtVDJUgFwH4mPCAVV6YKecRLSgGz3NiBYMfr=_MjMQQDs1g@mail.gmail.com>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Provags-ID: V03:K0:J8PfOI8jrThpkvykuFrz8kIq/B1ihbEUmxEqFpx+9ZTt6DWv+yf mSdyC17gbynCPxEyCEQsOmnU9HGsODLMtvm05884Wf//Dc2gXK6m3g5hny7gWQkxr4UcVZs kPf94kvp3hNhdmldCDBLUd2CGjWAfDemvDDNxXBGGERgP17tGySJLIcwGnXX6hI1J2xah6h CTyc+aaYxIQNVj/P3NEdA==
X-UI-Out-Filterresults: notjunk:1;V01:K0:cDURwr5xwLk=:1Ri1kNmIH0mbax5rl1FvfM K3zpEUs/iWVs2VU7gkUSrnaeLGWElgJEflfuv24pnpZPgiQBJbQA4zswilH06FKsyC9AzWdLq 0ZXHSCutaDJB9SqqTXxnXt3nNAqFWG7GwQraMQZl1v2ScaGafXod0tg2vXCFXXLw6EUPqueLy C4rt+O+TXt+FpzZJOmDVg6Fev1NhxiZPqZhYeW0rSoA8un2Yhj7rhEJ4IeQkHP8uby/8uRSy9 kDdnaR0BXJ/ZLZijHXh6YT4TfS0LiojFrBKpBge/WRCnBIBkKTqWWALqlV79gyQTiju0Y0UAC dzkUQ40ESQ582/SXnc5Qh/E8ryuYkyTS/LTrMJ1AnMgOqZZ+y03WAvm98x/7v+je2njRCYvIx +y8qBOWaWNyzfT+eul/P6JXP0GE2/MtLQwofHB/gECF/K2sIbADzD5/IakjOf81ZdScIoyYLY CA9eMt9lfdlh1fXOtwfviXg0bJ0cH6e/rAUrO+H5nlQBpEUaTeo4rpOWXdibUpC5TyqMV6l3o 9SAkPQ6SaEiK4p954mE9T7TWZamZnTzzHHms0JFz6u4z8JQINYVid0FtRXXZv0HILkIAtxFSX /h4g1e1/tYET/YHnuAbu+LaKpS3ZkvRYB0LCl0aChbKmfqdBP0pTyLf7kumQ7QEUVtoLXXouX lO+LXGHUzRb6mK4enD5zvSfMsG9JLJ7+u1zX1N5ntzte//L2szUCLwKRmMziBcbI0XRWtuzbd 3Ao9OGDBs4p98Kpi3goVheeeA+aurLQ+BsNjPfed8BT1JSG5OTZGog8faj6mDM7f7Bw8LM965 ad8++Cdqd6QmFg6ytQv9ofdLmMPw2yMRe6le+A8B+bEjVhtMl7Pe6rUw8GEEJV2u940329Yir OrnuIoCkXJ6sJyr/Bnq33zVzT7o3DRShrEa0pVVTMDssD2uQlIC6hHRCj2eRL11BYZJUpBPXK IegbG60Z6kSCJFt7rG+DW7RPNbyexns5beOhtqmqKa2GTRffgyj2Ca3b3aiJoZGG2pV/sVjKc od/1v5/ZnhQV0ul6toWDkuew6l2FIIC+kR09RxGOw1i1f+tayuoadYqRXPR2Rd+VCw==
Archived-At: <https://mailarchive.ietf.org/arch/msg/urn/KlcbJcr0HiVSI5fDrr5pNPS6Kjk>
Subject: Re: [urn] draft-ietf-urnbis-rfc2141bis-urn-19
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 07 Jan 2017 21:01:57 -0000

...ok, here's my feedback...:

Major:

Abstract

    A Uniform Resource Name (URN) is a Uniform Resource Identifier (URI)
    that is assigned under the "urn" URI scheme and a particular URN
    namespace, with the intent that the URN will be a persistent,
    location-independent resource identifier.  With regard to URN syntax,
    this document defines the canonical syntax for URNs (in a way that is
    consistent with URI syntax), specifies methods for determining URN-
    equivalence, and discusses URI conformance.  With regard to URN
    namespaces, this document specifies a method for defining a URN
    namespace and associating it with a namespace identifier, and
    describes procedures for registering namespace identifiers with the
    Internet Assigned Numbers Authority (IANA).  This document obsoletes
    both RFC 2141 and RFC 3406.

Observation: this definition of "URN" actually is inconsistent with the 
one in RFC 3986 (which includes names that use URI schemes other than 
"urn"). I understand it's on purpose and called out later, but maybe 
there's a way to avoid this confusion?

(also, maybe insert a few paragraph breaks...)

1.  Introduction

    ...

    This document rests on two key assumptions:

    1.  Assignment of a URN is a managed process.

Any reader familiar with urn:uuid will ask him/herself: so is this 
namespace ok?

    ...

    The foregoing considerations, together with various differences
    between URNs and URIs that are locators (specifically URLs) as well

According to RFC 3986, a URI that is a locator by definition *is* a URL, 
no?

    as the greater focus on URLs in RFC 3986 as the ultimate successor to
    [RFC1738] and [RFC1808], may lead to some interpretations of RFC 3986
    and this specification that appear (or perhaps actually are) not
    completely consistent, especially with regard to actions or semantics
    other than the basic syntax itself.  If such situations arise,
    discussions of URNs and URN namespaces should be interpreted
    according to this document and not by extrapolation from RFC 3986.

I really believe that this is going to cause more confusion than it 
possibly can avoid. I think it would be an improvement to remove this 
paragraph altogether.


1.1.  Terminology

    The following terms are distinguished from each other as described
    below:

    URN:  A URI (as defined in RFC 3986) using the "urn" scheme and with
       the properties of a "name" as described in that document as well
       as the properties described in this one.  The term applies to the
       entire URI including its optional components.  Note to the reader:
       the term "URN" has been used in other contexts to refer to a URN
       namespace, the namespace identifier (NID), the Assigned-name, and
       to URIs that do not use the "urn" scheme.  All but the last of
       these is described using more specific terminology elsewhere in
       this document, but, because of those other uses, the term should
       be used and interpreted with care.

Please separate the "note to the reader" from the terminology definition.

       ...

    The term "name" is deliberately not defined here and should be (and
    in practice, is) used only very informally.  RFC 3986 uses the term
    as a category of URI distinguished from "locator" (Section 1.1.3) but
    also uses it in other contexts.  If those uses are treated as
    definitions, they conflict with, e.g., the idea of the name of a URN
    namespace, i.e., a NID or terms associated with non-URN identifier
    systems.

Again, this just creates confusion. RFC 3986 indeed uses the term "name" 
in several places to talk about ...names. The same is probably true for 
the majority of all RFCs. I don't think this is a problem.

1.2.1.  Resolution

    With traditional Uniform Resource Locators (URLs), i.e., with most
    URIs that are locators, resolution is relatively straightforward
    because it is used to determine an access mechanism which in turn is
    used to dereference the locator by (typically) retrieving a
    representation of the associated resource, such as a document (see
    Section 1.2.2 of [RFC3986]).

/most URIs that are locators/all URIs that are locators/

But with that, this could be further simplified...


2.  URN Syntax

    As discussed above, the syntax for URNs in this specification allows
    significantly more functionality than was the case in the earlier
    specifications, most recently [RFC2141].  It is also harmonized with
    the general URI syntax [RFC3986] (which, it must be noted, was
    completed after the earlier URN specifications).

Exactly how is the last sentence relevant?

2.2.  Namespace Specific String (NSS)

    In order to make URNs as stable and persistent as possible when
    protocols evolve and the environment around them changes, URN
    namespaces SHOULD NOT allow characters outside the basic Latin
    repertoire [RFC20] unless the nature of the particular URN namespace
    makes such characters necessary.

RFC 20 does not seem to define "basic Latin repertoire".

2.3.2.  q-component

    For the sake of consistency with RFC 3986, the general syntax and the
    semantics of q-components are not defined by, or dependent on, the
    URN namespace of the URN.  In parallel with RFC 3896, specifics of
    syntax and semantics, e.g., which keywords or terms are meaningful,
    of course may depend on a particular URN namespace or even a
    particular resource.

I have no idea what the text about RFC 3986 is for. RFC 3986 does not 
care about the structure of a query component.

2.3.3.  f-component

    The f-component is intended to be interpreted by the client as a
    specification for a location within, or region of, the named
    resource.  It distinguishes the constituent parts of a resource named
    by a URN.  For a URN that resolves to one or more locators which can
    be dereferenced to a representation, or where the URN resolver
    directly returns a representation of the resource, the semantics of
    an f-component are defined by the media type of the representation.

    ...

I understand that it took a lot of time to come up with this, but I see 
absolutely no gain over just using the term fragment identifier. After 
all, it *is* the fragment identifier, as a URN is a URI, right?

4.2.  Parsing

    In part because of the separation of URN semantics from more general
    URI syntax [I-D.ietf-urnbis-semantics-clarif], generic URI processors
    need to pay special attention to the parsing and analysis rules of
    RFC 3986 and, in particular, must treat the URI as opaque unless the
    scheme and its requirements are recognized.  In the latter case, such
    processors may be in a position to invoke scheme-appropriate
    processing, e.g., by a URN resolver.  A URN resolver can either be an
    external resolver that the URI resolver knows of, or it can be
    functionality built into the URI resolver.  Note that this
    requirement might impose constraints on the contexts in which URNs
    are appropriately used; see Section 4.1.

Can we state the same without referring to a document we decided not to 
publish?


4.3.  URNs and Relative References

    Section 5.2 of [RFC3986] describes an algorithm for converting a URI
    reference that might be relative to a given base URI into "parsed

s/URI reference/relative reference/

    components" of the target of that reference, which can then be
    recomposed per RFC 3986 Section 5.3 into a target URI.  This

s/RFC 3986//

    algorithm is problematic for URNs because their syntax does not
    support the necessary path components.  However, if the algorithm is
    applied independent of a particular scheme, it should work
    predictably for URNs as well, with the following understandings
    (syntax production terminology taken from RFC 3986):

It *does* work predictably, right?

    1.  A system that encounters a <URI-reference> that obeys the syntax
        for <relative-ref>, whether it explicitly has the scheme "urn" or
        not, will convert it into a target URI as specified in RFC 3986.

A <relative-ref> by definition does not contain a scheme name, so the 
mention of "urn" is misleading here.

    2.  Because of the persistence and stability expectations of URNs,
        authors of documents, etc., that utilize URNs should generally
        avoid the use of the "urn" scheme in any <URI-reference> that is
        not strictly a <URI> as specified in RFC 3986, specifically
        including those that would require processing of <relative-ref>.

Given my previous comment, maybe most of this can be removed?


4.4.  Transport and Display

    When URNs are transported and exchanged, they MUST be represented in
    the format defined herein.  Further, all URN-aware applications MUST

That's a tautology, no? Otherwise they wouldn't be URNs...

    offer the option of displaying URNs in this canonical form to allow
    for direct transcription (for example by copy-and-paste techniques).

That makes it sound as if there was a form other than the canonical one 
which is still a URN, wich I believe is not true.



Editorial:

1.  Introduction

    A Uniform Resource Name (URN) is a Uniform Resource Identifier (URI)
    [RFC3986] that is assigned under the "urn" URI scheme and a
    particular URN namespace, with the intent that the URN will be a
    persistent, location-independent resource identifier.  A URN
    namespace is a collection of such URNs, each of which is (1) unique,
    (2) assigned in a consistent and managed way, and (3) assigned
    according to a common definition.  (Some URN namespaces create names
    that exist only as URNs, whereas others assign URNs based on names
    that were already created in non-URN identifier systems, such as
    ISBNs [RFC3187], ISSNs [RFC3044], or RFCs [RFC2648].)

Para break before "A URN namespace...".

3.1.  Procedure

    If an r-component, q-component, or f-component (or any combination
    thereof) is included in a URN, it MUST be ignored for purposes of
    determining URN-equivalence.

remove "(or any combination thereof)" -- unless I'm missing something, 
this is entirely redundant.

Nits:

metadata: <area> should be "Applications and Real-Time" or "art

9.2.  Informative References

    [DOI-URI]  Paskin, N., Neylon, E., Hammond, T., and S. Sun, "The
               "doi" URI Scheme for the Digital Object Identifier (DOI)",
               June 2003,
               <http://tools.ietf.org/id/draft-paskin-doi-uri-04.txt>.

This will likely require the phrase "work in progress".

    [I-D.ietf-urnbis-semantics-clarif]
               Klensin, J., "URN Semantics Clarification", draft-ietf-
               urnbis-semantics-clarif-04 (work in progress), June 2016.

We shouldn't need this reference.

(I also note that the references sometimes include the DOI and sometimes 
do not, but this can be fixed by the RFC Editor)