[urn] Feedback on draft-ietf-urnbis-semantics-clarif-03

Julian Reschke <julian.reschke@gmx.de> Wed, 27 April 2016 15:02 UTC

Return-Path: <julian.reschke@gmx.de>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7D15612D82D for <urn@ietfa.amsl.com>; Wed, 27 Apr 2016 08:02:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.601
X-Spam-Level:
X-Spam-Status: No, score=-2.601 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WXx4tLPc7tuK for <urn@ietfa.amsl.com>; Wed, 27 Apr 2016 08:02:32 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.20]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 02BA912D857 for <urn@ietf.org>; Wed, 27 Apr 2016 08:02:29 -0700 (PDT)
Received: from [192.168.1.123] ([5.10.171.186]) by mail.gmx.com (mrgmx102) with ESMTPSA (Nemesis) id 0LmeGF-1bVK1z2Jsp-00aB2V; Wed, 27 Apr 2016 17:02:24 +0200
To: "urn@ietf.org" <urn@ietf.org>
From: Julian Reschke <julian.reschke@gmx.de>
Message-ID: <231532ef-e195-b73d-4a34-eb445bdd1900@gmx.de>
Date: Wed, 27 Apr 2016 17:02:25 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.0
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Provags-ID: V03:K0:FwMDXenvrjEZBjxK2s9adNAzVeDyECqVuCLZfdcRN5GreDbC9yv vRyW0KdmL9/eDHkHK/YYaNXi7f1f0cQpKY6nQzW/0sDj88WqJsJ9wYbxtwmJcxyiI9zLv1l 8kG6DQk7Fy+Di/DOFgiayuqElQtoWdVJWbzcnFChK2k/QnVdIJdMSGVbfTUXLxwN9YApx8u PWIEEyTYKvmbU9PTdScFg==
X-UI-Out-Filterresults: notjunk:1;V01:K0:rn5aAFMv8lA=:MgxO3xbSWWMEAAG/LfwLei +avV2AURUhpEPK+2TJ8H/BSp3oFVelSf/D8zr32CGm1sfKiI01CYrKfkRujBLAiSxRPCZnd2m nMYe99FqTEthw80mWOvWKPfman3IbZWCVvpJHoflDV6gUcUHn/qleP+LuQoUj30daGOnSsiPk DGnumJGBwsR25BoKYMUW6c6le6PLlWJGxCtRWo8YaAGMuEV/AAsiVzMlUlsiDlfgvQuEGWQ/g f4MP19oNhhynTm+cDUlhKVZR1fEArraXVvjZ8LWu+FO2WC6s/eaPTMkpNyBL4F7/iKgFoSAvF qdVBGqhrxyty7fQqdJTIXDRJfaUvQejv1Fxx2jrwOEfYtyFrcddmZSZYUdkMujDeKGrXhyRRC ZzRn8806jDPZB1XrNLWU2qmRpgt83UhQtOA90uit5yMgrZVsggaRaZ094Ka5QCHjDgYAVpaGY ZUENrdJ1GJzKZ77ZCR17+XcAW0LRl0dot4Wd9jdy/DbhTNr1a0d7MEDAIvCUYRGH4S+yujcME +Iy3uBdlHIp+drWTSkISymgI/JQ7PpgKJUZErNSsG2Bvmmfh01uMcrY5Fk0M2LcFZZFxPq1ho PzOckGTHv2+G2t77iQBpyennlmELBtfyYaTb+f99JECS091memapYcvL4dTxD9VTO52K19SiP UjZu0hMiBEm2sPOHP2yeXBtUazkXNW3J7QSG+NZvFYesiXJbj4+UihiyFZgxvfetKWqgW/wg+ 3+j+IOAh2rdcXv/E49XeUA9Mtaj4V1/ojmHV+nyTgKL2XHf7WS/kqm0V3mHJEonYmtc+Zqts1 nY3NGQf
Archived-At: <http://mailarchive.ietf.org/arch/msg/urn/_dvCZ9tPOUNV1jVz8ypXdeF8LBs>
Subject: [urn] Feedback on draft-ietf-urnbis-semantics-clarif-03
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Apr 2016 15:02:34 -0000

Hi there,

I was asked off-list to review the current drafts, given my interest in 
RFC 3986 in the past.

I was first confused about 2141bis seemingly making changes to 3986, but 
then realized that there's draft-ietf-urnbis-semantics-clarif-03 as 
well. I believe 2141bis needs to *normatively* reference this one.

That said, I appreciate that these two are separated, because it's 
easier to understand what's going on with relation to 3986. However, the 
separation may need more work; this document IMHO contains stuff that 
should be in 2141bis, while there's stuff in 2141bis which would need to 
be here.

Other than that see below comments inline (sorry for the format).

Best regards, Julian

-- snip --

Uniform Resource Names (urnbis)                               J. Klensin
Internet-Draft                                          February 5, 2016
Updates: 3986 (if approved)
Intended status: Standards Track
Expires: August 8, 2016


                       URN Semantics Clarification
                draft-ietf-urnbis-semantics-clarif-03.txt

Abstract

    Experience has shown that identifiers associated with persistent
    names have properties and requirements that may be somewhat different
    from identifiers associated with the locations of objects.  This is
    especially true when such names are expected to be stable for a very
    long time or when they identify large and complex entities.  In order
    to allow Uniform Resource Names (URNs) to evolve to meet the needs of
    the Library, Museum, Publisher, and Information Science communities
    and other users, this specification separates URNs from the semantic
    constraints that many people believe are part of the specification
    for Uniform Resource Identifiers (URIs) in RFC 3986, updating that

Lack of clarity: this is intended to be standards track document 
updating a full internet standard. I don't think that what people 
believe about that spec is relevant; we should be able to agree upon 
what it says and what it doesn't say.

    document accordingly.  The syntax of URNs is still constrained to
    that of RFC 3986, so generic URI parsers are unaffected by this
    change.

...that depends on whether resolution of a relative reference against a 
base URI is part of what a URI parser does (do we have a definiton of 
what a "URI parser" does?)


1.  Introduction

    The Generic URI Syntax specification [RFC3986] covers both locators
    and names and mixtures of the two (See its Section 1.1.3) and
    describes Uniform Resource Locators (URLs) -- first documented in the
    IETF in RFC 1738 [RFC1738] -- as an embodiment of the locator concept
    and Uniform Resource Names (URNs), specifically those using the "urn"
    scheme [RFC2141], as an embodiment of the names that do not directly
    provide for resource location.  This specification is concerned only
    about URNs of the variety described in RFC 2141 [RFC2141] and its
    successors [RFC2141bis] (i.e., those that use the "urn" scheme).
    URLs, other types of names, and any URI types that may not fall into
    one of the above categories are out of its scope and unaffected by
    it.

I understand that historical context is useful, but do we really need 
half a paragraph to say: "this is about URIs using the 'urn' scheme"?

    Experience with URNs since the publication of RFC 3986 has identified
    several ways in which their inclusion under the 3986 scope has
    hampered understanding, adoption, and especially extension
    (specifically extensions of types that were anticipated, but not
    defined, in RFC 2141).  The need for extensions to the URN concept is
    now being felt in some communities, especially those that include
    libraries, museums, publishers, and other information scientists.

    In particular, the Generic URI Syntax specification goes beyond
    syntax to specify the meaning and interpretation of various fields,
    especially the "query" and "fragment" ones and the various syntax
    forms and interpretations it allows for <hier-part>.  This

Agreed for "fragment", not sure about "query". Could you elaborate a bit?

    specification excludes URNs from those definitions of meaning and
    interpretation so that RFC 3986 applies to their syntax only.  The
    meaning --and any more specific syntax rules-- for those fields for
    URNs are now defined in a URN-specific document [RFC2141bis].  URNs
    remain members of the URI family and parsers for generic URI syntax
    are not affected by this specification although parsers that make
    assumptions based on other URI schemes obviously might be.

I don't get the last sentence. If a parser makes assumptions on *other* 
schemes, how would it affect processing of "urn" URIs?

    Neither this specification nor the successor to RFC 2141 [RFC2141bis]
    discusses DDDS [RFC3401] resolution or conversion to (and
    interpretation of) URCs [RFC2483] or, with the exception of providing
    some syntax to cover some specific cases, URN "resolution" more
    generally.  Any of those topics that do need to be addressed should
    be covered in other documents.  The document also does not discuss
    alternatives to URNs, either those that might use a different scheme
    name within the RFC 3986 URI framework or those that might use a
    different framework entirely.  In particular, some externally-defined
    content or object identification systems could be represented either
    by a URN namespace or through separate URI schemes.  This
    specification does not offer advice on that choice other than to
    suggest that the two options not be confused (or both used in a way
    that would be confusing).

(matter of taste: I think that long text about what spec does *not* do 
really distracts from what it is about; maybe it would be good to things 
like this (and the text about URN's history) to an appendix?)

    This document updates RFC 3986 to make the distinction between syntax
    and semantics clear for URNs and to isolate URNs from presumed URI
    semantic requiremnts.  It is important to note that some readers of

s/requiremnts/requirements/

    RFC 3986 are convinced that the separation is clear in that
    specification and therefore that no changes to that document are
    needed.  For them, this specification is only a confirming
    clarification.

I understand that this is well-intended, but it is certainly very very 
confusing.

    In the long term, as the expanded syntax and uses of URNs become
    commonplace and RFC 3986 is updated, this specification is likely to
    become of historical interest only, providing an extended rationale
    for decisions made and adjustment of the boundary between URN
    specifications and generic URI ones.

...

3.  The role of queries and fragments in URNs

    Part of the concern that led to this document was a desire to
    accommodate URN components that would be analogous to the query and
    fragment components of generalized URNs but that might have different
    properties.  For many cases, the analogy cannot be exact.  For
    example, RFC 3986 ties the interpretation of fragments to media
    types.  Since media type is a function of specific content, URNs that
    are never resolved cannot have an associated media type, nor can URNs
    that resolve to, for example, other URIs that may then not be

FTR, RFC 3986 says: "If no such representation exists, then the 
semantics of the fragment are considered unknown and are effectively 
unconstrained. Fragment identifier semantics are independent of the URI 
scheme and thus cannot be redefined by scheme specifications."

    resolved further.  Similarly, while the RFC 3986 syntax for queries
    (and fragments) may be entirely appropriate for URN use, terminology
    like "Service Request" (see Appendix B of the predecessor "URNs are
    not..." draft [ServiceRequests] for additional discussion) may be
    more suitable to the URN context than "query" (if, indeed, the
    portion of the URN that is syntactically equivalent to a URI query is
    where those requests belong).

RFC 3986: "The query component contains non-hierarchical data that, 
along with data in the path component (Section 3.3), serves to identify 
a resource within the scope of the URI's scheme and naming authority (if 
any)." -- so what *exactly* does this specification try to change? Just 
the name of the component??

4.  Changes to RFC 3986

    This specification removes URN semantics from the scope of RFC 3896.

s/3896/3986/

    It makes no changes to the generic URI syntax.  That syntax still
    applies to URNs as well as to other URI types.  Even as regard to
    semantics, it has no practical effect for URNs defined in strict
    conformance to the prior URN specification [RFC2141] or the
    associated registration specification [RFC3406].

So what other effect *does* it have?

    In particular (but without altering RFC 3986 in any way), the generic
    URI syntax for "queries" (strings starting with "?" and continuing to
    the end of the URI or to a "#"), and for "fragments" (strings
    starting with "#" and continuing to the end of the URI) is unchanged.
    For URNs, additional syntax is introduced to divide the URI "query"
    into two parts, referred to as "q-components" and "r-components".

...I'd argue that this URN-specific syntax shouldn't be discussed in 
*this* specification.

    The syntax and general semantics of "fragments" (specified in RFC
    3986 as scheme-independent) are unchanged, but a somewhat liberal
    interpretation may be needed in the context of URNs, so a fragment is
    referred to as an "f-component" as a term of convenience to highlight
    that distinction.  [RFC2141bis].

 From this paragraph it's entirely unclear what the actual change is.

5.  Actions Occurring in Parallel with this Specification

    The basic URN syntax specification [RFC2141] was published well
    before RFC 3986 and therefore does not depend on it.  The successor
    to that specification [RFC2141bis], fully spells out, or references
    documents that spell out, the semantics and any required within-field
    syntax of URNs.  It uses great care about generic or implicit
    reference to any URI specification and delegates further details to
    specific namespaces.

    [[CREF1: Note in Draft: Perhaps this section can be dropped
    entirely.]]

Actually, no. <draft-ietf-urnbis-rfc2141bis-urn-16#section-4.3> 
essentially updates <https://tools.ietf.org/html/rfc3986#section-5.2>. 
This specification needs to be clear about that, because that is 
actually the biggest change to RFC 3986, and it's not even mentioned here.

So if there *was* agreement on that change (which is a separate, 
complex, discussion), it really really would need to be mentioned here.


....

8.  IANA Considerations

    [[CREF2: RFC Editor: Please remove the first paragraph below before
    publication.]]

    This memo is not believed to require any action on IANA's part.

    There is an existing (i.e. prior to the publication of this document)
    registry for "Uniform Resource Identifier (URI) Schemes" that already
    includes the "urn" scheme itself and a separate existing URN
    Namespace registry.  None of those registrations have any specific
    dependencies on generic URI specifications.

Actually, the second paragraph could go as well...

9.  Security Considerations

    This specification changes the semantics of URNs to make them self-
    contained (as specified in other documents), relying on the generic
    URI syntax specification for syntax only.  It should have no effect
    on Internet security unless the use of a definition, syntax, and
    semantics that are more clear reduces the potential for confusion and
    consequent vulnerabilities.

It seems that it's RFC 2142bis which defines these changes, no?

10.2.  Informative References

    [DeterministicURI]
               Mazahir, O., Thaler, D., and G. Montenegro, "Deterministic
               URI Encoding", February 2014, <http://www.ietf.org/id/
               draft-montenegro-httpbis-uri-encoding-00.txt>.

               This is an expired document, cited for historical context
               only.

This reference doesn't seem to be used.

...

    [URN-transition]
               Klensin, J. and J. Hakala, "Uniform Resource Name (URN)
               Namespace Registration Transition", Feburary 2016,
               <https://datatracker.ietf.org/doc/draft-ietf-urnbis-ns-
               reg-transition/>.

This reference doesn't seem to be used.


Appendix A.  Background on the URN - URI relationship

    The Internet community now has many years of experience with both
    name-type identifiers and location-based identifiers (or "references"
    for those who are sensitive to the term "identifier" such as many
    members of the library and information science communities..  The

s/.././

    primary examples of these two categories are Uniform Resource Names
    (URNs [RFC2141] [RFC2141bis]) and Uniform Resource Locators (URLs)
    [RFC1738]).  That experience leads to the conclusion that it is
    impractical to constrain URNs to the high-level semantics of URLs.
    The generic syntax for URIs [RFC3986] is adequately flexible to
    accommodate the perceived needs of URNs, but the specific semantics
    associated with the URI syntax definition -- what particular
    constructions "mean" and how and where they are interpreted -- appear
    to not be.  Generalization from URLs to generic Uniform Resource
    Identifiers (URIs) [RFC3986], especially to name-based, high-
    stability, long-persistence, identifiers such as many URNs, has
    failed because the assumed similarities do not adequately extend to
    all forms of URNs.  Ultimately, locators, which typically depend on
    particular accessing protocols and a specification relative to some
    physical space or network topology, are simply different creatures
    from long-persistence, location-independent, object identifiers.  The
    syntax and semantic constraints that are appropriate for locators are
    either irrelevant to or interfere with the needs of resource names as
    a class.  That was tolerable as long as the URN system didn't need
    additional capabilities (over those specified in RFC 2141) but
    experience since RFC 2141 was published has shown that they are, in
    fact, needed.

I don't believe this appendix is way too vague to be helpful.