Re: [urn] Re-examining p-components (and "/", hierarchy, and relative references)

"Leslie Daigle (ThinkingCat)" <ldaigle@thinkingcat.com> Fri, 17 April 2015 16:01 UTC

Return-Path: <ldaigle@thinkingcat.com>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0BC0C1A00E8 for <urn@ietfa.amsl.com>; Fri, 17 Apr 2015 09:01:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.9
X-Spam-Level:
X-Spam-Status: No, score=-3.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, GB_I_LETTER=-2] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JRy_IcouY5Ww for <urn@ietfa.amsl.com>; Fri, 17 Apr 2015 09:01:27 -0700 (PDT)
Received: from zoidberg.ecotroph.net (zeke.ecotroph.net [70.164.19.155]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A09A61A007C for <urn@ietf.org>; Fri, 17 Apr 2015 09:01:27 -0700 (PDT)
Received: from aran.int.lexiconix.com (pool-108-44-246-138.clppva.fios.verizon.net [108.44.246.138]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by zoidberg.ecotroph.net (Postfix) with ESMTP id 72979A1899; Fri, 17 Apr 2015 12:01:26 -0400 (EDT)
Message-ID: <55312E53.2010605@thinkingcat.com>
Date: Fri, 17 Apr 2015 12:01:23 -0400
From: "Leslie Daigle (ThinkingCat)" <ldaigle@thinkingcat.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:31.0) Gecko/20100101 Thunderbird/31.5.0
MIME-Version: 1.0
To: John C Klensin <john-ietf@jck.com>, urn@ietf.org
References: <666705E5201C2C1889DF193A@JcK-HP8200.jck.com>
In-Reply-To: <666705E5201C2C1889DF193A@JcK-HP8200.jck.com>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/urn/92rokbpsHU61P7ibiYeh0zHHlHE>
Subject: Re: [urn] Re-examining p-components (and "/", hierarchy, and relative references)
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Apr 2015 16:01:31 -0000


I think this is a rational, if hypothetical, reason for p-components, 
and my requirement would be for the document to be clear about the use:

> For those namespaces that need additional qualification (beyond
> the traditional NSS) that still participates in equality
> checking, that has so far left p-components as the only logical
> option.  Perhaps there is some other syntax that would do the
> job --a hypothetical X-component -- but, so far, no one has
> proposed one and the logic of 3986 is such that I think we can
> expect any such syntax proposal to produce cries of pain about
> generic URI parsers and related topics.


IMO, that can be achieved by being clear about what the NSS is, assert 
that relative URNs are not expected/permitted, and attest that "/" 
should not be used in a URN simply because one is importing a namespace 
that happens to use them and it's a matter of syntactic convenience not 
to escape them with %-encoding.

Leslie.

On 4/3/15 3:49 PM, John C Klensin wrote:
> ABSTRACT /SUMMARY for this rather long note:
>
> p-components seem to a focus of controversy, possibly blocking
> other work and general progress in the WG.  This note tries to
> summarize and analyze the controversy, asks if p-components are
> important enough that we need to resolve that controversy, and
> proposes an alternative if they are not.
>
> Other other circumstances, I might just have written this note
> and proposal as an I-D, but it is really just an exploration,
> not a proposal.
>
> </ABSTRACT>
>
>
> Hi.
>
> Several comments in the last couple of weeks have raised issues
> with what are called p-components in 2141bis.  Most, but not
> all, of those notes have focused particularly on interactions
> with hierarchy and relative resolution.
>
> In the interest of getting done, I want to see if the WG can
> reexamine p-components rather than trying to dissect 3986 and
> its implications in this area.  This note will do some
> examination of 3986 interactions, but please bear with me.
>
> Assumption: I am taking the applicability of 3986 syntax but
> not semantics (as discussed in Andy's consensus call and
> draft-ietf-urnbis-semantics-clarif, although that document is
> still open to textual improvements) and the need to expand
> allowable URN syntax beyond those of RFC 2141 (and 3046) as
> given.  If there is anyone following the URN list who sees the
> discussions about hierarchy, relative resolution, or
> applicability of their interpretation of 3986 rules (with or
> without draft-ietf-appsawg-uri-scheme-reg) as a way to kill
> either 2141bis or URNs generally by a death of a thousand cuts,
> this note will probably not be helpful to them.
>
>
> THE ISSUES:
>
> Those who have spoken up about issues involving p-components
> (or the use of "/" in URNs) should check this to be sure I
> haven't mischaracterized their positions (and correct me if I
> have), but I believe the main concerns, as presented by others
> and with which I generally do not agree, are:
>
> (Issue 1): Despite the "no 3986 semantics" rule, we really
> cannot have a path or any use of a "/" in any URI, URNs
> included, without making that URI hierarchical.  From that
> perspective, we can't say "URNs are not hierarchical", which
> 2141bis-11 does say [1], and allow "/".  Other than relative
> resolution issues, it is not clear whether this issue has any
> practical impact or is [just] an argument that the "not
> hierarchical" paragraph is self-contradictory and should be
> rewritten somehow.
>
> (Issue 2): The generic URI parsers [2] will apply relative
> resolution processing if a "/" is seen, or perhaps even if it
> is not.  So such processing is inevitable, cannot be opted out
> of by URNs, and will cause serious damage to URNs if we try.
>
> (Issue 3): Relative resolution is very important to some other
> URIs, current and permanent or planned.  If URNs impose
> restrictions on relative resolution operatons, even
> restrictions that apply only to URNs, it will be damaging to
> those other URIs.
>
> (Issue 4): Because URNs are non-hierarchical and relative
> resolution would cause problems or meaningless results, URNs
> should not allow "/" unless the particular namespace actually
> is hierarchical and relative resolution is appropriate.
>
> (Issue 5): It is not possible to decide that a URI scheme does
> not support relative resolution.  All URIs have to support such
> resolution.
>
> (Issue 6): Because it is not possible for a scheme to disallow
> relative resolution, URNs cannot make support for them a
> per-namespace issue either.
>
>
> PRELIMINARY ANALYSIS:
>
> There are some contradictions in the list above that don't
> make it easier to form a complete picture, much less figure out
> which arguments are valid or not.  Issues 1, 5, 6, and probably
> 4 are really about semantics so, unless it really is impossible
> to separate the actual syntax restrictions in 3986 from its
> semantic ones [3], their relevancy is questionable.
>
> The overall issue appears to be further complicated by an
> apparent (but not explicitly written down anywhere that I've
> found) principle for URIs, which is that meaningless
> constructions are ok.  So, as discussed in earlier notes in the
> context of fragments, if a URI is associated with a resource
> and a fragment is specified that makes no sense for the
> resource, whatever processor is dealing with the URI and/or the
> resource is free to just discard it rather than letting it
> interfere with other operations.  For the particular case of
> "/" and despite the restriction in 2141, nothing prevents
> someone from writing "urn:some-NID:some-NSS/garbage" today and
> using it in some URI context.  It would presumably be
> meaningless and different namespace-specific action routines
> would deal with it in different ways, but AFAICT, no one has
> shown evidence of horrible damage, whether to generic URI
> parsers, the nature of URNs, or otherwise.  Similarly, 3986
> makes it quite clear that false negatives are ok and so on.
>
> At ;east for some people, that principle (and the absence of
> general damage when it is applied) do not appear to apply where
> URNs are involved.  When we tried to propose a note in 2141bis
> that would have said, effectively,
>
>     "Relative resolution as described in RFC 3986 is likely to
>     be meaningless for URNs.  Parsers and processors that are
>     aware they are dealing with URIs with the
>     'urn:' scheme SHOULD NOT attempt to process relative
>     references.  However, if that processing is performed,
>     perhaps by a generic system that is not aware of particular
>     schemes, and results in meaningless results, the relevant
>     system should be prepared to just move on."
>
> For the particular case of the "urn:" scheme, we get
> repetitions of one or more of the issues about.
>
> I don't know what to do with the other issues, especially
> because of the difficulties with claims about generic URI
> parsers (see note [2] below).  So let's come back to the
> issue.
>
>
> WHY WOULD WE WANT A p-COMPONENT?
>
> URNs are very much tied up with the principle of "persistence"
> (or even "permanence".  Many of us believe we know,
> intuititively, what persistence of an identifier means, but we
> have been unsuccessful in proposing, much less agreeing on, a
> definition that is sufficiently precise to discriminate among
> boundary or edge cases [4].  For URNs (perhaps more so than
> URIs generally) careful and accurate matching and being clear
> about what compnents of the string are considered stable/
> persistent has seemed to be especially important.  Neither
> f-components (which those who want to think of them as 3986
> fragments believe are inherently bound to target objects and
> not the URI) and q-components (which, as Keith and others,
> including myself, have pointed out, raise questions of
> relationship between instructions to the URN processor and
> instructions to/about targets) are problematic from both
> persistence and comparison purposes.  2141bis has explicitly
> excluded both from equality comparisons for just those reasons.
>
> For those namespaces that need additional qualification (beyond
> the traditional NSS) that still participates in equality
> checking, that has so far left p-components as the only logical
> option.  Perhaps there is some other syntax that would do the
> job --a hypothetical X-component -- but, so far, no one has
> proposed one and the logic of 3986 is such that I think we can
> expect any such syntax proposal to produce cries of pain about
> generic URI parsers and related topics.
>
> If we have p-components anyway, there are also advantages to
> actual or potential namespaces that use the "/" character in
> their identifying strings.  Of course, we could simply declare
> the separation of NSS from p-components to be an artifact of
> 3986 semantics and solve the identifying string problem by
> allowing the "/" in the NSS, but doing so would not help with
> either the "generic URI parser" or relative resolution issue,
> nor would it help those who look at a "/" in something that
> appears to be a URI and see "hierarchy" (probably in large
> letters).
>
>
> A ROTTEN, BUT POSSIBLY DESIRABLE, CHOICE
>
> The theoretical argument for p-components is above.  I think we
> also need to remember that opening 2141 (and at least
> implicitly 3406) to allow additinoal syntax has been incredibly
> painful and that, after 2141bis is complete and approved,
> opening or revising it to allow yet more syntax is likely to be
> even more painful.  That may be a good argument for adding
> p-components now, even if doing so creates some pain, just to
> have it over with.
>
> However, the above theorizing aside, I have not seen any real
> demand for p-components on the mailing list.  If there is such
> demand, I hope that those who have current needs will come
> forward and explain them.  But, if there is not and
> p-components are likely to be a source of blocking
> controversies, another option would be to modify 2141bis to
> reserve the syntax but prohibit registration (and, to the
> extent we have the power, use) of URNs containing them until
> and unless future documents appear that address the issues.
>
> Because I'm very concerned about the "death of a thousand cuts"
> mentioned above, I would hope that the WG would insist that
> anyone arguing for getting p-components out of the current
> disucssion by pushing them into the future would assure us that
> they would not turn around and raise other supposedly-blocking
> issues, but I recognize that there is ultimately no way to
> prevent that behavior other than by assurance of good faith.
>
> Were we to decide to push p-components aside, the WG should
> decide whether it would be desirable to have an informational
> appendix or separate informational document that records what
> we believe the issues are and why we reached the decisions we
> did as a starting point for future work.  This note might be a
> good starting point.  Had we had a supplmental document for
> 2141 that identified the issues and reasons additional syntax
> was deferred, it might have saved us a year or three in the
> current efforts (and/or had a restrictive impact on 3986).
> But, if we translated the controversies about the implications
> of p-components into difficulty getting consensus about such an
> explanation, perhaps it would just be better to bequeath the
> issues to the future.
>
> best,
>      john
>
>
>               -------------------------
>
> NOTES
>
> [1] draft-ietf-urnbis-rfc2141bis-urn-11, Section 5,
>    paragraph 2.
>
> (2] "Generic URI parsers" are problematic in another way that
>    should be kept in mind.  They have been invoked frequently in
>    these (URN) and other discussions about URIs, but it seems to
>    me that, each time someone tries to pin down exactly what
>    they do and how libraries compare, the answers are wildly
>    different, with some systems that are considered to be such
>    parsers by their authors or advocates doing little other than
>    separate the scheme name from the rest of the URI, others
>    trying very hard to interpret and implement every aspect of
>    3986 including its deep semantics (because 3986 includes many
>    options, two different good-faith implementations of that
>    variety may not behave the same way), and still others
>    essentially assuming that every URI behaves like an HTTP URL
>    and that everything else is either invalid or deviant.  Some
>    tests on web browsers appear to be consistent with the latter
>    point of view and are being documented as normative by
>    WHATWG.  To the extent to which the later is the actual
>    working definition of "generic URI parser", they are never
>    going to work well with URNs.
>
> {3} I have heard a few people suggest that it actually is
>    impossible, but I haven't seen such comments on the list or
>    otherwise in public.  If it is not possible, it seems to me
>    that two or three years of list discussion suggests that we
>    are faced with a choice between abandoning all ideas of
>    extending URNs beyond 2141 and going back to the "URNs are
>    not URIs" model.
>
> [4] There have also been some claims that the concept is
>    meaningless or impossible in practice, but I'll leave that
>    discussion for other notes.
>
> _______________________________________________
> urn mailing list
> urn@ietf.org
> https://www.ietf.org/mailman/listinfo/urn
>

-- 

-------------------------------------------------------------------
Leslie Daigle
Principal, ThinkingCat Enterprises
ldaigle@thinkingcat.com
-------------------------------------------------------------------