Re: [urn] Fwd: Fwd: I-D Action: draft-ietf-urnbis-rfc2141bis-urn-12.txt

"Hakala, Juha E" <juha.hakala@helsinki.fi> Tue, 01 September 2015 07:31 UTC

Return-Path: <juha.hakala@helsinki.fi>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9BF0A1ACD08 for <urn@ietfa.amsl.com>; Tue, 1 Sep 2015 00:31:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.598
X-Spam-Level: *
X-Spam-Status: No, score=1.598 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_34=0.6, J_CHICKENPOX_37=0.6, MANGLED_OFF=2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id d2BalG1QQMvg for <urn@ietfa.amsl.com>; Tue, 1 Sep 2015 00:31:22 -0700 (PDT)
Received: from emea01-am1-obe.outbound.protection.outlook.com (mail-am1on0735.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe00::735]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6432A1A9171 for <urn@ietf.org>; Tue, 1 Sep 2015 00:31:20 -0700 (PDT)
Received: from AMSPR07MB454.eurprd07.prod.outlook.com (10.242.106.145) by AMSPR07MB454.eurprd07.prod.outlook.com (10.242.106.145) with Microsoft SMTP Server (TLS) id 15.1.256.15; Tue, 1 Sep 2015 07:30:58 +0000
Received: from AMSPR07MB454.eurprd07.prod.outlook.com ([10.242.106.145]) by AMSPR07MB454.eurprd07.prod.outlook.com ([10.242.106.145]) with mapi id 15.01.0256.013; Tue, 1 Sep 2015 07:30:58 +0000
From: "Hakala, Juha E" <juha.hakala@helsinki.fi>
To: Peter Saint-Andre - &yet <peter@andyet.net>, Melinda Shore <melinda.shore@gmail.com>, "urn@ietf.org" <urn@ietf.org>
Thread-Topic: [urn] Fwd: Fwd: I-D Action: draft-ietf-urnbis-rfc2141bis-urn-12.txt
Thread-Index: AQHQqv8vPdyydT05JUuwHnufbKXNbZ3GSVDggGEfZgCAAEPacA==
Date: Tue, 01 Sep 2015 07:30:58 +0000
Message-ID: <AMSPR07MB454C6DB429C2FECD283B24EFA6A0@AMSPR07MB454.eurprd07.prod.outlook.com>
References: <55804085.5000105@gmail.com> <5584CD07.4050906@gmail.com> <AMSPR07MB4542A8B2C582B8F1772FD0DFAA80@AMSPR07MB454.eurprd07.prod.outlook.com> <55E50FC2.8050106@andyet.net>
In-Reply-To: <55E50FC2.8050106@andyet.net>
Accept-Language: fi-FI, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=juha.hakala@helsinki.fi;
x-originating-ip: [128.214.71.222]
x-microsoft-exchange-diagnostics: 1; AMSPR07MB454; 5:+XtdByzXAFnfMXD3rQT3N/lOSi+wQW/dW7hk7v/TpFUCr+P2rxN7xjcMk7GPTBtxKVeAFVOQ5E2mQFW44Howpixi2bA3sJYINrCs+kHX8Zoozxpqmi+bd6fP+OkBiCREvjnUySOpORiiB1Ex79lv0w==; 24:rqftOOqS38CHr8yeDhgnwnmrCYzqkyMMlH2zkYWDSdRUTxg4fVtK29OQzS59RYZUABh465z5tsly24pFcRD+EYPshHgvYr12K9tcx007rLc=; 20:uXaz8i8tsZmG/QRzEubGuzvVsJQq/uFwojHmu2+iSg/XZiy0GUoZsag4FaPCTvCgWw9i3HIQhgmfAm69AUfJuQ==
x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AMSPR07MB454;
x-microsoft-antispam-prvs: <AMSPR07MB454AD7D741F3468D822B063FA6A0@AMSPR07MB454.eurprd07.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:;
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(601004)(8121501046)(5005006)(3002001); SRVR:AMSPR07MB454; BCL:0; PCL:0; RULEID:; SRVR:AMSPR07MB454;
x-forefront-prvs: 06860EDC7B
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(6009001)(43784003)(51444003)(13464003)(189002)(199003)(2473001)(15198665003)(46102003)(122556002)(19580395003)(15395725005)(19580405001)(40100003)(5007970100001)(92566002)(189998001)(93886004)(5001860100001)(5001960100002)(5001830100001)(230783001)(74482002)(101416001)(5002640100001)(87936001)(5004730100002)(105586002)(76176999)(77156002)(106356001)(50986999)(102836002)(77096005)(15975445007)(33656002)(2656002)(62966003)(10400500002)(2900100001)(106116001)(86362001)(5001770100001)(54356999)(5003600100002)(4001540100001)(66066001)(2950100001)(81156007)(2501003)(76576001)(64706001)(68736005)(74316001)(97736004)(10090945008)(19477635001); DIR:OUT; SFP:1102; SCL:1; SRVR:AMSPR07MB454; H:AMSPR07MB454.eurprd07.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en;
received-spf: None (protection.outlook.com: helsinki.fi does not designate permitted sender hosts)
spamdiagnosticoutput: 1:23
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: helsinki.fi
X-MS-Exchange-CrossTenant-originalarrivaltime: 01 Sep 2015 07:30:58.5751 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 98ae7559-10dc-4288-8e2e-4593e62fe3ee
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AMSPR07MB454
Archived-At: <http://mailarchive.ietf.org/arch/msg/urn/wOhAG3f-Tw_mEa6YsrjAGYklrd8>
Cc: Tobias Weigel <weigel@dkrz.de>, "stella@isbn-international.org" <stella@isbn-international.org>, "isabelxiang@hotmail.com" <isabelxiang@hotmail.com>
Subject: Re: [urn] Fwd: Fwd: I-D Action: draft-ietf-urnbis-rfc2141bis-urn-12.txt
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 01 Sep 2015 07:31:25 -0000

Hello Peter, 

Comments inline below. 

> -----Original Message-----
> From: Peter Saint-Andre - &yet [mailto:peter@andyet.net]
> Sent: 1. syyskuuta 2015 5:39
> To: Hakala, Juha E <juha.hakala@helsinki.fi>; Melinda Shore
> <melinda.shore@gmail.com>; urn@ietf.org
> Cc: Tobias Weigel <weigel@dkrz.de>; stella@isbn-international.org;
> isabelxiang@hotmail.com
> Subject: Re: [urn] Fwd: Fwd: I-D Action: draft-ietf-urnbis-rfc2141bis-urn-
> 12.txt
> 
> > In the second paragraph of chapter 3.3 we say that unless defined for
> > a particular namespace, use of q-, r- and f-components is disallowed.
> >
> > This is OK for q- and r-components, but I am not sure if such
> > limitation is required for F-component, since it has no impact on URN
> > resolution. Moreover, F-component is not namespace specific but
> > depends on file format. It can be added to the URN:ISBN if the
> > identified book is e.g. a PDF document, but not to every URN:ISBN
> > (even if the ISBN identifies an e-book; printed books certainly do not
> > have URI fragments). So it is likely that f-component can never be
> > applied to all URNs in a certain namespace, but it may be possible to
> > apply it to some URNs in most (actionable) namespaces.
> 
> As I recall, the purpose of the text you have cited was to maintain strict
> compatibilty with RFC 2141 for all existing namespaces; only if a namespace
> explicitly allows q-, r-, and f-components after 2141bis is published should it
> allow and use those components. This is an "opt-in"
> model, if you will, and seems like the safest approach (i.e., do not assume that
> it's OK to add an f-component to an "old", 2141-compatible namespace just
> because 2141bis says that f-components are now allowed).
> This is a form of Postel's Law, I think.

The problem I have with this is that f-components can be added to existing URNs by anyone any time after the URN has been minted. For instance, in URN:ISBN namespace the use of components is not allowed at the moment, but we cannot prevent someone from adding an F-component to URN:ISBN if the identified book has URI fragments. To "legalize" this, I would revise the URN:ISBN namespace registration request and say that f-component can be used if the file format of the identified book allows that. But similar comment could be added to RFC2141bis, and it would apply to all existing URN namespaces retrospectively. 

Keith's problem with components being namespace specific can be avoided if the implementers can agree on common syntax for r-component. This requires a registry of resolution services and service parameters, and syntax for asking for these services. Whether a resolver supports these services or not depends on the resolver. Even with the same namespace there will be diversity in this respect. 

Within persistent identifier community there is a lot of diversity already as regards resolution services and syntax used for requesting those services. Whether Handle, ARK and URN syntaxes can be aligned in the future remains to be seen, but agreeing on how URNs use components could be the first step down on this road.    

> 
> > The example in the last paragraph of chapter 3.3.1 uses a real ISBN
> > but in wrong (urn:example) namespace. It is better to replace the ISBN
> > with a random NSS such as "bar", because the URN is not actionable and
> > even if it were, the fragment would not be applicable.
> 
> Yes, we'll fix that to prevent confusion.

Fine. 
> 
> > Although libraries' primary intention was to use q-component for
> > passing resolution related information to URN resolvers, I see no
> > problem in using r-component instead to this purpose. IMO it is good
> > design to make a difference between requests to resolvers
> > (r-component) and requests to resources themselves or applications
> > managing them (q-component). Identified resources will become
> > increasingly large and complex (scientific data sets are a good
> > example of this)  and it may be necessary to perform various
> > operations before the resource is ready to be sent to the user.
> 
> Do you think that the text or example in Section 3.3.1 needs to change in
> order to make that clear?

Well, I would like to change the example a bit: 

This could perhaps be accomplished by specifying the desired
   metadata field (e.g., "identifier") in the q-component, resulting in
   URNs such as
   "urn:example:foo?operation=search&field=identifier".
   However, this primary purpose is not intended to forestall other
   potential uses for q-components.

> 
> > In 3.3.3 we say that f-component need not be semantically equivalent
> > to the URI fragment (component). IMO URN f-component is never
> > semantically equivalent with URI fragment since f-component does not
> > have a role in identification. We avoid making this difference
> > explicit by saying that f-component is intended to "distinguish
> > integral parts of resources", which is OK to me in spite of being
> > somewhat vague.
> 
> I think that "constituent parts" might be clearer.

+ 1 

> 
> > If we want to be more specific we could say that what is being
> > distinguished are physical (encoded) parts of resources.
> > F-component cannot be applied to logical parts of resources unless
> > they coincide with physical ones. It might be a good idea to be more
> > specific about what f-component does because there is an increasing
> > need within my community (libraries, archives and museums) to identify
> > component parts of resources. If and when we start developing a new
> > standard identifier for logical components of resources (or if we
> > extend existing systems such as NBN in such a way that they do the
> > job) it is important to know precisely what the role of f-component
> > is. The present wording does not make that very clear.
> 
> I am not well versed in information science, so I am sure how the distinction
> between logical and physical parts is applied in that field.
> However, from my own limited perspective I hesitate to say that f-components
> are necessarily tied to physical representation because even though the
> "preface" and "chapter 3" and "afterword" might be constituent parts of a
> book, in an electronic book those parts are not represented physically as they
> would be in a paper book.

Constituent parts of an electronic book can be represented physically in the (XML/HTML/SGML/...) encoding of the document. Sometimes this kind of encoding matches the physical representation of the printed book; it may also be simpler or more detailed. Over the years there has been a tendency from simple formats such as ASCII documents to more complex, structured document formats. Preface, chapters and afterword, perhaps even smaller components of the text can be either identified separately (we use URN:NBN for this purpose) or citing them may be enabled with the encoding of the document. 
 
> 
> > The example in the last paragraph of chapter 3.3.3 uses a real ISBN
> > but in wrong (urn:example) namespace. It is better to replace the ISBN
> > with a random NSS such as "bar". URN:ISBN is actionable (as
> > http://urn.fi/URN:ISBN:978-952-10-7060-0) but urn:example is not. In
> > both cases fragment is not applicable.
> 
> Yes, we will fix that too.

OK. 
> 
> > In chapter 6 (top of page 14) the draft states that "uniqueness
> > constraint means that an identifier within the namespace is never
> > assigned to more than one resource and never reassigned to a different
> > resource". We might consider saying "one resource (as defined in the
> > namespace)".  Every journal article published in e.g.
> > Scientific American has the same ISSN, since in the URN:ISSN namespace
> > "resources" are serials. In SICI (Serial Item and Contribution
> > Identifier) namespace resources are journal issues and articles.
> > URN:NBN could be used for identification of still images within these
> > articles.
> 
> I see your point. How about this?
> 
>     The "uniqueness" constraint means that an identifier within the
>     namespace is never assigned to more than one resource and never
>     reassigned to a different resource (for the kind of "resource"
>     identified by URNs assigned within the namespace).

Fine with me! 

> > In chapter 6.1, the last paragraph of page 16 the draft requires that
> > particular attention should be paid to strings that might imply
> > association with well-known trademarks. More exhaustive formulation
> > would be "... well-known identifier systems and trademarks". In this
> > context we need to be even more concerned about NID strings like
> > "handle" and "doi" than "pepsi".
> 
> Good point.
> 
> > In 7.3.1 (Purpose of the URN registration) it is IMO necessary to
> > provide information whether resolution services are or will be
> > available, and if so, what the existing / anticipated services.
> > Registrants might even be able to provide examples for q- and
> > r-component semantics and syntax, even if they are registered
> > elsewhere later.
> 
> That seems fine.
> 
> > It might be a good idea to ask the registrants to clarify formal
> > status of the identifier (de jure standard / de facto standard /
> > other) and current / anticipated extent of its use (international /
> > national / local usage).
> 
> In my experience and (see RFC 6648) the experience of the Internet
> community, it is not especially helpful to draw a distinction between
> standardized and unstandardized identifiers, because unstandardized
> identifiers have a tendency to leak into the space of standardized identifiers.

Still, it would helpful to know the status of the identifier when the namespace was registered. Users should be able to see which namespaces are based on standards and which are not. And registrants of non-standard systems could tell if there is a plan to standardize the identifier in the future.

IMO standardized identifiers such as ISBN are more reliable than other identifier systems, since the former usually have e.g. a stable user community and well formalized identifier assignment rules.     

> > Identifiers with statuses "other" and "local usage" should usually
> > have informal than formal namespaces registered for them.
> 
> Here again, see RFC 6648.
> 
> > The current scope question (number 4 on the list) is a little bit of a
> > mixed bag as it covers both geographical and organizational issues; it
> > might be better to target scope on various organizational issues only
> > (public / private sector, book trade / libraries, military / civilian
> > use etc.).
> 
> It's not fully clear to me why some of those dimensions are relevant, but in
> any case I think the answers would emerge from asking about the community
> of use. Thus I propose:
> 
>         The scope and applicability of the URNs assigned within the
>         namespace; this might include information about the community of
>         use (e.g., a particular nation, industry, technology, or
>         organization), whether the assigned URNs will be used on public
>         networks or private networks, etc.

This is OK for me. 
 
> > As regards the current second (later perhaps third) question on the
> > list, I would drop the request for telling why it is preferable to use
> > URN rather than some other technology. IETF does not need to know
> > this. Also asking the registrant to explain why existing URN
> > namespaces are not a good fit sounds a bit negative. I would ask the
> > registrant to describe how the namespace to be registered relates to
> > existing URN namespaces and how the new namespace complements them.
> 
> I like that more positive focus. How is this?
> 
>        How the namespace relates to and complements existing URN
>        namespaces, URI schemes, and identifier systems.

Works for me.

All the best, 

Juha

> 
> Thanks again for the review.
> 
> Peter