Re: [urn] Names and derived identifiers

Juha Hakala <juha.hakala@helsinki.fi> Wed, 01 October 2014 12:05 UTC

Return-Path: <juha.hakala@helsinki.fi>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CD7E11A037E for <urn@ietfa.amsl.com>; Wed, 1 Oct 2014 05:05:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.287
X-Spam-Level:
X-Spam-Status: No, score=-2.287 tagged_above=-999 required=5 tests=[BAYES_50=0.8, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.786, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AdQZpVCMlp7S for <urn@ietfa.amsl.com>; Wed, 1 Oct 2014 05:05:29 -0700 (PDT)
Received: from smtp-rs1-vallila2.fe.helsinki.fi (smtp-rs1-vallila2.fe.helsinki.fi [128.214.173.75]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F2EB11A0397 for <urn@ietf.org>; Wed, 1 Oct 2014 05:05:26 -0700 (PDT)
Received: from [128.214.71.180] (lh2-kkl1206.lib.helsinki.fi [128.214.71.180]) (authenticated bits=0) by smtp-rs1.it.helsinki.fi (8.14.4/8.14.4) with ESMTP id s91C5Ncn003969 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT) for <urn@ietf.org>; Wed, 1 Oct 2014 15:05:28 +0300
Message-ID: <542BEDFE.8050501@helsinki.fi>
Date: Wed, 01 Oct 2014 15:05:18 +0300
From: Juha Hakala <juha.hakala@helsinki.fi>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.8.0
MIME-Version: 1.0
To: urn@ietf.org
References: <201409060307.s86371n6031692@hobgoblin.ariadne.com> <8CCBED81E6F16B40F76AD624@JcK-HP8200.jck.com>
In-Reply-To: <8CCBED81E6F16B40F76AD624@JcK-HP8200.jck.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: http://mailarchive.ietf.org/arch/msg/urn/Lp97bq8T_q9OoPItxMa1mjyq6rg
Subject: Re: [urn] Names and derived identifiers
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 01 Oct 2014 12:05:32 -0000

Hello Dale; John,

Comments to the three examples:

>> Example 1: Adding a "fragment" part to a URN
>>
>> Let us assume that <urn:oid:1.3.6.1.4.1.14490.21.137> is the
>> name of an HTML document.  (I have administrative authority
>> over that part of the OID tree, so I can make it so!)  And let
>> us suppose that one of the <a> tags in the document has a
>> "name" attribute with value "section1". Then (given how
>> "fragment" is used in HTML URLs), it would be reasonable to
>> define that <urn:oid:1.3.6.1.4.1.14490.21.137#section1>
>> references that location within that (abstract) HTML document.

+ 1.
> But 3986 explicitly binds fragments to media type, so, unless
> either every OID or your particular OID has a media type, that
> discussion leads to a very odd place.  In addition, as
> previously discussed on-list and reflected in Appendix B.3 of
> the current draft, there is a case to be made that the fragment
> to which you refer is really a property of the HTML document and
> not the OID or URN.  As such, the syntax you are looking for
> might be more like
>
> <http://retrieval.mechanism.domain/?urn:oid:1.3.6.1.4.1.14490.21.137#section1>
> I don't believe that, but the distinction is very fine indeed
> and one that the WG may need to address.

Fragments can be added to URNs if the identifier assignment rules in the 
namespace in question (or in a subset of that namespace) require that 
identifiers are applied to one manifestation of the resource only. ISBN 
meets that requirement (unless someone is misusing the system, but we do 
not need to consider that option here).

In the URN context, the philosophically most acceptable solution is to 
say that the fragment indicates a location within a resource that is 
identified with the URN. A user can add a fragment any time after that 
URN has been assigned, provided that the media type (file format) of the 
identified resource allows that.
>
>> Example 2: Footnoting a page
>>
>> Sitting in front of me is a copy of the book edition with ISBN
>> 978-1-59403-698-9.  That book edition has the URN
>> <urn:isbn:978-1-59403-698-9>.
>>
>> Suppose we define an extension to the syntax to allow
>> reference to a specific page of a book denoted by an ISBN:
>> <urn:isbn:978-1-59403-698-9^p=67>.  (Here, I deliberately use
>> the character "^" for the extension; "^" is not valid in URIs,
>> and so I cannot be thought to be advocating any particular
>> syntax.)  This construction would be useful for scholarly
>> citations.
> Of course, your doing that as part of the URN requires that you
> have an agreement with the particular publisher that all
> instances of objects that they associate with the ISBN have the
> same pagination properties.  I don't believe that is a
> requirement of the ISBN standard.  Whether it is or not, I know
> (as Juha has also pointed out), publisher assignments of ISBNs
> are sometimes more flexible.

Just an aside: with new electronic formats such as HTML 5 and CSS 3, 
page numbers are getting meaningless. For referencing purposes it will 
be better to use logical components of resources such as chapters and 
paragraphs.

>
>> Example 3: Retrieving bibliographic information
>>
>> Suppose there is a standard for providing the metadata
>> (bibliographic data) about a resource.  This has been called
>> "URC" (Uniform Resource Characteristics) in RFC 2169.  Then we
>> could standardize that
>> <urn:oid:1.3.6.1.4.1.14490.21.312^metadata> is the identifier
>> of the metadata regarding <urn:oid:1.3.6.1.4.1.14490.21.312>.
>> (Of course, this does not address how we obtain the metadata,
>> it only defines a standard identifier with which to refer to
>> it.)
> I'll let you and Juha (and others who are interested) debate
> whether that form is an identifier or a service request.  I hope
> that, in the process, you will explain to the rest of us how the
> distinction moves the work of the WG forward.

Libraries etc. do not use the same identifier system for resources and 
metadata records which describe those resources. Such solution would be 
cumbersome to maintain. One cannot safely assume that there is 1:1 
relation between resources and metadata records about these resources. 
Of course, metadata records do contain the identifier / identifiers of 
resources / works.

There are a lot of standards for providing (bibliographic and 
administrative) metadata about resources. IETF tried to develop one 
(URC) but decided not to, which was wise. Building and maintaining a 
metadata format is a major commitment.

It is feasible to create a service request for retrieving metadata about 
the identified resource. But a user must be able to specify the 
preferred metadata format.

So instead of this:

urn:oid:1.3.6.1.4.1.14490.21.312^metadata

you might say something like this:

urn:oid:1.3.6.1.4.1.14490.21.312?s=I2C&p=DC

or even

urn:oid:1.3.6.1.4.1.14490.21.312?s=I2C&p=DC&p=simple

to indicate that you want a (simple) Dublin Core metadata record describing the identified resource. This URN is then passed on to a resolver which migrates it into something that is understood as search request by a relevant target system.

Juha


>
>> ...
>    best,
>       john
>
> _______________________________________________
> urn mailing list
> urn@ietf.org
> https://www.ietf.org/mailman/listinfo/urn


-- 

  Juha Hakala
  Senior advisor

  The National Library of Finland
  Library Network Services
  P.O.Box 26 (Teollisuuskatu 23)
  FIN-00014 Helsinki University
  Tel. +358 9 191 44293
  Mobile +358 50 3827678