Re: [urn] resolver profiles: a generative POWER-UP for r-component

"Hakala, Juha E" <juha.hakala@helsinki.fi> Mon, 05 October 2015 10:54 UTC

Return-Path: <juha.hakala@helsinki.fi>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3F1BD1A913F for <urn@ietfa.amsl.com>; Mon, 5 Oct 2015 03:54:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.302
X-Spam-Level:
X-Spam-Status: No, score=-1.302 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_34=0.6, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id L71aKmnGKx8F for <urn@ietfa.amsl.com>; Mon, 5 Oct 2015 03:53:55 -0700 (PDT)
Received: from emea01-am1-obe.outbound.protection.outlook.com (mail-am1on0704.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe00::704]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 316271A9136 for <urn@ietf.org>; Mon, 5 Oct 2015 03:53:53 -0700 (PDT)
Received: from AMSPR07MB454.eurprd07.prod.outlook.com (10.242.106.145) by AMSPR07MB456.eurprd07.prod.outlook.com (10.242.106.149) with Microsoft SMTP Server (TLS) id 15.1.280.20; Mon, 5 Oct 2015 10:53:36 +0000
Received: from AMSPR07MB454.eurprd07.prod.outlook.com ([10.242.106.145]) by AMSPR07MB454.eurprd07.prod.outlook.com ([10.242.106.145]) with mapi id 15.01.0280.017; Mon, 5 Oct 2015 10:53:35 +0000
From: "Hakala, Juha E" <juha.hakala@helsinki.fi>
To: Sean Leonard <dev+ietf@seantek.com>, "urn@ietf.org" <urn@ietf.org>
Thread-Topic: [urn] resolver profiles: a generative POWER-UP for r-component
Thread-Index: AQHQ95WFeOnu40hdB0Sgxi8vmxZOHJ5cvcVg
Date: Mon, 05 Oct 2015 10:53:35 +0000
Message-ID: <AMSPR07MB454D3EC8D83AD35A2B3B931FA480@AMSPR07MB454.eurprd07.prod.outlook.com>
References: <56054AEF.7000800@seantek.com>
In-Reply-To: <56054AEF.7000800@seantek.com>
Accept-Language: fi-FI, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=juha.hakala@helsinki.fi;
x-originating-ip: [128.214.71.222]
x-microsoft-exchange-diagnostics: 1; AMSPR07MB456; 5:epWpNyuGZ5MRBV6w0x3J1f8Xj+P6ldNd4AanuYvOdYrZ7n+T/ouNHZorv6uwXK6bvd47nyFW7jW4eiuLfWoEEKOaKAJyoAFdzphPjuoOuDmCj4TSec8e7o48QZxRubKVJZ9YH+r9T+RAosYvONt+QQ==; 24:Y4f4/3U7qmPxNftVuPhE1GytJKpaWfKpl4ItX3HDmxNWkGyuOcgXG/wy7X75483t6/Ezq9XMReJzFbjC2fu60vu0YdEh1NSiTXiHdz8t8x4=; 20:1jrHp4JCd4OeX+Yn4gx3yEYq/nerfyU0wi4IoQnZiFASRnFzp/9uq6tcQeKfp9dgvABtfrLjWF6qVXEz+hhBRw==
x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AMSPR07MB456;
x-microsoft-antispam-prvs: <AMSPR07MB456A7880BA3719F24AACED2FA480@AMSPR07MB456.eurprd07.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:;
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(601004)(2401047)(5005006)(520078)(8121501046)(3002001); SRVR:AMSPR07MB456; BCL:0; PCL:0; RULEID:; SRVR:AMSPR07MB456;
x-forefront-prvs: 07200C0526
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(6009001)(13464003)(45074003)(199003)(189002)(5001770100001)(74482002)(5001960100002)(105586002)(106116001)(50986999)(54356999)(68736005)(106356001)(76576001)(46102003)(15975445007)(10400500002)(107886002)(77096005)(5002640100001)(77156002)(102836002)(62966003)(189998001)(2501003)(101416001)(2900100001)(81156007)(5001830100001)(5003600100002)(5001920100001)(76176999)(74316001)(15380165006)(97736004)(19580405001)(92566002)(40100003)(66066001)(561944003)(19580395003)(64706001)(2950100001)(86362001)(33656002)(87936001)(122556002)(5004730100002)(5007970100001)(5008740100001)(5001860100001)(4001540100001); DIR:OUT; SFP:1102; SCL:1; SRVR:AMSPR07MB456; H:AMSPR07MB454.eurprd07.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en;
received-spf: None (protection.outlook.com: helsinki.fi does not designate permitted sender hosts)
spamdiagnosticoutput: 1:23
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: helsinki.fi
X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Oct 2015 10:53:35.5295 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 98ae7559-10dc-4288-8e2e-4593e62fe3ee
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AMSPR07MB456
Archived-At: <http://mailarchive.ietf.org/arch/msg/urn/2K3hnuTz3u-_RakEkW6naC8M-v4>
Subject: Re: [urn] resolver profiles: a generative POWER-UP for r-component
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 05 Oct 2015 10:54:02 -0000

Hello,

some comments below. 

> -----Original Message-----
> From: urn [mailto:urn-bounces@ietf.org] On Behalf Of Sean Leonard
> Sent: 25. syyskuuta 2015 15:24
> To: urn@ietf.org
> Subject: [urn] resolver profiles: a generative POWER-UP for r-component
> 
> URNies:
> 
> What URNs really need is a way to ensure that the resolution process
> delivers more consistent and predictable results.

All PIDs must be made "smarter". If resolution only links identifier (name) to 1-n locations, it is possible to argue that this is an extra step that can / should be avoided. But if resolution can facilitate persistent access to all kinds of information about the identified resource, it is hard to achieve similar results with e.g. HTTP (which is itself not persistent).  

It has not been easy to improve URN syntax so that smart resolution is possible. One problem is URI syntax, which has left us with a limited tool set (? for query and # for fragment). Using ?? for r-component is in these circumstances the best thing we can do. Alas, other PID systems may not be able to do even this.   

> Since the introduction of the r-component, URNs are now able to address
> (speak to) the resolver directly to control aspects of the resolution process.
> Yet the specifics are woefully absent in draft-ietf-urnbis-rfc2141bis-urn-13.

The intention is to proceed in two steps: first to establish r-component, and then to establish mechanism with which the r-component syntax and semantics are maintained.  Since the technical infrastructure within which resolution happens changes all the time, it is not possible to create a fixed list of resolution services and service parameters. But we can have for instance IANA registry for shared services (and their parameters) and some other arrangements for namespace specific services (if any). 

> I propose that the r-component begin with a resolver profile (or resolution
> profile, or directive), which is an opaque string that identifies the nature of
> the resolution services being requested by the particular urn: URI. It's a
> POWER-UP 🍄 for the r-component!

This is one option that we can consider. 

> Consider our favorite urn:isbn. Given an ISBN, what can we do with it?
> What do we want to get for it? And how can we make the f-component
> meaningful, if the urn: URI string is not going to give us back some kind of
> predictable media type/type of resource?

And also: what do / can we do now, and what is possible 50 or 100 years from now? We do not have any idea of what library systems / digital archives will be capable of in the distant future, but we do now that books will still have the same ISBNs by then. Of course, all manifestations valid now will be rendered useless for most users in 2115; only those with specialized tools can still render the original documents. All others will use migrated modern versions of the works.  

> Answer: the namespace registration can include profiles, and these profiles
> better specify what happens when resolution occurs. In our ISBN case,
> profiles can include:
> showbook
> librarycatalog
> pricing
> buybook
> covershot
> bibliographicinfo
> referencedworks
> ...and a potentially infinite range of other resolution things to do.

Indeed; I could easily add a lot of stuff into this list ;-). 
 
> The absence of a profile string does not imply a "default" action. If you
> specify urn:isbn:0385537670, you are left with just the identifier.
> The implementation (context) can figure out whatever it wants to do.

This is the current status and definitely not one that we should be happy with. It is important to give the user a choice between different options. IMO the role of metadata will be essential: how much will this resource cost? is there a Finnish translation? what can I do with this resource once I have downloaded it? what tools shall I need to render it? who owns the rights for this work? And so on. 

> To the extent that the resolver profile needs or accepts additional data, this
> data can follow the profile string.
> 
> Examples include:
> 
> urn:isbn:0385537670?productphoto:dpi=600
> urn:isbn:0385537670?pricing:au;used

Although these information needs are definitely valid it is not necessary to nail the syntax yet. 

> urn:oid:1.3.6.1.5.5?info
> urn:ietf:rfc:3986?getspec:ct=text/html#section-2.1
> 
> The syntax is:
> 
> 
>        namestring    = assigned-name
>                        [ "?" r-component ]
>                        [ "#" f-component ]
>        assigned-name = "urn" ":" NID ":" NSS
>        NID           = (alphanum) 0*30(ldh) (alphanum)
>        ldh           = alphanum / "-"
>        NSS           = word
>        r-component   = profile [":" query]
>        profile       = NID
>        ; query is from RFC 3986
>        word          = pchar *(pchar / "/")
>        f-component   = fragment  
> 
> Unlike the NSS, which needs to be well-defined, durable, immutable, etc.,
> profiles are intended to be very lightweight and capricious. To the extent
> that a namespace uses profiles at all, registration of profiles would be either
> First-Come, First-Served, or Specification Required. The namespace
> registration can impose restrictions on profiles (including disallowing profiles
> altogether), but anyone should be permitted to invent a new profile,
> document it, and register it. [This is a generative proposition--see David G.
> Post, The Theory of Generativity,
> <http://ir.lawnet.fordham.edu/flr/vol78/iss6/2/>.]

Whatever rule a URN namespace applies for naming, only applies to NSS. The components can be added by anybody any time. I suppose that they will often be machine generated. A GUI which enables the user to retrieve metadata about the resource will modify the actual component accordingly when protocols and metadata formats change. Users do not need to know anything about these changes. And we standard makers only need to provide new services and parameters when requested. 

As regards the syntax of r-component, IMO it is useful to provide some generic services. Backwards compatibility with RFC 2483 requires that. But we also know that RFC 2483 had too simple view on services; for instance, there can be many different metadata formats, and asking each one of them via different service would not make sense.  

> Resolver profiles are namespace-specific, so "info" for the oid: URN would
> have totally different semantics from "info" for the nato: URN or
> service: URN.
> 
> The profile functions as a contract on how specific resolvers should behave,
> including how the resolver should interpret the query parameters
> (string) and translate the information into resources (or into URLs that can be
> dereferenced into resources).

I am afraid that making resolver profiles solely namespace-specific might create problems. But it is certain that different namespaces will support different services. For instance, ISBN and ISNI (International Standard Name Identifier) will not have much common from service point of view. But same people will be using both, provided that ISNI acquires a URN namespace. 

Best regards, 

Juha

> With this definition, there is no longer a q-component. Everything is on the r-
> component. Effectively the r-component is the query, but it's addressed to
> and interpreted by the resolver.
> 
> That's the proposal. Comments welcome.
> 
> Sean