Re: [urn] Benjamin Kaduk's Discuss on draft-hakala-urn-nbn-rfc3188bis-01: (with DISCUSS and COMMENT)
Benjamin Kaduk <kaduk@mit.edu> Fri, 08 June 2018 20:32 UTC
Return-Path: <kaduk@mit.edu>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5F024131024; Fri, 8 Jun 2018 13:32:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.201
X-Spam-Level:
X-Spam-Status: No, score=-4.201 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 84eRsntPaGoZ; Fri, 8 Jun 2018 13:32:38 -0700 (PDT)
Received: from dmz-mailsec-scanner-8.mit.edu (dmz-mailsec-scanner-8.mit.edu [18.7.68.37]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D75D2129619; Fri, 8 Jun 2018 13:32:37 -0700 (PDT)
X-AuditID: 12074425-301ff700000045ef-35-5b1ae7e4a6a3
Received: from mailhub-auth-3.mit.edu ( [18.9.21.43]) (using TLS with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by dmz-mailsec-scanner-8.mit.edu (Symantec Messaging Gateway) with SMTP id DA.74.17903.4E7EA1B5; Fri, 8 Jun 2018 16:32:36 -0400 (EDT)
Received: from outgoing.mit.edu (OUTGOING-AUTH-1.MIT.EDU [18.9.28.11]) by mailhub-auth-3.mit.edu (8.13.8/8.9.2) with ESMTP id w58KWZkH019707; Fri, 8 Jun 2018 16:32:35 -0400
Received: from kduck.kaduk.org (24-107-191-124.dhcp.stls.mo.charter.com [24.107.191.124]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id w58KWUR8015567 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Fri, 8 Jun 2018 16:32:33 -0400
Date: Fri, 08 Jun 2018 15:32:30 -0500
From: Benjamin Kaduk <kaduk@mit.edu>
To: Peter Saint-Andre <stpeter@mozilla.com>
Cc: The IESG <iesg@ietf.org>, draft-hakala-urn-nbn-rfc3188bis@ietf.org, "urn@ietf.org" <urn@ietf.org>
Message-ID: <20180608203227.GD16349@kduck.kaduk.org>
References: <152837409539.30768.4568779645299135020.idtracker@ietfa.amsl.com> <6a1a100c-3bc0-76d3-3ae4-047d37906bfc@mozilla.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg="pgp-sha512"; protocol="application/pgp-signature"; boundary="ikeVEW9yuYc//A+q"
Content-Disposition: inline
In-Reply-To: <6a1a100c-3bc0-76d3-3ae4-047d37906bfc@mozilla.com>
User-Agent: Mutt/1.9.1 (2017-09-22)
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrAKsWRmVeSWpSXmKPExsUixCmqrfvkuVS0wfI2Q4s/13+xWMz4M5HZ 4tnKU4wWU5s/MDmweCxZ8pPJo+9AF2sAUxSXTUpqTmZZapG+XQJXxryzDYwFl9IrJtxaxd7A eDy4i5GTQ0LARGLRm/mMILaQwGImiU/Nsl2MXED2BkaJJR+WMEE4V5gkdl5oYAOpYhFQkehc sRXMZgOyG7ovM4PYIgLaEjcP7WUBsZkF8iSaW7eB2cICmRIXft9nB7F5gbZdevySGWJoE6PE yQ8/GCESghInZz6Bai6T2H97H1ADB5AtLbH8HwdImFPAXuLhlxtgc0QFlCX29h1in8AoMAtJ 9ywk3bMQuiHCWhI3/r1kwhDWlli28DUzhG0rsW7de5YFjOyrGGVTcqt0cxMzc4pTk3WLkxPz 8lKLdC30cjNL9FJTSjcxgiKD3UV1B+Ocv16HGAU4GJV4eBuapKKFWBPLiitzDzFKcjApifKe OC8ZLcSXlJ9SmZFYnBFfVJqTWnyIUQVo16MNqy8wSrHk5eelKonwPlMCauVNSaysSi3KhymT 5mBREufNWcQYLSSQnliSmp2aWpBaBJOV4eBQkuAVAiYGIcGi1PTUirTMnBKENBMH5yFGCQ4e oOH1z0CGFxck5hZnpkPkTzEqSonzTgJJCIAkMkrz4HpBCU0ie3/NK0ZxoLeEeW+CVPEAkyFc 9yugwUxAgz2YJUEGlyQipKQaGCc1XUpf6xFruqHm/vqJtQoX3D5947uf+8JwX3pgj2JAwBrL WJUzD9aH13/veCV43NP78q4/2g53kz+HfrqY8HrlK/nnfCkZExTyg/7812/wrv1T+lN12+PC zb9eic2ccfT8H0bGbe9X73x96ZVBXvyDm983/U/99XfGu1DtbRVR6xUeN3GdzD6oxFKckWio xVxUnAgA+hmvM0MDAAA=
Archived-At: <https://mailarchive.ietf.org/arch/msg/urn/ZEKRmVH0PNEtshWAcGzmXMSQ-Pg>
Subject: Re: [urn] Benjamin Kaduk's Discuss on draft-hakala-urn-nbn-rfc3188bis-01: (with DISCUSS and COMMENT)
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Jun 2018 20:32:42 -0000
I'm happy to see the main point of discussion progressing with input from people who know more about the subject than me ... that said, I can comment on some of the other points, inline. On Thu, Jun 07, 2018 at 02:02:23PM -0600, Peter Saint-Andre wrote: > [ + cc urn@ietf.org for broader discussion ] > > Document shepherd here. I expect the document author (and perhaps my > co-author on RFC 8141) to provide further thoughts. > > On 6/7/18 6:21 AM, Benjamin Kaduk wrote: > > Benjamin Kaduk has entered the following ballot position for > > draft-hakala-urn-nbn-rfc3188bis-01: Discuss > > > > When responding, please keep the subject line intact and reply to all > > email addresses included in the To and CC lines. (Feel free to cut this > > introductory paragraph, however.) > > > > > > Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html > > for more information about IESG DISCUSS and COMMENT positions. > > > > > > The document, along with other ballot positions, can be found here: > > https://datatracker.ietf.org/doc/draft-hakala-urn-nbn-rfc3188bis/ > > > > > > > > ---------------------------------------------------------------------- > > DISCUSS: > > ---------------------------------------------------------------------- > > > > I think this document may benefit from an Internationalization > > Considerations sections, but am not entirely sure how needed it is. > > So let's discuss it... > > > > In particular, the URN:NBN lexical equivalence rules include several > > case-insensitive comparisons, for the prefix and for the case of the > > hex digits in any percent-encoded values, but do not specify any > > operation on the decoded percent-encoded values/characters. > > As a reminder, RFC 8141 does state: > > In particular, with regard to characters outside the ASCII range, > URNs that appear in protocols or that are passed between systems MUST > use only Unicode characters encoded in UTF-8 and further encoded as > required by RFC 3986. To the extent feasible and consistent with the > requirements of names defined and standardized elsewhere, as well as > the principles discussed in Section 1.2, the characters used to > represent names SHOULD be restricted to either ASCII letters and > digits or to the characters and syntax of some widely used models > such as those of Internationalizing Domain Names in Applications > (IDNA) [RFC5890], Preparation, Enforcement, and Comparison of > Internationalized Strings (PRECIS) [RFC7613], or the Unicode > Identifier and Pattern Syntax specification [UAX31]. > > In order to make URNs as stable and persistent as possible when > protocols evolve and the environment around them changes, URN > namespaces SHOULD NOT allow characters outside the ASCII range > [RFC20] unless the nature of the particular URN namespace makes such > characters necessary. > > By my reading of draft-hakala-urn-nbn-rfc3188bis and RFC 8141, the > allowable case-sensitivity for nbn_string constructs generated by a > national library applies to the percent-encoded string because that is > where any comparison or equivalence-matching would occur for these > identifiers. Venturing into case matching of percent-decoded strings > would (IMHO) unnecessarily open up an ugly can of worms. > > > In many > > (perhaps even most?) cases, ignoring such encoded characters for > > purposes of case-insensitive comparison is the wrong thing to do, > > but if I understand correctly, it actually is the correct thing to > > do in this case. Namely, a NBN (or URN:NBN), once assigned, is > > essentially static data and consumers of it should not attempt to > > perform modification, Unicode normalization, etc. on it -- that > > would potentially change what is being identified (or render the > > identifier invalid). > > Well, Unicode normalization would be used as part of equivalence > operations (as in IDNA or PRECIS), but in general you are right about > modification. These are identifiers or even numbers, not malleable strings. > > > On the other hand, a national library or > > delegated institution that is assigning NBNs may wish to take into > > account Unicode normalization rules and other similar considerations > > while assigning NBNs (in particular, the nbn_string component), as > > part of their allocation policy. > > It could, but as far as I know none of the national libraries have yet > gone down that path or seen the need to. Juha can tell us if I'm wrong. > > > Because these can be subtle, it > > may be worth explicitly pointing out the potential issues for > > registration authorities. > > "There be dragons and don't go there" seems like fine advice. > > > That, plus the directive to consumers to > > not normalize, seems like it would be appropriate content for an > > Internationalization Considerations section. > > By "normalize" you mean perform equivalence matching of percent-decoded > strings (of which Unicode normalization might be one step), right? Here > again I think the answer is "don't do that" because it's equivalence > matching is done on the percent-encoded strings. I did not have a terribly concrete scenario in mind when I wrote this; I think the one Adam described is probably enough to get us thinking about the right things. > > Separately, in Section 4.2.1 where we cover 4-components, I noted > > that RFC 8141 rather discourages actually using r-components until > > their semantics are standardized. The text here seems to be giving > > free reign for national libraries to assign their own semantics > > without any coordination with a broader community. > > Juha and perhaps John can clarify, but as I understand it the scope of a > URN resolver for NBNs would likely be within a particular national > library system, not even necessarily across all national libraries (this > is how things are deployed now in the absence of URN resolution, in any > case). > > > Do we really > > want to advocate for this, as opposed to attempting to get broadly > > unified semantics for r-components Internet-wide? (Perhaps we > > already have and I just missed it; if so, a reference here would be > > appropriate.) > > The semantics of r-components are yet to be defined. I would venture > that the IETF is probably not the right place to do that work, given how > little energy remained in the URN WG at the end (and we probably didn't > have the right people in the room in the first place). I won't argue with that. Does it make sense to say something like "There are not currently any broadly accepted semantics for r-components at the time of this writing which may be grounds to be cautious with their use" in this document? > > ---------------------------------------------------------------------- > > COMMENT: > > ---------------------------------------------------------------------- > > > > I'm a little confused on some of the places in the text that talk > > about URN:NBNs being "generated from" NBNs (and non-reuse > > thereafter) or restrictions on URN:NBN assignment (e.g., > > uniqueness). The procedure seems to be basically deterministic for > > creating a URN:NBN once an NBN is assigned, and potentially > > something that could be done by any party in possession of the NBN > > (i.e., not necessarily the registration authority that created the > > NBN). So I'm not sure why the act of generating the URN:NBN has any > > significance, if anyone could do it -- the restrictions would need > > to apply at NBN assignment time in order to be useful. (This kind > > of gets into Ben's DISCUSS point, too, in the sense that we can only > > say what prerequisites there are for national library NBN allocation > > policies in order for them to be useful with URN:NBN, but they can > > in principle do whatever they like and choose to not use URN:NBN.) > > Yes, the process of creating a URN from an NBN is trivial (modulo > potentially interesting encoding of non-ASCII characters). I think the > point of the text is that an NBN URN is not exactly the same as an NBN. > Perhaps that could be worded more clearly. Okay. (I don't think I have any suggestions for different text.) > > Section 3.2 > > > > From the library community point of view it is important that the > > f-component is not a part of the NSS and therefore f-component > > attachment does not mean that the relevant component part is > > identified. Moreover, the resolution process still retrieves the > > entire resource even if there is an f-component. The fragment > > selection is applied by the resolution client (e.g., browser) to the > > media returned by the resolution process. In other words, in this > > latter case the fragments are logical and physical components of the > > identified resource whereas in the former cases these "fragments" are > > actually complete, independently named entities. > > > > I'm not sure I'm understanding this correctly -- is the "former > > case" the thing that libraries should not do, namely, including the > > f-component in the NSS? > > Now that you point it out, I'm not sure what the former case is. > Formally speaking the f-component simply is not part of the NSS, see the > ABNF in RFC 8141. I guess we should wait for Juha to clarify. > > If an NBN identifies a work, descriptive metadata about the work > > SHOULD be supplied. The metadata record MAY contain links to > > Internet-accessible digital manifestations of the work. > > > > This left me confused. Is it only intended to apply in the case > > described in the previous paragraph, where the resource identified > > by the NBN is not available in the Internet? Or does it always > > apply, forcing the metadata to take precedence over delivering the > > actual work? (Or maybe I'm just confused, and there's an easy way > > to deliver both metadata and the actual work alongside each other > > with no ambiguity.) > > Juha can clarify this. > > > Section 4.1 > > > > National Bibliography Number (NBN) is a generic term referring to a > > group of identifier systems administered by the national libraries > > and institutions authorized by them. > > > > "the national libraries" implies a specific set -- which ones? It > > may be better to hedge with "some national libraries". > > Or remove "the" ... "by national libraries". That's probably better :) Thanks, Benjamin > > Section 4.2.2 > > > > Do we need to say anything about a URN-to-URI step before talking > > about URI-to-resource services? > > > > I'm also wondering about any relationship between "component > > resource" NBNs and f-components of the containing work. If there is > > are NBNs assigned to both an image within a work and that containing > > work, and an NBN with f-resource is used to refer to the image > > within the containing work, is there any relationship between the > > f-resource and the image-specific NBN? > > > > Section 4.3 > > > > Expressing NBNs as URNs is usually straightforward, as only ASCII > > characters are allowed in NBN strings. If necessary, NBNs MUST be > > translated into canonical form as specified in RFC 8141. > > > > When is it necessary? > > It seems that in theory an NBN itself could contain non-ASCII > characters, whereas an NBN URN and its nbn_string construct can contain > only ASCII characters. At least that is my understanding. > > > Being part of the prefix, sub-namespace identifier strings are case- > > insensitive. They MUST NOT contain any hyphens. > > > > This MUST seems to just duplicate a syntactic requirement from the > > ABNF; is RFC 2119 language really necessary? > > /me shrugs > > > Section 8 > > > > John Klensin provided significant editorial and advisory support for > > late versions of the draft. > > > > Presumably that's "later versions"? > > Yes. > > Peter > >
- Re: [urn] Benjamin Kaduk's Discuss on draft-hakal… Adam Roach
- Re: [urn] Benjamin Kaduk's Discuss on draft-hakal… Peter Saint-Andre
- Re: [urn] Benjamin Kaduk's Discuss on draft-hakal… Peter Saint-Andre
- Re: [urn] Benjamin Kaduk's Discuss on draft-hakal… John C Klensin
- Re: [urn] Benjamin Kaduk's Discuss on draft-hakal… Benjamin Kaduk
- Re: [urn] Benjamin Kaduk's Discuss on draft-hakal… Benjamin Kaduk
- Re: [urn] Benjamin Kaduk's Discuss on draft-hakal… Adam Roach
- Re: [urn] Benjamin Kaduk's Discuss on draft-hakal… Peter Saint-Andre
- Re: [urn] Benjamin Kaduk's Discuss on draft-hakal… John C Klensin
- Re: [urn] Benjamin Kaduk's Discuss on draft-hakal… Hakala, Juha E
- Re: [urn] Benjamin Kaduk's Discuss on draft-hakal… Benjamin Kaduk