]> Using National Bibliography Numbers as Uniform Resource Names The National Library of Finland
P.O. Box 15, Helsinki University Helsinki MA FIN-00014 Finland juha.hakala@helsinki.fi
National bibliography numbers Uniform resource names National Bibliography Numbers, NBNs, are used by the national libraries and other organizations in order to identify resources in their collections. Generally, NBNs are applied to resources that are not catered for by established (standard) identifier systems such as ISBN. A URN (Uniform Resource Names) namespace for NBNs was established in 2001 in RFC 3188. Since then, several European national libraries have implemented URN:NBN-based systems. This document replaces RFC 3188 and defines how NBNs can be supported within the updated URN framework. A revised namespace registration (version 4) compliant to RFC 8141 is included. This draft replaces draft-ietf-urnbis-rfc3188bis-nbn-urn-04, posted 2012-10-22.
One of the basic permanent URI schemes (cf. RFC 3986, ) is 'URN' (Uniform Resource Name) as originally defined in RFC 2141 with new definitions and registration procedure in 2017 . Any traditional identifier, when used within the URN system, must to have a namespace of its own, registered with IANA [IANA-URN]. National Bibliography Number (NBN) is one such namespace, specified in 2001 in RFC 3188. URN:NBNs are in production use in several European countries including (in alphabetical order) Austria, Finland, Germany, Italy, Hungary, the Netherlands, Norway, Sweden, and Switzerland. The URN:NBN namespace is collectively managed by these national libraries. URN: NBNs have been applied to diverse content including Web archives, digitized materials, research data, and doctoral dissertations. They can be used by the national libraries and organizations co-operating with them. As a part of the initial development of the URN system in the late 1990s, the IETF URN working group agreed that it was important to demonstrate that the URN syntax can accommodate existing identifier systems. RFC 2288 investigated the feasibility of using ISBN, ISSN and SICI (see below) as URNs, with positive results; however, it did not formally register corresponding URN namespaces. This was in part due to the still evolving process to formalize criteria for namespace definition documents and registration, consolidated later in the IETF, first into RFC 2611, then into RFC 3406, and now given by RFC 8141. URN Namespaces have been registered for NBN (National Bibliography Number), ISBN (International Standard Book Number), and ISSN (International Serial Standard Number) in RFCs 3188, 3187, and 3044, respectively. ISBN and ISSN namespaces were made compliant with RFC 8141 in 2017 by publishing revised ISSN and ISBN namespace registrations. The term "National Bibliography Number" encompasses persistent local identifier systems that the national libraries and their partner organizations use in addition to the more formally (and internationally) established identifiers. These partner organizations include university libraries, universities and other research organizations and governmental organizations. Some national libraries have a lot of these liaisons; for instance, the German National Library had almost 400 by early 2018 . In practice, NBN differs from standard identifier systems such as ISBN and ISSN because it is not a single identifier system with standard-specified scope and syntax. Each NBN implementer creates its own system with its own syntax and assignment rules. Each user organization is also obliged to keep track of how NBNs are being used, but within the generic framework set in this document, local NBN assignment policies may vary considerably. Historically, NBNs have been applied in the national bibliographies to identify the resources catalogued into them. Prior to the emergence of bibliographic standard identifiers in the early 1970s, national libraries assigned NBNs to all catalogued publications. Since the late 1990s, the NBN scope has been extended to cover a vast range of digitized and born digital resources. Only a small subset of these resources is cataloged in the national bibliographies or other bibliographic databases. Digitized resources and their component parts (such as still images in books, or journal articles) are examples of resources that may get NBNs. It is possible to extend the scope of the NBN much further. The National Library of Finland is using them in the Finnish National Ontology Service Finto to identify corporate names (see http://finto.fi/cn/en/). NBNs to identify metadata elements provides a stable basis for creation of linked data. Simple guidelines for using NBNs as URNs and the original namespace registration were published in RFC 3188 . The RFC at hand replaces RFC 3188; sections discussing the methods in which URN:NBNs should be resolved have been updated, unused features have been eliminated, and the text is compliant with the stipulations of the revised URN specification . The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Comments are welcome and should be directed to the urn@ietf.org mailing list or the authors. RFC-Editor: this subsection to be deleted before RFC publication.
"NBN" refers to any National Bibliography Number identifier system used by the national libraries and other institutions, which use these identifiers with national libraries' support and permission. In this memo, "URN:NBN" is used as a shorthand for "NBN-based URN".
NBNs are widely used to identify both hand-held and digital resources in the collections of national libraries and other institutions that are responsible for preserving the cultural heritage of their constituents. Resources in these collections are usually preserved for a long time (i.e., for centuries). While the preferred methods for digital preservation may vary over time and depending on the content, the favorite one is currently migration. Whenever necessary, a resource in outdated file format is migrated into a more modern file format. All old versions of the resource are also kept, in order to alleviate the negative effects of partially successful migrations and gradual loss of original look and feel that may accompany even fully successful migrations. When there are multiple manifestations of a digital object, each one SHOULD have its own NBN. NBNs SHOULD only be used for objects when standard identifiers such as ISBN are not applicable. However, NBNs MAY be used for component resources even when the resource as a whole qualifies for a standard identifier. For instance, even if a digitized book has an ISBN, JPEG image files of its pages may get NBNs. These URN:NBNs can be used as persistent links to the pages. The scope of standard identifier systems such as ISBN and ISSN is limited; they are applicable only to certain kinds of resources. Generally speaking, the role of the NBN is to fill in the gaps. Collectively, the standard identifiers and NBNs cover all resources the national libraries and their partners need to preserve for the long term. below, and there in particular , presents a more detailed overview of the structure of the NBN namespace, related institutions, and the identifier assignment principles used.
National libraries are the key organizations providing persistent URN resolution services for resources identified with NBNs, independent of their form. National libraries MAY allow other organizations such as university libraries or governmental organizations to assign NBNs to the resources they preserve for long term. In such case, the national library MUST co-ordinate the use of NBNs at the national level. National libraries can also provide URN resolution services and technical services to other NBN users. These organizations are supposed to either establish their own URN resolution services or use the technical infrastructure provided by the national library. In the URN:NBN namespace, each persistent identifier SHOULD be resolvable and provide one or more resolution services. NBNs MAY be used to identify component resources, but the NBN Namespace does not specify a generic, intrinsic syntax for doing that. However, there are at least two different ways in which component resources can be taken into account within the NBN namespace. The simplest and probably the most common approach is to assign a separate NBN for each component resource such as a file containing a digitized page of a book, and make no provisions to make such NBNs discernible in a systematical way from others. The URN:NBN assigned to the component resource enables direct and persistent access to the page, which might otherwise be available only via browsing the book from the title page to the page wanted. Second, if the stipulations of the URI Generic Syntax and the Internet media type specification are met, in accordance with the provisions in RFC 8141, the URN f-component MAY be attached to URN:NBNs in order to indicate the desired location within the resource supplied by URN resolution. From the library community point of view it is important that the f-component is not a part of the NSS and therefore f-component attachment does not mean that the relevant component part is identified. Moreover, the resolution process still retrieves the entire resource even if there is an f-component. The fragment selection is applied by the resolution client (e.g., browser) to the media returned by the resolution process. In other words, in this latter case the fragments are logical and physical components of the identified resource whereas in the former cases these "fragments" are actually complete, independently named entities. Resources identified by NBNs are not always available in the Internet. If so, URN:NBNs can resolve to surrogates such as metadata records describing identified resources. If an NBN identifies a work, descriptive metadata about the work SHOULD be supplied. The metadata record MAY contain links to Internet-accessible digital manifestations of the work. below, and in particular therein, presents a detailed overview of the application of the URN:NBN Namespace as well as the principles of, and systems used for, the resolution of NBN- based URNs.
National Bibliography Number (NBN) is a generic term referring to a group of identifier systems administered by the national libraries and institutions authorized by them. The NBN assignment is typically performed by the organization hosting the resource. National libraries are committed to permanent preservation of their deposit collections. Each national library uses NBNs independently of other national libraries; apart from this document, there is no global authority that specifies or controls NBN usage. NBNs as such are unique only on the national level. When used as URNs, base NBN strings MUST be augmented with a controlled prefix, which is the particular nation's ISO 3166-1 alpha-2 two-letter country code. These prefixes guarantee uniqueness of the URN:NBNs at the global scale [Iso3166MA]. A national library using URN:NBNs SHOULD specify a local assignment policy; such policy SHOULD limit the URN:NBN usage to the information resources stored permanently in the national library's digital collections or databases. A more liberal URN:NBN assignment policy MAY be applied, but NBNs assigned to a short-lived resources SHOULD NOT be made URN: NBNs. URN:NBN assignment policy SHOULD also clarify the local policy concerning identifier assignment to component parts of resources, and specify with sufficient detail the syntax of local component identifiers (if there is one as a discernible part of the NBNs). The policy SHOULD also cover any employed extensions to the default NBN scope (e.g., to cover identification of metadata elements).
Expressing NBNs as URNs is usually straightforward, as traditionally only ASCII characters have been used in NBN strings. If necessary, NBNs must be translated into canonical form as specified in RFC 8141. When an NBN is used as a URN, the namespace-specific string (NSS) MUST consist of three parts: a prefix, structured as a primary prefix, which is a two-letter ISO 3166-1 country code of the library's country, and zero or more secondary prefixes, each indicated by a delimiting colon character (:) and a sub-namespace identifier, a hyphen (-) as a delimiting character, and the NBN string. The prefix is case-insensitive. An NBN string can be either case- sensitive or case-insensitive, depending on the NBN syntax applied. Future implementers of NBNs SHOULD make their NBN strings case- insensitive. Different delimiting characters are not semantically equivalent. Use of colon as the delimiting character is allowed if and only if the country code-based NBN namespace (identified by the respective ISO 3166-1 country code used as the primary part of the prefix) is split further into smaller sub-namespaces, in which case the colon separates the ISO 3166-1 country code from the sub-namespace identifier. These subdivisions (including the colon separator) form an optional part of the prefix. A colon MUST NOT be used for any other purpose in the prefix. A hyphen MUST be used for separating the prefix and the NBN string, or the part of the NBN string that is assigned to the identified object by a sub-division authority. If there are several national libraries in one country, these libraries MUST agree on how to divide the national namespace between themselves using this method before the URN:NBN assignment begins in any of these libraries. A national library MAY also assign to trusted organization(s) -- such as a university or a government institution -- its own NBN sub-namespace. The sub-namespace MAY be further divided by the partner organization (the national library MUST be informed about these sub-namespaces). Being part of the prefix, sub-namespace identifier strings are case- insensitive. They MUST NOT contain any hyphens. The sub-namespace identifiers used beneath a country-code-based namespace MUST be registered on the national level by the national library that assigned the code. The national register of these codes SHOULD be made available online. Models (indicated linebreak inserted for readability): URN:NBN:<ISO 3166 alpha-2 country code>-<assigned NBN string> URN:NBN:<ISO 3166 alpha-2 country code>:<sub-namespace code>-\ <assigned NBN string> Examples: URN:NBN:fi-fe201003181510 urn:nbn:ch:bel-9039 urn:nbn:se:uu:diva-3475 urn:nbn:hu-3006
Eventually, URNs might be resolved with the help of a resolver discovery service (RDS). Since no such system has been installed yet in the Internet, URN:NBNs are usually embedded in HTTP URIs in order to make them actionable in the present Internet. In these HTTP URIs, the authority part must point to the appropriate URN resolution service. For instance, in Finland, the address of the national URN resolver is <http://urn.fi>. Thus the HTTP URI for the Finnish URN in the example above is <http://urn.fi/URN:NBN:fi-fe201003181510>. The country code-based prefix part of the URN:NBN namespace-specific string will provide a hint needed to find the correct resolution service for URN:NBNs from the global resolver discovery service when it is established. There are three inter-related aspects of persistence that need to be discussed: persistence of the objects itself, persistence of the identifier, and persistence of the URN resolvers. NBNs have traditionally been assigned to printed resources, which tend to be persistent. In contrast, digital resources require frequent migrations to guarantee accessibility. Although it is impossible to estimate how often migrations are needed, hardware and software upgrades take place frequently, and a life time exceeding 10-20 years can be considered as long. However, it is a common practice to keep also the original and previously migrated versions of resources. Therefore even outdated versions of resources can be available, no matter how old or difficult to use they have become. If all versions of a resource are kept, a user who requires authenticity may retrieve the original version of the resource, whereas a user to whom the ease of use is a priority is likely to be satisfied with the latest version. In order to enable the users to find the best match, an archive can link all manifestations of a resource to each other (possibly via a work level metadata record) so as to make the users aware of them. Thus, even if specific versions of digital resources are not normally persistent, persistent identifiers such as URN:NBNs support information architectures that enable persistent access to any version of the resource, including ones which can only be utilized by using digital archeology tools such as custom made applications to render the resource. Persistence of URN resolvers themselves is mainly an organizational issue, related to the persistence of organizations maintaining them. As URN:NBN resolution services will be supplied (primarily) by the national libraries, these services are likely to be long-lived.
URN:NBNs (or other persistent identifiers) SHOULD be applied to all resources which have been prioritized in the organization's digital preservation plan. URN:NBNs SHOULD NOT be assigned to resources that are known to not be persistent. URN:NBNs MAY however be applied to resources that have a low-level preservation priority and will not be migrated to more modern file formats. If the identified version of a resource has disappeared, the resolution process SHOULD supply a surrogate if one exists, such as the original printed version of a resource, or a more modern digital version of that resource.
This URN Namespace registration describes how National Bibliography Numbers (NBNs) can be supported within the URN framework; it uses the updated IANA template specified in RFC 8141. This Namespace ID was formally assigned to the National Bibliography Number in October 2001 when the namespace was registered officially . Utilization of URN:NBNs had started in demo systems already in 1998. Since 2001, tens of millions of URN:NBNs have been assigned. The number of users of the namespace has grown in two ways: new national libraries have started using NBNs, and many national libraries using the system have formed new liaisons. Resources (digital or otherwise) in the collections of national libraries and their partner organizations. Component parts of identified resources. Metadata records describing the identified resources. Individual data elements in identified metadata records. Version: 4 Date: 2018-04-09 Name: Juha Hakala Affiliation: Senior Adviser, The National Library of Finland Email: juha.hakala@helsinki.fi Postal: P.O.Box 15, 00014 Helsinki University, Finland Web URL: http://www.nationallibrary.fi/ The National Library of Finland registered the namespace on behalf of the Conference of the European National Librarians (CENL) and Conference of Directors of National Libraries (CDNL). The NBN namespace is available for free for the national libraries. They MAY allow other organizations to assign URN:NBNs and use the resolution services established by the library for free or for a fee. The fees, if collected, should be based on, e.g., the maintenance costs of the system. The namespace-specific string (NSS) will consist of three parts: a prefix, consisting of an ISO 3166-1 alpha-2 country code and optional sub-namespace code(s) separated by colon(s), a hyphen (-) as the delimiting character, and an NBN string assigned by the national library or sub-delegated authority. This definition uses ABNF .
Colon MAY be used within the prefix only as a delimiting character between the ISO 3166-1 country code and sub-namespace code(s), which split the national namespace into smaller parts. The structure (if any) of the nbn_string is determined by the authority for the prefix. Whereas the prefix is regarded as case-insensitive, NBN-strings MAY be case-sensitive at the preference of the assigning authority; parsers therefore MUST treat these as case-sensitive; any case mapping needed to introduce case-insensitivity is the responsibility of the relevant resolution system. Hyphen MUST be used as the delimiting character between the prefix and the NBN string. Within the NBN string, hyphen MAY be used for separating different sections of the identifier from one another. All two-letter codes are reserved by the ISO 3166 Maintenance Agency for either existing and possible future ISO country codes (or for private use). Sub-namespace identifiers MUST be registered on the national level by the national library that assigned the code. The list of such identifiers SHOULD be available via the Web. See Section 4.2 of RFC XXXX for examples.
National Bibliography Number (NBN) is a generic name referring to a group of identifier systems used by the national libraries and their partner organizations for identification of resources (and their component parts) that lack a 'canonical' identifier. The scope of NBN has been extended to also include, e.g., metadata records and their elements. Each national library uses NBNs independently of other national libraries; there is neither a general standard defining the NBN syntax nor a global authority to control the use of these identifier systems. The syntax of NBN strings is specified locally. NBNs used in national bibliographies contain only characters that belong to the US-ASCII character set. Following the expansion of the NBN scope and semi- and fully automated NBN assignment processes, some future NBNs MAY contain characters that MUST be translated into canonical form according to the specifications in RFC 8141. The NSS syntax specified in this registration is in full conformance with RFC 8141 and its predecessors. The prefix, consisting of an ISO 3166-1 country code and its (optional) sub-divisions, is case-insensitive. The NBN string MAY be case-sensitive or case-insensitive, depending on the rules chosen by the NBN authority designated by the prefix; therefore, general-purpose resolver clients without sub-namespace specific knowledge) MUST treat NBN strings as case-sensitive. Syntax requirements expressed in RFC 8141 MUST be taken into account. Formally, two URN:NBNs are lexically equivalent if they are octet- by-octet equal after the following (conceptional) preprocessing: normalize the case of the leading "urn:nbn:" token; normalize the case of the prefix (country code and its optional sub-divisions); normalize the case of any percent-encoding; Note: The case used in the normalization steps is a local matter; implementations can normalize to lower or upper case as they see fit, they only need to do it consistently. URN:NBN resolvers MAY support several services. Some of them have been formally specified in RFC 2483; some remain unspecified. Examples of existing relevant services are URI to URL or URLs, URI to URN or URNs, URI to resource or resources, and URI to resource metadata. In the latter case it is important to be able to indicate the preferred metadata format or the completeness of the metadata record or the metadata content requested such as table of contents. A URN resolver maintained by a national library can utilize for instance the national bibliography, digital asset management systems and digital preservation systems to supply these services. Examples of services that can be specified and implemented in the future: request the oldest and most original version of the resource; request the latest version of the resource, and request rights metadata related to the resource. Depending on the technical infrastructure within which digital resources are preserved and made available, any service can be provided either via q-component, r-component or both. If URI-to-resource service is used and the media type of a resource supports the use of an f-component, it can be used to indicate a location within the identified resource because NBNs SHOULD be assigned to one and only one version of a resource, such as a PDF version of an article. The URN:NBN Namespace does not impose any restrictions of its own on f-component usage. NBNs as such are not unique; different national libraries can assign the same NBN to different resources. Therefore, to guarantee the uniqueness of URN:NBNs, a prefix, based on the ISO country code, is added to the resource. An NBN, once it has been assigned to a resource, MUST be persistent, and therefore URN:NBNs are persistent as well. A URN:NBN, once it has been generated from a NBN, MUST NOT be re- used for another resource. Users of the URN:NBN namespace MUST ensure that they do not assign the same URN:NBN twice. Different policies can be applied to guarantee this. For instance, NBNs and corresponding URN:NBNs MAY be assigned sequentially by programs in order to avoid human mistakes. It is also possible to use printable representations of checksums such as SHA-1 as NBNs. Assignment of NBN-based URNs MUST be controlled on national level by the national library (or national libraries, if there is more than one). National guidelines can differ, but the identified resources themselves SHOULD be persistent. Different URN:NBN assignment policies have resulted in varying levels of control of the assignment process. Manual URN assignment by the library personnel provides the tightest control, especially if the URN:NBNs cover only resources catalogued into the national bibliography. In most national libraries, the scope of URN:NBN is already much broader than this. Usage rules can vary within one country, from one URN:NBN sub-namespace to the next. As of yet, there are no international guidelines for URN:NBN use beyond those expressed in this document. See Section 4.3 of RFC XXXX. None specified on the global level (beyond a routine check of those characters that require special encoding when employed in URIs). NBNs may have a well specified and rich syntax (including, e.g., fixed length and checksum). In such case, it is possible to validate the correctness of NBNs programmatically. NBNs are applied to resources held in the collections of national libraries and their partner organizations. NBNs may also be used to identify, e.g., component parts of these resources or metadata records describing resources or their component parts.
IANA is asked to update the existing registration of the Formal URN Namespace 'NBN' using the template given above in .
This document proposes means of encoding NBNs as URNs. A URN resolution service for NBN-based URNs is depicted, but only at a generic level; thus, questions of secure or authenticated resolution mechanisms and authentication of users are out of scope of this document. It does not deal with means of validating the integrity or authenticating the source or provenance of URN:NBNs. Issues regarding intellectual property rights associated with objects identified by the URN:NBNs are also beyond the scope of this document, as are questions about rights to the databases that might be used to construct resolution services. Beyond the generic security considerations laid out in the underlying documents listed in the Normative References (Section 10.1), no specific security threats have been identified for NBN-based URNs.
Revision of RFC 3188 started during the project PersID. Later the revision was included in the charter of the URNbis working group and worked on in that group in parallel with what became RFC 8141 and RFC 8254. The author wishes to thank his colleagues in the PersID project and the URNbis participants for their support and review comments. Tommi Jauhiainen has provided feedback on an early version of this draft. The author wishes to thank Tommi Jauhiainen, Bengt Neiss, and Lars Svensson for the comments they have provided to various versions of this draft. John Klensin provided significant editorial and advisory support for late versions of the draft.
This document would not have been possible without contributions by Alfred Hoenes.
&rfc2119; &rfc3986; &rfc5234; &rfc8141; &rfc1321; &rfc2046; &rfc2141; &rfc2288; &rfc2611; &rfc3044; &rfc3187; &rfc3188; &rfc3406; &rfc6234; &rfc8254; persid: Building a persistent identifier infrastructure PersID initiative, 2009-2011 URN:NBN Resolver fuer Deutschland und Schweiz: Information ueber Partner Institutionen Deutsche Nationalbibliothek Namespace Registration for International Standard Book Number (ISBN) ISO 2108:2017 The International ISBN Agency
Namespace Registration for International Standard Serial Number (ISSN) and Linking ISSN (ISSN-L) based on ISO 3297:2007 The ISSN International Centre
URI Schemes Registry IANA
URN Namespace Registry IANA
ISO Maintenance agency for ISO 3166 country codes ISO
Numerous clarifications based on a decade of experience with RFC 3188. Non-ISO 3166 (country code) based NBNs have been removed due to lack of usage. In accordance with established practice, the whole NBN prefix is now declared case-insensitive. The document is based on the new URN Syntax specification, RFC 8141. Use of query components and fragment components with this Namespace is now specified, in accordance with RFC 8141.
RFC-Editor: Please delete this whole section before RFC publication.
formal updates for a WG draft; no more "Updates: 2288"; introduced references to other URNbis WG documents; changes based on review by Tommi Jauhiainen; Sect. 3 restructured into namespace and community considerations; old Sect. 7 incorporated in new Sect. 3.1; Security Considerations: old Section 4.5 merged into Section 5; added guidelines for when two manifestations of the same work should get different URN:NBNs; clarified role of ISO 3166/MA for ISO 3166-1 country codes; clarified role of non-ISO prefix registry maintaind by the LoC; resolved inconsistency in lexical equivalence rules: as already specified for ISO alpha-2 country-codes, and in accordance with established practice, the whole NBN prefix is now declared case-insensitive; registration template adapted to rfc3406bis[-00]; numerous editorial fixes and enhancements.
Numerous changes to accommodate the outcome of the discussions on the urn list; three different ways of identifying fragments specified; removed some redundant/irrelevant paragraphs/subsections; the "one manifestation, one URN" principle strenghtened; introduced the idea of interlinking manifestations; extended the scope of the NBN explicitly to works; added reference to S4.2 in namespace registration; numerous editorial fixes and enhancements.
Removed the possibility of using prefixes not based on country codes; replaced all instances of the word object with resources; removed some redundant/irrelevant paragraphs/subsections; allowed the possibility for identifying data elements with NBNs; a few editorial fixes and enhancements.
improved text related to "prefix" in NSS; addressed issues with text related to case-sensitivity of NSS strings; addressed comments and open details on requirements language; switched language to talk about "resource" instead of "object"; several more editorial fixes and enhancements.
specification of how to use URN query and fragment part based on the revised versions of rfc2141bis and rfc3406bis; various textual improvements and clarifications, including: textual alignments with rfc3187bis draft vers. -03; multiple editorial fixes and improvements.
Conversion of document to XML2RFC format, change of name (not a WG task). Adjusted for changes to 2141bis, consolidation of RFC 3406bis, creation of transition document. Made a number of changes to reflect publication of RFC 8141 (previously 2141bis and 3406bis) and update terminology, references, and current status to early 2018.