Per: https://tools.ietf.org/html/rfc8141
Namespace ID:
gbs
Registration Information:
Version: 1
Date: 2019-09-27
Declared registrant of the namespace:
Name: Ryffine Inc.
Address: 445 N Broadway, Denver, CO 80203
Contact: Philip R Brenan
E-mail: philiprbrenan@gmail.com
www: http://www.ryffine.com
Purpose:
To allow organizations to share content written in Xml to the Dita Standard:
http://docs.oasis-open.org/dita/dita/v1.3/os/part2-tech-content/dita-v1.3-os-part2-tech-content.html
without the exponential duplication that occurs without the name space
standardization provided by a URN.
Dita is a technical documentation standard promulgated by OASIS: a nonprofit
consortium that drives the development, convergence and adoption of open
standards for the global information society as noted at
https://www.oasis-open.org/org
A major goal of Dita is to enable authors to build documents from small
reusable components called topics and then to share and reuse these topics
via collections to enable other documents to be be built more rapidly.
As a consequence of the current addressing mechanism used to link Dita
topics together within a document the number of such topics in existence
tends to grow exponentially over time as documents evolve. Typically when a
new version of a product is documented the author takes the existing set of
linked topic files comprising the documentation of the product, duplicates
all of these files to preserve the complex linkage structure between these
topics, then makes a small number of changes to a few of the duplicated
files, leaving the bulk of the topic files unchanged. At the moment it is
difficult to reuse the original topic files in situ because of the need to
maintain the links between them.
The GB Standard as currently implemented at:
https://metacpan.org/pod/Dita::GB::Standard
seeks to reduce this exponential growth of topic files by giving each topic
a unique deterministic name so that links between topics can be expressed in
a way that endures as the topic files are copied over time.
As proposed, the GB Standard allows a collection of Dita topics to quickly
determine whether it already has a copy of an incoming topic by computing
the GB Standard name of the topic and comparing it to the names of all such
topics already collected locally ready for publication. If the name already
exists then the incoming topic is discarded and the existing topic is
reused, if the name does not exist in the collection then the collection
adds the incoming topic to its list of topics available for publication.
At the same time, the GB standard provides a human readable name for each
topic which assists authors in selecting topics from each collection for
reuse.
The GB standard has been used by the applicant since 2016 to successfully
build and maintain several large collections of topics.
The purpose of this application then is to formalize the GB Standard naming
convention as a globally recognized URN to enable standardized topic naming
among organizations collaborating on the production of collections of
technical documentation using Dita. The proposed URN will not, as it
stands, provide immediate global location of topics so named, instead, it
provides a standardized method of querying one or more collections of such
topics by both humans and computers in an efficient manner.
Syntax:
urn: gbs : : :
where:
is a string of one or more characters drawn from: [a-zA-Z0-9_] which
identifies the type of content being classified. At this point in time only
one such type is in active use: the "dita" type. It is possible that further
types might be required in the future, if so, this document will be updated
to reflect these new types.
is a string of 1 to 64 characters drawn from: [a-zA-Z0-9_]. When
has the value: "dita" (currently the only permissible value), is
computed by concatenating the text between which ever of the following Xml
tags exist in a the Dita topic in the order in which they appear in that
topic:
The text between these tags is used to form the component after
converting runs of all characters other than a-zA-Z0-9 to single underscores
and truncating after character 64 if the resulting string is longer than 64
characters in length. This method was chosen based on operational experience
as it produces readable names that are closely aligned with what authors
expect to see as a topic name.
is the MD5 sum https://en.wikipedia.org/wiki/MD5 of the content being
identified presented as a 32 character lowercase hexadecimal string drawn
from: [a-z0-9]{32} . Presenting the MD5 sum in lowercase, last and therefore
to the right has the beneficial side effect of allowing authors to visually
ignore it and concentrate instead on the component in the majority of
cases where the component happens to be (almost) unique. This
arrangement makes the GB Standard name useful to both humans and computers.
Assignment:
Identifier uniqueness considerations:
Uniqueness is guaranteed by the component being an MD5 sum and is
thus guaranteed to be identical for identical content and very probably
different for differing content.
Identifier persistence considerations:
Persistence is guaranteed by the immutability over time of the MD5 sum
of the component.
Process of identifier assignment:
is currently set to "dita".
is chosen algorithmically depending on the value of using the
topic as input as described above.
is chosen by computing the MD5 sum of the content.
For example:
urn:gbs:dita:Introduction_to_the_GB_Standard:dddb7e2c29d2c8b9d87187fdf52a2702
Resolution:
Content cannot be directly located by this standard. However, URN's are
not necessarily required to provide locations services initially: providing
a globally unique name is valuable in its own right because it encourages
the development of, and convergence on, a small number of large, shared,
inter-operable, global collections of topics within each of which the
uniqueness of the URN is sufficient to provide a location service.
Equivalence is determined by comparing (ignoring case) the components
of the two topics to be compared. If they are equal the two topics are
considered to be equal. Otherwise they are considered to be unequal even if
the underlying content is in fact identical. The characteristics of the MD5
sum ensure that only a small number of topics will be unnecessarily
duplicated as a result of such false positive equivalences.
Security and Privacy:
The validity of the URN can be checked as follows:
Check that the component is "dita".
Check that the component is computed correctly as described above.
Check the the component matches the MD5 sum of the content.
Inter-operability:
The case of the letters chosen is immaterial and can be safely ignored in
all computations on the proposed URN as only the component is used for
comparisons.
Dita topics that do not contain ASCII characters suitable for constructing
the component will be accommodated by adding a new value to the list of
values accepted by the component and specifying the corresponding
algorithm for computing the component in an update to this document.
Additional Information:
An implementation in Perl of the GB Standard as specified above when is
equal to "dita" is located at:
https://metacpan.org/pod/Dita::GB::Standard
References:
ASCII: https://en.wikipedia.org/wiki/ASCII
Dita specification: http://docs.oasis-open.org/dita/dita/v1.3/os/part2-tech-content/dita-v1.3-os-part2-tech-content.html
MD5 Sum: https://en.wikipedia.org/wiki/MD5
XML: https://en.wikipedia.org/wiki/XML