Re: [urn] gbs Name space identifier

Philip R Brenan <philiprbrenan@gmail.com> Wed, 25 September 2019 18:46 UTC

MIME-Version: 1.0
References: <CALhwFR=5Y3gjTX62P10HT_fHGWZV5t9ov=siWmWKD9MaA4EUhA@mail.gmail.com> <87r24m4614.fsf@hobgoblin.ariadne.com> <trinity-ca77aa47-8a00-419e-bfe8-867543668e08-1569325868991@3c-app-webde-bap33>
In-Reply-To: <trinity-ca77aa47-8a00-419e-bfe8-867543668e08-1569325868991@3c-app-webde-bap33>
From: Philip R Brenan <philiprbrenan@gmail.com>
Date: Wed, 25 Sep 2019 19:46:25 +0100
Message-ID: <CALhwFRmtVK_xjZZQcw7JRyuW7PEr4n0keb3CnAyfsGyJjfxf3Q@mail.gmail.com>
To: lars.svensson@web.de
Cc: "Dale R. Worley" <worley@ariadne.com>, urn@ietf.org
Content-Type: multipart/alternative; boundary="000000000000f4c0dc05936512ef"
Archived-At: <https://mailarchive.ietf.org/arch/msg/urn/KMe_dzmKBqFQ3nAGvw80ByxgSUI>
Subject: Re: [urn] gbs Name space identifier
Precedence: list

I have removed the link in question as the explanation of the derivation of
the *<T>* component was deemed unsatisfactory.  Here is what I was trying
to achieve:

It is anticipated that the GB Standard represented by the *urn:* *gbs* name
space could be usefully applied to a number of different document types,
such as Dita, DocBook, Word, Html etc.  The <T> component is designed to
separate these various name spaces. At the moment the only <T> in active
use  is *dita* for Dita documents.  Within the Dita space the algorithm for
computing the *<G>* component is included in:

https://metacpan.org/pod/Dita::GB::Standard

as gbStandardFileName().

The computation of the <G> component is performed by examining the text
between which ever of the following *xml* tags exist in a particular Dita
document in the order in which they appear:

 title mainbooktitle booktitlealt

The text between these tags is used to form the <G> component after
converting runs of all characters other than a-zA-Z0-9 to single
underscores. This method was chosen because it produces the most readable
names that are closely aligned with what authors expect to see as a file
name.

The purpose of the GB Standard is to control the explosion of duplicate
Dita topics that tends to occur as documents evolve.  Typically when a new
product is documented, the author takes the existing set of linked topic
files comprising the documentation of the product, duplicates all of these
files to preserve the linkage structure,  then makes a small number of
changes to a few of the duplicated files, leaving the bulk of the topic
files unchanged.  It is difficult to reuse the original topic files in situ
because of the need to maintain the links between them.

The GB Standard seeks to reduce this exponential growth of topic files by
giving each topic a unique deterministic name so that links between topics
can be expressed in a way that endures as the topic files are copied over
time.

As proposed, the GB Standard allows a server to quickly determine whether
it has a copy of a file by computing the GB Standard name of an incoming
file and comparing it to the names of all such files stored locally.   If
the name already exists then that file is reused, if the name does not
exist on the server then the server adds the incoming file to its list of
files available.

It is not the current intention to use the GB Standard name to locate off
site copies of a file - as things stand this could only be achieved by
querying each server known to store files in this manner in turn.  Please
tell me whether it is necessary for a *urn* to be able to uniquely locate
files as well as classify them?  If it is a requirement that a *urn *can be
used to locate a topic file anywhere in the world then I need to rethink
this aspect of the GB Standard and update my application for the *gbs*
namespace accordingly.  If location is not necessarily required then the
description of the computation of the <G> component and adequate
documentation of the standard names in the <T> would be seem to be the
elements that need work to progress this application further?

On Tue, Sep 24, 2019 at 12:51 PM <lars.svensson@web.de> wrote:

> > >    <T> is a string of one or more characters drawn from: [a-zA-Z0-9_]
> which
> > >    identifies the type of content from a list of types published by the
> > >    registrant at https://metacpan.org/pod/Dita::GB::Standard::Types .
> >
> > I attempted to obtain the list of valid types at the given URL, but was
> > unsuccessful.  That page seemed to be a very top-level discussion of
> > "The GB Standard".
>
> That URL gives me a 404...
>
> Best,
>
> Lars
>

-- 
Thanks,

Phil <https://opentokrtc.com/room/phil>

Philip R Brenan <https://opentokrtc.com/room/phil>

[urn] gbs Name space identifier Philip R Brenan
Re: [urn] gbs Name space identifier Dale R. Worley
Re: [urn] gbs Name space identifier lars.svensson
Re: [urn] gbs Name space identifier Hakala, Juha E
Re: [urn] gbs Name space identifier Philip R Brenan
Re: [urn] gbs Name space identifier Hakala, Juha E
Re: [urn] gbs Name space identifier Philip R Brenan
Re: [urn] gbs Name space identifier Hakala, Juha E
Re: [urn] gbs Name space identifier Philip R Brenan
Re: [urn] gbs Name space identifier Dale R. Worley
Re: [urn] gbs Name space identifier Dale R. Worley
Re: [urn] gbs Name space identifier Philip R Brenan
Re: [urn] gbs Name space identifier Dale R. Worley
Re: [urn] gbs Name space identifier Philip R Brenan
Re: [urn] gbs Name space identifier Dale R. Worley