Re: [urn] gbs Name space identifier

"Hakala, Juha E" <juha.hakala@helsinki.fi> Thu, 26 September 2019 04:16 UTC

Return-Path: <juha.hakala@helsinki.fi>
X-Original-To: urn@ietfa.amsl.com
Delivered-To: urn@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4727E120271 for <urn@ietfa.amsl.com>; Wed, 25 Sep 2019 21:16:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[AC_DIV_BONANZA=0.001, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=helsinkifi.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vzaKhKH4hQAM for <urn@ietfa.amsl.com>; Wed, 25 Sep 2019 21:15:56 -0700 (PDT)
Received: from EUR02-HE1-obe.outbound.protection.outlook.com (mail-eopbgr10105.outbound.protection.outlook.com [40.107.1.105]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1AB591200DF for <urn@ietf.org>; Wed, 25 Sep 2019 21:15:55 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=N8hHjimnYqrUAwD6VfMx3E7WNgVQOAJ5x+DlzdCqBDgDb7gdKf3cYX4YmcV2lx4KqwHIwJGW/2WhecXG1xf/oO8rVZwI0giQrC+LGJCGe4PKmPvKQRdXz+kWKsHhEOsgAnro3Il4miiqhoWsGq5o4VAsPIGEXlR2gtChUqDrxHvZttSGxOdgV3/2ihJjbI73KmkuJT+a0RqcuHikkYyYOy1LjWNEctWe9VrV65i2aCI9bf/vplXf45xEl7ruoNDke/q73gzpRC6U6FPDjGxho1N/200+eXjC4shjd8mwbQSuEirLoOoB/xNUF2yy213bvdOsClJN/77WYXTXilWEvQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0gvaWY13nqGMmzL3NUB+q2BrT8dCbpUxeYO3b74IDGw=; b=SqT3rOhxKBf+6w+g1cvvJhrldMbqhEvcBtFp3UDpzR8nr3ERPFn4W/bziuHHuaOOW2kYrQS1pI2d03m6TlCBttLmxzCnSnbbWRqUFC/T5Sp0guSEUtnOsE/JpydFkYuv0kiYQAc+EiZTwEsbgvn+Pw1aO+v5nLliJVmo1Hup3ddn+Ct1JoD2qYY+r4cXqQNonHeOulZTzLiNZPTd/qStq9VXOwK4lX0NVojPSkjDasvzMNburmCLOO5yp9xQntBUrq4HolztUZcMEJqAxEIYZrRmYvyGIuxBLVtC/eDxMFRTdT5SrFsuhDaAZqJi+8bF29H5D+SlNyOjLo8zsWhgtw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=helsinki.fi; dmarc=pass action=none header.from=helsinki.fi; dkim=pass header.d=helsinki.fi; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=HelsinkiFI.onmicrosoft.com; s=selector2-HelsinkiFI-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0gvaWY13nqGMmzL3NUB+q2BrT8dCbpUxeYO3b74IDGw=; b=G3i/mgRQ2GPsCCYRoJIaYgVx1KgfFBobuvoGzZuUvSSf+rEfCA39p1Nrcbq4KDI3Xznpdc/qX9NHj8IM+URZsj7a/6SQK3+s1eglVulBB1JLPahzITbCFbivQCokQaG7Z51jeJy9X6CUcsDS+soJr0ptS6l9Qlp3/QJPWScBk0A=
Received: from HE1PR07MB3097.eurprd07.prod.outlook.com (10.170.244.159) by HE1PR07MB4169.eurprd07.prod.outlook.com (20.176.165.153) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2305.15; Thu, 26 Sep 2019 04:15:52 +0000
Received: from HE1PR07MB3097.eurprd07.prod.outlook.com ([fe80::c0c6:337b:4ea5:1ba]) by HE1PR07MB3097.eurprd07.prod.outlook.com ([fe80::c0c6:337b:4ea5:1ba%7]) with mapi id 15.20.2305.013; Thu, 26 Sep 2019 04:15:51 +0000
From: "Hakala, Juha E" <juha.hakala@helsinki.fi>
To: Philip R Brenan <philiprbrenan@gmail.com>, "lars.svensson@web.de" <lars.svensson@web.de>
CC: "urn@ietf.org" <urn@ietf.org>, "Dale R. Worley" <worley@ariadne.com>
Thread-Topic: [urn] gbs Name space identifier
Thread-Index: AQHVaQ+bR34HLN0muUyz1prz3YP+Wqc6ypaAgAIGW4CAAJyAkA==
Date: Thu, 26 Sep 2019 04:15:51 +0000
Message-ID: <HE1PR07MB30972990D54C3FF07D4D5712FA860@HE1PR07MB3097.eurprd07.prod.outlook.com>
References: <CALhwFR=5Y3gjTX62P10HT_fHGWZV5t9ov=siWmWKD9MaA4EUhA@mail.gmail.com> <87r24m4614.fsf@hobgoblin.ariadne.com> <trinity-ca77aa47-8a00-419e-bfe8-867543668e08-1569325868991@3c-app-webde-bap33> <CALhwFRmtVK_xjZZQcw7JRyuW7PEr4n0keb3CnAyfsGyJjfxf3Q@mail.gmail.com>
In-Reply-To: <CALhwFRmtVK_xjZZQcw7JRyuW7PEr4n0keb3CnAyfsGyJjfxf3Q@mail.gmail.com>
Accept-Language: en-GB, en-US
Content-Language: fi-FI
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=juha.hakala@helsinki.fi;
x-originating-ip: [128.214.147.95]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 3c5ef727-5cfa-4d37-2be5-08d7423837e6
x-ms-traffictypediagnostic: HE1PR07MB4169:
x-ms-exchange-purlcount: 3
x-microsoft-antispam-prvs: <HE1PR07MB416961D6F13FB16C4C29963CFA860@HE1PR07MB4169.eurprd07.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-forefront-prvs: 0172F0EF77
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(4636009)(366004)(136003)(346002)(396003)(39860400002)(376002)(189003)(199004)(26005)(186003)(14444005)(14454004)(66066001)(110136005)(966005)(66574012)(52536014)(236005)(81166006)(8676002)(81156014)(9686003)(33656002)(54906003)(7696005)(54896002)(316002)(99286004)(446003)(86362001)(6306002)(11346002)(606006)(2906002)(71190400001)(478600001)(71200400001)(2501003)(3846002)(476003)(486006)(786003)(76176011)(790700001)(5660300002)(8936002)(6116002)(25786009)(74316002)(76116006)(64756008)(66946007)(7736002)(256004)(6506007)(66556008)(66476007)(55016002)(4326008)(66446008)(6436002)(53546011)(102836004); DIR:OUT; SFP:1102; SCL:1; SRVR:HE1PR07MB4169; H:HE1PR07MB3097.eurprd07.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1;
received-spf: None (protection.outlook.com: helsinki.fi does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: 2GAbxCOgaf2GjCXKcMhfg1HHvhT0XmJdZp3rJUUNYKkA3n19SRl/LPpxDGoSjha+JfEJLfoYUOcmGBbK3hSPRFhZUHppLyVSiEA7apMUCR9SiaWSP5UFpXgkJzB1BmKQHb6OLgau/gM741cLCrw0lQ+d8E+Nf2Gy5tv7204+0U1K6nV1lpBsqmvOmI0sXxonjERaRU39JumwWOpinqpBz3pFRgj6wfHeZxblyePUIoDz37rlt31Va0mfcl/ZJspKsKJSrFickCjj8kh34a+KC9oyULwn2EMnakzOZhNj13vunzF6JSG73ABv0wvyaTNbW/Uvpo/qDE1ODN9vB0TqJSqTs8XKoqMdzkwvqcIZVKvXYl66rvwWW6zs0L5TN+pctSz3tz/u30M9+FzNQ6AJvmicjy5LyG/92PX6QrkdcgQRY4PvnBr1vtpRIgkGlBPKRWcP7Npyll2P9tvxerAXYQ==
x-ms-exchange-transport-forked: True
Content-Type: multipart/alternative; boundary="_000_HE1PR07MB30972990D54C3FF07D4D5712FA860HE1PR07MB3097eurp_"
MIME-Version: 1.0
X-OriginatorOrg: helsinki.fi
X-MS-Exchange-CrossTenant-Network-Message-Id: 3c5ef727-5cfa-4d37-2be5-08d7423837e6
X-MS-Exchange-CrossTenant-originalarrivaltime: 26 Sep 2019 04:15:51.7638 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 98ae7559-10dc-4288-8e2e-4593e62fe3ee
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: wwFmj+KFxmiGPizs0NIO86VqvSFq1g4qW41eBxiR5serKeTPttG7w74r0KC8wekKLkbFZte0M7k8VumsBwoTKg==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR07MB4169
Archived-At: <https://mailarchive.ietf.org/arch/msg/urn/ojB4Hu8X3FD1g96zdJ4q1FjzJhY>
Subject: Re: [urn] gbs Name space identifier
X-BeenThere: urn@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Revisions to URN RFCs <urn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/urn>, <mailto:urn-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/urn/>
List-Post: <mailto:urn@ietf.org>
List-Help: <mailto:urn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/urn>, <mailto:urn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Sep 2019 04:16:00 -0000

Hello Philip,

as regards this:

Please tell me whether it is necessary for a urn to be able to uniquely locate files as well as classify them?

URNs don’t have to be provide resolution services, so the resolver (if any) does not need to know the location or locations of the identified resource, or to link the URN to these URL / URLs. You may want to mention in the urn:gbs namespace registration request that no resolution services are anticipated.

It might be useful to add to the request the sentences below on document types and <T> values, with a note that for the time being only Dita documents are within scope. And some background information about Dita might be useful as well, for those who are not familiar with it.

Best regards,

Juha

Lähettäjä: urn <urn-bounces@ietf.org> Puolesta Philip R Brenan
Lähetetty: keskiviikko 25. syyskuuta 2019 21.46
Vastaanottaja: lars.svensson@web.de
Kopio: urn@ietf.org; Dale R. Worley <worley@ariadne.com>
Aihe: Re: [urn] gbs Name space identifier

I have removed the link in question as the explanation of the derivation of the <T> component was deemed unsatisfactory.  Here is what I was trying to achieve:

It is anticipated that the GB Standard represented by the urn: gbs name space could be usefully applied to a number of different document types, such as Dita, DocBook, Word, Html etc.  The <T> component is designed to separate these various name spaces. At the moment the only <T> in active use  is dita for Dita documents.  Within the Dita space the algorithm for computing the <G> component is included in:

https://metacpan.org/pod/Dita::GB::Standard

as gbStandardFileName().

The computation of the <G> component is performed by examining the text between which ever of the following xml tags exist in a particular Dita document in the order in which they appear:

 title mainbooktitle booktitlealt

The text between these tags is used to form the <G> component after converting runs of all characters other than a-zA-Z0-9 to single underscores. This method was chosen because it produces the most readable names that are closely aligned with what authors expect to see as a file name.

The purpose of the GB Standard is to control the explosion of duplicate Dita topics that tends to occur as documents evolve.  Typically when a new product is documented, the author takes the existing set of linked topic files comprising the documentation of the product, duplicates all of these files to preserve the linkage structure,  then makes a small number of changes to a few of the duplicated files, leaving the bulk of the topic files unchanged.  It is difficult to reuse the original topic files in situ because of the need to maintain the links between them.

The GB Standard seeks to reduce this exponential growth of topic files by giving each topic a unique deterministic name so that links between topics can be expressed in a way that endures as the topic files are copied over time.

As proposed, the GB Standard allows a server to quickly determine whether it has a copy of a file by computing the GB Standard name of an incoming file and comparing it to the names of all such files stored locally.   If the name already exists then that file is reused, if the name does not exist on the server then the server adds the incoming file to its list of files available.

It is not the current intention to use the GB Standard name to locate off site copies of a file - as things stand this could only be achieved by querying each server known to store files in this manner in turn.  Please tell me whether it is necessary for a urn to be able to uniquely locate files as well as classify them?  If it is a requirement that a urn can be used to locate a topic file anywhere in the world then I need to rethink this aspect of the GB Standard and update my application for the gbs namespace accordingly.  If location is not necessarily required then the description of the computation of the <G> component and adequate documentation of the standard names in the <T> would be seem to be the elements that need work to progress this application further?










On Tue, Sep 24, 2019 at 12:51 PM <lars.svensson@web.de<mailto:lars.svensson@web.de>> wrote:
> >    <T> is a string of one or more characters drawn from: [a-zA-Z0-9_] which
> >    identifies the type of content from a list of types published by the
> >    registrant at https://metacpan.org/pod/Dita::GB::Standard::Types .
>
> I attempted to obtain the list of valid types at the given URL, but was
> unsuccessful.  That page seemed to be a very top-level discussion of
> "The GB Standard".

That URL gives me a 404...

Best,

Lars


--
Thanks,
Phil<https://opentokrtc.com/room/phil>
Philip R Brenan<https://opentokrtc.com/room/phil>