Re: [Ltru] Preliminary Investigation into Application of ISO 11179

Karen_Broome@spe.sony.com Wed, 28 June 2006 00:07 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1FvNaO-0002vw-4r; Tue, 27 Jun 2006 20:07:16 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1FvNaM-0002vZ-QI for ltru@ietf.org; Tue, 27 Jun 2006 20:07:14 -0400
Received: from outbound-haw.frontbridge.com ([12.129.219.97] helo=outbound3-haw-R.bigfish.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1FvNaL-0001Br-Kj for ltru@ietf.org; Tue, 27 Jun 2006 20:07:14 -0400
Received: from outbound3-haw.bigfish.com (localhost.localdomain [127.0.0.1]) by outbound3-haw-R.bigfish.com (Postfix) with ESMTP id 3DDA51607566; Wed, 28 Jun 2006 00:07:13 +0000 (UTC)
Received: from mail24-haw-R.bigfish.com (unknown [192.168.51.1]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by outbound3-haw.bigfish.com (Postfix) with ESMTP id 32DE4160748E; Wed, 28 Jun 2006 00:07:13 +0000 (UTC)
Received: from mail24-haw.bigfish.com (localhost.localdomain [127.0.0.1]) by mail24-haw-R.bigfish.com (Postfix) with ESMTP id 1F7ED4D271E; Wed, 28 Jun 2006 00:07:13 +0000 (UTC)
X-BigFish: VP
Received: by mail24-haw (MessageSwitch) id 11514532336564_5430; Wed, 28 Jun 2006 00:07:13 +0000 (UCT)
Received: from USCCIMTA02.spe.sony.com (unknown [64.14.251.196]) (using SSLv3 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by mail24-haw.bigfish.com (Postfix) with ESMTP id E54E94D271F; Wed, 28 Jun 2006 00:07:12 +0000 (UTC)
Received: from usmail04.spe.sony.com ([43.130.148.27]) by USCCIMTA02.spe.sony.com (Lotus Domino Release 6.5.5) with ESMTP id 2006062717091800-354115 ; Tue, 27 Jun 2006 17:09:18 -0700
In-Reply-To: <200606270822.k5R8MIZP015428@mta6.iomartmail.com>
To: Debbie Garside <debbie@ictmarketing.co.uk>
Subject: Re: [Ltru] Preliminary Investigation into Application of ISO 11179
MIME-Version: 1.0
X-Mailer: Lotus Notes Release 6.5.4 March 27, 2005
Message-ID: <OF23E9DDCE.3B39B676-ON8825719A.007E55B6-8825719B.0000A7F3@spe.sony.com>
From: Karen_Broome@spe.sony.com
Date: Tue, 27 Jun 2006 17:06:18 -0700
X-MIMETrack: Serialize by Router on USMAIL04/SVR/SPE(Release 6.5.4FP1|June 19, 2005) at 06/27/2006 17:06:19, Serialize complete at 06/27/2006 17:06:19, Itemize by SMTP Server on USCCiMTA02/SVR/SPE(Release 6.5.5|November 30, 2005) at 06/27/2006 05:09:18 PM, Serialize by Router on USCCiMTA02/SVR/SPE(Release 6.5.5|November 30, 2005) at 06/27/2006 05:09:19 PM, Serialize complete at 06/27/2006 05:09:19 PM
X-Spam-Score: 0.7 (/)
X-Scan-Signature: 43ca87c8fcef5d9f6e966e1c3917103e
Cc: 'Doug Ewell' <dewell@adelphia.net>, 'LTRU Working Group' <ltru@ietf.org>
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1328618759=="
Errors-To: ltru-bounces@ietf.org

Debbie,

Did you review all of ISO 11179 or just 11179-6? 

There is a lot more to ISO 11179 than just the administrative practices 
(-6) and the previous parts discuss a hierarchical metadata model not 
mentioned in your review below. I think you're confusing the terms "value" 
and "representation" and the other sections provide clarity on this.  The 
top of the hierarchy is the Data Concept, which is an abstract description 
independent of its representation -- a pure semantic layer. 

The hierarchy, as I understand it, would be something like this:

Data Concept
        Language Code: A standardized code used to identify a particular 
language or dialect.

Data Elements [Data Concept + Representation Class] related to "Language 
Code" data concept =

ISO 639-1 Language Code:  A two-letter code assigned by the ISO 639-1 
standard to identify a particular language.
ISO 639-2/B Language Code
ISO 639-2/T Language Code
ISO 639-3 Language Code
ISO 639-6 Language Code

Value Domain
        Each data element in this case has a "Value Domain" and the Value 
Domain contains the individual values such as "en-US". 

...

For the Language Code data concept, the ISO 11179 structure seems relevant 
and useful. But when we look at some of the other concepts, it seems less 
useful and perhaps problematic:

Data Concept = Script Code
Data Element = ISO 15924 Script Code

Data Concept = Country Code
Data Element = ISO 3166 Country Code

Note that "Description" or even "Subtag Description" is too vague by ISO 
11179 rules (subjective judgment, but there are a lot of examples that 
support this view) and I think you would need to break these out as:

En-US Language Name
Fr-FR Language Name
En-US Script Name
En-US Country Name

etc.

These are unique data elements relating to several data concepts, so the 
current model with its "Description" field would need serious revision to 
be compliant, I think.

I'm not opposed to further discussion of this moving forward. I only 
question how valuable this is for a standard that has so few data 
elements. It is a good thing that you're familiar with the section of the 
standard I've spent the least time reviewing.  :)

Best regards,

Karen Broome
Sony Pictures Entertainment




"Debbie Garside" <debbie@ictmarketing.co.uk> 
06/27/2006 01:22 AM

To
"'LTRU Working Group'" <ltru@ietf.org>
cc
'Doug Ewell' <dewell@adelphia.net>
Subject
[Ltru] Preliminary Investigation into Application of ISO 11179






Findings of a preliminary investigation into the application of ISO 11179 
to
the RFC3066bis Registry

Cost
Initial investigations suggest that ISO 11179 can be applied to the 
Registry
at a base level for very little cost.  The main cost is in mapping the ISO
11179 terminology to the existing Registry terminology and a number of
additional data elements would be required.  The Registry already
incorporates a system of metadata elements that are consistent with the
model presented within ISO 11179. 

In particular the value of the following aspects of ISO 11179-6 should be
investigated: 

Identification
The attributes registration authority identifier (RAI), data identifier
(DI), and version identifier (VI) constitute the international 
registration
data identifier (IRDI). At least one IRDI is required for an administered
item. 

Data identifiers are assigned by a Registration Authority; data 
identifiers
shall be unique within the domain of a Registration Authority. 

Requirements for a Registration Authority, and a discussion of the IRDI,
appear in ISO/IEC 11179-6.

As each Registration Authority may determine its own DI assignment scheme,
there is no guarantee that the DI by itself will uniquely identify an
administered item. For example, if two authorities both use sequential
6-digit numbers, there may be two administered items with the same DI's;
however, the administered items will almost certainly not be the same. 

If one administered item appears in two registers, it will have two DI's.
Therefore, both the DI and the RAI are necessary for identification of an
administered item. 

If particular attributes of an administered item change, then a new 
version
of the administered item shall be created and registered. The registrar
shall determine these attributes. In such a case, a VI is required to
complete the unique identification of an administered item.

For further guidance, see ISO/IEC 11179-6. 

An IRDI can serve as a key when exchanging data among information systems,
organizations, or other parties who wish to share a specific administered
item, but might not utilize the same names or contexts.
 
ISO/IEC 11179 does not specify the format or content of a unique DI.

The IETF (or LTRU) would need to apply for an International Code 
Designator
(ICD) - a four integer code; this coupled with the organization name as 
well
as a "department" identifier (OPI) becomes the IRDI e.g. 1234.IETF.LTRU.
The ICD would be registered by the RA of ISO/IEC 6523 Organization Codes 
as
Registration Authority Identifier which is currently BSI. 

Implications for the LTRU Registry
The DI (or UI - Unique Identifier) cannot be the Subtag as there are 
already
conflicting Subtags within the registry (e.g. cy/CY).  It is more 
preferable
that the unique identifier be the chosen language/country/script name
(please note, this is not the preferred name).  This would fit with the
current ISO 639-3, -5 and -6 models and open the Registry to adoption by
meta-data knowledge grids. (I will take a good look at the naming
conventions within ISO 3166-1 at a later date but prior to publication of
FDIS 3166-1). 

Anomalies within ISO naming conventions of standards issued prior to the
adoption of ISO 11179 can be dealt with on a case by case basis via set
rules. 

The Subtag would become a "Representation" with the name being the unique
"Data Identifier". This would involve having a "Primary Description" which
would form the DI.

In reviewing the "Required Metadata Attributes" for a "Preferred Standard"
Status administered item, preliminary investigations reveal no serious
additional requirements other than those already mentioned here. Some
manipulation and interpretation of registry data and standard mandatory
requirements would be required but no difficulties are envisaged. I would
refer the WG to ISO/IEC 11179-6:2005(E); Table B-8 (p.34)
 
Benefits
The ISO 11179 model allows for there being conflicting codes between
different meta-data registries in conformity with ISO 11179; that is part 
of
the conceptual model.  ("in conformity with" is correct - there are
essential parts of the standard).

In essence, the ISO 11179 meta-model supports linkage to other ISO 11179
conformant meta-data registries thus facilitating data 
exchange/interchange
whilst giving the LTRU Registry ownership of the data elements contained
therein - they become LTRU elements giving room for manoeuvre should ISO 
get
it wrong.

This will make language tags more meaningful in the future.  The key word
here is "linkage".  ISO 11179 conformant meta-data registries facilitate 
the
creation of knowledge grids, grid computing and the semantic web!

Conclusion
At first glance the cost/value ratio favours ISO 11179; there appears to 
be
very little cost yet the true benefits of future interoperability and data
exchange are unknown. 

It is recommended that further investigation be conducted before 
application
of ISO 11179 can be discussed at WG level. 

Further benefits with regard to data interchange should be explored.

It is further recommended that the "investigation into application of ISO
11179 Meta Data Registries to the Registry and its registration procedures
be conducted by nominated members of the WG with a view to application" be
added to the new LTRU Charter. 

Best regards


Debbie Garside



_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru


_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru