[Ietf-languages] Errors in Registry

"Doug Ewell" <doug@ewellic.org> Tue, 08 January 2019 23:58 UTC

Return-Path: <doug@ewellic.org>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 535F612E036 for <ietf-languages@ietfa.amsl.com>; Tue, 8 Jan 2019 15:58:46 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ISpQ1OGm-jYh for <ietf-languages@ietfa.amsl.com>; Tue, 8 Jan 2019 15:58:43 -0800 (PST)
Received: from mork.alvestrand.no (mork.alvestrand.no [158.38.152.117]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3B15A12D4ED for <ietf-languages@ietf.org>; Tue, 8 Jan 2019 15:58:43 -0800 (PST)
Received: by mork.alvestrand.no (Postfix) id 336887C5811; Wed, 9 Jan 2019 00:58:41 +0100 (CET)
Delivered-To: ietf-languages@alvestrand.no
Received: from localhost (localhost [127.0.0.1]) by mork.alvestrand.no (Postfix) with ESMTP id 177237C53E7 for <ietf-languages@alvestrand.no>; Wed, 9 Jan 2019 00:58:41 +0100 (CET)
X-Virus-Scanned: Debian amavisd-new at alvestrand.no
Received: from mork.alvestrand.no ([127.0.0.1]) by localhost (mork.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZQtgOHlI-_Ux for <ietf-languages@alvestrand.no>; Wed, 9 Jan 2019 00:58:39 +0100 (CET)
X-Greylist: from auto-whitelisted by SQLgrey-1.8.0
X-Greylist: from auto-whitelisted by SQLgrey-1.8.0
X-Comment: SPF skipped for whitelisted relay - client-ip=192.0.46.72; helo=pechora6.dc.icann.org; envelope-from=doug@ewellic.org; receiver=ietf-languages@alvestrand.no
Received: from pechora6.dc.icann.org (pechora6.icann.org [192.0.46.72]) by mork.alvestrand.no (Postfix) with ESMTPS id AE7FA7C061F for <ietf-languages@alvestrand.no>; Wed, 9 Jan 2019 00:58:38 +0100 (CET)
Received: from p3plwbeout03-05.prod.phx3.secureserver.net (p3plsmtp03-05-2.prod.phx3.secureserver.net [72.167.218.217]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by pechora6.dc.icann.org (Postfix) with ESMTPS id 0547A1E0215 for <ietf-languages@iana.org>; Tue, 8 Jan 2019 23:58:33 +0000 (UTC)
Received: from p3plgemwbe03-07.prod.phx3.secureserver.net ([72.167.218.135]) by :WBEOUT: with SMTP id h1FegBdSvXLYxh1FegKf2q; Tue, 08 Jan 2019 16:57:42 -0700
X-SID: h1FegBdSvXLYx
Received: (qmail 1301 invoked by uid 99); 8 Jan 2019 23:57:42 -0000
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"
X-Originating-IP: 159.100.160.53
User-Agent: Workspace Webmail 6.9.50
Message-Id: <20190108165741.665a7a7059d7ee80bb4d670165c8327d.f9c23f4dc6.wbe@email03.godaddy.com>
From: Doug Ewell <doug@ewellic.org>
To: ietf-languages <ietf-languages@iana.org>
Date: Tue, 08 Jan 2019 16:57:41 -0700
Mime-Version: 1.0
X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-4.6.2 (pechora6.dc.icann.org [192.0.46.72]); Tue, 08 Jan 2019 23:58:34 +0000 (UTC)
X-CMAE-Envelope: MS4wfCzdtiOACu00ONevJmNKVtQpoHIZz62QZYoUJJb93ZlUxfRqU9UaHjUDu9kGNuKU+ew01eE3JHWTgccKtz7JChZ5tb/GMBTjmdloCJuAgTl34nQA/J2a dh+QmxNJRI4D0314h0WBLAJvXXo+IAMKkg0ob6vknRFgviZKPlI/XRJBchaUyyltzxTbpcPa49nnL2fwOMmKSrhJjb3QllCO8F4=
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/_uxUBEppvHL1Xkq1JKWb2lpk_Fo>
Subject: [Ietf-languages] Errors in Registry
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Jan 2019 23:58:46 -0000

I've done some experimenting recently with an XML version of the
Language Subtag Registry, which allowed me to build an XSD schema and
check the Registry much more robustly than would otherwise be possible
with the record-jar format.
 
In the process, I've discovered a few cases where the current Registry
is not compliant with the RFC 5646 definition. We need to examine these
cases and decide what to do about them.
 
The issues are as follows:
 
 
1. Extlangs 'lsg', 'rsi', 'yds' have no Preferred-Value
 
Section 2.2.2 says:
 
"3.  Extended language subtag records MUST include a 'Preferred-Value'. 
The 'Preferred-Value' and 'Subtag' fields MUST be identical."
 
Section 3.1.7 goes on to say:
 
"Changes to one subtag can affect other subtags as well: when proposing
changes to the registry, the Language Subtag Reviewer MUST review the
registry for such effects and propose the necessary changes using the
process in Section 3.5, although anyone MAY request such changes.  For
example: Suppose that subtag 'XX' has a 'Preferred-Value' of 'YY'.  If
'YY' later changes to have a 'Preferred-Value' of 'ZZ', then the
'Preferred-Value' for 'XX' MUST also change to be 'ZZ'."
 
This was clearly written with the use case in mind of a subtag whose
Preferred-Value changes from something to something else. The intent was
to prevent "chaining" of Preferred-Values, such that 'xxx' has a P-V of
'yyy', which has a P-V of 'zzz', and so forth.
 
However, a subtag can be deprecated with no Preferred-Value. This occurs
frequently when ISO 639-3 withdraws a language code element and the
decision doesn't provide a single replacement element.
 
In BCP 47, subtags for individual languages that are encompassed by an
ISO 639-3 macrolanguage have corresponding extlang subtags, with a
Preferred-Value of the language subtag (per 2.2.2). An important
extension to this scenario is that sign languages are considered to be
encompassed by [sgn], and also get extlangs.
 
Three sign languages have been withdrawn by ISO 639-3, all as
"nonexistent": [lsg] for Lyons Sign Language, [rsi] for Rennellese Sign
Language, and [yds] for Yiddish Sign Language. In an attempt to follow
Section 3.1.7, we removed the Preferred-Value from the extlang records
for 'lsg', 'rsi', and 'yds'. But this breaks the requirement in 2.2.2
that all extlangs must have a P-V which matches the subtag value.
 
Proposed action: Restore the P-V for these three deprecated extlang
subtags. They will point to deprecated language subtags, which is the
"chaining" we were trying to prevent, but that is less of a problem IMHO
than the violation of 2.2.2.
 
 
2. Variants 'arevela' and 'arevmda' have language subtags as their P-V
 
Section 3.1.2 says:
 
"Preferred-Value's field-body contains a canonical mapping from this
record's value to a modern equivalent that is preferred in its place. 
Depending on the value of the 'Type' field, this value can take
different forms: [...] For fields of type 'script', 'region', or
'variant', 'Preferred-Value' contains the subtag of the same type that
is preferred for forming the language tag."
 
We deprecated variant subtags 'arevela' for Eastern Armenian and
'arevmda' for Western Armenian when ISO 639-3 added language code
element [hyw], because now subtag 'hy' could refer specifically to
Eastern Armenian and 'hyw' would refer to Western Armenian, making the
variants unnecessary. But this is a violation of 3.1.2: the P-V is not
of the same type as the subtag.
 
Proposed action: Remove the P-V for each of these variant subtags, and
add a Comments field along the lines of "Preferred tag is hy", analogous
to 'heploc'.
 
 
If there is general agreement to make these changes, I'll post forms. I
do feel it's important to bring the Registry in line with its
specification.
 
 
--
Doug Ewell | Thornton, CO, US | ewellic.org