Re: [Ltru] Process for creating 4646bis Registry

Martin Duerst <duerst@it.aoyama.ac.jp> Mon, 18 September 2006 01:26 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1GP7uH-0001Sv-5y; Sun, 17 Sep 2006 21:26:45 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1GP7uG-0001Sq-6C for ltru@ietf.org; Sun, 17 Sep 2006 21:26:44 -0400
Received: from scmailgw1.scop.aoyama.ac.jp ([133.2.251.194]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1GP7uE-0004J2-7E for ltru@ietf.org; Sun, 17 Sep 2006 21:26:44 -0400
Received: from scmse1.scbb.aoyama.ac.jp (scmse1 [133.2.253.16]) by scmailgw1.scop.aoyama.ac.jp (secret/secret) with SMTP id k8I1QejE024822 for <ltru@ietf.org>; Mon, 18 Sep 2006 10:26:40 +0900 (JST)
Received: from (133.2.206.133) by scmse1.scbb.aoyama.ac.jp via smtp id 3744_bc51b50a_46b4_11db_849f_0014221fa3c9; Mon, 18 Sep 2006 10:26:40 +0900
Received: from Tanzawa.it.aoyama.ac.jp ([133.2.210.1]:49525) by itmail.it.aoyama.ac.jp with [XMail 1.22 ESMTP Server] id <S247EE> for <ltru@ietf.org> from <duerst@it.aoyama.ac.jp>; Mon, 18 Sep 2006 10:26:45 +0900
Message-Id: <6.0.0.20.2.20060917161407.088b7ce0@localhost>
X-Sender: duerst@localhost
X-Mailer: QUALCOMM Windows Eudora Version 6J
Date: Sun, 17 Sep 2006 16:29:34 +0900
To: Doug Ewell <dewell@adelphia.net>, LTRU Working Group <ltru@ietf.org>
From: Martin Duerst <duerst@it.aoyama.ac.jp>
Subject: Re: [Ltru] Process for creating 4646bis Registry
In-Reply-To: <010201c6da1a$3405ad70$6401a8c0@DGBP7M81>
References: <010201c6da1a$3405ad70$6401a8c0@DGBP7M81>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 8fbbaa16f9fd29df280814cb95ae2290
Cc:
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

Hello Doug,

Many thanks for your work.

As a co-chair, I strongly suggest that you submit your text below
and your data as an Internet-Draft as soon as possible. This is
NOT because I want us to meet the deadline in September, but
because this is the way work is done in the IETF. Also, I'd
like to see how well the Internet-Drafts system takes this
(almost) mega-draft.

As a filename, my suggestion is draft-ietf-ltru-4645bis-00.txt,
but any other suggestion is welcome. The main problem with the
above name is that it may be too close to draft-ietf-ltru-4646bis-00.txt.

Some comments as a technical contributor below.

At 14:29 06/09/17, Doug Ewell wrote:
>Here is my first pass at describing the process for generating the RFC 4646bis Language Subtag Registry.  This is not necessarily the exact wording that will appear in RFC 4645bis, but parts of it may end up there with little or no change.
>
>This needs to be evaluated by the WG on at least two levels:
>
>* the process itself
>* the way the process is described (logic, clarity, ambiguity, mechanics, etc.)
>
>1.  Starting point
>
>The existing Registry serves as the starting point for the new Registry. RFC 4645bis will include a reference (probably normative)

No, this should be informative, because there is no need to go back
to RFC 4645 unless for informational purposes.

>to RFC 4645 for those who wish to trace the path all the way back to the ISO and UN standards and RFC 3066 tag registry.
>
>The new inputs to the Registry are the ISO/FDIS 639-3 online code tables.  This includes the code set dated 2006-04-21 and the macrolanguage mappings dated 2005-06-14, which are currently (2006-09-16) available on the ISO 639-3 home page.  Peter has said there are changes in the FDIS document not reflected in these files, and more changes will be coming, but I am using these files because they are the most up-to-date information available to me.

Great.

>2.  Language subtags modified
>
>The existing Registry includes 489 primary language subtags, including some for "language collections" not included in 639-3.  In cases where the reference name(s) in 639-3 for these languages differ(s) from the Description fields in the Registry, the new names are added to the existing Registry entry, after all existing descriptions.  For example, the subtag "gd" currently has the descriptions "Gaelic" and "Scottish Gaelic"; the new description "Gaelic (Scots)" from 639-3 will be added. This includes inverted forms of existing names, such as "Frisian, Northern".
>
>This also includes language names that include a country name, to differentiate them from other languages of the same name associated with a different country.  For example, the subtag "bas" currently has the description "Basa"; to this would be added "Basa (Cameroon)" from 639-3. The new subtag "bzw" would have the single description "Basa (Nigeria)".
>
>Reference names in the 639-3 files that contain the strings "(generic)" or "(specific)" are changed to "(macrolanguage)" and the empty string, respectively, in accordance with Peter's statement on September 12 that the FDIS already includes this change.  The single exception is "Zande (specific)" (zne), which is left unchanged because there is also a 639-3 code element "Zande" (znd) and neither of these is a macrolanguage.

Can we expect this to be fixed (i.e. the two Zande to be disambiguated)
in a newer version of 639-3? Peter?

>No existing Description fields are changed or deleted.
>
>3.  New subtags added
>
>All 639-3 code elements not in the LSR are added as primary language subtags if they are not included under a 639-3 macrolanguage, or as extended language subtags if they are (with the macrolanguage as Prefix).  This is consistent with John Cowan's description of the process, which was generally approved by the list.
>
>As a special case, all 639-3 languages with the words "Sign Language" in their name are added as extlangs, with "sgn" as the Prefix.  This is the same as if 639-3 had listed them under a macrolanguage "sgn".  (The entry "sgn", being a collection code, is not present in 639-3 at all.)
>
>All subtags added to the Registry are added in alphabetical order within their type, with the 2-letter language subtags appearing before the 3-letter subtags.  IANA previously added two 3-letter subtags, "anp" and "frr", in the section for 2-letter subtags, which is allowed by RFC 4646 but may surprise some readers.  No attempt is made here to change this.

I think that if we want things in a certain order, we should put them
into that order, and say so in 4646bis.

>4.  Grandfathered and redundant tags
>
>As a consequence of adding these new subtags, the following grandfathered tags are deprecated, with the indicated tag specified as Preferred-Value:
>
>i-ami      -> ami
>i-bnn      -> bnn
>i-pwn      -> pwn
>i-tao      -> tao
>i-tay      -> tay
>i-tsu      -> tsu
>sgn-CH-de  -> sgn-sgg
>zh-hakka   -> zh-hak
>zh-min     -> (no Preferred-Value, just deprecated)
>zh-min-nan -> zh-nan
>zh-xiang   -> zh-hns
>
>Each of these grandfathered tags has a Comment field indicating the ISO 639-3 code element (for example, "replaced by ISO code ami").  This duplicates the Preferred-Value information and is not strictly necessary, but is added for consistency with other grandfathered tags. This comment is also added to the tag "zh-guoyu", which is already deprecated in favor of another grandfathered tag, "zh-cmn" (which will now be redundant; see below).
>
>The following grandfathered tags are now fully composable and are moved to redundant:
>
>zh-cmn
>zh-cmn-Hans
>zh-cmn-Hant
>zh-gan
>zh-wuu
>zh-yue
>
>The following redundant sign-language tags are deprecated (requiring a change in the RFC 4646 rules, which currently disallow this).
>
>sgn-BR -> sgn-bzs
>sgn-CO > sgn-csn
>sgn-DE > sgn-gsg
>sgn-DK > sgn-dsl
>sgn-ES > sgn-ssp
>sgn-FR > sgn-fsl
>sgn-GB > sgn-bfi
>sgn-GR > sgn-gss
>sgn-IE > sgn-isg
>sgn-IT > sgn-ise
>sgn-JP > sgn-jsl
>sgn-MX > sgn-mfs
>sgn-NI > sgn-ncs
>sgn-NL > sgn-dse
>sgn-NO > sgn-nsl
>sgn-PT > sgn-psr
>sgn-SE > sgn-swl
>sgn-US > sgn-ase
>sgn-ZA > sgn-sfs
>
>No change is made to the Description field for these tags, so (for example) "sgn-US" will still have the description "American Sign Language".  This is to avoid unintended changes of scope.

Great.

>5.  Availability
>
>A prototype 4646bis Registry built using these rules, with an arbitrary File-Date (and Added date for all new subtags) of 2007-01-01, is available at:
>
>http://users.adelphia.net/~dewell/lstreg-4646bis.txt   (640,265 bytes)
>http://users.adelphia.net/~dewell/lstreg-4646bis.zip   (78,943 bytes, PKZip format)
>
>Please use these files as necessary to evaluate the process.  Please do not distribute them or build anything for release based on them; remember that the rules are not final.  They will not be changed until a revised version of this process is posted to LTRU, so please download them only once per revision.

Thanks again.   Regards,   Martin.



#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     


_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru