[Ltru] Proposal: include new "Language-Type:" field

John Cowan <cowan@ccil.org> Wed, 04 October 2006 13:46 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1GV75K-0000GD-0S; Wed, 04 Oct 2006 09:46:54 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1GV75G-00008Y-EC for ltru@ietf.org; Wed, 04 Oct 2006 09:46:50 -0400
Received: from mercury.ccil.org ([192.190.237.100]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1GV70m-0008Fw-5q for ltru@ietf.org; Wed, 04 Oct 2006 09:42:13 -0400
Received: from cowan by mercury.ccil.org with local (Exim 4.34) id 1GV70a-0000Ob-K1 for ltru@ietf.org; Wed, 04 Oct 2006 09:42:00 -0400
Date: Wed, 04 Oct 2006 09:42:00 -0400
To: ltru@ietf.org
Message-ID: <20061004134200.GC15633@ccil.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
User-Agent: Mutt/1.3.28i
From: John Cowan <cowan@ccil.org>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: c0bedb65cce30976f0bf60a0a39edea4
Subject: [Ltru] Proposal: include new "Language-Type:" field
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

A proposal:

The 4646bis registry should capture the language type information from
ISO 639-3.  Each language, including macrolanguages, is labeled in -3
as either living, extinct, ancient, historic, or constructed.  These
terms are defined precisely at http://www.sil.org/iso639-3/types.asp .
Though informative rather than normative, this is very useful to help
shrink the large number of languages to a more manageable size.

The current draft of 639-3 contains 6989 living languages, 417
extinct languages, 114 ancient languages, 53 historic languages,
and 24 constructed languages.  The codes 'mul', 'und', and 'zxx' are
special cases.

The registry should also capture the individual language vs.  language
collection information from ISO 639-2.  If a code element appears in
-2 but not in -3, it is a language collection; there are 68 such code
elements.  Now that we have 639-3 code elements for essentially every
language on the planet, language-collection subtags are extremely vague
and provide little guidance to the recipient.

(I'm not really happy with the vagueness of "Language-Type", and would
prefer "Language-Status", but it's the term used in the FDIS.)


I propose the following language for 4646bis section 3.1.2:

o Language-Type
        o Language-Type's field-body contains one of the values
          'collection', 'extinct', 'ancient', 'historic', 'constructed',
          or 'special'.  This field MUST NOT appear except in records
          of type 'language'.


And here's a draft of the new section 3.1.3.8:

3.1.3.8.  Language-Type field

        The field 'Language-Type' MUST only appear in records whose
        'Type' field-body is 'language'. This field MUST NOT appear
        more than once in a record.  Most of the language records in
        the registry represent individual living languages.  This field
        indicates those which are not.

        The value 'collection' indicates a language collection appearing
        in ISO 639-2 but not ISO 639-3.  The values 'extinct', 'ancient',
        'historic', and 'constructed' indicate languages which are so
        designated in ISO 639-3; precise definitions of these terms can
        be found in that standard.  The value 'special' is used for the
        three subtags 'mul', 'und', and 'zxx', which do not actually
        designate languages at all.


Finally, here's a rule for section 4.1:

        8.  Language subtags with a 'Language-Type' field of 'collection'
        do not represent specific languages, and SHOULD NOT be used
        unless more specific information is unavailable.

Appropriate adjustments would be needed to 3.3, 3.4, and 3.5 as well.
We should be able to set this field if and when we ever register
a language subtag directly, and change it when 639-3 changes.

-- 
John Cowan
        cowan@ccil.org
                I am a member of a civilization. --David Brin

_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru