Re: [Ltru] Proposal: include new "Language-Type:" field

"Mark Davis" <mark.davis@icu-project.org> Wed, 04 October 2006 15:41 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1GV8rl-0001n0-Lc; Wed, 04 Oct 2006 11:41:01 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1GV8rk-0001mt-HH for ltru@ietf.org; Wed, 04 Oct 2006 11:41:00 -0400
Received: from nf-out-0910.google.com ([64.233.182.184]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1GV8rj-0000QM-S6 for ltru@ietf.org; Wed, 04 Oct 2006 11:41:00 -0400
Received: by nf-out-0910.google.com with SMTP id l23so540281nfc for <ltru@ietf.org>; Wed, 04 Oct 2006 08:40:58 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=dnQhW+NjD7KVM/Vi73JVIcJ6vsvqDiX2QBgCfpFwGEMnWshczRzy5PnxUyHdtTWVptWXJpxs3OZ5JC1a6vaQcSDI+2QhsGOAirv8ck9Uh30yC0JxBn/4gavqw0rlfAMmIUfmN9cGPrVzJqcZQiZcSlLyVo5rADUy/9n5RjGOu5I=
Received: by 10.49.93.13 with SMTP id v13mr2483379nfl; Wed, 04 Oct 2006 08:40:58 -0700 (PDT)
Received: by 10.49.93.18 with HTTP; Wed, 4 Oct 2006 08:40:57 -0700 (PDT)
Message-ID: <30b660a20610040840l570c5a6cv852aaa6faf14cb94@mail.gmail.com>
Date: Wed, 04 Oct 2006 08:40:57 -0700
From: Mark Davis <mark.davis@icu-project.org>
To: John Cowan <cowan@ccil.org>
Subject: Re: [Ltru] Proposal: include new "Language-Type:" field
In-Reply-To: <20061004134200.GC15633@ccil.org>
MIME-Version: 1.0
References: <20061004134200.GC15633@ccil.org>
X-Google-Sender-Auth: 5501db1f9c352bd3
X-Spam-Score: 0.4 (/)
X-Scan-Signature: 827a2a57ca7ab0837847220f447e8d56
Cc: ltru@ietf.org
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0539489717=="
Errors-To: ltru-bounces@ietf.org

I think this is out of scope for the registry; our goal should be to have
sufficient description to identify the language (that is, to distinguish it
from others), and no more -- not to be the kitchen sink of information about
languages.

In addition, I don't see much of a need to duplicate information that is in
639, and I doubt that this additional information is "very useful to help
shrink the large number of languages to a more manageable size". Most people
will be picking among living languages, and this classification only
separates out 8% of those. Making a list of 7K languages 8% smaller is not
the key to an effective UI ;-). Those who need this information can go to
639 for it.

  living 6989 92.0%  extinct 417 5.5%  ancient 114 1.5%  historic 53 0.7%
constructed 24 0.3%  oddball 3 0.0%  total 7600

Mark

On 10/4/06, John Cowan <cowan@ccil.org> wrote:
>
> A proposal:
>
> The 4646bis registry should capture the language type information from
> ISO 639-3.  Each language, including macrolanguages, is labeled in -3
> as either living, extinct, ancient, historic, or constructed.  These
> terms are defined precisely at http://www.sil.org/iso639-3/types.asp .
> Though informative rather than normative, this is very useful to help
> shrink the large number of languages to a more manageable size.
>
> The current draft of 639-3 contains 6989 living languages, 417
> extinct languages, 114 ancient languages, 53 historic languages,
> and 24 constructed languages.  The codes 'mul', 'und', and 'zxx' are
> special cases.
>
> The registry should also capture the individual language vs.  language
> collection information from ISO 639-2.  If a code element appears in
> -2 but not in -3, it is a language collection; there are 68 such code
> elements.  Now that we have 639-3 code elements for essentially every
> language on the planet, language-collection subtags are extremely vague
> and provide little guidance to the recipient.
>
> (I'm not really happy with the vagueness of "Language-Type", and would
> prefer "Language-Status", but it's the term used in the FDIS.)
>
>
> I propose the following language for 4646bis section 3.1.2:
>
> o Language-Type
>         o Language-Type's field-body contains one of the values
>           'collection', 'extinct', 'ancient', 'historic', 'constructed',
>           or 'special'.  This field MUST NOT appear except in records
>           of type 'language'.
>
>
> And here's a draft of the new section 3.1.3.8:
>
> 3.1.3.8.  Language-Type field
>
>         The field 'Language-Type' MUST only appear in records whose
>         'Type' field-body is 'language'. This field MUST NOT appear
>         more than once in a record.  Most of the language records in
>         the registry represent individual living languages.  This field
>         indicates those which are not.
>
>         The value 'collection' indicates a language collection appearing
>         in ISO 639-2 but not ISO 639-3.  The values 'extinct', 'ancient',
>         'historic', and 'constructed' indicate languages which are so
>         designated in ISO 639-3; precise definitions of these terms can
>         be found in that standard.  The value 'special' is used for the
>         three subtags 'mul', 'und', and 'zxx', which do not actually
>         designate languages at all.
>
>
> Finally, here's a rule for section 4.1:
>
>         8.  Language subtags with a 'Language-Type' field of 'collection'
>         do not represent specific languages, and SHOULD NOT be used
>         unless more specific information is unavailable.
>
> Appropriate adjustments would be needed to 3.3, 3.4, and 3.5 as well.
> We should be able to set this field if and when we ever register
> a language subtag directly, and change it when 639-3 changes.
>
> --
> John Cowan
>         cowan@ccil.org
>                 I am a member of a civilization. --David Brin
>
> _______________________________________________
> Ltru mailing list
> Ltru@ietf.org
> https://www1.ietf.org/mailman/listinfo/ltru
>
_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru