Re: [Ltru] Consensus call: extlang

"Mark Davis" <mark.davis@icu-project.org> Thu, 29 May 2008 18:52 UTC

Return-Path: <ltru-bounces@ietf.org>
X-Original-To: ltru-archive@megatron.ietf.org
Delivered-To: ietfarch-ltru-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id A48FD3A67B1; Thu, 29 May 2008 11:52:14 -0700 (PDT)
X-Original-To: ltru@core3.amsl.com
Delivered-To: ltru@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 5F0223A6B02 for <ltru@core3.amsl.com>; Thu, 29 May 2008 11:52:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.976
X-Spam-Level:
X-Spam-Status: No, score=-1.976 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VVEVUTpaXssv for <ltru@core3.amsl.com>; Thu, 29 May 2008 11:51:59 -0700 (PDT)
Received: from yx-out-2324.google.com (yx-out-2324.google.com [74.125.44.30]) by core3.amsl.com (Postfix) with ESMTP id 3E47E3A6B6C for <ltru@ietf.org>; Thu, 29 May 2008 11:51:45 -0700 (PDT)
Received: by yx-out-2324.google.com with SMTP id 8so356508yxg.49 for <ltru@ietf.org>; Thu, 29 May 2008 11:51:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; bh=CPzzj3ntM9TcAL6juvGTx7p8csrKOKqToU9GH3GL6aA=; b=dlDxSR9ngQsiv3fOV74Y+qs8n0s0BfukAsel86ak3wUk1ik6+/ylknkRmsD9aVIx6FQlJ2eP7qF8o7vV8YKxNrX5D5rDnFX9X03M7qLDfXXEGvVgX2HzkEZaWHKZ7UBeq0c1cEH/7LtRtE9/Z7zeKj5P215wopdLf55Eo77iJKs=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=B1OeboYfJkxL0bDhkYAk3szdjcW3wcF6pVae+0eIIj/u/f4EwMfhmwWP6veCm/FsXPSYeRNqrSCMmcc0Yj74u8JDv3XwF35FUkgK26+HSMk4ChQMSIhAA/bkVCe3+btJDR+IfQTkerYPZQlS+OokfOAbNak/AQtrn6RgIjKC2hk=
Received: by 10.150.54.2 with SMTP id c2mr6460167yba.69.1212087102155; Thu, 29 May 2008 11:51:42 -0700 (PDT)
Received: by 10.150.206.3 with HTTP; Thu, 29 May 2008 11:51:42 -0700 (PDT)
Message-ID: <30b660a20805291151t61cbe69bm49fbb227f7b2429d@mail.gmail.com>
Date: Thu, 29 May 2008 11:51:42 -0700
From: Mark Davis <mark.davis@icu-project.org>
To: "Broome, Karen" <Karen_Broome@spe.sony.com>
In-Reply-To: <E19FDBD7A3A7F04788F00E90915BD36C13C251B437@USSDIXMSG20.spe.sony.com>
MIME-Version: 1.0
References: <422633.90603.qm@web31813.mail.mud.yahoo.com> <E19FDBD7A3A7F04788F00E90915BD36C13C2528ABB@USSDIXMSG20.spe.sony.com> <30b660a20805282114v642c07dawa905112dbd6a35f5@mail.gmail.com> <E19FDBD7A3A7F04788F00E90915BD36C13C251B437@USSDIXMSG20.spe.sony.com>
X-Google-Sender-Auth: 074cb7a59f4b7674
Cc: "ltru@ietf.org" <ltru@ietf.org>
Subject: Re: [Ltru] Consensus call: extlang
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0738021818=="
Sender: ltru-bounces@ietf.org
Errors-To: ltru-bounces@ietf.org

Here is a brief description:

   - Map deprecated codes to the canonical codes*. Thus "zh-yue" maps to
   "yue", and identifies Cantonese, "zh-hakka" maps to "hak", and so on.
      - * There are a few exceptions to this where we use the deprecated
      form as the canonical form, for backwards compatibility, eg "iw" for "he".
      - We have yet to decide what to do with codes that are illegal in
      4646, but where the intent is clear, like "zh-yue-Hant-HK"; I
suspect that
      we'll also remap those also, eg "zh-yue-Hant-HK".
      - We don't want to deal with irregular codes, so any that can't be
   resolved by this mapping are disallowed (there is a separate thread on
   that). Luckily in draft 14 there are only a small handful of those.
   - For lookup, use the kind of fallback that Addison outlined, taking into
   account backward compatibility considerations like the fact that "zh-TW" was
   used to represent "zh-Hant".
   - cmn-Hant-TW
      - zh-Hant-TW
      - cmn-Hant
      - zh-Hant
      - cmn-TW
      - zh-TW
      - cmn
      - zh
   - We have fallbacks that are unrelated to macro/micro, like
   Romanian-Moldavian, Filipino-Tagalog, and I'd expect to add to that list
   over time as we support additional languages. Mandarin might be a good
   fallback for, say, Gan; but Spanish might be a better fallback for
   various Quechua encompassed languages. Each case has to be decided on its
   own merits as it comes up: we've found that macrolanguage is not very useful
   for the goal to "give the user what s/he wants/needs".
   - Because extlang brings no benefit and only complications, if BCP 47
   were to adopt it, I suspect we would cease trying to be conformant to BCP 47
   internally.

Mark

On Thu, May 29, 2008 at 9:42 AM, Broome, Karen <Karen_Broome@spe.sony.com>
wrote:

> Mark writes:
>
> >If identification is what you want, then "cmn" IS unambigously
> >Mandarin (indeterminate script);
>
> But is "cmn" the tag you will use, Mark?
>
> >The string "zh-yue" is really a different semantic; it is not just
> >Cantonese, it is "Cantonese-but-fall-back-to-ambiguously-Chinese-of-
> >indeterminate-script" in lookup (but not some other operations like
> >RFC 4646 filtering).
>
> The fallback is *NOT*, I repeat *NOT*, part of the semantic of that tag.
> ISO defines the semantics of the tags and fallback has nothing to do with
> identifying languages.
>
> My issue is this: If I look at my real-world language list and the codes
> I've assigned, I think everyone on this list would agree that the codes I've
> chosen based on the IANA registry and RFC 4646 are the best tags for that
> content. I have codes for everything I need today, thanks to the patient
> guidance of everyone on this list.
>
> However, when RFC 4646bis is released, it seems like I will be using
> different tags than everyone else on this list and we will no longer agree
> on what the best tag is. I fear the subtleties in the rules may prove
> tougher to explain than today's rules which already cause some eyerolling.
> Considering this, I start to wonder if the release of RFC 4646bis is a step
> backward or forward. While I'm happy there's a tag for Broome Pearling
> Lugger Pidgin in ISO 639-3, I don't have any personal need for it.
>
> Regards,
>
> Karen Broome
>



-- 
Mark
_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www.ietf.org/mailman/listinfo/ltru