Re: [Ltru] my technical position on extlang

Leif Halvard Silli <lhs@malform.no> Sun, 25 May 2008 09:10 UTC

Return-Path: <ltru-bounces@ietf.org>
X-Original-To: ltru-archive@megatron.ietf.org
Delivered-To: ietfarch-ltru-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 466C828C11D; Sun, 25 May 2008 02:10:42 -0700 (PDT)
X-Original-To: ltru@core3.amsl.com
Delivered-To: ltru@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 193B73A6929 for <ltru@core3.amsl.com>; Sun, 25 May 2008 02:10:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.477
X-Spam-Level:
X-Spam-Status: No, score=-2.477 tagged_above=-999 required=5 tests=[AWL=0.122, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KsvB95AZNgL0 for <ltru@core3.amsl.com>; Sun, 25 May 2008 02:10:18 -0700 (PDT)
Received: from lakepoint.domeneshop.no (lakepoint.domeneshop.no [194.63.248.54]) by core3.amsl.com (Postfix) with ESMTP id 681473A6A8A for <ltru@ietf.org>; Sun, 25 May 2008 02:09:36 -0700 (PDT)
Received: from 10013.local (cm-84.208.108.246.getinternet.no [84.208.108.246]) (authenticated bits=0) by lakepoint.domeneshop.no (8.13.8/8.13.8) with ESMTP id m4P997to031762; Sun, 25 May 2008 11:09:07 +0200
Message-ID: <48392CB4.2050505@malform.no>
Date: Sun, 25 May 2008 11:09:08 +0200
From: Leif Halvard Silli <lhs@malform.no>
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.1b1) Gecko/20060724 Thunderbird/2.0a1 Mnenhy/0.7.4.666
MIME-Version: 1.0
To: Mark Davis <mark.davis@icu-project.org>, John Cowan <cowan@ccil.org>, LTRU Working Group <ltru@ietf.org>
References: <30b660a20805181149u2e1e3fb9y1a3b5b751c3e6998@mail.gmail.com> <20080523044305.GB7960@mercury.ccil.org> <30b660a20805230851r519f5d14wd93a92494d1db1c9@mail.gmail.com> <20080523160905.GD21554@mercury.ccil.org> <30b660a20805231405q56b156c4vbb3b6abda4af3893@mail.gmail.com> <20080523225400.GB13152@mercury.ccil.org> <30b660a20805231639w1de0fda8w116662738f8c5d6a@mail.gmail.com> <20080523234427.GC13152@mercury.ccil.org> <30b660a20805231655r34486205m9362e8fe65193ae6@mail.gmail.com> <20080524001151.GD13152@mercury.ccil.org> <30b660a20805240943o44a5719r50eb8f0eaf721dca@mail.gmail.com>
In-Reply-To: <30b660a20805240943o44a5719r50eb8f0eaf721dca@mail.gmail.com>
Subject: Re: [Ltru] my technical position on extlang
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: ltru-bounces@ietf.org
Errors-To: ltru-bounces@ietf.org

Mark Davis 2008-05-24 18.43:
> Let's suppose that content tagged with "fr", "gsw", and
> "yue" (or "zh-yue") are available, and nothing else. Let's suppose also
> that I understand German, French, and Mandarin, but not any of the
> following.
>   

> No Extlang Model
>
>    - If I specify "zh cmn fr", I'll get "fr" under either model. Fine, I can
>    understand "fr".
>    - If I wanted to also request Cantonese as well as Mardarin, I would use
>    "yue cmn zh fr" and get "yue"
>    - I only get "yue" if I request it. That's good, because I don't
>    understand "yue".
>    - I need to include "zh", and will for the indefinite future because the
>    vast majority of Mandarin content is tagged that way.
>    - I might get content tagged "zh" but which is actually "yue" but the
>    chances of that are extremely remote.
>   

A fundamental question:  No matter how "extremly remote" the "yue" 
content which is tagged as "zh" is, why do you place "zh" first? Where 
is the benefit in having 'zh cmn' instead of 'cmn zh'?  If you are 
against Extlang, why are you rejecting the benefit of not having Extlang 
by placing 'zh' first?

> Extlang Model
>
>    - If I specify "zh zh-cmn fr", I would get "zh-yue". Broken, I don't
>    understand Cantonese!
>   

Ok, so you want to discus your 0.00001% of Chinese resources which are 
in Cantonese ...

    * Firstly, you should not need to say 'zh' when you have said
      'zh-cmn', since Web servers wiill see that zh-cmn as a partly
      match for 'zh'. So I disagree that you really need to include 'zh'
      here.
    * Secondly, as I understand it, a Mandarin user very likely
      understands some Cantonese. It all depends on the kind of
      Cantonese - if we are to believe what was mentioned on this list
      earlier.  Do the unlikelyhood that the Mandarin user could meet a
      Cantonese page he/she was completely unable to comprehend really
      defend "going through hoops"?
          o Thus I feel that this need for a Mandarin speaker to avoid
            Cantonese at all cost is not a a real issue for a native
            Chinese speaker. I.e. you describe an edge case.

>    - In order to specify that I want "zh" or "cmn" but no other languages, I
>    have to use "zh-cjy;q=0, zh-cpx;q=0, zh-czh;q=0, zh-czo;q=0, zh-gan;q=0,
>    zh-hak;q=0, zh-hsn;q=0, zh-mnp;q=0, zh-nan;q=0, zh-wuu;q=0, zh-yue;q=0 zh
>    zh-cmn fr".
>   

A simpler solution, for laymen, would be this order of preference: 
"zh-cmn fr zh-yue zh-cjy zh-cpx zh-czh zh-czo zh-gan zh-hak zh-hsn 
zh-mnp zh-nan zh-wuu"

In other words, to completely control the languages of a macrolanguage, 
you must list them all.

This is the benefit, and downside, of macrolanguages.

>    - Moreover, as you said, if another encompassed language shows up for zh,
>    I could get that inadvertently unless I add this to my list.
>    - Again, I need to include "zh", and will for the indefinite future
>    because the vast majority of Mandarin content is tagged that way.
>   

All users of the Chinese Macrolanguae will, eventually, need to include 
'zh'.
-- 
leif halvard silli
_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www.ietf.org/mailman/listinfo/ltru