Re: [Ltru] my technical position on extlang

"Gerard Meijssen" <gerard.meijssen@gmail.com> Sun, 25 May 2008 10:14 UTC

Return-Path: <ltru-bounces@ietf.org>
X-Original-To: ltru-archive@megatron.ietf.org
Delivered-To: ietfarch-ltru-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 4B7353A6AA9; Sun, 25 May 2008 03:14:27 -0700 (PDT)
X-Original-To: ltru@core3.amsl.com
Delivered-To: ltru@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 9D28528C15E for <ltru@core3.amsl.com>; Sun, 25 May 2008 03:13:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.999
X-Spam-Level:
X-Spam-Status: No, score=-0.999 tagged_above=-999 required=5 tests=[AWL=1.599, BAYES_00=-2.599, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4zGkDg79+enk for <ltru@core3.amsl.com>; Sun, 25 May 2008 03:13:06 -0700 (PDT)
Received: from rv-out-0506.google.com (rv-out-0506.google.com [209.85.198.232]) by core3.amsl.com (Postfix) with ESMTP id 470CE28C117 for <ltru@ietf.org>; Sun, 25 May 2008 03:12:52 -0700 (PDT)
Received: by rv-out-0506.google.com with SMTP id b25so1440298rvf.49 for <ltru@ietf.org>; Sun, 25 May 2008 03:12:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; bh=SdqrdzXOythzAb/2RTUNEH73fBLC0YzvhRpzIDey2fk=; b=vyjKCznaB6s0DBtBGQIxOaI1ZCn7GxSWaBv8TTE9N42fSicE2FCKYXZWtPCTHKbwqA58v3VUkY/mhpe4SvsEMzdYqkOxoI2pO1dwA4Ie7NSb/cq5Y8ptzpdyaRX15/5J9B8WHk3VMKGm2Dar/TAI1WYzHfji+OYzsrKKr3hrwkY=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=SFXpstNH180vp6TyYhDlXMwvzKg3Xbcuc5ug60lgbIOnb9IC7H08uaEzDHxhw7GICUWbOqqCA6/cK+kXGx8s/rlen5ZRe3udLX+lz7pak6lkO9y1R6+NMZmz+NOjASanl4Sf0zrA6txonjIEZNqCpLvpgXbZUu/2pw+mkf7jSLA=
Received: by 10.141.122.20 with SMTP id z20mr1593377rvm.160.1211710368536; Sun, 25 May 2008 03:12:48 -0700 (PDT)
Received: by 10.141.210.20 with HTTP; Sun, 25 May 2008 03:12:48 -0700 (PDT)
Message-ID: <41a006820805250312k41b9c228s95c5dce9213549d2@mail.gmail.com>
Date: Sun, 25 May 2008 12:12:48 +0200
From: Gerard Meijssen <gerard.meijssen@gmail.com>
To: Leif Halvard Silli <lhs@malform.no>
In-Reply-To: <48392CB4.2050505@malform.no>
MIME-Version: 1.0
References: <30b660a20805181149u2e1e3fb9y1a3b5b751c3e6998@mail.gmail.com> <20080523160905.GD21554@mercury.ccil.org> <30b660a20805231405q56b156c4vbb3b6abda4af3893@mail.gmail.com> <20080523225400.GB13152@mercury.ccil.org> <30b660a20805231639w1de0fda8w116662738f8c5d6a@mail.gmail.com> <20080523234427.GC13152@mercury.ccil.org> <30b660a20805231655r34486205m9362e8fe65193ae6@mail.gmail.com> <20080524001151.GD13152@mercury.ccil.org> <30b660a20805240943o44a5719r50eb8f0eaf721dca@mail.gmail.com> <48392CB4.2050505@malform.no>
Cc: LTRU Working Group <ltru@ietf.org>
Subject: Re: [Ltru] my technical position on extlang
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0335097680=="
Sender: ltru-bounces@ietf.org
Errors-To: ltru-bounces@ietf.org

Hoi,
I am lost when I try to understand you. For me it is simple; with a zh
article as it is tagged today, we know nothing for sure. It can be anything.
This is the sad truth of the matter. Keeping  the zh tag only prolongs this
for the content that will be tagged in the future. What we need is to
understand what the historic tags means and interpret it for what it stands
for. This interpretation is done best by computers.

As long as you expect people to be able to tag manually, you do in my
opinion best to cut this gordian knot. This leaves you with ends of rope
that you can call "cmn" "yue" et all. When you are sure that people use
software that hides the issues involved, that present the choises in natural
language, then and only then does it not matter what the codes say, they
could even be a binary code.
Thanks,
     Gerard


On Sun, May 25, 2008 at 11:09 AM, Leif Halvard Silli <lhs@malform.no> wrote:

> Mark Davis 2008-05-24 18.43:
> > Let's suppose that content tagged with "fr", "gsw", and
> > "yue" (or "zh-yue") are available, and nothing else. Let's suppose also
> > that I understand German, French, and Mandarin, but not any of the
> > following.
> >
>
> > No Extlang Model
> >
> >    - If I specify "zh cmn fr", I'll get "fr" under either model. Fine, I
> can
> >    understand "fr".
> >    - If I wanted to also request Cantonese as well as Mardarin, I would
> use
> >    "yue cmn zh fr" and get "yue"
> >    - I only get "yue" if I request it. That's good, because I don't
> >    understand "yue".
> >    - I need to include "zh", and will for the indefinite future because
> the
> >    vast majority of Mandarin content is tagged that way.
> >    - I might get content tagged "zh" but which is actually "yue" but the
> >    chances of that are extremely remote.
> >
>
> A fundamental question:  No matter how "extremly remote" the "yue"
> content which is tagged as "zh" is, why do you place "zh" first? Where
> is the benefit in having 'zh cmn' instead of 'cmn zh'?  If you are
> against Extlang, why are you rejecting the benefit of not having Extlang
> by placing 'zh' first?
>
> > Extlang Model
> >
> >    - If I specify "zh zh-cmn fr", I would get "zh-yue". Broken, I don't
> >    understand Cantonese!
> >
>
> Ok, so you want to discus your 0.00001% of Chinese resources which are
> in Cantonese ...
>
>    * Firstly, you should not need to say 'zh' when you have said
>      'zh-cmn', since Web servers wiill see that zh-cmn as a partly
>      match for 'zh'. So I disagree that you really need to include 'zh'
>      here.
>    * Secondly, as I understand it, a Mandarin user very likely
>      understands some Cantonese. It all depends on the kind of
>      Cantonese - if we are to believe what was mentioned on this list
>      earlier.  Do the unlikelyhood that the Mandarin user could meet a
>      Cantonese page he/she was completely unable to comprehend really
>      defend "going through hoops"?
>          o Thus I feel that this need for a Mandarin speaker to avoid
>            Cantonese at all cost is not a a real issue for a native
>            Chinese speaker. I.e. you describe an edge case.
>
> >    - In order to specify that I want "zh" or "cmn" but no other
> languages, I
> >    have to use "zh-cjy;q=0, zh-cpx;q=0, zh-czh;q=0, zh-czo;q=0,
> zh-gan;q=0,
> >    zh-hak;q=0, zh-hsn;q=0, zh-mnp;q=0, zh-nan;q=0, zh-wuu;q=0, zh-yue;q=0
> zh
> >    zh-cmn fr".
> >
>
> A simpler solution, for laymen, would be this order of preference:
> "zh-cmn fr zh-yue zh-cjy zh-cpx zh-czh zh-czo zh-gan zh-hak zh-hsn
> zh-mnp zh-nan zh-wuu"
>
> In other words, to completely control the languages of a macrolanguage,
> you must list them all.
>
> This is the benefit, and downside, of macrolanguages.
>
> >    - Moreover, as you said, if another encompassed language shows up for
> zh,
> >    I could get that inadvertently unless I add this to my list.
> >    - Again, I need to include "zh", and will for the indefinite future
> >    because the vast majority of Mandarin content is tagged that way.
> >
>
> All users of the Chinese Macrolanguae will, eventually, need to include
> 'zh'.
> --
> leif halvard silli
> _______________________________________________
> Ltru mailing list
> Ltru@ietf.org
> https://www.ietf.org/mailman/listinfo/ltru
>
_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www.ietf.org/mailman/listinfo/ltru