Re: [Ltru] extlang

"Randy Presuhn" <randy_presuhn@mindspring.com> Tue, 28 August 2007 22:30 UTC

Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1IQ9aM-0007EU-Ky; Tue, 28 Aug 2007 18:30:58 -0400
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1IQ9aL-0007EO-Fc for ltru-confirm+ok@megatron.ietf.org; Tue, 28 Aug 2007 18:30:57 -0400
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1IQ9aK-0007EG-SW for ltru@ietf.org; Tue, 28 Aug 2007 18:30:57 -0400
Received: from elasmtp-curtail.atl.sa.earthlink.net ([209.86.89.64]) by chiedprmail1.ietf.org with esmtp (Exim 4.43) id 1IQ9aK-0002Y8-3b for ltru@ietf.org; Tue, 28 Aug 2007 18:30:56 -0400
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dk20050327; d=mindspring.com; b=YM7i35tSb8ibFbjZOF5O7XgiKSDE6RTV3/OabWDQHATTcj7FDe21sF6O2NUqVcwu; h=Received:Message-ID:From:To:References:Subject:Date:MIME-Version:Content-Type:Content-Transfer-Encoding:X-Priority:X-MSMail-Priority:X-Mailer:X-MimeOLE:X-ELNK-Trace:X-Originating-IP;
Received: from [68.164.80.252] (helo=oemcomputer) by elasmtp-curtail.atl.sa.earthlink.net with asmtp (Exim 4.34) id 1IQ9aJ-0003k3-Be for ltru@ietf.org; Tue, 28 Aug 2007 18:30:55 -0400
Message-ID: <001501c7e9c3$71f00b80$6801a8c0@oemcomputer>
From: Randy Presuhn <randy_presuhn@mindspring.com>
To: LTRU Working Group <ltru@ietf.org>
References: <30b660a20708281459r6000d746qe007f2882fae6d73@mail.gmail.com>
Subject: Re: [Ltru] extlang
Date: Tue, 28 Aug 2007 15:33:23 -0700
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2800.1478
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1478
X-ELNK-Trace: 4488c18417c9426da92b9037bc8bcf44d4c20f6b8d69d888fa44b31bb60a93565c0710376ec2c2966c68a24449faf2cf350badd9bab72f9c350badd9bab72f9c
X-Originating-IP: 68.164.80.252
X-Spam-Score: 0.0 (/)
X-Scan-Signature: b5d20af10c334b36874c0264b10f59f1
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

Hi -

As a technical contributor...

> From: "Mark Davis" <mark.davis@icu-project.org>
> To: "John Cowan" <cowan@ccil.org>
> Cc: "LTRU Working Group" <ltru@ietf.org>
> Sent: Tuesday, August 28, 2007 2:59 PM
> Subject: [Ltru] extlang
...
> You are not revealing some important hidden assumptions in your statements.
> 
>    1. The macrolanguage is always a better fallback for every encompassed
>    language than other alternatives. Out of the many encompassed languages, you
>    implying that a speaker of every encompassed language will be able to
>    understand the macrolanguage, or at least better than the alteratives.

I don't see how the use of extlang would require this as an assumption.
Making such an assumption seems a bit like assuming that all languages whose
tags begin in "a" are somehow related.

>    2. People don't lose anything by having the fallback. I dispute this
>    as well. As previously noted:
>    1. Truncation fallback from zh-cmn-Hant-SG to "zh" loses the Hant and
>       the SG; falling back from ar-arb-SA to 'ar' loses the "SA".

It is the nature of truncation fallback to lose information.  No matter
what order the subtags are trimmed off, someone will be able to argue
that for some particular case, a different order might have been better.
This isn't an argument against extlang; it's an argument against unrealistically
high expectations for truncation fallback.

>       2. It introduces ambiguous language names. Right now, in the
>       overwhelming majority of practice, standard Arabic is "ar"; after the
>       change, standard Arabic is "ar-arb".

This would not be desirable.  However, I wonder whether the semantic associated
by most taggers and users of tags with "ar" is "standard Arabic" or merely "Arabic".

>    3. People can't get along without this.
>    1. Anyone who has to deal with language issues on all but a trivial
>       level must already have a mechanism to deal with sh, sr, hr; with no, nb,
>       nn. Those are macrolanguages and encompassed languages. They
> exist right now
>       WITHOUT an extlang mechanism, and people deal with them. The proposed
>       mechanism won't handle these, and anyone who can handle these
> doesn't need
>       extlang.

I think this argument is flawed in that it neglects the cost of supporting
such constellations of languages.  There are already some messes that we're
stuck with, and that have to be handled in an ad hoc manner.  This doesn't
justify requiring ad hoc handling for every other such constellation of
languages.

>       2. With the Macrolanguage field, there is sufficient information
>       for *anyone* who wants to to implement extlang-like fallback
> (including for
>       sh or no), *without* encumbering the IDs with superfluous information.

This would be a compelling argument, if fallback were the sole reason for extlang.
However, fallback is not the sole motivation for extlangs; they are also
of use to taggers with incomplete knowledge of the languages used in the
materials they are tagging.  The library staff in my home town would be
doing well if they correctly recognized material as "zh" or "ar".  It would
be quite unrealistic to expect them to be any more precise.

Of course we'd all like everyone who has to tag material to have perfect
knowledge of the languages involved so that tags with sufficient precision
and accuracy would be used.  But we also know that in reality people work
with incomplete knowledge.  Consequently, I think we should allow people to
who by necessity tag with low precision to nonetheless do so accurately.

> > >    2. it is sufficiently better to warrant making the language tags more
> > >    complicated by the addition of this mechanism.
> >
> > Language tags become more complicated *if* it is desired to make them
> > so.  Those who find "zh" sufficient may continue to use it while still
> > interoperating with "zh-cmn", "zh-yue", and so on.  Existing deployed
> > matchers will continue to work, as will existing deployed software
> > that understands specific tags; they will not need to become more
> > complicated to understand the out-of-band relationship between "zh",
> > "cmn", "yue", etc.
> 
> 
> This is untrue. As soon as we implemented extlang in prototype, we ran into
> the problems listed above. It *didn't* work out of the box.

I'm missing something.  Precisely what scenario was it that was expected to
work that did not?

Randy



_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru