[Ltru] Re: extlang
John Cowan <cowan@ccil.org> Tue, 28 August 2007 22:35 UTC
Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1IQ9f0-0003X0-Ac; Tue, 28 Aug 2007 18:35:46 -0400
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1IQ9ez-0003SA-5v for ltru-confirm+ok@megatron.ietf.org; Tue, 28 Aug 2007 18:35:45 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1IQ9ey-0003PZ-Pk for ltru@ietf.org; Tue, 28 Aug 2007 18:35:44 -0400
Received: from earth.ccil.org ([192.190.237.11]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1IQ9ex-0003kr-Fm for ltru@ietf.org; Tue, 28 Aug 2007 18:35:44 -0400
Received: from cowan by earth.ccil.org with local (Exim 4.63) (envelope-from <cowan@ccil.org>) id 1IQ9eq-0002Lv-L3; Tue, 28 Aug 2007 18:35:36 -0400
Date: Tue, 28 Aug 2007 18:35:36 -0400
To: Mark Davis <mark.davis@icu-project.org>
Message-ID: <20070828223536.GB31670@mercury.ccil.org>
References: <30b660a20708281459r6000d746qe007f2882fae6d73@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <30b660a20708281459r6000d746qe007f2882fae6d73@mail.gmail.com>
User-Agent: Mutt/1.5.13 (2006-08-11)
From: John Cowan <cowan@ccil.org>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: a2c12dacc0736f14d6b540e805505a86
Cc: LTRU Working Group <ltru@ietf.org>
Subject: [Ltru] Re: extlang
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org
Mark Davis scripsit: > Not single-handedly: Addison has also commented on it. AFAIK he has acquiesced in, but not defended, your position. (I may be wrong.) > And on the other side, I've heard 3 strong proponents. So not exactly > overwhelming either way. FWIW, I have heard privately from several normally active listmembers supporting my position. I assume that you know I wouldn't lie about this. > I've said before, and will say again: it was never a foregone conclusion. I have never said so. I do say that it is the status quo, and it's a change to the status quo that requires defending. > Do I really have to repeat this over and over???? Alas, we do seem to be looping. I wish I saw the way ahead. > 1. The macrolanguage is always a better fallback for every encompassed > language than other alternatives. Out of the many encompassed languages, you > implying that a speaker of every encompassed language will be able to > understand the macrolanguage, or at least better than the alteratives. RFC 1766 already said that fallback doesn't always work, and sometimes produces something unintelligible to the requester. But I believe you are misusing the term "macrolanguage". A macrolanguage is not to be equated with a "main" or standardized variety. Some macrolanguages like Chinese and Arabic have such varieties, others like Quechua and Nahuatl do not. Rather, a macrolanguage is a group of languages that *in some domain* is recognized as a single language. By definition, if you speak Sudan Arabic, you are speaking something that is part of the Arabic macrolanguage, even if you cannot speak modern standard Arabic at all. Thus it may be empirically sound to assume that anything (or at least any text) tagged "ar" is in MSA, it is not certain. > 1. Truncation fallback from zh-cmn-Hant-SG to "zh" loses the Hant and > the SG; falling back from ar-arb-SA to 'ar' loses the "SA". So it does. > 2. It introduces ambiguous language names. Right now, in the > overwhelming majority of practice, standard Arabic is "ar"; after the > change, standard Arabic is "ar-arb". You propose in the alternative that "ar" and "arb" both be understood as modern standard Arabic. I submit that that is worse. > 1. Anyone who has to deal with language issues on all but a trivial > level must already have a mechanism to deal with sh, sr, hr; > with no, nb, nn. Those are macrolanguages and encompassed > languages. They exist right now WITHOUT an extlang mechanism, > and people deal with them. The proposed mechanism won't handle > these, and anyone who can handle these doesn't need extlang. True, but not everyone does. As I have said before, the majority of language-sensitive applications, I believe, deal only with recognizing a short list of languages that they know what to do with, and all others have the semantics of "unknown language". It's perfectly compliant just to look at the first subtag, decide if it is 'en', 'fr', or 'ar', and throw away all other information; furthermore, I believe this to be typical. Most applications just don't involve processing almost every document known to man. I want this sort of application to continue to work without having to be modified to add "arb" as a fourth alternative. > 2. With the Macrolanguage field, there is sufficient information > for *anyone* who wants to to implement extlang-like fallback > (including for sh or no), *without* encumbering the IDs with > superfluous information. I support the Macrolanguage: field for the uses which you mention. Furthermore, I support extlangs *only* in the context of using newly introduced 639-3 identifiers that are encompassed by a macrolanguage registered in 639-2. Thus, I do *not* support changes to language tags based on shifting macrolanguage information as SIL learns more, just the 350+ specific code elements of 639-3, and no more. > This is untrue. As soon as we implemented extlang in prototype, we ran into > the problems listed above. It *didn't* work out of the box. I can well understand that in certain applications having an unusual breadth of scope, both in the matter of documents and in the matter of languages, that it might well not. -- Dream projects long deferred John Cowan <cowan@ccil.org> usually bite the wax tadpole. http://www.ccil.org/~cowan --James Lileks _______________________________________________ Ltru mailing list Ltru@ietf.org https://www1.ietf.org/mailman/listinfo/ltru
- [Ltru] Re: IANA, UTF-8 and the Language Subtag Re… Frank Ellermann
- [Ltru] Re: extlang Doug Ewell
- [Ltru] Re: IANA, UTF-8 and the Language Subtag Re… Doug Ewell
- [Ltru] Re: extlang Doug Ewell
- Re: [Ltru] Re: extlang Gerard Meijssen
- [Ltru] Re: extlang Doug Ewell
- RE: [Ltru] Re: extlang Peter Constable
- Re: [Ltru] Re: extlang John Cowan
- Re: [Ltru] Re: IANA, UTF-8 and the Language Subta… David Conrad
- Re: [Ltru] Re: extlang Doug Ewell
- Re: [Ltru] Re: extlang Karen_Broome
- Re: [Ltru] Re: extlang John Cowan
- Re: [Ltru] Re: IANA, UTF-8 and the Language Subta… Doug Ewell
- Re: [Ltru] Re: extlang Doug Ewell
- [Ltru] Re: extlang Doug Ewell
- Re: [Ltru] Re: extlang John Cowan
- [Ltru] Numerical region subtags (RE: extlang) Don Osborn
- Re: [Ltru] Numerical region subtags (RE: extlang) Addison Phillips
- [Ltru] Re: Numerical region subtags (RE: extlang) Doug Ewell
- [Ltru] RE: Numerical region subtags (RE: extlang) Don Osborn
- [Ltru] Re: extlang Doug Ewell
- [Ltru] Re: extlang Doug Ewell
- [Ltru] Re: extlang Doug Ewell
- [Ltru] Re: extlang Doug Ewell
- [Ltru] Re: extlang Doug Ewell
- [Ltru] extlang Mark Davis
- Re: [Ltru] extlang Randy Presuhn
- [Ltru] Re: extlang John Cowan
- [Ltru] Re: extlang Mark Davis
- Re: [Ltru] Re: extlang Martin Duerst
- Re: [Ltru] extlang Martin Duerst
- Re: [Ltru] Re: extlang Martin Duerst
- Re: [Ltru] Re: extlang Karen_Broome
- Re: [Ltru] Re: extlang Randy Presuhn
- Re: [Ltru] Re: extlang John Cowan
- RE: [Ltru] Re: extlang Peter Constable
- Re: [Ltru] Re: extlang Mark Davis
- Re: [Ltru] Re: extlang Marion Gunn
- Re: [Ltru] Re: extlang Marion Gunn
- Re: [Ltru] Re: extlang Addison Phillips
- Re: [Ltru] Re: extlang Marion Gunn
- [Ltru] Use of the string "microlangauges" Randy Presuhn
- Re: [Ltru] Re: extlang Karen_Broome
- Re: [Ltru] Re: extlang John Cowan
- RE: [Ltru] Re: extlang Peter Constable
- Re: [Ltru] Re: extlang Mark Davis
- Re: [Ltru] Re: extlang Doug Ewell
- [Ltru] Re: extlang Doug Ewell
- Re: [Ltru] Re: extlang Marion Gunn
- Re: [Ltru] Re: extlang John Cowan
- Re: [Ltru] Re: extlang GerardM
- [Ltru] Re: extlang Doug Ewell
- Re: [Ltru] Re: extlang John Cowan
- Re: [Ltru] Re: extlang Doug Ewell
- Re: [Ltru] Re: extlang John Cowan
- RE: [Ltru] Re: extlang Peter Constable
- Re: [Ltru] Re: extlang Addison Phillips
- Re: [Ltru] Re: extlang John Cowan
- Re: [Ltru] Re: extlang Doug Ewell
- Re: [Ltru] Re: extlang Marion Gunn
- [Ltru] Availability of ISO documents (Was: extlang Stephane Bortzmeyer
- Re: [Ltru] Availability of ISO documents (Was: ex… John Cowan
- [Ltru] Re: Availability of ISO documents (Was: ex… Stephane Bortzmeyer
- [Ltru] Re: Availability of ISO documents (Was: ex… Marion Gunn