Re: extlang (was Re: Suggested language for "mis" (Re: [Ltru] RE: ISO 639-2 decision: "mis"))
John Cowan <cowan@ccil.org> Wed, 20 June 2007 17:15 UTC
Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1I13lv-0004E9-OT; Wed, 20 Jun 2007 13:15:11 -0400
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1I13lu-0004Ds-Qr for ltru-confirm+ok@megatron.ietf.org; Wed, 20 Jun 2007 13:15:10 -0400
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1I13lu-0004DW-Ds for ltru@ietf.org; Wed, 20 Jun 2007 13:15:10 -0400
Received: from earth.ccil.org ([192.190.237.11]) by chiedprmail1.ietf.org with esmtp (Exim 4.43) id 1I13lr-0008Ua-RG for ltru@ietf.org; Wed, 20 Jun 2007 13:15:10 -0400
Received: from cowan by earth.ccil.org with local (Exim 4.63) (envelope-from <cowan@ccil.org>) id 1I13lm-000540-Tb; Wed, 20 Jun 2007 13:15:03 -0400
Date: Wed, 20 Jun 2007 13:15:02 -0400
To: Mark Davis <mark.davis@icu-project.org>
Subject: Re: extlang (was Re: Suggested language for "mis" (Re: [Ltru] RE: ISO 639-2 decision: "mis"))
Message-ID: <20070620171502.GL12168@mercury.ccil.org>
References: <30b660a20706171252l3c61d451p464b96e864d1a515@mail.gmail.com> <007f01c7b166$8ef7bf10$6401a8c0@DGBP7M81> <30b660a20706181006x3efbf772t9a0751feb070a6cb@mail.gmail.com> <20070619013433.GA15048@mercury.ccil.org> <30b660a20706191130x2a83134ned38aed061d551b1@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <30b660a20706191130x2a83134ned38aed061d551b1@mail.gmail.com>
User-Agent: Mutt/1.5.13 (2006-08-11)
From: John Cowan <cowan@ccil.org>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 7fa173a723009a6ca8ce575a65a5d813
Cc: LTRU Working Group <ltru@ietf.org>, Doug Ewell <dewell@roadrunner.com>
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org
Mark Davis scripsit: > My point was that anyone who wants to deal with macro languages, has to > already deal with sr, hr, etc. as primary subtags, not as secondary. It's not a MUST, it's a MAY. Matchers are allowed to take into account any information outside the purely syntactic match algorithms that they happen to have -- and find useful For example, a fallback from Scots to English probably makes sense everywhere -- if you know Scots, you know English. On the other hand, a fallback from Swedish to Finnish would be a Really Bad Idea, except within Finland, where it might make all the sense in the world. (Falling back from English to Scots or Finnish to Swedish essentially never makes sense.) So while there is no reason to prevent such nonsyntactic matching, and 4647 explicitly licenses it, there is no reason to require that every matcher use it either. The point of the extlang tags, like their currently-grandfathered predecessors, is to make additional matches easy. > 1. The reason for making microlanguages be extlang instead of primary > sublanguages is so that truncation-style matching will have better > results. Agreed. > 2. Fallback works when there is mutual comprehensibility (not > necessarily 100%, but to a high degree); if you fallback to something > that is not comprehensible, then fallback has failed. Two caveats: (a) fallback is one-way, so *mutual* comprehensibility is not required; (b) Mohawk and English aren't comprehensible at all, mutually or asymmetrically, but it so happens that the few hundred Mohawk-speakers know English too. Also, fallback is a matter of best effort; it does not have to work in every case to be useful in many cases. In particular, a failed fallback leaves you no worse off than before. > Option A. > > 1. Thus for extlang to work for microlanguages, the speakers of any > microlanguages sharing a macrolanguage need to be able to understand > the speakers of any other microlanguages sharing that macrolanguage. Not necessarily all, though the more the better, of course. > 2. Peter and the ISO JAC can verify that A1 is true; that every > speaker of Hakka can understand Jinyu; every speaker of Shihhi Arabic > can understand Cypriot Arabic; and so on). Of course that's not the case, as you must know. Even if true in any particular case, "every speaker" is an inappropriate standard. If there are one or two ancient Mohawks who don't have sufficient command of English, it can't be helped. > 3. Everything is hunky-dory. "Hunky-dory" is the wrong standard. > Option B. This doesn't differ much from Option A, except that it changes the semantics of macrolanguages in a way inconsistent with 639-3: > 1. The macrolanguage alone is always assumed to be the "standard", Otherwise my comments above apply. > After all, it is trivial to make a 4647bis that adds an optional step > for microlanguages, which is that when you get to a microlanguage, > the next step is to look at its macrolanguage before falling back to > the default. That has the same result (and same problems) as extlang, > but is something that is not baked into the standard -- is something > that people can implement if they want without impacting matching for > everyone else. True enough, but it disregards the behavior of the large number of naive matchers already out there that lack nonsyntactic information. It also disregards the well-chosen grandfathered tags (which will become for the most part redundant in 4646bis along the current lines), which were picked precisely because they worked tolerably, if not perfectly, with naive matchers. This is the primary argument for using extlangs: they are a conservative extension of what has already been done and what tags have already been assigned to documents. Your argument also proves too much: we find it useful to fall back from az-Cyrl, az-Latn, and az-Arab to plain az (or vice versa for filtering) without requiring that these be mutually intercomprehensible (your Option A) or that one of them is the standard (your Option B). The results may be less than ideal, but we live with it. > We and everyone else already have to deal with equivalences with > grandfathered and irregular tags anyway; these are not a real problem. Again, every one else does not *have to* deal with them; they MAY treat them just like all other tags, at the expense of missing some reasonable matches. 4647 makes such matches entirely optional. > I disagree strongly. If you can't make a compelling case that extlang > will make BCP 47 better instead of worse, and won't even look at the > reasons not to do it, nor even bother to set out a case for it, then > why should we add it? It's true, as Dr. Johnson said: "Of an opinion which is no longer doubted, the evidence ceases to be examined." I've done my best above. But I don't understand yet how extlangs make BCP 47 worse, and I think they actually help given legacy documents and legacy matchers. -- John Cowan cowan@ccil.org http://www.ccil.org/~cowan Is it not written, "That which is written, is written"? _______________________________________________ Ltru mailing list Ltru@ietf.org https://www1.ietf.org/mailman/listinfo/ltru
- extlang (was Re: Suggested language for "mis" (Re… Mark Davis
- Re: extlang (was Re: Suggested language for "mis"… Stephane Bortzmeyer
- Re: extlang (was Re: Suggested language for "mis"… Doug Ewell
- Re: extlang (was Re: Suggested language for "mis"… Mark Davis
- Re: extlang (was Re: Suggested language for "mis"… John Cowan
- RE: extlang (was Re: Suggested language for "mis"… Peter Constable
- Re: extlang (was Re: Suggested language for "mis"… Doug Ewell
- Re: extlang (was Re: Suggested language for "mis"… John Cowan
- Re: extlang (was Re: Suggested language for "mis"… Addison Phillips
- Re: extlang (was Re: Suggested language for "mis"… Stephane Bortzmeyer
- [Ltru] * in extended filtering Peter Constable
- Re: [Ltru] * in extended filtering Addison Phillips
- Re: extlang (was Re: Suggested language for "mis"… Mark Davis
- Re: extlang (was Re: Suggested language for "mis"… John Cowan
- Re: extlang (was Re: Suggested language for "mis"… Addison Phillips
- Re: extlang (was Re: Suggested language for "mis"… John Cowan
- [Ltru] Re: extlang Randy Presuhn
- [Ltru] Re: extlang Randy Presuhn
- Re: [Ltru] Re: extlang Addison Phillips
- Re: [Ltru] Re: extlang John Cowan
- Re: extlang (was Re: Suggested language for "mis"… John Cowan
- Re: [Ltru] Re: extlang Randy Presuhn
- RE: extlang (was Re: Suggested language for "mis"… Peter Constable
- RE: [Ltru] Re: extlang Peter Constable
- RE: [Ltru] Re: extlang Peter Constable
- Re: [Ltru] Re: extlang John Cowan
- Re: extlang (was Re: Suggested language for "mis"… Martin Duerst
- Re: extlang (was Re: Suggested language for "mis"… John Cowan
- [Ltru] Re: extlang Doug Ewell
- Re: [Ltru] Re: extlang Mark Davis
- Flagging macrolanguages in the LSR (Was: extlang … Stephane Bortzmeyer
- Re: Flagging macrolanguages in the LSR (Was: extl… Addison Phillips
- Re: Flagging macrolanguages in the LSR (Was: extl… Stephane Bortzmeyer
- [Ltru] Langtag.net Karen_Broome
- Re: Flagging macrolanguages in the LSR (Was: extl… Addison Phillips
- Re: [Ltru] Langtag.net Addison Phillips
- [Ltru] Re: Langtag.net Stephane Bortzmeyer
- [Ltru] Extlang stability (was: extlang) Frank Ellermann
- Re: [Ltru] Extlang stability (was: extlang) John Cowan
- Re: [Ltru] Extlang stability (was: extlang) Mark Davis
- [Ltru] Re: Extlang stability Frank Ellermann
- Re: [Ltru] Extlang stability (was: extlang) John Cowan
- Re: [Ltru] Re: Extlang stability John Cowan
- Re: [Ltru] Extlang stability (was: extlang) Mark Davis
- Re: [Ltru] Extlang stability (was: extlang) John Cowan