Re: [Ltru] Re: Macrolanguage and extlang
"Doug Ewell" <dewell@roadrunner.com> Wed, 18 July 2007 05:41 UTC
Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1IB2Hs-00083M-4n; Wed, 18 Jul 2007 01:41:24 -0400
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1IB2Hq-0007vl-78 for ltru-confirm+ok@megatron.ietf.org; Wed, 18 Jul 2007 01:41:22 -0400
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1IB2Hp-0007u6-Bc for ltru@ietf.org; Wed, 18 Jul 2007 01:41:21 -0400
Received: from mta9.adelphia.net ([68.168.78.199]) by chiedprmail1.ietf.org with esmtp (Exim 4.43) id 1IB2Hn-0004Ie-Tk for ltru@ietf.org; Wed, 18 Jul 2007 01:41:21 -0400
Received: from DGBP7M81 ([76.167.184.182]) by mta9.adelphia.net (InterMail vM.6.01.05.02 201-2131-123-102-20050715) with SMTP id <20070718054119.ZHJA17030.mta9.adelphia.net@DGBP7M81>; Wed, 18 Jul 2007 01:41:19 -0400
Message-ID: <009d01c7c8fe$43f0ee60$6a01a8c0@DGBP7M81>
From: Doug Ewell <dewell@roadrunner.com>
To: LTRU Working Group <ltru@ietf.org>
References: <E1I9ghp-0006L9-0h@megatron.ietf.org> <013b01c7c6a8$55cb4a20$6401a8c0@DGBP7M81> <20070715152301.GY9402@mercury.ccil.org> <30b660a20707151612k14b1e578q7cc7887c68ccc785@mail.gmail.com> <00d701c7c738$841e6930$6a01a8c0@DGBP7M81> <30b660a20707171219q4c824654h7ad9063f23ba26ad@mail.gmail.com>
Subject: Re: [Ltru] Re: Macrolanguage and extlang
Date: Tue, 17 Jul 2007 22:41:18 -0700
MIME-Version: 1.0
Content-Type: text/plain; format="flowed"; charset="utf-8"; reply-type="original"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.3138
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 1a1bf7677bfe77d8af1ebe0e91045c5b
Cc:
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org
Mark Davis wrote: > zh = Mandarin since that is what everyone means by "zh" currently. Hold that thought. > Scenario 1. The user's browser has the proposed "zh-yue-Hant-US". My > lookup falls back to zh, so I serve it up to the user. So even if the > target of the match (zh) is not Cantonese, you want a fallback to zh. > I'm guessing that you see this as better than if we defined the tag as > "yue-Hant-US", since it gets to some fallback that the user is likely > to understand. But I don't see this as much different than if we had > fr-br-BE (meaning Breton, but fall back to French), or ro-mo (meaning > Moldavian, but falling back to Romanian). You're right: providing a fallback from encompassed languages back to their macrolanguages (as defined by Ethnologue) doesn't extend to all other imaginable fallback scenarios. I'm pretty sure nobody ever claimed that it would. > And note that in the fallback, the script and region are completely > lost. > > Scenario 2. The user's browser has zh-cmn-Hant-US. In matching, we > fall back to zh. Note than in the fallback, the script and region are > completely lost. Both of these scenarios assume that script and/or region subtags are likely to be present. You haven't addressed my scenario, where they are not. > We have essentially just introduced a synonym for zh which causes > fallback to lose information, for no good reason. But according to your earlier statement, "cmn" is already effectively a synonym for "zh". > The problem with extlang is that the fallback from encompassed > language to macrolanguage is fundamentally different in kind than a > fallback from region to script to base language. In the case of > script, like uz-Arab and uz-Latn, or en-US vs en-GB, we really have > variations on the same language, and fallback makes sense. We ordered > the subtags so that it works optimally overall. Agreed. > The encompassed languages, on the other hand, are not just dialects, > not just variants. They are languages in their own right. Agreed to a certain point. The whole idea of encompassed languages is that people sometimes consider them to be languages in their own right, and sometimes as "dialects" or "variants" of a macrolanguage. I hate to keep harping on Chinese, but try searching on "Chinese dialects" and "Chinese languages" and see which search is more likely to tell you more about Mandarin vs. Cantonese vs. Wu vs. Hakka. There are a *lot* of people who think of these as variants of a single language. And the same is likely to be true for Standard Arabic vs. Algerian Arabic vs. Libyan Arabic vs. Uzbeki Arabic. > Trying to insert them into the fallback process just screws things up, > because they need a "sideways" matching not just simple truncation > fallback. If you want to do any fallback with extlang, it would be to > fall back from zh-yue-<other stuff> to zh-<other stuff>. That means > that in order to do reasonable fallback, you can't just use truncation > fallback anyway. Agreed. Then again, I'm not the biggest fan of truncation fallback. > So I see the situation this way: > > * The only reason for adding the complication of the extlang mechanism > is to make truncation fallback work better. > > * Truncation fallback with extlang doesn't work better. I claim it does work better if there are no script or region or variant subtags, and is no worse even if there are. > So there is no need to make encompassed languages be "secondary" > languages by making them be "secondary" subtags. I do agree with the possible public perception that extended languages are second-class in some way, although of course that is not the intent. > So instead of adding the extlang mechanism to RFC 4646, what we really > need to do is to point people to how to handle yue and other > encompassed languages along with mo/ro, tl/fil, and other edge cases > in a reasonable way, by augmenting matching. Well, you'll never get me to disagree that matching should be more sophisticated than RFR truncation. But I hope you're not suggesting that we add an RFC 4647bis to our plate at this late date. Look how much time we burned on RFC 4647, and look at the end result: we're still designing everything under the assumption that matching engines will be limited to RFR. -- Doug Ewell * Fullerton, California, USA * RFC 4645 * UTN #14 http://users.adelphia.net/~dewell/ http://www1.ietf.org/html.charters/ltru-charter.html http://www.alvestrand.no/mailman/listinfo/ietf-languages _______________________________________________ Ltru mailing list Ltru@ietf.org https://www1.ietf.org/mailman/listinfo/ltru
- [Ltru] Suggested text for future compatibility of… Mark Davis
- Re: [Ltru] Suggested text for future compatibilit… Randy Presuhn
- Re: [Ltru] Suggested text for future compatibilit… Mark Davis
- Re: [Ltru] Suggested text for future compatibilit… Randy Presuhn
- Re: [Ltru] Suggested text for future compatibilit… Addison Phillips
- Re: [Ltru] Suggested text for future compatibilit… Mark Davis
- [Ltru] Re: Suggested text for future compatibilit… Doug Ewell
- Re: [Ltru] Re: Suggested text for future compatib… John Cowan
- Re: [Ltru] Re: Suggested text for future compatib… Doug Ewell
- [Ltru] Re: Macrolanguage and extlang Doug Ewell
- Re: [Ltru] Re: Macrolanguage and extlang John Cowan
- [Ltru] Re: Macrolanguage and extlang Doug Ewell
- Re: [Ltru] Re: Macrolanguage and extlang Mark Davis
- Re: [Ltru] Re: Macrolanguage and extlang Doug Ewell
- [Ltru] Re: Suggested text for future compatibilit… Stephane Bortzmeyer
- [Ltru] Re: Macrolanguage and extlang Stephane Bortzmeyer
- RE: [Ltru] Re: Macrolanguage and extlang Peter Constable
- Re: [Ltru] Re: Macrolanguage and extlang Mark Davis
- Re: [Ltru] Re: Macrolanguage and extlang Addison Phillips
- Re: [Ltru] Re: Macrolanguage and extlang Randy Presuhn
- Re: [Ltru] Re: Macrolanguage and extlang John Cowan
- Re: [Ltru] Re: Macrolanguage and extlang Doug Ewell
- [Ltru] Re: Macrolanguage and extlang Doug Ewell
- Re: [Ltru] Re: Macrolanguage and extlang John Cowan
- RE: [Ltru] Re: Macrolanguage and extlang Kent Karlsson
- Re: [Ltru] Re: Macrolanguage and extlang John Cowan