Re: [Ltru] Consensus call: extlang
"Mark Davis" <mark.davis@icu-project.org> Fri, 30 May 2008 16:13 UTC
Return-Path: <ltru-bounces@ietf.org>
X-Original-To: ltru-archive@megatron.ietf.org
Delivered-To: ietfarch-ltru-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 7A7103A6B3A; Fri, 30 May 2008 09:13:16 -0700 (PDT)
X-Original-To: ltru@core3.amsl.com
Delivered-To: ltru@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6F2E43A6B64 for <ltru@core3.amsl.com>; Fri, 30 May 2008 09:13:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.976
X-Spam-Level:
X-Spam-Status: No, score=-1.976 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SYKqwORExMkh for <ltru@core3.amsl.com>; Fri, 30 May 2008 09:13:07 -0700 (PDT)
Received: from yx-out-2324.google.com (yx-out-2324.google.com [74.125.44.28]) by core3.amsl.com (Postfix) with ESMTP id 96D4F3A6C29 for <ltru@ietf.org>; Fri, 30 May 2008 09:11:07 -0700 (PDT)
Received: by yx-out-2324.google.com with SMTP id 8so455948yxg.49 for <ltru@ietf.org>; Fri, 30 May 2008 09:11:07 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; bh=NCH7AUbrG0eikWZe+Ds1bOsr9Gncx/oOp0Et9AXaJUk=; b=MK34F9MTef6lst0BScTBYIyhSNKUrnv+njG8if6pGYRv5y3QP1DL2qyoZKz5g8RnBitEuBV4FNPFj0iRBBtaV2yOrzYW+UvH5NL6x4X1GnG0UTNW7/CJ6YHjJGK6FrpQaI0jL9HySlIeRH7Lce8QLKR8sOU1OwaM7vdjqKw9Gnw=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=lrIxPzp+yZGmefTDUkcZnUnjuXknB+aGw1iv28BeVFCMNZpzpLB3/tP3ZNs6X7/8R2EzanRK9aQo9bWquH/HfTPf+AVgSv/dbLRGqKYyDtmgAptkHQfHEAylu7ZufqDo75yj1A7CDN8Dh0oWJS3EgiOKEw1UcHQoaISZp9I4vW8=
Received: by 10.151.102.16 with SMTP id e16mr1567431ybm.80.1212163867012; Fri, 30 May 2008 09:11:07 -0700 (PDT)
Received: by 10.150.206.3 with HTTP; Fri, 30 May 2008 09:11:06 -0700 (PDT)
Message-ID: <30b660a20805300911j1713bff0xa7e8e468e039d42@mail.gmail.com>
Date: Fri, 30 May 2008 09:11:06 -0700
From: Mark Davis <mark.davis@icu-project.org>
To: "Broome, Karen" <Karen_Broome@spe.sony.com>
In-Reply-To: <E19FDBD7A3A7F04788F00E90915BD36C13C251B4FC@USSDIXMSG20.spe.sony.com>
MIME-Version: 1.0
References: <01c301c8bbe5$8c2810c0$6801a8c0@oemcomputer> <6.0.0.20.2.20080527170755.05bd89c0@localhost> <002f01c8c024$0dcdb5c0$6801a8c0@oemcomputer> <6.0.0.20.2.20080528163346.074fac80@localhost> <001f01c8c122$0cbcae80$6801a8c0@oemcomputer> <4D25F22093241741BC1D0EEBC2DBB1DA013A84C314@EX-SEA5-D.ant.amazon.com> <007601c8c1bc$84d93920$6801a8c0@oemcomputer> <104f01c8c1d8$94ad6f30$0a00a8c0@CPQ86763045110> <30b660a20805291559x4f6243a8pecc7ee92c2a36d9c@mail.gmail.com> <E19FDBD7A3A7F04788F00E90915BD36C13C251B4FC@USSDIXMSG20.spe.sony.com>
X-Google-Sender-Auth: 1fd95e4a4ccb2a47
Cc: LTRU Working Group <ltru@ietf.org>
Subject: Re: [Ltru] Consensus call: extlang
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1852955380=="
Sender: ltru-bounces@ietf.org
Errors-To: ltru-bounces@ietf.org
Peter's addressed some of the questions. Back to your original question. For backward compatibility, we'll continue to represent Mandarin as "zh", Standard Arabic as "ar", and so on. Note that this is independent of whether extlang is used or not. That is, if extlang exists, we'll treat incoming "zh-cmn" as if it were "zh"; if it doesn't, we'll treat "cmn" as if it were "zh". And under either scenario it is conformant to tag Mandarin as 'zh'. Why? While 639-3 now specifies clearly that "de" means (for example) just Standard German, while "zh" means Any Chinese, this clarity of specification was not present earlier. The code "zh" has been used in the past for Mandarin, overwhelmingly so; not just 99% or 99.9%, but many 9's. As you said, the tendency was to use illegal (or private use codes) for non-Mandarin content. All of our internal software and any external software that we talk to will expect Mandarin to be tagged as 'zh' for the forseeable future. Of course, we recognize that others may end up using 'zh-cmn' / 'cmn', so we're prepared to deal with that. Note also that really the whole premise of extlang is that 'zh' continues to normally map to Mandarin. After all, if 'zh' really meant that you were as likely to get Gan or Hakka as Mandarin, then having "zh-yue" in order to get some kind of automatic fallback wouldn't make any sense. Other comments below. On Thu, May 29, 2008 at 6:04 PM, Broome, Karen <Karen_Broome@spe.sony.com> wrote: > Mark, > > One thing I think you aren't acknowledging is that "treat as synonyms" > means something very different to the vast numbers of content creators who > use this standard than it does the handful of search engines that use the > fuzzy logic associated with companion standards. As you note in your > document, "It is clear that companies like Google or Yahoo can work around > the problems with extlang." How many other users need and can afford to > implement the extended fallback and filtering logic? Enough that this logic > should be the primary driver behind the chosen solution? > > Before I spend too much time picking apart your lengthy screed involving a > scenario where the BBC presents its web site in Sudanese Creole Arabic with > rotating languages code logic for each day of the week ... (ahem) ... here's > my real-world Chinese language list: > > Chinese (Variant Unknown) > Chinese (Cantonese, Spoken) > Chinese (Cantonese, Written) > Chinese (Mandarin, Spoken) > Chinese (Mandarin, Spoken Taiwanese) > Chinese (Mandarin, Simplified) > Chinese (Mandarin, Traditional) > Chinese (Taiwanese, Spoken) > Chinese (Taiwanese, Written) Sorry you consider it a scree. I realize that the emails have sometimes gotten heated -- email really is a poor substitute for audio discussions in controversial issues; I've seen many, many issues in Unicode and other standards flare for months in email, and be resolved in a few hours of discussion. My real point is that if a query for 'ar' really means "give me any kind of Arabic", then a query for 'ar' would be almost meaningless, since it could return any of a number of mutually incomprehensible alternatives. Although 639-3 now defines it to be "any Arabic", in practice what users will expect to get back is Standard Arabic, and they would be unpleasantly surprised to get back other varieties. And our purpose should be to avoid our users' getting unpleasant surprises. > > > (Apologies, this is hard to represent in ASCII. I have a mini-spreadsheet > if someone wants it.) > > > 1 2 3 4 > a. zh zh zh zh > b. zh-yue yue yue yue > c. zh-yue yue yue yue > d. zh-cmn cmn zh cmn > e. zh-cmn-TW cmn-TW zh-TW cmn-TW > f. zh-cmn-Hans cmn-Hans zh-Hans zh-Hans > g. zh-cmn-Hant cmn-Hant zh-Hant zh-Hant > h. zh-min-nan nan nan nan > i. zh-min-nan nan nan nan above modified slightly to add row references. > > > > * Option #1 (RFC 4646) contains the codes as I have them today. Note that this is not actually RFC4646 conformant: zh-cmn-TW is not valid. > > * Option #2 (RFC 4646bis) contains the codes if I choose to go against the > grain and use "cmn". > * Option #3 (RFC 4646bis) treats "zh" and "cmn" as synonyms; avoids using > "cmn" for compatibility. > * Option #4 (RFC 4646bis) contains the codes "cmn" for spoken context > (where distinction is essential) and "zh" for written context. > > Comments: > > * Option #1 is unambiguous and shows that there is a relationship between > these languages. It also preserves the legacy "zh" tag so developers that > aren't hip to later versions of BCP 47 or 639-3 will have some idea what > these tags mean. The tags are maybe longer than they need to be, but if I > need a fixed-length tag, I can wait for 639-6. The languages may not be > mutually intelligible in some contexts, but they are related. > > * Option #2 is unambiguous, but Microsoft, Google, and Amazon won't be > using the same tags for Chinese that I do. Even if I don't follow their > lead, others likely will. This worries me. Also, the rules for #2 must > include fuzzy guidelines such as, "use the 'zh' tag except when you think > it's a bad idea" and "use the shortest tag except when you don't want to." > This presents complications in trying to explain some sort of consistent > method to the LTRU madness to others. Given this, I start to wish ISO 639-6 > a safe and speedy passage. > > * Option #3 is what I believe you might suggest, but for me, that's the > worst list of all. There are five ambiguous "zh" categories on that list. It > follows the "always use the shortest tag" rule and respects history, but > it's useless to me from an identification perspective. Your list is already ambiguous for columns 1 and 2; you are using "yue" for two different things (written and spoken). The only change it really makes is that you don't have a term for "any chinese". RFC 4646 lacks terms for many, many combinations of things: a term for "any german" (including de, gsw, ...), "any french", "any scandinavian", or any one of the countless other possible sets of languages that people consider to be important for some particular purpose. That's why lists of languages are really the appropriate vehicle. > > * Option #4 has three ambiguous tags and means I have to explain to people > who aren't in this industry about why I use different tags for the same > language. This strategy is less ambiguous that #3, but I'm not sure I can > explain it to other content creators for the same reasons as #2 and presents > the spoken/written complication others may not want. In the long run, this > seems messy and unclear enough that it will result in bad tagging. > > * Options #2,3,4: In general, it worries me that RFC 4646bis offers so many > "preferred" options for the same thing. I really can't see how this > simplifies things for anyone. > > I don't have a need for fuzzy fallback scenarios. I need precise tags and > mostly simple lookup. I think if you take the fallback scenarios and > absurdities out of the document you reference, I don't think there's much > left. The only purpose I have heard for extlang *is* for fallback; that's why the document goes into (painful) depth on that topic. For identification alone, "zh" and "zh-cmn" really mean just the same thing. It is only in the context of matching (filtering and lookup) that they differ in semantics *because of their behavior*: where "cmn" means simply Cantonese, "zh-cmn" effectively means "Cantonese but fallback to any Chinese". > > > Regards, > > Karen Broome > > > > > >-----Original Message----- > >From: ltru-bounces@ietf.org [mailto:ltru-bounces@ietf.org] On Behalf > >Of Mark Davis > >Sent: Thursday, May 29, 2008 4:00 PM > >To: debbie@ictmarketing.co.uk > >Cc: LTRU Working Group > >Subject: Re: [Ltru] Consensus call: extlang > > > >What would be useful is to hear from the extlangistas what their > >concerns are specifically; many have not given reasons for favoring > >encompassed languages into extlang instead of into the primary > >language subtag. It would be useful for them to give the scenarios > >where they think extlang is an improvement. It would be useful to > >find out why they think the scenarios such as in > >http://docs.google.com/Doc?docid=dfqr8rd5_676kxxxjhd&hl=en are not a > >problem. > > > >Clearly people think that using the extlang model solves more > >problems than it causes, so it would be useful to example specific > >cases and see if that is, in fact, true. > > > > > >Mark > > -- Mark
_______________________________________________ Ltru mailing list Ltru@ietf.org https://www.ietf.org/mailman/listinfo/ltru
- Re: [Ltru] Consensus call: extlang Randy Presuhn
- Re: [Ltru] Consensus call: extlang Doug Ewell
- [Ltru] Consensus call: extlang Randy Presuhn
- Re: [Ltru] Consensus call: extlang Debbie Garside
- Re: [Ltru] Consensus call: extlang Martin Hosken
- Re: [Ltru] Consensus call: extlang Stephane Bortzmeyer
- Re: [Ltru] Consensus call: extlang Felix Sasaki
- Re: [Ltru] Consensus call: extlang Doug Ewell
- Re: [Ltru] Consensus call: extlang John Cowan
- Re: [Ltru] Consensus call: extlang Ira McDonald
- Re: [Ltru] Consensus call: extlang Broome, Karen
- Re: [Ltru] Consensus call: extlang Mark Davis
- Re: [Ltru] Consensus call: extlang Nicolas Krebs
- Re: [Ltru] Consensus call: extlang Peter Constable
- Re: [Ltru] Consensus call: extlang Shawn Steele
- Re: [Ltru] Consensus call: extlang Broome, Karen
- Re: [Ltru] Consensus call: extlang Mark Davis
- Re: [Ltru] Consensus call: extlang Broome, Karen
- [Ltru] clear guidance on tagging in cases involvi… Peter Constable
- Re: [Ltru] clear guidance on tagging in cases inv… Broome, Karen
- Re: [Ltru] Consensus call: extlang Randy Presuhn
- Re: [Ltru] clear guidance on tagging in cases inv… Peter Constable
- Re: [Ltru] clear guidance on tagging in cases inv… Broome, Karen
- Re: [Ltru] clear guidance on tagging in cases inv… Felix Sasaki
- Re: [Ltru] clear guidance on tagging in cases inv… Broome, Karen
- Re: [Ltru] Consensus call: extlang Leif Halvard Silli
- [Ltru] hierarchy for hierarchy sake? Peter Constable
- Re: [Ltru] Consensus call: extlang Randy Presuhn
- Re: [Ltru] Consensus call: extlang Mark Davis
- Re: [Ltru] Consensus call: extlang Randy Presuhn
- Re: [Ltru] Consensus call: extlang Peter Constable
- Re: [Ltru] Consensus call: extlang Doug Ewell
- Re: [Ltru] Consensus call: extlang Peter Constable
- Re: [Ltru] Consensus call: extlang Stephane Bortzmeyer
- Re: [Ltru] Consensus call: extlang Martin Duerst
- Re: [Ltru] Consensus call: extlang Mark Davis
- Re: [Ltru] Consensus call: extlang John Cowan
- Re: [Ltru] Consensus call: extlang Randy Presuhn
- Re: [Ltru] Consensus call: extlang Randy Presuhn
- Re: [Ltru] Consensus call: extlang Leif Halvard Silli
- Re: [Ltru] Consensus call: extlang Shawn Steele
- Re: [Ltru] Consensus call: extlang Shawn Steele
- Re: [Ltru] Consensus call: extlang Peter Constable
- Re: [Ltru] Consensus call: extlang Peter Constable
- [Ltru] Applications and Backward Compatibility RE… Debbie Garside
- [Ltru] Wondering about compromize (was: Re: Conse… Martin Duerst
- Re: [Ltru] Consensus call: extlang Martin Duerst
- [Ltru] Second-guessing (was: Re: Consensus call: … Martin Duerst
- Re: [Ltru] Consensus call: extlang Martin Duerst
- [Ltru] Macrolanguage, Extlang. The Sami language … Leif Halvard Silli
- Re: [Ltru] Macrolanguage, Extlang. The Sami langu… Leif Halvard Silli
- Re: [Ltru] Applications and Backward Compatibilit… Mark Davis
- Re: [Ltru] Macrolanguage, Extlang. The Sami langu… Don Osborn
- Re: [Ltru] Consensus call: extlang Mark Davis
- [Ltru] Does 'de' really mean "only standard Germa… Randy Presuhn
- Re: [Ltru] Macrolanguage, Extlang. The Sami langu… Debbie Garside
- Re: [Ltru] Does 'de' really mean "only standard G… Debbie Garside
- Re: [Ltru] Does 'de' really mean "only standard G… John Cowan
- [Ltru] Lookup, matching, etc. Shawn Steele
- Re: [Ltru] Macrolanguage, Extlang. The Sami langu… Don Osborn
- Re: [Ltru] Does 'de' really mean "only standard G… Randy Presuhn
- Re: [Ltru] Macrolanguage, Extlang. The Sami langu… Debbie Garside
- Re: [Ltru] Does 'de' really mean "only standard G… Gerard Meijssen
- Re: [Ltru] Does 'de' really mean "only standard G… John Cowan
- Re: [Ltru] Does 'de' really mean "only standard G… John Cowan
- Re: [Ltru] Does 'de' really mean "only standard G… Mark Davis
- Re: [Ltru] Consensus call: extlang Broome, Karen
- Re: [Ltru] Does 'de' really mean "only standard G… Broome, Karen
- Re: [Ltru] Does 'de' really mean "only standard G… Debbie Garside
- Re: [Ltru] Macrolanguage, Extlang. The Sami langu… Kent Karlsson
- Re: [Ltru] Consensus call: extlang Leif Halvard Silli
- Re: [Ltru] Macrolanguage, Extlang. The Sami langu… Leif Halvard Silli
- Re: [Ltru] Does 'de' really mean "only standard G… Randy Presuhn
- Re: [Ltru] Does 'de' really mean "only standard G… Broome, Karen
- Re: [Ltru] Does 'de' really mean "only standard G… Peter Constable
- Re: [Ltru] Does 'de' really mean "only standard G… John Cowan
- Re: [Ltru] Does 'de' really mean "only standard G… Peter Constable
- Re: [Ltru] Consensus call: extlang Stephane Bortzmeyer
- Re: [Ltru] Consensus call: extlang Peter Constable
- Re: [Ltru] Consensus call: extlang Phillips, Addison
- Re: [Ltru] Consensus call: extlang Phillips, Addison
- Re: [Ltru] Does 'de' really mean "only standard G… Phillips, Addison
- Re: [Ltru] Does 'de' really mean "only standard G… Peter Constable
- Re: [Ltru] Consensus call: extlang Randy Presuhn
- [Ltru] Consensus call: extlang tex
- Re: [Ltru] Consensus call: extlang Peter Constable
- Re: [Ltru] Consensus call: extlang Broome, Karen
- Re: [Ltru] Consensus call: extlang Mark Davis
- Re: [Ltru] Does 'de' really mean "only standard G… Randy Presuhn
- Re: [Ltru] Consensus call: extlang Doug Ewell
- Re: [Ltru] Does 'de' really mean "only standard G… John Cowan
- Re: [Ltru] Consensus call: extlang Leif Halvard Silli
- Re: [Ltru] Consensus call: extlang Leif Halvard Silli
- Re: [Ltru] Consensus call: extlang Kent Karlsson
- Re: [Ltru] Consensus call: extlang Leif Halvard Silli
- Re: [Ltru] Consensus call: extlang Phillips, Addison
- Re: [Ltru] Consensus call: extlang Broome, Karen
- Re: [Ltru] Consensus call: extlang Kent Karlsson
- Re: [Ltru] Consensus call: extlang Broome, Karen
- Re: [Ltru] Consensus call: extlang Peter Constable
- Re: [Ltru] Consensus call: extlang John Cowan
- Re: [Ltru] Does 'de' really mean "only standard G… Randy Presuhn
- Re: [Ltru] Consensus call: extlang Leif Halvard Silli
- Re: [Ltru] Consensus call: extlang Randy Presuhn
- Re: [Ltru] Consensus call: extlang Broome, Karen
- Re: [Ltru] Consensus call: extlang Mark Davis
- Re: [Ltru] Consensus call: extlang Mark Davis
- Re: [Ltru] Consensus call: extlang Leif Halvard Silli
- Re: [Ltru] Consensus call: extlang Broome, Karen
- Re: [Ltru] Does 'de' really mean "only standard G… John Cowan
- Re: [Ltru] Consensus call: extlang Debbie Garside
- Re: [Ltru] Consensus call: extlang Mark Davis
- Re: [Ltru] Consensus call: extlang Doug Ewell
- Re: [Ltru] Consensus call: extlang Broome, Karen
- Re: [Ltru] Consensus call: extlang John Cowan
- Re: [Ltru] Consensus call: extlang Peter Constable
- Re: [Ltru] Consensus call: extlang Andrew Cunningham
- Re: [Ltru] Consensus call: extlang Peter Constable
- Re: [Ltru] Consensus call: extlang Martin Duerst
- Re: [Ltru] Consensus call: extlang Martin Duerst
- Re: [Ltru] Consensus call: extlang Mark Davis
- Re: [Ltru] Consensus call: extlang Mark Davis
- Re: [Ltru] Consensus call: extlang Felix Sasaki
- [Ltru] tags for Chinese Peter Constable
- Re: [Ltru] tags for Chinese John Cowan
- Re: [Ltru] Consensus call: extlang Broome, Karen
- Re: [Ltru] Consensus call: extlang Peter Constable
- Re: [Ltru] Consensus call: extlang Randy Presuhn
- Re: [Ltru] Consensus call: extlang Phillips, Addison
- Re: [Ltru] Consensus call: extlang Broome, Karen
- Re: [Ltru] Consensus call: extlang Randy Presuhn
- Re: [Ltru] Consensus call: extlang Leif Halvard Silli
- Re: [Ltru] Consensus call: extlang Kent Karlsson
- Re: [Ltru] Consensus call: extlang Phillips, Addison
- Re: [Ltru] Consensus call: extlang Kent Karlsson
- Re: [Ltru] Consensus call: extlang Phillips, Addison
- Re: [Ltru] Consensus call: extlang Broome, Karen
- Re: [Ltru] Consensus call: extlang Leif Halvard Silli
- Re: [Ltru] Consensus call: extlang Leif Halvard Silli
- Re: [Ltru] Consensus call: extlang Doug Ewell
- Re: [Ltru] Consensus call: extlang Peter Constable
- Re: [Ltru] Consensus call: extlang Peter Constable
- Re: [Ltru] Consensus call: extlang Peter Constable
- Re: [Ltru] Consensus call: extlang Martin Duerst
- Re: [Ltru] Consensus call: extlang Leif Halvard Silli
- [Ltru] Standard german (Was: Consensus call: extl… Stephane Bortzmeyer
- Re: [Ltru] Standard german (Was: Consensus call: … Randy Presuhn
- Re: [Ltru] Standard german (Was: Consensus call: … Phillips, Addison
- Re: [Ltru] Standard german (Was: Consensus call: … Randy Presuhn
- Re: [Ltru] Standard german (Was: Consensus call: … Peter Constable
- Re: [Ltru] Standard german (Was: Consensus call: … Mark Davis