[Ltru] Great Script Debate "the Next Generation"... (long)
Addison Phillips <addison@yahoo-inc.com> Thu, 05 October 2006 18:58 UTC
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1GVYQ8-0008GU-6f; Thu, 05 Oct 2006 14:58:12 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1GVYQ7-0008GI-28 for ltru@ietf.org; Thu, 05 Oct 2006 14:58:11 -0400
Received: from rsmtp2.corp.yahoo.com ([207.126.228.150]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1GVYQ5-00006m-F2 for ltru@ietf.org; Thu, 05 Oct 2006 14:58:11 -0400
Received: from [172.21.37.80] (duringperson-lx.corp.yahoo.com [172.21.37.80]) (authenticated bits=0) by rsmtp2.corp.yahoo.com (8.13.6/8.13.6/y.rout) with ESMTP id k95Iw29O080151 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for <ltru@ietf.org>; Thu, 5 Oct 2006 11:58:02 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:subject: content-type:content-transfer-encoding; b=q8+Ot1AyXMwGAsqmFxpojVfsxPkvpKDnWd0A0haSgTHEZ2aPpUWy/pP5UjE5VRVV
Message-ID: <452555BA.2040601@yahoo-inc.com>
Date: Thu, 05 Oct 2006 11:58:02 -0700
From: Addison Phillips <addison@yahoo-inc.com>
User-Agent: Thunderbird 1.5.0.7 (Windows/20060909)
MIME-Version: 1.0
To: 'LTRU Working Group' <ltru@ietf.org>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Spam-Score: -15.0 (---------------)
X-Scan-Signature: 944ecb6e61f753561f559a497458fb4f
Subject: [Ltru] Great Script Debate "the Next Generation"... (long)
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org
All, Taking up the gauntlet flung down by John Cowan :-), herewith my proposal for fixing Suppress-Scripts ---- First, some recap of the problem. Suppress-Scripts are meant to identify languages that are written predominantly in a single script. This is to warn users and implementers not to form language tags using a script subtag that is usually redundant, for compatibility with tagging practices prior to RFC 4646. S-S poses a number of interesting problems. I've cited previously the registration problems. Mainly the problem here is that most languages fit the pattern of "wanting" some form of S-S field. Even languages that might not have a clear relationship to a specific script subtag (cf. Doug's research on Korean) probably should not use a script subtag. In particular, creating accurate values for even the ISO 639-1 and ISO 639-2 set of languages would require significant knowledge about the current and recent historical writing traditions of each given language, plus, possibly, some knowledge of public policy and/or potential suppression or abuse of minority tradition in regard to that language. S-S indicates that a given language is written predominantly in a specific script, so the burden of proof for a less common language might be very difficult to achieve, since the presence of many texts in a specific script does not "prove the negative", that is, that a significant body of texts or a specific writing tradition does not exist that uses a separate script. The main alternative we've dealt with in the past would be an "Accept-Script" or "Recommend-Script" approach. That design, which was not adopted in RFC 4646, involves documenting the known cases in which a script subtag *should* be used, and, in effect, recommending that languages that do not have an "A-S" field not use a script subtag except when indicating a specific difference important within a given group of information items. A-S avoids the "proving the negative" problem of S-S. Since it applies to a much smaller set of languages, it probably requires less registration overhead. The burden of proof may be just as difficult to achieve and is encumbered with essentially the same problems that attend S-S, since assertions about multiple script usage are just as potentially disruptive as assertions about the "single scriptness" of a language. Removing script information from the registry altogether is appealing as an alternative. As Mark points out, script subtags are entirely voluntary and entirely valid. The informational nature of the S-S field is merely to help guide implementers and users to try and do the right things. If maintenance is a nightmare, why persist in maintaining somewhat fictional information? I do think that guidance for users/implementers is a valid goal here. My experience as an "eminence grise" for language tags over the past couple of years is that the level of ill-informedness and mythology surrounding language tags is pretty deep. Anything we can do to help speed proper implementation of language tags will help. The problem here is that I think we're miscasting the role of the script advisory field or fields in the registry. If we only document the "do not use" case, users and implementers will remain ignorant of what to do for languages without the S-S field. Having explained the Chinese issue several times, it's clear to me that many implementers will not stumble over the right subtags by accident... and certainly not for languages such as Serbian, Uzbek, or Azerbaijani. If we only document the "do use" cases, though, users and implementers may not notice the warnings against use of scripts elsewhere. Leading to the problem we initially sought to prevent. Thus, my proposal for solving the problem: 1. Include the strongest possible warnings about not using script subtags in 4646bis. This is probably embodied in Karen Broome's suggested texts. 2. Replace "Suppress-Script" with "Script". If no script field is supplied, language tags/ranges should still not use a script subtag unless one is warranted by the information item or request. If the script field is present and contains a single item, the language is known to use that script predominantly. If two or more items are present, the language is commonly written in more than one script. Users are advised *not* to use a script subtag unless the language has more than one item in the Script field. Potential issues: 1. zero or one script subtag have the same behavior. Acquiring a second script, however, requires extra scrutiny by ietf-languages because it changes the potential default behavior for tag formation for that language. Reactions? -- Addison Phillips Globalization Architect -- Yahoo! Inc. Internationalization is an architecture. It is not a feature. _______________________________________________ Ltru mailing list Ltru@ietf.org https://www1.ietf.org/mailman/listinfo/ltru
- [Ltru] Great Script Debate "the Next Generation".… Addison Phillips
- Re: [Ltru] Great Script Debate "the Next Generati… Mark Davis
- Re: [Ltru] Great Script Debate "the Next Generati… Karen_Broome
- [Ltru] Re: Great Script Debate "the Next Generati… Frank Ellermann
- [Ltru] Re: Great Script Debate "the Next Generati… Doug Ewell
- Re: [Ltru] Re: Great Script Debate "the Next Gene… Mark Davis
- Re: [Ltru] Re: Great Script Debate "the Next Gene… John Cowan
- Re: [Ltru] Re: Great Script Debate "the Next Gene… Mark Davis
- Re: [Ltru] Re: Great Script Debate "the Next Gene… Addison Phillips
- RE: [Ltru] Re: Great Script Debate "the Next Gene… Peter Constable
- Re: [Ltru] Re: Great Script Debate "the Next Gene… Doug Ewell
- Re: [Ltru] Great Script Debate "the Next Generati… John Cowan
- Re: [Ltru] Great Script Debate "the Next Generati… Mark Davis
- Re: [Ltru] Great Script Debate "the Next Generati… John Cowan