[Ltru] Re: Non-Latin-1 Description fields in RFC 4645bis
"Doug Ewell" <dewell@roadrunner.com> Fri, 07 December 2007 06:26 UTC
Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1J0WfP-0003aM-VN; Fri, 07 Dec 2007 01:26:33 -0500
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1J0WfO-0003Yj-Dh for ltru-confirm+ok@megatron.ietf.org; Fri, 07 Dec 2007 01:26:30 -0500
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1J0WfO-0003YP-2T for ltru@ietf.org; Fri, 07 Dec 2007 01:26:30 -0500
Received: from mta11.adelphia.net ([68.168.78.205]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1J0WfN-0003UT-FL for ltru@ietf.org; Fri, 07 Dec 2007 01:26:30 -0500
Received: from DGBP7M81 ([76.167.184.182]) by mta11.adelphia.net (InterMail vM.6.01.05.02 201-2131-123-102-20050715) with SMTP id <20071207062629.EMSA19654.mta11.adelphia.net@DGBP7M81> for <ltru@ietf.org>; Fri, 7 Dec 2007 01:26:29 -0500
Message-ID: <00de01c8389a$1a388590$6601a8c0@DGBP7M81>
From: Doug Ewell <dewell@roadrunner.com>
To: LTRU Working Group <ltru@ietf.org>
Date: Thu, 06 Dec 2007 22:26:28 -0800
MIME-Version: 1.0
Content-Type: text/plain; format="flowed"; charset="utf-8"; reply-type="response"
Content-Transfer-Encoding: 8bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.3138
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198
X-Spam-Score: 2.2 (++)
X-Scan-Signature: 02ec665d00de228c50c93ed6b5e4fc1a
Subject: [Ltru] Re: Non-Latin-1 Description fields in RFC 4645bis
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org
When thinking about how to define "the Latin script" for purposes of meeting the requirement in RFC 4646bis, section 3.1.4, it might be useful to consider why we have such a requirement in the first place. We have generally copied Description fields for language, script, and region subtags from the respective ISO standard. The various ISO MA's and RA's generally do not go overboard in character repertoire -- in particular, ISO 639-3/RA uses ordinary ASCII punctuation in place of true click letters -- but in this case they have seen fit to use "Māhārāṣṭri Prākrit" as a language name. (I don't know what, if anything, the plain-ASCII equivalent "Maharastri Prakrit" means; I hope it's something I can say in public.) We have committed ourselves to using the names from the ISO standards. This issue mainly concerns the variant subtags. Many of the fields in the Registry are intended for some type of automated processing, but the Description is really meant for human consumption. Software certainly isn't going to do much with it, except display it. Therefore, the Description field ought to be designed with human usability in mind. While the names of languages, writing systems, countries, etc. could obviously be written in a huge variety of scripts, most humans (professional translators and linguists aside) are unlikely to be able to read more than two or three scripts, so it makes sense to impose a limit to help ensure scrutability. Because of the nature of the rest of the Registry, some familiarity with the Latin script is more or less assumed. Consequently, we imposed a requirement that every subtag have at least one Latin-script description. There is no written language on earth that uses all of the letters of the Latin script, for any reasonable definition of "Latin script" (i.e. more than just ASCII). Nevertheless, a reader can generally recognize Latin-script letters that are not part of that reader's alphabet. For example, a Francophone should be able to recognize n-tilde (ñ) as a letter belonging to the broader "Latin script" even though it is not in the narrower "French alphabet." Technical limits should not be seen as the primary concern. The problem of limited font coverage, as explained months ago to CE Whitehead, is a temporary one that will not last forever. Programs and operating systems are being localized to more and more languages, requiring more comprehensive font coverage, and rendering engines are becoming smart enough to substitute glyphs from alternative fonts instead of displaying square boxes. On the other hand, an LTRU policy that restricts "the Latin script" to an artificial subset is something that will last until some future LTRU comes around to change it. I suggest we adopt a liberal definition of "the Latin script," preferably one based on Unicode Standard Annex #24, "Script Names," rather than a narrow definition based on legacy character sets or subsets, keyboard layouts, or font coverage. These are my opinions; yours may differ. -- Doug Ewell * Fullerton, California, USA * RFC 4645 * UTN #14 http://home.roadrunner.com/~dewell http://www1.ietf.org/html.charters/ltru-charter.html http://www.alvestrand.no/mailman/listinfo/ietf-languages ˆ _______________________________________________ Ltru mailing list Ltru@ietf.org https://www1.ietf.org/mailman/listinfo/ltru
- [Ltru] Non-Latin-1 Description fields in RFC 4645… Doug Ewell
- Re: [Ltru] Non-Latin-1 Description fields in RFC … John Cowan
- Re: [Ltru] Non-Latin-1 Description fields in RFC … Doug Ewell
- [Ltru] Re: Non-Latin-1 Description fields in RFC … Frank Ellermann
- [Ltru] Re: Non-Latin-1 Description fields in RFC … Doug Ewell
- [Ltru] Re: Non-Latin-1 Description fields in RFC … Frank Ellermann
- [Ltru] Re: Non-Latin-1 Description fields in RFC … Doug Ewell
- Re: [Ltru] Re: Non-Latin-1 Description fields in … Mark Davis
- Re: [Ltru] Re: Non-Latin-1 Description fields in … Addison Phillips
- Re: [Ltru] Re: Non-Latin-1 Description fields in … Doug Ewell
- [Ltru] Re: Non-Latin-1 Description fields in RFC … Frank Ellermann