[Ltru] Solving the UTF-8 problem; was Language Tag Modification 1694acad;
"CE Whitehead" <cewcathar@hotmail.com> Mon, 02 July 2007 15:04 UTC
Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1I5NRl-0006YU-LU; Mon, 02 Jul 2007 11:04:13 -0400
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1I5NRk-0006O4-68 for ltru-confirm+ok@megatron.ietf.org; Mon, 02 Jul 2007 11:04:12 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1I5NRj-0006Mn-CZ for ltru@ietf.org; Mon, 02 Jul 2007 11:04:11 -0400
Received: from bay0-omc1-s7.bay0.hotmail.com ([65.54.246.79]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1I5NR5-0006mb-Fo for ltru@ietf.org; Mon, 02 Jul 2007 11:04:10 -0400
Received: from hotmail.com ([65.54.169.41]) by bay0-omc1-s7.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.2668); Mon, 2 Jul 2007 08:03:30 -0700
Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Mon, 2 Jul 2007 08:03:31 -0700
Message-ID: <BAY114-F31053BEBAE0817D81180DFB30D0@phx.gbl>
Received: from 65.54.169.200 by by114fd.bay114.hotmail.msn.com with HTTP; Mon, 02 Jul 2007 15:03:27 GMT
X-Originating-IP: [74.255.101.129]
X-Originating-Email: [cewcathar@hotmail.com]
X-Sender: cewcathar@hotmail.com
From: CE Whitehead <cewcathar@hotmail.com>
To: ietf-languages@iana.org, ltru@ietf.org
Bcc:
Date: Mon, 02 Jul 2007 11:03:27 -0400
Mime-Version: 1.0
Content-Type: text/plain; format="flowed"
X-OriginalArrivalTime: 02 Jul 2007 15:03:31.0004 (UTC) FILETIME=[27253FC0:01C7BCBA]
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 14582b0692e7f70ce7111d04db3781c8
Cc: dewell@roadrunner.com
Subject: [Ltru] Solving the UTF-8 problem; was Language Tag Modification 1694acad;
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org
Hi, I'm confused as to whether or not persons with Thai Windows only (do they exist?? I thought so) can see all Latin-1 characters properly. Also, I think adding a comment to transliterations into ascii that overlap would not be too much trouble. (But this is needed only if there are persons who have operating systems anywhere on earth who cannot see these characters). Finally, I do not think it is too much work to provide a ferw transliterations of non-ascii characters, given all the other stuff that goes into language subtag entries. And I do not mind whether it is the officialregistry in ascii and the unofficial in utf-8 or the other way around. Thanks. (I've put all this again below for people who like the context.) --C. E. Whitehead Doug Ewell dewell at roadrunner.com Mon Jul 2 00:58:48 CEST 2007 >What are we going to do when the ISO 639-3 code list is finalized and we >have to deal with adding the following pairs of languages, whose names >differ only by diacritical marks? >aru Arua >arx Aruá >bfa Bari >mot Barí >kgm Karipúna >kuq Karipuná >sbe Saliba >slc Sáliba >wbf Wara >tci Wára >Are we going to include an ASCII version of every name that contains an >accented letter? There are several hundred in ISO 639-3. I have not seen ISO 639-3; but as it so much work just to put in the comments, description, etc., that it should be trivial to add in the ascii name! As for the languages that differ only by the diacritical mark you might need a comment about this somewhere; I think it can be handled! (for non-latin charcters and for Latin if needed, eg. if there are persons in the world who cannot see these) >Section 3.1 mentions transcription of non-Latin Description fields into the >Latin script. It does not talk about providing a pure-ASCII equivalent for >every non-ASCII French- or Spanish-language string, and I don't believe >that was the WG's intention. Transcriptions are useful when the content is >in Arabic or Cyrillic or Han, to make the material available to >Latin-script-only readers. Providing "transcriptions" like "4#xE8;eme >(4eme)" merely announces to the world that we can't solve our own technical >character-encoding problems without resorting to unwieldy kludges. Are there people who only have say Thai windows, who would appreciate the transcriptions? That's what I was thinking; sorry if I'm wrong. Stephane Bortzmeyer bortzmeyer at nic.fr Mon Jul 2 15:55:33 CEST 2007 wrote: >On Sun, Jul 01, 2007 at 03:58:48PM -0700, >Doug Ewell <dewell at roadrunner.com> wrote a message of 161 lines which >said: > > Another possibility is to have IANA post an official version of the > > Registry in one encoding, such as UTF-8, and additional, unofficial > > versions in other encodings, such as Latin-1 or hex NCRs. >Why not? Currently, we do exactly the opposite: IANA publishes the >official registry in hex NCR >(http://www.iana.org/assignments/language-subtag-registry) and >langtag.net publishes an unofficial version in UTF-8 >(http://www.langtag.net/registries/language-subtag-registry.utf8). Fine with me! --C. E. Whitehead cewcathar@hotmail.com _________________________________________________________________ Picture this share your photos and you could win big! http://www.GETREALPhotoContest.com?ocid=TXT_TAGHM&loc=us _______________________________________________ Ltru mailing list Ltru@ietf.org https://www1.ietf.org/mailman/listinfo/ltru
- [Ltru] Solving the UTF-8 problem; was Language Ta… CE Whitehead
- [Ltru] Re: Solving the UTF-8 problem; was Languag… Doug Ewell
- [Ltru] Re: Solving the UTF-8 problem; was Languag… Stephane Bortzmeyer
- [Ltru] Re: Solving the UTF-8 problem; was Languag… Doug Ewell
- [Ltru] RE: Solving the UTF-8 problem; was Languag… Peter Constable
- [Ltru] RE: Solving the UTF-8 problem; was Languag… Peter Constable