Re: [Ltru] Re: Macrolanguages, countries & orthographies
"Mark Davis" <mark.davis@icu-project.org> Wed, 14 February 2007 02:50 UTC
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1HHAEB-0003XP-Rd; Tue, 13 Feb 2007 21:50:39 -0500
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1HHAEB-0003XG-3H for ltru@ietf.org; Tue, 13 Feb 2007 21:50:39 -0500
Received: from wx-out-0506.google.com ([66.249.82.231]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HHAE8-0004oP-Dv for ltru@ietf.org; Tue, 13 Feb 2007 21:50:39 -0500
Received: by wx-out-0506.google.com with SMTP id h31so45695wxd for <ltru@ietf.org>; Tue, 13 Feb 2007 18:50:34 -0800 (PST)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=cRLO6i9uMwBQYS3gYEGBAFi94NlD+dgoQLn+W3NzowlJ/DpcP83qY/Tgi5s9RqF5XoishEbXnZ9ylBB/TEEqkX44NwWorpV+d2LvgP7AxqViILORjvUryog1eKGFzS+nRthYmR3OdEiyw7cKosrN7AICA4XQLbBDOSx02hjdC74=
Received: by 10.90.103.2 with SMTP id a2mr20295756agc.1171421433420; Tue, 13 Feb 2007 18:50:33 -0800 (PST)
Received: by 10.90.50.20 with HTTP; Tue, 13 Feb 2007 18:50:33 -0800 (PST)
Message-ID: <30b660a20702131850m6b045226q9229a98529d02f6a@mail.gmail.com>
Date: Tue, 13 Feb 2007 18:50:33 -0800
From: Mark Davis <mark.davis@icu-project.org>
To: Debbie Garside <debbie@ictmarketing.co.uk>
Subject: Re: [Ltru] Re: Macrolanguages, countries & orthographies
In-Reply-To: <45d2714f.311f7d7e.4ffe.2491SMTPIN_ADDED@mx.google.com>
MIME-Version: 1.0
References: <30b660a20702131622g2a3f7c4bu5651b3e7dd575075@mail.gmail.com> <45d2714f.311f7d7e.4ffe.2491SMTPIN_ADDED@mx.google.com>
X-Google-Sender-Auth: a6251e84774abc60
X-Spam-Score: 0.1 (/)
X-Scan-Signature: c2e58d9873012c90703822e287241385
Cc: LTRU Working Group <ltru@ietf.org>
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1822027942=="
Errors-To: ltru-bounces@ietf.org
1. Your personal opinion contradicts what I heard as a rough consensus earlier, that fr is not a superset of fro or frm. The highest priority is that the interpretation be well-defined and consistent; I think your opinion would be reasonable, but it doesn't seem to be the consensus. 2. If you asked me right now up-or-down on ISO 649-6, I'd say absolutely not, since (a) it would introduce all kinds of duplicate encodings, and (b) there has been no clear rationale given that the other information is worth adding. For example, the second two of these, from your example, would be duplicates. kca obgc Khanty kcal kcaw Khanty Written Latin Script kcac kcaw Khanty Written Cyrillic Script The only justification I've seen is that " Needless to say, I think there is a need for ISO 639-6 within this RFC. There is a need for variants. There is a need for a hierarchical system that will facilitate the creation of pick lists; 7,200 entities are more than a little unwieldy for the end user when setting language preferences!" But as per earlier messages, end users have very little idea of language taxonomies. A far better interface for narrowing choices would be one that lets them pick a country, and then languages in use in the country. But all of that is beside the point -- UI is *not* the goal of BCP 47. A hierarchy of languages may well be useful in some circumstances, but it is orthogonal to the requirements of BCP 47. Mark On 2/13/07, Debbie Garside <debbie@ictmarketing.co.uk> wrote: > > Mark wrote: > > >> The principle is the same for any other language: do we presume that > the code means only the modern variant, or covers all historical variations? > We need to get an answer for that; without that answer, we can't know > whether to accept or reject historic variant proposals. > > My personal opinion is that ISO 639-3 subtags cover the "whole language" > as described; all of the language, every part of the language, written and > spoken and... historical. Even when there is an ISO 639-3 historical subtag > that covers part of it. > > My advice, accept urgent proposals for historic variants on the basis that > they will be deprecated when ISO 639-6 comes into being - assuming it is > incorporated within RFC4646bis or ter. Inform proposers of such variants > that ISO 639-6 is currently being designed and if the need is not urgent > delay until ISO 639-6 is published. > > I think this group needs to make a decision wrt ISO 639-6. > > Best regards > > Debbie > > > > ------------------------------ > *From:* Mark Davis [mailto:mark.davis@icu-project.org] > *Sent:* 14 February 2007 00:22 > *To:* Lars Aronsson > *Cc:* ietf-languages@iana.org; LTRU Working Group > *Subject:* [Ltru] Re: Macrolanguages, countries & orthographies > > Saying that it is not as important is, I agree, your prejudice. Importance > is in the eye of the beholder, and ISO 639-3 has 7,500 languages, which make > distinctions that to people concerned with Czech will be far less important > than the difference between old Czech and modern Czech. > > Moreover, one cannot fixate on the exact example used. There are plenty of > others, because very few languages have "Old" variants in 639-3. The > principle is the same for any other language: do we presume that the code > means only the modern variant, or covers all historical variations? We need > to get an answer for that; without that answer, we can't know whether to > accept or reject historic variant proposals. > > Mark > > On 2/13/07, Lars Aronsson <lars@aronsson.se> wrote: > > > > Mark Davis wrote: > > > > > Assume that old Czech is as different from modern as fro is from fr. > > > > But is this a real problem? How much total literature is written > > and available in different variations of Czech? My prejudice says > > that as a nation with a language and literature of its own, Czech > > is about as young as Finnish, Norwegian or Serbian, i.e. 19th > > century. Can you give any concrete examples when not having a > > separate *code* for pre-renaissance Czech is a practical problem? > > > > Linguists of course have *names* for Swedish of all ages, but I > > see no real use for having ISO or the IETF specify language > > *codes*. I could be wrong, but if so please enlighten and correct > > me. Nobody is going to translate OpenOffice or Mozilla to the > > language spoken by vikings (Old Norse) or the Swedish used during > > the Lutheran reformation (called New Swedish, ironically). > > > > Yes, there is now a branch of Wikipedia in Old English > > ( ang.wikipedia.org), but that is a rare exception. I don't expect > > this to happen in other languages. Ang has now 744 articles, > > compared to the 11,000 articles of the Latin Wikipedia. > > > > I'm scanning old books, and I'm starting to see a practical > > problem with different orthographies and spelling reforms, similar > > to those addressed with the IETF defined codes for German de-1901 > > and de-1996. Analogous to these codes, we could perhaps find use > > for sv-1801, sv-1889, sv-1906, da-1775, da-1892 and da-1948, > > because we now have *significant amounts* of text online in each > > of these language versions. But before 1775/1801 the orthography > > of Swedish and Danish varies so heavily with each work, that it > > becomes pretty much useless to try to identify more versions. > > And before that time, there is also so small amounts of literature > > available, that any automatic processing becomes insignificant. > > > > > > > > -- > > Lars Aronsson (lars@aronsson.se ) > > Aronsson Datateknik - http://aronsson.se > > _______________________________________________ > > Ietf-languages mailing list > > Ietf-languages@alvestrand.no > > http://www.alvestrand.no/mailman/listinfo/ietf-languages > > > > > > -- > Mark > > -- Mark
_______________________________________________ Ltru mailing list Ltru@ietf.org https://www1.ietf.org/mailman/listinfo/ltru
- [Ltru] Re: Macrolanguages, countries & orthograph… Mark Davis
- [Ltru] Re: Macrolanguages, countries & orthograph… CE Whitehead
- [Ltru] RE: Macrolanguages, countries & orthograph… Don Osborn
- [Ltru] Re: Macrolanguages, countries & orthograph… Mark Davis
- RE: [Ltru] Re: Macrolanguages, countries & orthog… Debbie Garside
- Re: [Ltru] Re: Macrolanguages, countries & orthog… Mark Davis
- RE: [Ltru] Re: Macrolanguages, countries & orthog… Debbie Garside
- Re: [Ltru] Re: Macrolanguages, countries & orthog… John Cowan
- Re: [Ltru] Re: Macrolanguages, countries & orthog… David Starner
- RE: [Ltru] Re: Macrolanguages, countries & orthog… Debbie Garside
- Re: [Ltru] Re: Macrolanguages, countries & orthog… David Starner
- RE: [Ltru] Re: Macrolanguages, countries & orthog… Peter Constable
- RE: [Ltru] Re: Macrolanguages, countries & orthog… Debbie Garside
- Re: [Ltru] Re: Macrolanguages, countries & orthog… John Cowan