Re: [Ltru] Re: Macrolanguages, countries & orthographies

"Mark Davis" <mark.davis@icu-project.org> Wed, 14 February 2007 02:50 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1HHAEB-0003XP-Rd; Tue, 13 Feb 2007 21:50:39 -0500
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1HHAEB-0003XG-3H for ltru@ietf.org; Tue, 13 Feb 2007 21:50:39 -0500
Received: from wx-out-0506.google.com ([66.249.82.231]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HHAE8-0004oP-Dv for ltru@ietf.org; Tue, 13 Feb 2007 21:50:39 -0500
Received: by wx-out-0506.google.com with SMTP id h31so45695wxd for <ltru@ietf.org>; Tue, 13 Feb 2007 18:50:34 -0800 (PST)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=cRLO6i9uMwBQYS3gYEGBAFi94NlD+dgoQLn+W3NzowlJ/DpcP83qY/Tgi5s9RqF5XoishEbXnZ9ylBB/TEEqkX44NwWorpV+d2LvgP7AxqViILORjvUryog1eKGFzS+nRthYmR3OdEiyw7cKosrN7AICA4XQLbBDOSx02hjdC74=
Received: by 10.90.103.2 with SMTP id a2mr20295756agc.1171421433420; Tue, 13 Feb 2007 18:50:33 -0800 (PST)
Received: by 10.90.50.20 with HTTP; Tue, 13 Feb 2007 18:50:33 -0800 (PST)
Message-ID: <30b660a20702131850m6b045226q9229a98529d02f6a@mail.gmail.com>
Date: Tue, 13 Feb 2007 18:50:33 -0800
From: Mark Davis <mark.davis@icu-project.org>
To: Debbie Garside <debbie@ictmarketing.co.uk>
Subject: Re: [Ltru] Re: Macrolanguages, countries & orthographies
In-Reply-To: <45d2714f.311f7d7e.4ffe.2491SMTPIN_ADDED@mx.google.com>
MIME-Version: 1.0
References: <30b660a20702131622g2a3f7c4bu5651b3e7dd575075@mail.gmail.com> <45d2714f.311f7d7e.4ffe.2491SMTPIN_ADDED@mx.google.com>
X-Google-Sender-Auth: a6251e84774abc60
X-Spam-Score: 0.1 (/)
X-Scan-Signature: c2e58d9873012c90703822e287241385
Cc: LTRU Working Group <ltru@ietf.org>
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1822027942=="
Errors-To: ltru-bounces@ietf.org

1. Your personal opinion contradicts what I heard as a rough consensus
earlier, that fr is not a superset of fro or frm. The highest priority is
that the interpretation be well-defined and consistent; I think your opinion
would be reasonable, but it doesn't seem to be the consensus.

2. If you asked me right now up-or-down on ISO 649-6, I'd say absolutely
not, since (a) it would introduce all kinds of duplicate encodings, and (b)
there has been no clear rationale given that the other information is worth
adding.

For example, the second two of these, from your example, would be
duplicates.
kca     obgc    Khanty
kcal    kcaw    Khanty Written Latin Script
kcac    kcaw    Khanty Written Cyrillic Script

The only justification I've seen is that " Needless to say, I think
there is a need for ISO 639-6 within this RFC.  There is a need for
variants.  There is a need for a hierarchical system that will facilitate
the creation of pick lists; 7,200 entities are more than a little unwieldy
for the end user when setting language preferences!"

But as per earlier messages, end users have very little idea of language
taxonomies. A far better interface for narrowing choices would be one that
lets them pick a country, and then languages in use in the country. But all
of that is beside the point -- UI is *not* the goal of BCP 47.

A hierarchy of languages may well be useful in some circumstances, but it is
orthogonal to the requirements of BCP 47.

Mark

On 2/13/07, Debbie Garside <debbie@ictmarketing.co.uk> wrote:
>
>  Mark wrote:
>
> >> The principle is the same for any other language: do we presume that
> the code means only the modern variant, or covers all historical variations?
> We need to get an answer for that; without that answer, we can't know
> whether to accept or reject historic variant proposals.
>
> My personal opinion is that ISO 639-3 subtags cover the "whole language"
> as described; all of the language, every part of the language, written and
> spoken and... historical.  Even when there is an ISO 639-3 historical subtag
> that covers part of it.
>
> My advice, accept urgent proposals for historic variants on the basis that
> they will be deprecated when ISO 639-6 comes into being - assuming it is
> incorporated within RFC4646bis or ter.  Inform proposers of such variants
> that ISO 639-6 is currently being designed and if the need is not urgent
> delay until ISO 639-6 is published.
>
> I think this group needs to make a decision wrt ISO 639-6.
>
> Best regards
>
> Debbie
>
>
>
>  ------------------------------
> *From:* Mark Davis [mailto:mark.davis@icu-project.org]
> *Sent:* 14 February 2007 00:22
> *To:* Lars Aronsson
> *Cc:* ietf-languages@iana.org; LTRU Working Group
> *Subject:* [Ltru] Re: Macrolanguages, countries & orthographies
>
> Saying that it is not as important is, I agree, your prejudice. Importance
> is in the eye of the beholder, and ISO 639-3 has 7,500 languages, which make
> distinctions that to people concerned with Czech will be far less important
> than the difference between old Czech and modern Czech.
>
> Moreover, one cannot fixate on the exact example used. There are plenty of
> others, because very few languages have "Old" variants in 639-3. The
> principle is the same for any other language: do we presume that the code
> means only the modern variant, or covers all historical variations? We need
> to get an answer for that; without that answer, we can't know whether to
> accept or reject historic variant proposals.
>
> Mark
>
> On 2/13/07, Lars Aronsson <lars@aronsson.se> wrote:
> >
> > Mark Davis wrote:
> >
> > > Assume that old Czech is as different from modern as fro is from fr.
> >
> > But is this a real problem?  How much total literature is written
> > and available in different variations of Czech?  My prejudice says
> > that as a nation with a language and literature of its own, Czech
> > is about as young as Finnish, Norwegian or Serbian, i.e. 19th
> > century.  Can you give any concrete examples when not having a
> > separate *code* for pre-renaissance Czech is a practical problem?
> >
> > Linguists of course have *names* for Swedish of all ages, but I
> > see no real use for having ISO or the IETF specify language
> > *codes*.  I could be wrong, but if so please enlighten and correct
> > me.  Nobody is going to translate OpenOffice or Mozilla to the
> > language spoken by vikings (Old Norse) or the Swedish used during
> > the Lutheran reformation (called New Swedish, ironically).
> >
> > Yes, there is now a branch of Wikipedia in Old English
> > ( ang.wikipedia.org), but that is a rare exception.  I don't expect
> > this to happen in other languages.  Ang has now 744 articles,
> > compared to the 11,000 articles of the Latin Wikipedia.
> >
> > I'm scanning old books, and I'm starting to see a practical
> > problem with different orthographies and spelling reforms, similar
> > to those addressed with the IETF defined codes for German de-1901
> > and de-1996.  Analogous to these codes, we could perhaps find use
> > for sv-1801, sv-1889, sv-1906, da-1775, da-1892 and da-1948,
> > because we now have *significant amounts* of text online in each
> > of these language versions. But before 1775/1801 the orthography
> > of Swedish and Danish varies so heavily with each work, that it
> > becomes pretty much useless to try to identify more versions.
> > And before that time, there is also so small amounts of literature
> > available, that any automatic processing becomes insignificant.
> >
> >
> >
> > --
> >   Lars Aronsson (lars@aronsson.se )
> >   Aronsson Datateknik - http://aronsson.se
> > _______________________________________________
> > Ietf-languages mailing list
> > Ietf-languages@alvestrand.no
> > http://www.alvestrand.no/mailman/listinfo/ietf-languages
> >
>
>
>
> --
> Mark
>
>


-- 
Mark
_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru