Re: [Ietf-languages] Khmer orthographic reform

"Doug Ewell" <doug@ewellic.org> Thu, 31 October 2019 22:18 UTC

Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"
User-Agent: Workspace Webmail 6.10.4
Message-Id: <20191031151719.665a7a7059d7ee80bb4d670165c8327d.38c22bee93.wbe@email03.godaddy.com>
From: Doug Ewell <doug@ewellic.org>
To: Élie_Roux <elie.roux@telecom-bretagne.eu>, IETF Languages Discussion <ietf-languages@iana.org>
Cc: Chris Tomlinson <chris.j.tomlinson@gmail.com>
Date: Thu, 31 Oct 2019 15:17:19 -0700
Mime-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/bHx_J6k2BivO6ruYyMXTg_OOPkY>
Subject: Re: [Ietf-languages] Khmer orthographic reform
Precedence: list

Élie Roux wrote:

> I have some data in Khmer that I need to use two tags for:
> - one for Khmer as written before the orthographic reforms of the XXth
> c.
> - one for Khmer written according to said reforms

This is a good starting point. At least you have established that the
works need to be tagged distinctly (as opposed to "we need to
distinguish these varieties," which is not always a reason for tagging).

It may not be clear that two subtags are needed, if the current
orthography is overwhelmingly dominant and can be assumed, but that
discussion can come later.

> The reform is not exactly clear cut, unfortunately. Let me see if I
> can find any sources.

You'll certainly want to find a reference that explains and describes
the reform. The ones Richard supplied may help. It's not mandatory that
all contemporary writers switched to the new orthography simultaneously
or uniformly; that certainly wasn't the case for, say, '1606nict'.

> What is IANA policy with regards to this kind of situation? Is it
> reasonable for me to propose a "-pre1966" or "-pre20c" subtag for the
> strings that use the old orthography?

It's not "IANA policy," but rather ietf-languages practice. IANA merely
serves as the repository for the Registry.

It's certainly reasonable to request one or two variant subtags (see
above). Check the available references.

The subtag values themselves probably won't be any sort of abbreviation
for "pre-20th century" or the like. Such a subtag could theoretically
apply to dozens or hundreds of languages, with a different meaning for
each; and although the Prefix field is supposed to suggest the languages
for which the subtag is considered suitable, concerns are usually voiced
that this is not sufficient to discourage inappropriate use.

--
Doug Ewell | Thornton, CO, US | ewellic.org

[Ietf-languages] Khmer orthographic reform Élie Roux
Re: [Ietf-languages] Khmer orthographic reform Richard Wordingham
Re: [Ietf-languages] Khmer orthographic reform Doug Ewell
Re: [Ietf-languages] Khmer orthographic reform Élie Roux
Re: [Ietf-languages] Khmer orthographic reform Élie Roux
Re: [Ietf-languages] Khmer orthographic reform Doug Ewell