Re: [Ietf-languages] Forms for subtag kmpre20c

Richard Wordingham <> Mon, 02 December 2019 16:45 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id D6001120827 for <>; Mon, 2 Dec 2019 08:45:10 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 7qtbQBiMVfFG for <>; Mon, 2 Dec 2019 08:45:08 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id DD3D7120825 for <>; Mon, 2 Dec 2019 08:45:07 -0800 (PST)
Received: from JRWUBU2 ([]) by cmsmtp with ESMTP id boorisSSVrx5Aboori7mvG; Mon, 02 Dec 2019 16:45:06 +0000
X-Originating-IP: []
X-Spam: 0
X-Authority: v=2.3 cv=Te64SyYh c=1 sm=1 tr=0 a=yrOAJgItaIMndimPI+pDLQ==:117 a=yrOAJgItaIMndimPI+pDLQ==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=IkcTkHD0fZMA:10 a=8390bBgGAAAA:8 a=y6WWApfMB_tvVEJ1pj8A:9 a=QEXdDO2ut3YA:10 a=ONfENc2Xldf9O1kC1nT1:22
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=meg.feb2017; t=1575305106; bh=MAMWDl8bBOMA+kgza/WCwjUUgQvezm2upfVT7jOgZWI=; h=Date:From:To:Subject:In-Reply-To:References; b=GDAEGsyQ2ngEJL0G0Guw75UL7K+k+zCy3WvBgrbRNaV+4uhWV5Ildo1DoZ5kkgXB7 unZEuArxdVvuyXP4vqqHssEYYz3YVq4QD+3lGbnNym53l2egWTmjYCVfMwTwWBXvJt qjVRA8jxQyRNEhmFg5HjWIby86Qx/FMGZQPHl9E9yqvfsKEF88k0loT6rZLk8Tzbx7 tZmmaSUESc1ETrKBj0Ly9ikIWckvi0VE5ktrdEappVmp6lcYA9hU3cvpIVTVOAGZiJ KBFZ8Go3r/72YUgM6szfejVl0AcYmsYNUHxZ4LA5M14asVRCV5GoN6O7PJujj61WGy I1kNEoiEzVLvA==
Date: Mon, 2 Dec 2019 16:45:00 +0000
From: Richard Wordingham <>
Message-ID: <20191202164500.471f80da@JRWUBU2>
In-Reply-To: <>
References: <> <> <> <> <000501d5a31c$cb6f52e0$624df8a0$> <> <> <20191201185039.0ec4bf53@JRWUBU2> <> <20191202141237.1724fc7c@JRWUBU2> <>
X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.30; i686-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
X-CMAE-Envelope: MS4wfJH234XMlbKa8tBHBrcC24KP8WrgtYY7j7o3pfdXoD8zE5eBvEzBFESLQCRcUdx2K2ov7G2a1oYXUVsK1sGX11zePeX+6KghwEX7NVe5/iMOYB6iVSv1 Ek7KA2g69A4B35cc0YWJvnd5j/XB8Th64zvcrmBT+R3tjOe1NRfeewoa
Archived-At: <>
Subject: Re: [Ietf-languages] Forms for subtag kmpre20c
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 02 Dec 2019 16:45:11 -0000

On Mon, 2 Dec 2019 15:36:46 +0100
Élie Roux <> wrote:

> > If that's truly the case, the proper tag is und-Khmr.  
> Why not. Generally speaking I think all lang tags should all have a
> macrolanguage though. Most our database is composed of Tibetan; most
> of it is Classical but we also have old and modern, but we don't care
> about the distinction in the lang tags. I certainly find und-Tibt (or
> actually und-x-ewts as we're using transliteation) quite ugly and I
> would much prefer bo to be a macrolanguage (like zh) instead of an
> individual one. But that's not a problem I can solve... In the
> meantime I'll use km as it's much more user friendly.
> > You then hit the
> > problem that language tagging doesn't handle exclusions.  At least,
> > Michael Everson said it doesn't and I have no reason to disbelieve
> > him.  It also makes sense to me as a policy.  
> Sure
> > And this immediately undermines the previous generality, as it
> > includes things like pi-Khmr.  
> What generality? I'm not following

You said that you would like to use "-kmpre20c" for Khmer script
materials that were "written in Khmer script using non-reformed
spelling: Old, middle, modern Khmer, all sorts of dialects, Pali,
Sanskrit, Chinese, etc".  Then you referenced your convention on
github, which sensibly gives "pi-Khmr" for Pali in Khmer script.

However, I may have misunderstood you.  I though you meant you would
use "km-kmpre20c", but perhaps you just meant you would suffix it to
other languages.  On closer examination I saw pi-Khmr-x-kmpre20c.  The
general view here is that the latter should not happen without explicit

> > What about printed Khmer-script missionary texts from 1893 printed
> > in Hong Kong?  
> We don't have any yet and we have no acquisition plan for such texts.
> What about them?

You wrote, "The database I'm needing subtags for is mainly
bibliographical, it has the title for each text in two flavors: the
original (old) spelling as appearing on the manuscript, and the
equivalent in modern spelling (Chuon Nath style)".  I therefore
wondered if your database were restricted to manuscripts.  

> That's what we're trying to do, but we can't expect to stop all
> operations before we get the answers, and they probably won't be there
> in the next few decades... and I need to tag my data in the meantime.

You probably need to make provision for automatically converting the
tags later.