Re: [Ietf-languages] Forms for subtag kmpre20c

Richard Wordingham <richard.wordingham@ntlworld.com> Mon, 02 December 2019 16:45 UTC

Return-Path: <richard.wordingham@ntlworld.com>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D6001120827 for <ietf-languages@ietfa.amsl.com>; Mon, 2 Dec 2019 08:45:10 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=ntlworld.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7qtbQBiMVfFG for <ietf-languages@ietfa.amsl.com>; Mon, 2 Dec 2019 08:45:08 -0800 (PST)
Received: from know-smtprelay-omc-8.server.virginmedia.net (know-smtprelay-omc-8.server.virginmedia.net [80.0.253.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DD3D7120825 for <ietf-languages@ietf.org>; Mon, 2 Dec 2019 08:45:07 -0800 (PST)
Received: from JRWUBU2 ([82.4.11.47]) by cmsmtp with ESMTP id boorisSSVrx5Aboori7mvG; Mon, 02 Dec 2019 16:45:06 +0000
X-Originating-IP: [82.4.11.47]
X-Authenticated-User:
X-Spam: 0
X-Authority: v=2.3 cv=Te64SyYh c=1 sm=1 tr=0 a=yrOAJgItaIMndimPI+pDLQ==:117 a=yrOAJgItaIMndimPI+pDLQ==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=IkcTkHD0fZMA:10 a=8390bBgGAAAA:8 a=y6WWApfMB_tvVEJ1pj8A:9 a=QEXdDO2ut3YA:10 a=ONfENc2Xldf9O1kC1nT1:22
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ntlworld.com; s=meg.feb2017; t=1575305106; bh=MAMWDl8bBOMA+kgza/WCwjUUgQvezm2upfVT7jOgZWI=; h=Date:From:To:Subject:In-Reply-To:References; b=GDAEGsyQ2ngEJL0G0Guw75UL7K+k+zCy3WvBgrbRNaV+4uhWV5Ildo1DoZ5kkgXB7 unZEuArxdVvuyXP4vqqHssEYYz3YVq4QD+3lGbnNym53l2egWTmjYCVfMwTwWBXvJt qjVRA8jxQyRNEhmFg5HjWIby86Qx/FMGZQPHl9E9yqvfsKEF88k0loT6rZLk8Tzbx7 tZmmaSUESc1ETrKBj0Ly9ikIWckvi0VE5ktrdEappVmp6lcYA9hU3cvpIVTVOAGZiJ KBFZ8Go3r/72YUgM6szfejVl0AcYmsYNUHxZ4LA5M14asVRCV5GoN6O7PJujj61WGy I1kNEoiEzVLvA==
Date: Mon, 02 Dec 2019 16:45:00 +0000
From: Richard Wordingham <richard.wordingham@ntlworld.com>
To: ietf-languages@ietf.org
Message-ID: <20191202164500.471f80da@JRWUBU2>
In-Reply-To: <CANfi1Jh+74EMcTdm7_7wdqKJrDPL87jkr26wkadXBeRpJL5peA@mail.gmail.com>
References: <20191121141336.665a7a7059d7ee80bb4d670165c8327d.9a3859061b.wbe@email03.godaddy.com> <CANfi1JjyouJV-CLXdKOwvRxcFPM0csTe8=+44hszSBhVTxd-qA@mail.gmail.com> <CANfi1JjeSo2-Ez52Nu3Lcb3jC9skPp2_YWza8Xnusu0Xi8vHuA@mail.gmail.com> <CANfi1JgVZ=rc1s=ELHoS=tv9HkwuzNCP0PUAZbjXWfWX0UtEXQ@mail.gmail.com> <000501d5a31c$cb6f52e0$624df8a0$@ewellic.org> <7AAF56F5-A51D-45B2-9400-86FB94625A06@gmail.com> <00C5B42F-0871-4A9D-913A-EABAF0344F68@evertype.com> <20191201185039.0ec4bf53@JRWUBU2> <CANfi1Jgz65Ohw8FXSBaLzVx=9Xi_GpMAeL+BnKUoRgG=ij7txQ@mail.gmail.com> <20191202141237.1724fc7c@JRWUBU2> <CANfi1Jh+74EMcTdm7_7wdqKJrDPL87jkr26wkadXBeRpJL5peA@mail.gmail.com>
X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.30; i686-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-CMAE-Envelope: MS4wfJH234XMlbKa8tBHBrcC24KP8WrgtYY7j7o3pfdXoD8zE5eBvEzBFESLQCRcUdx2K2ov7G2a1oYXUVsK1sGX11zePeX+6KghwEX7NVe5/iMOYB6iVSv1 Ek7KA2g69A4B35cc0YWJvnd5j/XB8Th64zvcrmBT+R3tjOe1NRfeewoa
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/1eQ1D-UfQsinQy5svHjv77C4zCQ>
Subject: Re: [Ietf-languages] Forms for subtag kmpre20c
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2019 16:45:11 -0000

On Mon, 2 Dec 2019 15:36:46 +0100
Élie Roux <elie.roux@telecom-bretagne.eu> wrote:

> > If that's truly the case, the proper tag is und-Khmr.  
> 
> Why not. Generally speaking I think all lang tags should all have a
> macrolanguage though. Most our database is composed of Tibetan; most
> of it is Classical but we also have old and modern, but we don't care
> about the distinction in the lang tags. I certainly find und-Tibt (or
> actually und-x-ewts as we're using transliteation) quite ugly and I
> would much prefer bo to be a macrolanguage (like zh) instead of an
> individual one. But that's not a problem I can solve... In the
> meantime I'll use km as it's much more user friendly.
> 
> > You then hit the
> > problem that language tagging doesn't handle exclusions.  At least,
> > Michael Everson said it doesn't and I have no reason to disbelieve
> > him.  It also makes sense to me as a policy.  
> 
> Sure
> 
> > And this immediately undermines the previous generality, as it
> > includes things like pi-Khmr.  
> 
> What generality? I'm not following

You said that you would like to use "-kmpre20c" for Khmer script
materials that were "written in Khmer script using non-reformed
spelling: Old, middle, modern Khmer, all sorts of dialects, Pali,
Sanskrit, Chinese, etc".  Then you referenced your convention on
github, which sensibly gives "pi-Khmr" for Pali in Khmer script.

However, I may have misunderstood you.  I though you meant you would
use "km-kmpre20c", but perhaps you just meant you would suffix it to
other languages.  On closer examination I saw pi-Khmr-x-kmpre20c.  The
general view here is that the latter should not happen without explicit
sanction.

> > What about printed Khmer-script missionary texts from 1893 printed
> > in Hong Kong?  
> 
> We don't have any yet and we have no acquisition plan for such texts.
> What about them?

You wrote, "The database I'm needing subtags for is mainly
bibliographical, it has the title for each text in two flavors: the
original (old) spelling as appearing on the manuscript, and the
equivalent in modern spelling (Chuon Nath style)".  I therefore
wondered if your database were restricted to manuscripts.  

> That's what we're trying to do, but we can't expect to stop all
> operations before we get the answers, and they probably won't be there
> in the next few decades... and I need to tag my data in the meantime.

You probably need to make provision for automatically converting the
tags later.

Richard.