Re: [Ietf-languages] Forms for subtag kmpre20c

Élie Roux <elie.roux@telecom-bretagne.eu> Mon, 02 December 2019 14:37 UTC

Return-Path: <roux.elie@gmail.com>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 05C79120086 for <ietf-languages@ietfa.amsl.com>; Mon, 2 Dec 2019 06:37:26 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.033
X-Spam-Level:
X-Spam-Status: No, score=-3.033 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FREEMAIL_FORGED_FROMDOMAIN=0.25, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9LoCPQmzmPQJ for <ietf-languages@ietfa.amsl.com>; Mon, 2 Dec 2019 06:37:24 -0800 (PST)
Received: from mork.alvestrand.no (mork.alvestrand.no [158.38.152.117]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4507812006D for <ietf-languages@ietf.org>; Mon, 2 Dec 2019 06:37:24 -0800 (PST)
Received: by mork.alvestrand.no (Postfix) id BD39E7C381C; Mon, 2 Dec 2019 15:37:22 +0100 (CET)
Delivered-To: ietf-languages@alvestrand.no
Received: from localhost (localhost [127.0.0.1]) by mork.alvestrand.no (Postfix) with ESMTP id AD2FA7C0DEC for <ietf-languages@alvestrand.no>; Mon, 2 Dec 2019 15:37:22 +0100 (CET)
X-Virus-Scanned: Debian amavisd-new at alvestrand.no
Received: from mork.alvestrand.no ([127.0.0.1]) by localhost (mork.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ylsdgBoyp3dA for <ietf-languages@alvestrand.no>; Mon, 2 Dec 2019 15:37:20 +0100 (CET)
X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0
X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0
X-Comment: SPF skipped for whitelisted relay - client-ip=2620:0:2830:201::1:74; helo=pechora8.dc.icann.org; envelope-from=roux.elie@gmail.com; receiver=ietf-languages@alvestrand.no
Received: from pechora8.dc.icann.org (pechora8.icann.org [IPv6:2620:0:2830:201::1:74]) by mork.alvestrand.no (Postfix) with ESMTPS id 7E4D87C0C73 for <ietf-languages@alvestrand.no>; Mon, 2 Dec 2019 15:37:19 +0100 (CET)
Received: from mail-io1-f52.google.com (mail-io1-f52.google.com [209.85.166.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pechora8.dc.icann.org (Postfix) with ESMTPS id 73C83C02A0 for <ietf-languages@iana.org>; Mon, 2 Dec 2019 14:37:18 +0000 (UTC)
Received: by mail-io1-f52.google.com with SMTP id i11so29856395ioi.12 for <ietf-languages@iana.org>; Mon, 02 Dec 2019 06:37:18 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=XlO766wjy08s6LwcEHWXWKXr04pRHy5dBL5FwZ19SL4=; b=ZeeGL62+gd2ltMTFvfEPzkEeMVClOxG7Bpt0R0D4DOZ9B8f8TrYautiiB1SoWawnui cmDEJtaZ4kyRcisno3bxuwKefLmf78YPJPzYN4NM2Rpzm1n9qBDZeSk0XMQOFyq6Rcr0 jjt96LanjSJQsmVuQkuj4bmcv/mbAYP3wshPp6qkF9OImfOzPpTypx/M5Kimc4wz+VQI 1uAViD8TUSkeJJ/B2tdksXZMpnHbEKtQsDyjTiGAmUJJkPut9DuCcRm7D77lszj/OEIP Ym8ajK/u8XPoDnWSSr5d1UQpGuawaYu+M8Ss5vtLdwqz86HA8T/Sv25xBEZliPNkF+nf 5YCA==
X-Gm-Message-State: APjAAAVgFvvbsVszkeRgwec/UR04dOr5st/suj4V6/Ut8Lj+69+7OJrJ Hw3mTk3lU74KmJgzXejmR2Aw4/EbTE00ZFplbhlMlNk0eHo=
X-Google-Smtp-Source: APXvYqw35JNxGd5sJ8BrAPUr6IiUFOmp4Vmp3RX8D16WUTXS8I3ebkTytLcp04+xB7O6jsWxfBXhqIV/1Y7Nb7rC6vg=
X-Received: by 2002:a02:742:: with SMTP id f63mr14305149jaf.138.1575297418243; Mon, 02 Dec 2019 06:36:58 -0800 (PST)
MIME-Version: 1.0
References: <20191121141336.665a7a7059d7ee80bb4d670165c8327d.9a3859061b.wbe@email03.godaddy.com> <CANfi1JjyouJV-CLXdKOwvRxcFPM0csTe8=+44hszSBhVTxd-qA@mail.gmail.com> <CANfi1JjeSo2-Ez52Nu3Lcb3jC9skPp2_YWza8Xnusu0Xi8vHuA@mail.gmail.com> <CANfi1JgVZ=rc1s=ELHoS=tv9HkwuzNCP0PUAZbjXWfWX0UtEXQ@mail.gmail.com> <000501d5a31c$cb6f52e0$624df8a0$@ewellic.org> <7AAF56F5-A51D-45B2-9400-86FB94625A06@gmail.com> <00C5B42F-0871-4A9D-913A-EABAF0344F68@evertype.com> <20191201185039.0ec4bf53@JRWUBU2> <CANfi1Jgz65Ohw8FXSBaLzVx=9Xi_GpMAeL+BnKUoRgG=ij7txQ@mail.gmail.com> <20191202141237.1724fc7c@JRWUBU2>
In-Reply-To: <20191202141237.1724fc7c@JRWUBU2>
From: =?UTF-8?Q?=C3=89lie_Roux?= <elie.roux@telecom-bretagne.eu>
Date: Mon, 2 Dec 2019 15:36:46 +0100
Message-ID: <CANfi1Jh+74EMcTdm7_7wdqKJrDPL87jkr26wkadXBeRpJL5peA@mail.gmail.com>
To: IETF Languages Discussion <ietf-languages@iana.org>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/S7gNdwZ38jdTjWuiJ7UzuShLAi0>
Subject: Re: [Ietf-languages] Forms for subtag kmpre20c
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Dec 2019 14:37:26 -0000

> If that's truly the case, the proper tag is und-Khmr.

Why not. Generally speaking I think all lang tags should all have a
macrolanguage though. Most our database is composed of Tibetan; most
of it is Classical but we also have old and modern, but we don't care
about the distinction in the lang tags. I certainly find und-Tibt (or
actually und-x-ewts as we're using transliteation) quite ugly and I
would much prefer bo to be a macrolanguage (like zh) instead of an
individual one. But that's not a problem I can solve... In the
meantime I'll use km as it's much more user friendly.

> You then hit the
> problem that language tagging doesn't handle exclusions.  At least,
> Michael Everson said it doesn't and I have no reason to disbelieve
> him.  It also makes sense to me as a policy.

Sure

> And this immediately undermines the previous generality, as it includes
> things like pi-Khmr.

What generality? I'm not following

> What about printed Khmer-script missionary texts from 1893 printed in
> Hong Kong?

We don't have any yet and we have no acquisition plan for such texts.
What about them?

> The fact that Middle English as a whole certainly lacked a standard
> orthography is no bar to its being classified as a language.  The lack
> of standards is therefore not of itself a bar to adding variants for
> Old Khmer (if you truly have such materials), Middle Khmer and I
> suggest is not necessarily a bar to 'pre 20th century' Modern Khmer.

Sure.

> It might be necessary to provide evidence of a time depth to the
> variations - in which case we should try to get the experts involved.

That's what we're trying to do, but we can't expect to stop all
operations before we get the answers, and they probably won't be there
in the next few decades... and I need to tag my data in the meantime.

Best,
-- 
Elie