Re: [Ietf-languages] Forms for subtag kmpre20c

Élie Roux <elie.roux@telecom-bretagne.eu> Tue, 03 December 2019 08:03 UTC

Return-Path: <roux.elie@gmail.com>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 71430120127 for <ietf-languages@ietfa.amsl.com>; Tue, 3 Dec 2019 00:03:54 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.733
X-Spam-Level:
X-Spam-Status: No, score=-0.733 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FREEMAIL_FORGED_FROMDOMAIN=0.25, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id N6fr4mLEyMHO for <ietf-languages@ietfa.amsl.com>; Tue, 3 Dec 2019 00:03:52 -0800 (PST)
Received: from mork.alvestrand.no (mork.alvestrand.no [IPv6:2001:700:1:2::117]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 04C98120024 for <ietf-languages@ietf.org>; Tue, 3 Dec 2019 00:03:52 -0800 (PST)
Received: by mork.alvestrand.no (Postfix) id EED937C080E; Tue, 3 Dec 2019 09:03:48 +0100 (CET)
Delivered-To: ietf-languages@alvestrand.no
Received: from localhost (localhost [127.0.0.1]) by mork.alvestrand.no (Postfix) with ESMTP id C294D7C0A2B for <ietf-languages@alvestrand.no>; Tue, 3 Dec 2019 09:03:48 +0100 (CET)
X-Virus-Scanned: Debian amavisd-new at alvestrand.no
Received: from mork.alvestrand.no ([127.0.0.1]) by localhost (mork.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TU0m8aaeKAjo for <ietf-languages@alvestrand.no>; Tue, 3 Dec 2019 09:03:47 +0100 (CET)
X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0
X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0
X-Comment: SPF skipped for whitelisted relay - client-ip=2620:0:2d0:201::1:72; helo=pechora2.lax.icann.org; envelope-from=roux.elie@gmail.com; receiver=ietf-languages@alvestrand.no
Received: from pechora2.lax.icann.org (pechora2.icann.org [IPv6:2620:0:2d0:201::1:72]) by mork.alvestrand.no (Postfix) with ESMTPS id 94DA87C080E for <ietf-languages@alvestrand.no>; Tue, 3 Dec 2019 09:03:46 +0100 (CET)
Received: from mail-il1-f177.google.com (mail-il1-f177.google.com [209.85.166.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pechora2.lax.icann.org (Postfix) with ESMTPS id 5EE351E0377 for <ietf-languages@iana.org>; Tue, 3 Dec 2019 08:03:43 +0000 (UTC)
Received: by mail-il1-f177.google.com with SMTP id t9so2331944iln.4 for <ietf-languages@iana.org>; Tue, 03 Dec 2019 00:03:43 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=6bG7C2hDR/3h6jnoufNqEhCq8pvkSwG5w1a4j82tTB8=; b=FE4aAS6QvypFiEJjqtIjinCx07o13oR8Q1uzv512O2NfJMkexiL+fv0JP+Y55MLRIB UtKuxDuK9XWnGRM9h1Ai+RXRc+AmKBUe1BvxNVbLFSgdhVaamNsKW+qPunR/nHQHQHwX 5ViwTMdrNqsnwHntuMK6LLKg9C47ceRRYU8RhgRqK6bZQKJMCE/L/kdQlqMxHgqDkQaa C6++tCCkjBVWx0/oyqEfpPVw5gn1bZ2IM5LI8KTtMAGi7YuPBcKRtTcVNmVL1SwryCHS 1weZ42yAxUPFNcS+U7u+eRGa/mPj7pqiDSfFa3jqQ2DMid45siaFZq1KMtGaF3RPCv9j 7meA==
X-Gm-Message-State: APjAAAW+ta+HFjnfuO0h1rr27TGHr6iWckAf6S+KeDqugPvMmhKLSMna VkVZ3y/ZidzHzKMpP/5ew6vqSSbUUl0twaGgJAQzE4LRlNc=
X-Google-Smtp-Source: APXvYqzRXIcEnuFN4mkTDcw2ryD+PGgJ4PNbaxBfixTpZ88RG7TBnu0q532Hrmrphdoq/ipIAr/tZr4XkRg6iQnvtz4=
X-Received: by 2002:a92:3c17:: with SMTP id j23mr3534846ila.44.1575360202439; Tue, 03 Dec 2019 00:03:22 -0800 (PST)
MIME-Version: 1.0
References: <20191202165611.665a7a7059d7ee80bb4d670165c8327d.dba149222d.wbe@email03.godaddy.com>
In-Reply-To: <20191202165611.665a7a7059d7ee80bb4d670165c8327d.dba149222d.wbe@email03.godaddy.com>
From: Élie Roux <elie.roux@telecom-bretagne.eu>
Date: Tue, 03 Dec 2019 09:03:10 +0100
Message-ID: <CANfi1JgSUn3YG6M5HH1s3gqHCdKV8Y0CgiQUDi_c-YLNa3fi7A@mail.gmail.com>
To: IETF Languages Discussion <ietf-languages@iana.org>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/HLsJiMWWCv9aPxw7DdB4V9pvruY>
Subject: Re: [Ietf-languages] Forms for subtag kmpre20c
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Dec 2019 08:03:54 -0000

> Can you elaborate on what you mean by this? On the surface, I couldn't disagree more, but I assume I'm missing something.

I think it comes from a few different angles:

1. my experience with databases in the field that I'm working in
(Buddhist studies) is that they use zh for Buddhist texts in Chinese
(translated between the 4th and 11th c. give or take) and I'm quite
happy to do that too as nobody in the field requires the distinction
between the different flavors of Chinese, so zh perfectly fits the
purpose.

2. my experience with the same databases is that they all use the bo
lang tag for Tibetan. Unfortunately bo is not a macrolanguage, it's
supposed to be the language spoken in some areas today
(https://iso639-3.sil.org/code/bod). This language is very different
from most of the literature we have in our database which is Classical
Tibetan, which has its own tag (xct). Also, I struggle a bit to make
sense of the "bo" lang tag as: someone from Amdo (thus not speaking
"bo" but "adx") and someone from Lhasa (speaking "bo") can't
understand each other in speech, but the way they write is the quasi
identical. So how do you tag a blog article? If you don't know the
origin of the article, you can say it's "Literary Tibetan", which has
no tag, but you can't say for sure what "language" it is. And for
short sentences (such as titles like what we have in our database),
there's a great deal of overlap between Modern Literary Tibetan and
Classical Tibetan. And (in our applications) we don't care about this
distinction, we don't want to have to choose. And if we don't want to
chose, the only option is "und", which, to be honest, I find perfectly
ridiculous. So, we're sticking with bo even though it's not true...
But if there was a macrolanguage, we would definitely use it.

3. I suspect the situation with Khmer is actually the same, as well as
probably for Tham, Khom, etc.

And I don't know why some umbrella languages exist (such as zh, but
also inc or pra that I find useful), and why others don't...

Anyways, this is none of IETF's concern, I should bring that to SIL.

Best,
-- 
Elie