Re: [Ietf-languages] Adding prefixes with dialect variants to Occitan orthographic variants

Doug Ewell <doug@ewellic.org> Mon, 19 April 2021 19:49 UTC

Return-Path: <doug@ewellic.org>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CAB653A40DB for <ietf-languages@ietfa.amsl.com>; Mon, 19 Apr 2021 12:49:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.004
X-Spam-Level:
X-Spam-Status: No, score=0.004 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VaIHe_pmv3ZX for <ietf-languages@ietfa.amsl.com>; Mon, 19 Apr 2021 12:49:45 -0700 (PDT)
Received: from mork.alvestrand.no (mork.alvestrand.no [IPv6:2001:700:1:2::117]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 10A013A40D7 for <ietf-languages@ietf.org>; Mon, 19 Apr 2021 12:49:44 -0700 (PDT)
Received: by mork.alvestrand.no (Postfix) id 233467C6C4E; Mon, 19 Apr 2021 21:49:43 +0200 (CEST)
Delivered-To: ietf-languages@alvestrand.no
X-Comment: SPF skipped for whitelisted relay - client-ip=192.0.33.71; helo=pechora1.lax.icann.org; envelope-from=doug@ewellic.org; receiver=ietf-languages@alvestrand.no
Received: from pechora1.lax.icann.org (pechora1.icann.org [192.0.33.71]) by mork.alvestrand.no (Postfix) with ESMTPS id DC8AF7C6C3B for <ietf-languages@alvestrand.no>; Mon, 19 Apr 2021 21:49:42 +0200 (CEST)
Received: from p3plsmtpa12-04.prod.phx3.secureserver.net (p3plsmtpa12-04.prod.phx3.secureserver.net [68.178.252.233]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by pechora1.lax.icann.org (Postfix) with ESMTPS id 764A970000ED for <ietf-languages@iana.org>; Mon, 19 Apr 2021 19:49:39 +0000 (UTC)
Received: from DESKTOPLPOB1E4 ([71.237.1.75]) by :SMTPAUTH: with ESMTPSA id YZtql8AeD2MG8YZtqlNzZi; Mon, 19 Apr 2021 12:49:39 -0700
X-CMAE-Analysis: v=2.4 cv=LPWj/La9 c=1 sm=1 tr=0 ts=607dded3 a=6nY1uNNCgC/8Ccg2lpAcFA==:117 a=6nY1uNNCgC/8Ccg2lpAcFA==:17 a=IkcTkHD0fZMA:10 a=nORFd0-XAAAA:8 a=TXsVO6N6afjl22jOH4kA:9 a=QEXdDO2ut3YA:10 a=AYkXoqVYie-NGRFAsbO8:22
X-SECURESERVER-ACCT: doug@ewellic.org
From: Doug Ewell <doug@ewellic.org>
To: 'Mark Davis ☕️' <mark@macchiato.com>
Cc: ietf-languages@iana.org, info@locongres.org, b.dazeas@locongres.org, "'Phillips, Addison'" <addison=40lab126.com@dmarc.ietf.org>, 'David Mediavilla' <nkd595qbd4@liamekaens.com>
References: <000001d733ec$108d19a0$31a74ce0$@ewellic.org> <e9d2b8203d174852b11f1b5c71329f78@EX13D08UWB002.ant.amazon.com> <000101d734c4$b7da4b50$278ee1f0$@ewellic.org> <CAJ2xs_HWqVCbGYdu3EcQPRHgTHVTaX6PsxxJG4hf=W3+-z1y8w@mail.gmail.com>
In-Reply-To: <CAJ2xs_HWqVCbGYdu3EcQPRHgTHVTaX6PsxxJG4hf=W3+-z1y8w@mail.gmail.com>
Date: Mon, 19 Apr 2021 13:49:39 -0600
Message-ID: <000201d73555$22fbd260$68f37720$@ewellic.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQFIfiQFS+ZHKkgNEigaFKfRtIDIGAHfV9t+AkWt55YBPXuTnauulpWQ
Content-Language: en-us
X-CMAE-Envelope: MS4xfNHYtjX9/Dd8veK8V/sc4NiT+BwoWq7yuUzTsKJ2ZN67d7cfI+A8WWu8gZ0mzpjVOdqI1+aB/DmwmHDCj5UwFAQpqoOJrqtrU9muvPZVXjSip0zjYzV1 sJ19KPs0Pjacm6lbnRHDHl9kbu4XiU73h4Qyq3iUT5Y1waUDnpMh2zEt1TAJmxNL3sxV5/lLBTP0WOMOis7aVFC2XSpQ+lButjq60Or5yOgz7hdlPvjfiRvc SVFddgp8/FRH9mUhfF9ujYhEuSjJwhwka/XpJ+bljK29EINPqHnRG0QAMFYr65isLuF1UoZXIvbFzGMGsG23bFIzTaPSM9dBSR4xGGYtMSHXy+hP3Hi9uOrd ekVi62Y7qLVtIEd1Z1czbwT1SMj6WQ==
X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-4.6.2 (pechora1.lax.icann.org [0.0.0.0]); Mon, 19 Apr 2021 19:49:39 +0000 (UTC)
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/YH8tstjs0VsqjbmzqBGYWeLp7c0>
Subject: Re: [Ietf-languages] Adding prefixes with dialect variants to Occitan orthographic variants
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Apr 2021 19:49:50 -0000

Mark Davis wrote:

> However, since that time, truncation is far less of a concern, and the
> bigger issue is that BCP47 doesn't provide a well-defined canonical
> order of all prefixes. Without that, comparison of language tags is
> fundamentally flawed.

I agree with Addison that an ordering mechanism does exist, between the text of BCP 47 (Section 4.1, item 6) and the Prefix values in the Registry. There may be some gaps, making the mechanism less than perfect mathematically.

> So CLDR extends BCP47 to simply order the variants alphabetically. Any
> application that cares about variants (and there are very few) can
> treat those variants they care about simply as a set. Thus both
> oc-gascon-phonipa and oc-phonipa-gascon are both interpreted by CLDR
> as [language=oc, variantSet={gascon, phonipa}].

However, processes that use BCP 47 but not CLDR need to work with what is in BCP 47, and that includes adding Prefix values to guide the use of multiple variants.

>> on which combinations might be valid
>
> I know what you mean there, but this would be better phrased as "on
> which combinations are useful". All combinations of registered variant
> subtags are valid; it is just that some of them are pointless.

Yes, I was imprecise in my use of "valid", which is a term of art and needs to be used precisely. I meant combinations that would be "meaningful" in Occitan or have a snowball's chance of existing in non-artificial text samples, not "allowable" by the rules of BCP 47 or any other specification.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org