Re: [Ietf-languages] Language subtag registration form

Peter Constable <pgcon6@msn.com> Thu, 26 November 2020 07:06 UTC

Return-Path: <pgcon6@msn.com>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 351F53A0AD3 for <ietf-languages@ietfa.amsl.com>; Wed, 25 Nov 2020 23:06:54 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.08
X-Spam-Level:
X-Spam-Status: No, score=-1.08 tagged_above=-999 required=5 tests=[AC_DIV_BONANZA=0.001, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=msn.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aF7kwuouzpmJ for <ietf-languages@ietfa.amsl.com>; Wed, 25 Nov 2020 23:06:51 -0800 (PST)
Received: from mork.alvestrand.no (mork.alvestrand.no [158.38.152.117]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5CE173A09A8 for <ietf-languages@ietf.org>; Wed, 25 Nov 2020 23:06:51 -0800 (PST)
Received: by mork.alvestrand.no (Postfix) id A80A37C5984; Thu, 26 Nov 2020 08:06:49 +0100 (CET)
Delivered-To: ietf-languages@alvestrand.no
X-Comment: SPF skipped for whitelisted relay - client-ip=192.0.46.73; helo=pechora3.dc.icann.org; envelope-from=pgcon6@msn.com; receiver=ietf-languages@alvestrand.no
Received: from pechora3.dc.icann.org (pechora3.icann.org [192.0.46.73]) by mork.alvestrand.no (Postfix) with ESMTPS id 630467C590B for <ietf-languages@alvestrand.no>; Thu, 26 Nov 2020 08:06:49 +0100 (CET)
Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-oln040092005080.outbound.protection.outlook.com [40.92.5.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by pechora3.dc.icann.org (Postfix) with ESMTPS id 436C2700061F for <ietf-languages@iana.org>; Thu, 26 Nov 2020 07:06:48 +0000 (UTC)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SLrZgg+edBJMkKMqE0tcNezChAePhVCPDEocVzGHa3UNw8pgJUe3tF1v0cgIlgpV70ro2DdauyyHpDLnQndJMRUGOOVYTA/HxrOqhlv+foD9PNzM+bcYh5Qv3gCzBtL+dCCcMwZ/IIniYEj7V51hLkIsfQeqOniup7rSa7nuHHjwjpyoU0/k0hY3Ww5UbgrIUaosxPwg1G2Blp9IskOt//eEC/LqcUB3J86THBU9vpJHyu6TyYf8diAzimccRuDGqC15NwrKZwiI7gLCJQWloVsZwWsKdWTrvquCl8yDsa38KSDOI5WYhs3Pxue5Vr2d13LjU0T225+1GSaGl0j46Q==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Gsb/MtpkJPCJ71QCFOW1NRgTJrTaC9iVHFzkcmdiUVg=; b=FkEdWoWQ0A9CqfBvalgQ3X5QhDHIxb9e2edcD91Ty2GetevVl8shxi73bxl0lam+3vn9SK0r/2UBmyQMjMWBER1vJrcW2l440A2C25fzVoJrqiLwnL5o4ujZOUETV/t86moywCf4J1rUDX3p+COnbQpyCedG9h9hVy3LD0hvGSOiWx/0Oe9Ui4TlKAyB8GvQrwAF2K4tYUJe5gvhTEuXDGEarrkjgesxySr1rtA47MRxt01sjMrUBu3WizGmG6TvXBfo14jteS8ZISaKwth5RceVL+6UrdXnodWLRp7OIIIeUdNMoqF0UUs/P9adGQmD4TnQKmq7nPns1inXU38wGA==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=msn.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Gsb/MtpkJPCJ71QCFOW1NRgTJrTaC9iVHFzkcmdiUVg=; b=onWo2NffNMkRz7Wc6mBg3FPQA63uNMqT5YaK9kMbGwfaU0UUL/Hz5U7Ag6bN6rqBRUmW2tz5xi/7nhmGJXRgv4/FcTVjGabCQO9km3YQ9JpCkZybYl1QkqmK9yWMFR7xxyEmTQNn9apv/TdILOC/d+dLoM3Dr7A9naB1YenAa/0psQsAAhCc3ImsxtIl2181ZkNtZaE1grPRGCZRvsZ5INNyxKbwT+1mY0sGZRsqY6afhTfNN4HytnhS9AwOq107ZMNjm6Sd3RYeciN4OK3a7qilrnUos4WBsij5tDMT9xynnCM6+C4fgQJni/zp/91jkqEYDD98aApaiG89XnhoGQ==
Received: from MWHPR1301MB2112.namprd13.prod.outlook.com (2603:10b6:301:36::19) by MWHPR1301MB1934.namprd13.prod.outlook.com (2603:10b6:301:32::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3632.6; Thu, 26 Nov 2020 07:06:26 +0000
Received: from MWHPR1301MB2112.namprd13.prod.outlook.com ([fe80::94ff:8d76:2d69:9bf7]) by MWHPR1301MB2112.namprd13.prod.outlook.com ([fe80::94ff:8d76:2d69:9bf7%6]) with mapi id 15.20.3589.015; Thu, 26 Nov 2020 07:06:26 +0000
From: Peter Constable <pgcon6@msn.com>
To: Mark Davis ☕️ <mark@macchiato.com>, Doug Ewell <doug@ewellic.org>
CC: "ietf-languages@iana.org" <ietf-languages@iana.org>, Sebastian Drude <drude@xs4all.nl>
Thread-Topic: [Ietf-languages] Language subtag registration form
Thread-Index: AQHWwhcsIHgl28k+rkqQRFLAxIhjIanX8RQAgAAMcQCAAR57gIAA26vg
Date: Thu, 26 Nov 2020 07:06:26 +0000
Message-ID: <MWHPR1301MB21129ED84D7FCC14287BC75386F90@MWHPR1301MB2112.namprd13.prod.outlook.com>
References: <CAKZQS29HBak-v6M2HLCpdgeZHJTFVc2W_w4G=qOK+mtPcXEenQ@mail.gmail.com> <4846f915-5706-e9dc-8b16-9f16362f82f0@xs4all.nl> <001001d6c2c0$dbbb4180$9331c480$@ewellic.org> <CAJ2xs_HXFgNmNZEnPd=FJP_JV1ioRTxFpJ3hB8scaar=qeOuaA@mail.gmail.com>
In-Reply-To: <CAJ2xs_HXFgNmNZEnPd=FJP_JV1ioRTxFpJ3hB8scaar=qeOuaA@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-tmn: [cQ5/yTN6Jia9qhUOPCSx4mhb361ZwiUE]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 4d546ffe-daef-4a73-e93e-08d891d9ca75
x-ms-traffictypediagnostic: MWHPR1301MB1934:
x-ms-exchange-minimumurldomainage: ewellic.org#4760
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: 9ZfY6qFLtLXKJWJSiamVAuQHH9YTnuTwEBv8Vk0MKeSVmzNgZJwCQePtM1MC7bxypnAB7eSf71G04enTNWGRi7f2bgF2pkBUdHRO00FXPoEPXbeOYxTrZ301N+It8roVhuQD/GXnJ2sq5NXd7950KW/iyoJJVrTyrBVdWdv1FSAawcNm4SO62e9wJnTQlah9lmuxD7exjLZXOdjo2qHC73ZKDWAvVV5c+GMk7HZ7uQFRNLfAOjfCibXOT0Ni9+BZ
x-ms-exchange-antispam-messagedata: MpA8iHWh0j05bTxZ/GKMKBAPusaTlZ9NmUBEHMW3XD0E6fz7YzuKu7ar57YLP3yCQsvy3wCSc21dGhwIQODO0TxOjsk6vCUccD9MjTf6didCHjLJVB7+XCSblZzpZIjEhAAxmozgkF/JQ6VqQi6qsw==
x-ms-exchange-transport-forked: True
Content-Type: multipart/alternative; boundary="_000_MWHPR1301MB21129ED84D7FCC14287BC75386F90MWHPR1301MB2112_"
MIME-Version: 1.0
X-OriginatorOrg: sct-15-20-3174-8-msonline-outlook-32ef5.templateTenant
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: MWHPR1301MB2112.namprd13.prod.outlook.com
X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000
X-MS-Exchange-CrossTenant-Network-Message-Id: 4d546ffe-daef-4a73-e93e-08d891d9ca75
X-MS-Exchange-CrossTenant-originalarrivaltime: 26 Nov 2020 07:06:26.2401 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa
X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR1301MB1934
X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.2 (pechora3.dc.icann.org [0.0.0.0]); Thu, 26 Nov 2020 07:06:48 +0000 (UTC)
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/bpv9I2Th10E1lYs8y8VmkwbyGqo>
Subject: Re: [Ietf-languages] Language subtag registration form
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Nov 2020 07:06:54 -0000

Right now, the draft for TR 21636 doesn’t provide any _list(s)_, other than the list of eight dimensions of variation. It does define a coding system; it defines a conceptual framework. A coding system could be developed based on the framework, but I think it would perhaps be a bit premature for it to do so in its initial publication (for which reason, a TR is more appropriate than an IS).

> the allowable values for each of the eight dimensions may not be open-ended; they must be specified concretely somewhere

Some of the dimensions would probably be amenable to defining an enumeration of values from the outset. E.g., maybe that would be possible for modality/medium; or perhaps a preliminary set of values could be defined for which there’s good confidence that they would be widely used, with other values added over time as there is more clarity.

But coded values for some dimensions really would need to be long-term extensible. Just like the list of variant subtags in BCP 47.

Of course, neither Mark nor Doug has suggested that lists of coded values would need to be defined up-front before a coding system could be useful. Only that the code sets would need to be publicly and freely documented. The Language Subtag Registry itself is a good model for that, as well as ISO 639-3.

In principle, one might expect the registries for the “t” and “u” extensions to be good examples, though they aren’t as good as the above examples.


  *   The documentation of the “u” extension is OK: they’re defined in a specific section of the LDML spec, and it’s not too hard to discover (UTS #35: Unicode Locale Data Markup Language<http://www.unicode.org/reports/tr35/#Locale_Extension_Key_and_Type_Data>); but if you want a machine-readable data file, you’ll have to figure out how to navigate all of the data contained in a CLDR release—which is not going to be easy for a newcomer. Of course, the “u” extension is primarily (solely?) used in connection with CLDR, so that’s not surprising. But it still means it’s not the best model for something based on 21636 to follow.
  *   The “t” extension really is not a good example to follow: assignment of ‘t’ subtags is done as part of the Unicode CLDR project. When RFC 6497 was written, it defined certain values, and it allows for future assignments to be made in the LDML spec; but it’s harder to discover the documentation since there’s no direct trail from the RFC or from the ‘t’ extension registration form to a specific section of the LDML spec; so the reader has to figure out it’s somewhere in the LDML spec and go navigate that doc to find it. Manageable if you’re somewhat familiar, but probably not otherwise. And, like ‘u’, any machine-readable data is buried in the CLDR data. And that’s unfortunate, really, since the ‘t’ extension is potentially useful outside of the CLDR context, unlike the ‘u’ extension.


Peter

From: Ietf-languages <ietf-languages-bounces@ietf.org> On Behalf Of Mark Davis ??
Sent: Wednesday, November 25, 2020 9:26 AM
To: Doug Ewell <doug@ewellic.org>
Cc: ietf-languages@iana.org; Sebastian Drude <drude@xs4all.nl>
Subject: Re: [Ietf-languages] Language subtag registration form

> they must be specified concretely somewhere

I'd like to emphasize that the list must be freely available (not behind the common ISO paywall), versioned, and stable (in the sense that new items can be added, and existing items can be deprecated, but nothing can be removed). That is, similar to what we see in https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.iana.org%2Fassignments%2Flanguage-subtag-registry%2Flanguage-subtag-registry&data=04%7C01%7C%7Cc49d24d160c5470eac7a08d891676aa3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637419220639376226%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=G%2Fq%2BK%2BUuipRvaHKNHxEURA%2Fk%2FLP7GctcsZCzFs6OxcM%3D&reserved=0> and in https://iso639-3.sil.org/code_tables/download_tables<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fiso639-3.sil.org%2Fcode_tables%2Fdownload_tables&data=04%7C01%7C%7Cc49d24d160c5470eac7a08d891676aa3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637419220639376226%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=VfrQzsT10TuSSqRkEPgt86Fa0BDqtIg9n8a9sRU%2Bayk%3D&reserved=0> for ISO 639-3.

Mark


On Tue, Nov 24, 2020 at 4:21 PM Doug Ewell <doug@ewellic.org<mailto:doug@ewellic.org>> wrote:
I'd be willing to work with the ISO TC 21636 folks, as a specialist in encoding things rather than in languages or linguistics per se.

Keep in mind that in order to make 21636 work with BCP 47, it really has to be framed as an extension, and the allowable values for each of the eight dimensions may not be open-ended; they must be specified concretely somewhere. See https://tools.ietf.org/html/rfc5646#section-2.2.6<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftools.ietf.org%2Fhtml%2Frfc5646%23section-2.2.6&data=04%7C01%7C%7Cc49d24d160c5470eac7a08d891676aa3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637419220639386221%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=wXO%2B%2FbVB7Q%2FmvNU0tr0qcBCwIA0JuvpLzI8lismCTQw%3D&reserved=0> for more information about BCP 47 extensions. Changing the syntax of BCP 47 tags to add new subtag types is almost certainly a non-starter.

Regarding "sociolect," Ben may be open to changing that one instance of the word. Alternatively, it could be argued that users of Arcaicam Esperantom do indeed belong to a particular social group: that subset of Esperanto users who wish to explore a posited "early form."

--
Doug Ewell, CC, ALB | Thornton, CO, US | ewellic.org<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fewellic.org%2F&data=04%7C01%7C%7Cc49d24d160c5470eac7a08d891676aa3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637419220639386221%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=5sYvwsPCUZv8rG%2FASYvqF716EEBzBvqYpjne3CblLlA%3D&reserved=0>


From: Ietf-languages <ietf-languages-bounces@ietf.org<mailto:ietf-languages-bounces@ietf.org>> On Behalf Of Sebastian Drude
Sent: Tuesday, November 24, 2020 16:36
To: ietf-languages@iana.org<mailto:ietf-languages@iana.org>
Subject: Re: [Ietf-languages] Language subtag registration form

Hi everyone,
this is the first time I reply directly to such a request, please forgive me if I do mistakes or do not know the correct procedures.
I agree with the proposal; it makes sense to me to have a subtag for this variety of Esperanto.
I only do not agree with the characterization as a "Sociolect" -- a variety of a language which is characteristic for a certain (social) group within the whole speaker community.  As the applicant explains, the same speakers/writers may use it in the same text, so this is (according to the upcoming ISO TC 21636) clearly some kind of register, not a sociolect.
Which brings us back to the question of how in the future the 8 dimensions identified in ISO TC 21636 can be reflected/applied in the current frameworks, in particular BCP 47).  Are there members of this group who would like to participate with me in a working group that would work out proposals for that?
(Now I see that in point (4) of the proporal 'Arcaicam Esperantom' is also correctly identified as such, so the characterization as sociolect in (3) is probably a mistake or misunderstanding.)

_______________________________________________
Ietf-languages mailing list
Ietf-languages@ietf.org<mailto:Ietf-languages@ietf.org>
https://www.ietf.org/mailman/listinfo/ietf-languages<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ietf.org%2Fmailman%2Flistinfo%2Fietf-languages&data=04%7C01%7C%7Cc49d24d160c5470eac7a08d891676aa3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637419220639386221%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=xmeWTKgk9DAu7TGtK4AhEqvnf0AEKseyjWSq5Gvz0Ts%3D&reserved=0>