[Ietf-languages] Proto-languages

Richard Wordingham <richard.wordingham@ntlworld.com> Sat, 02 December 2023 12:03 UTC

Return-Path: <richard.wordingham@ntlworld.com>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DE47FC14F5E6 for <ietf-languages@ietfa.amsl.com>; Sat, 2 Dec 2023 04:03:47 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.441
X-Spam-Level:
X-Spam-Status: No, score=-1.441 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=ntlworld.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TJwSoS6JFxpV for <ietf-languages@ietfa.amsl.com>; Sat, 2 Dec 2023 04:03:43 -0800 (PST)
Received: from out.mail.icann.org (out.mail.icann.org [64.78.33.6]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D8746C14F5E4 for <ietf-languages@ietf.org>; Sat, 2 Dec 2023 04:03:42 -0800 (PST)
Received: from MBX112-E2-CO-1.pexch112.icann.org (10.226.41.200) by MBX112-W2-CO-2.pexch112.icann.org (10.226.41.130) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28; Sat, 2 Dec 2023 04:03:41 -0800
Received: from aesmt112-va-1-2.serverpod.net (10.216.74.35) by MBX112-E2-CO-1.pexch112.icann.org (10.226.41.201) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.28 via Frontend Transport; Sat, 2 Dec 2023 04:03:41 -0800
Received: from aesc112-va-1-2.serverpod.net (aesc112-va-1-2.serverpod.net [10.216.76.35]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by aesmt112-va-1.serverpod.net (Postfix) with ESMTPS id 7BD4260003 for <ietf-languages@ex.icann.org>; Sat, 2 Dec 2023 04:03:41 -0800 (PST)
Received: from exmx112-va-1-2.serverpod.net (exmx112-va-1-2.serverpod.net [10.216.72.35]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by aesmt112-va-1.serverpod.net (Postfix) with ESMTPS id 430DC60002 for <ietf-languages@ex.icann.org>; Sat, 2 Dec 2023 04:03:41 -0800 (PST)
Received: from pechora5.dc.icann.org (pechora5.icann.org [192.0.46.71]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by east.smtp.mx.icann.org (Postfix) with ESMTPS id B3A6F1C0002 for <ietf-languages@ex.icann.org>; Sat, 2 Dec 2023 04:03:40 -0800 (PST)
Received: from csmtpq1-prd-nl1-vmo.edge.unified.services (csmtpq1-prd-nl1-vmo.edge.unified.services [84.116.50.35]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by pechora5.dc.icann.org (Postfix) with ESMTPS id 870B27000343 for <ietf-languages@iana.org>; Sat, 2 Dec 2023 12:03:40 +0000 (UTC)
Received: from csmtp1-prd-nl1-vmo.nl1.unified.services ([100.107.82.135] helo=csmtp1-prd-nl1-vmo.edge.unified.services) by csmtpq1-prd-nl1-vmo.edge.unified.services with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from <richard.wordingham@ntlworld.com>) id 1r9OiM-00F91Z-1R for ietf-languages@iana.org; Sat, 02 Dec 2023 13:03:18 +0100
Received: from JRWUBU2 ([82.27.122.109]) by csmtp1-prd-nl1-vmo.edge.unified.services with ESMTP id 9OiKrIZ5PqF5V9OiLr8UAA; Sat, 02 Dec 2023 13:03:17 +0100
X-SourceIP: 82.27.122.109
X-Authenticated-Sender:
X-Spam: 0
X-Authority: v=2.4 cv=To3ghiXh c=1 sm=1 tr=0 ts=656b1d05 cx=a_exe a=lZfnwhydZ+7bl6OdZ0zTBw==:117 a=lZfnwhydZ+7bl6OdZ0zTBw==:17 a=kj9zAlcOel0A:10 a=e2cXIFwxEfEA:10 a=KX6cXfhxF7-RH0V9MCcA:9 a=CjuIK1q_8ugA:10
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ntlworld.com; s=meg.feb2017; t=1701518597; bh=MYMqCkwOKXctg/238IdBod4vKIxAQ7d/QI00AcMdF5M=; h=Date:From:To:Subject; b=DLR9wa6zrOqrfI5NiVJTMDhdkUW03gKCmmWkp1U6FxD+e1LS+3gszl2Vs55NLL3Xo W9BKv9z99LLsR3hLh1iIAXXidJ314lqlj5G6Cwwa4MLbGHN2xW0TFF0FJ/dii/Ojsg 0fEGHHUsCYBo4NrP2oXfKCpltfXdCGLRZAoCw1aczCf9agLkxjTlqsXzUpKyLy1Twa mrBgytpRqF1ehutxIRAfZKoTHloNbmgnncntq08YU+OqAnBHS+RTuwbkDnudDhSBWn 55HPMOqAQHRZjGoaz/Yzj+oUIyjIebRDLbL+fnj2fL/EUlGCS9DikuEv3dlwh8+Eoe tWfWN7LF/KwIw==
Date: Sat, 02 Dec 2023 12:03:13 +0000
From: Richard Wordingham <richard.wordingham@ntlworld.com>
To: IETF Languages Discussion <ietf-languages@iana.org>
Message-ID: <20231202120313.6cb2007a@JRWUBU2>
X-Mailer: Claws Mail 4.0.0 (GTK+ 3.24.33; x86_64-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
X-CMAE-Envelope: MS4xfAE8cSC5Xnqfya3FIxtA4Chx/0xIvnSraEOLE0HnSF82JVxcE4kihwRfccT7cz2aW589Jhz9s2PnTz6nwMPC7Oj9wkJ2VWjC28jACCHgQRIzMhzF67Ga BFU+twUpmv87vVNT8w1cfhf0+oscJ5ESkApJM2lChKiyff72Lknl+9Y0pmeg3vgu/MamN8FSr/qrO13VL4NK5gbW47FZO1OiY+M=
X-CMAE-Score: 0
X-CMAE-Analysis: v=2.4 cv=C8b6dCD+ c=1 sm=1 tr=0 ts=656b1d1c a=SRSNG2tq5TuPtBd2c0+Vew==:117 a=SRSNG2tq5TuPtBd2c0+Vew==:17 a=9+rZDBEiDlHhcck0kWbJtElFXBc=:19 a=kj9zAlcOel0A:10 a=e2cXIFwxEfEA:10 a=KX6cXfhxF7-RH0V9MCcA:9 a=CjuIK1q_8ugA:10
X-SOURCE-IP: 192.0.46.71
X-SPF-STATUS: soft_fail
X-SPF-FROM-STATUS: not_checked
X-RDNS-STATUS: pass
X-HELO-STRING: pechora5.dc.icann.org
Spam-Stopper-Id: 1cd848ec-4f0d-4fa0-b86b-e7e060592436
Spam-Stopper-v2: Yes
X-Envelope-Mail-From: richard.wordingham@ntlworld.com
X-Spam-Reasons: None
X-AES-Analytics-Data: eyJ0aW1lc3RhbXAiOiAiMjAyMy0xMi0wMlQxMjowMzo0MS4zNjJaIiwgIm1lc3NhZ2VUcmFja2luZyI6IHsiaGFuZGxpbmciOiBbIlRISVJEIFBBUlRZIEJZUEFTUyJdLCAidW5pZmllZENhdGVnb3J5IjogIlVOQ0FURUdPUklTRUQifSwgImVuZ2luZXMiOiB7fX0=
X-AES-Category: LEGIT
X-Spam-Category: None
X-Auto-Response-Suppress: DR, OOF, AutoReply
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/k8PNi3pNkJF-LsY7NxMsThW5vYE>
Subject: [Ietf-languages] Proto-languages
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Review of requests for language tag registration according to BCP 47 \(RFC 4646\)" <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 02 Dec 2023 12:03:48 -0000

Greetings,

Some while ago, we had a brief discussion of the English Wiktionary's
non-conformant usage of language tags.  I have since discovered that
these non-conformant tags were being included used as language tags in
the HTML of the web site!  However, sentiment amongst the editors has
moved towards it being a good idea to emit conformant language tags, to
the extent that we are talking about generating a table converting
Wiktionary language tags to BCP 47-conformant tags, not necessarily
preserving Wiktionary distinctions.

There are roughly 624 non-compliant Wiktionary language tags, whereas
there are only about 520 private use language subtags available.  A
programme to eliminate the need for private-use subtags will result in
a flow of registration requests.

A lot of these non-compliant language tags are for proto-languages, and
I foresee major problems in registering.  ISO 639-3 prohibits
reconstructed proto-languages, and the number of documents produced in
proto-languages is quite low.  Is there any point in trying to register
them with ISO 639? If so, how do we do it?

Another possibility I see is to register a variant subtag 'proto'
prefixable by any language subtag denoting a language family (formally,
we might have to generalise to anything with the scope of 'collection')
to denote the proto-language of the family.  If ISO 639 automatically
rejects mere proto-languages, is this in principle acceptable?  We
might have to list individual prefixes for the variant subtag, as
roa-proto would denote some variant of Latin (cf. la-peano for
Interlingua).

A third possibility is to go for BCP 47 5-8 character language subtags.
 Would formal ISO 639 rejections be required?

Richard.