Re: [Ietf-languages] Punjabi language code fix recommendations

Doug Ewell <doug@ewellic.org> Fri, 12 August 2022 06:29 UTC

Return-Path: <doug@ewellic.org>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 08679C159480 for <ietf-languages@ietfa.amsl.com>; Thu, 11 Aug 2022 23:29:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Level:
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kmZ8SNLA_QJE for <ietf-languages@ietfa.amsl.com>; Thu, 11 Aug 2022 23:29:07 -0700 (PDT)
Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2089.outbound.protection.outlook.com [40.107.243.89]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 14173C157B53 for <ietf-languages@ietf.org>; Thu, 11 Aug 2022 23:29:06 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=V73MmVZCMKuWlF3SzB8jFb5nvrnVgk+xO5OrfAOlwaNo5Gqy2Wb6vKxSikJhM5pPPjqB/CVmtqWzUbOdfODNhX3L1BHJ1FIyxtiwT6nSRnDSrjXr9RkZ9Q9AOgb18n5A/gM4NlPwX3TBo1oqvm0JAP8i62g1DjxsUo+lEY7c2IMYUg+mgrb9Ja2mRlovUxpNkVJQbbLk6/QTFALbjvONx7pcVTZA0Ag0sgxVSNIT69V0orf9TkPUCGkzFiLXzNnY6wen0Gb91GnjdHEGBtHo83wBKlPS1Hnlgvur9eoGkiTc8A6hhF+gau0AZ3evvrcmqjxmwwyUrb/dfNDhRbPbUQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=6FB5M5L2rRJ5amfZNYsZ6tElmGdMJxiqUEL7cfGbP3g=; b=flb1E+AWZpD1gF0FLtOgwhWTEYzW30LfxxZGhpKLy3UUClR7HpR85Wq+oLtPI6qt9CywC27ya23QcLcGipBnMsdaAOkSgJUjut3D7q+3SR6udeIBsiZQeqHkggyByEwDcXibPXO3Q0DVxj5dHny/mIFFmz5sRI9qYYPtTSKwkLhzBRnp2KvbM5/4ThbbgKkbQoamrYgc1fgIEXuux1BeZly6kHj421yB0DTNKu74W2UpHsC1apmcO78+FzQ3WenWx2GA79AUJwV/ngXF5jRdL/M+5SvlD2zlGIjEX0Mlr2S8UD/D0z1pt4gISUy9bMAq6HmIdWwo7HeOLSE+Igft2A==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=ewellic.org; dmarc=pass action=none header.from=ewellic.org; dkim=pass header.d=ewellic.org; arc=none
Received: from SJ0PR03MB6598.namprd03.prod.outlook.com (2603:10b6:a03:38a::21) by MN2PR03MB5264.namprd03.prod.outlook.com (2603:10b6:208:1e5::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5525.11; Fri, 12 Aug 2022 06:29:02 +0000
Received: from SJ0PR03MB6598.namprd03.prod.outlook.com ([fe80::6004:f4ea:a648:a135]) by SJ0PR03MB6598.namprd03.prod.outlook.com ([fe80::6004:f4ea:a648:a135%6]) with mapi id 15.20.5504.016; Fri, 12 Aug 2022 06:29:02 +0000
From: Doug Ewell <doug@ewellic.org>
To: "ietf-languages@ietf.org" <ietf-languages@ietf.org>
CC: "bgo_eiu (OSM mailing list email)" <bee.yourself100@protonmail.com>
Thread-Topic: [Ietf-languages] Punjabi language code fix recommendations
Thread-Index: AQHYrUw2Ou/zgCbPxk2MFyZLCIoU/K2qv0eg
Date: Fri, 12 Aug 2022 06:29:02 +0000
Message-ID: <SJ0PR03MB65983BC2E8AE0E5BA730B540CA679@SJ0PR03MB6598.namprd03.prod.outlook.com>
References: <-_6ijhQgPWxqcpH22EwNCGKuszqbXXDx3zD6m8Nhy204JBgBBHWektBOFkNxK0aHaOKS08oz2n09fHgM04kzlwzjLxGDWClx9ibZwkuEPMw=@protonmail.com>
In-Reply-To: <-_6ijhQgPWxqcpH22EwNCGKuszqbXXDx3zD6m8Nhy204JBgBBHWektBOFkNxK0aHaOKS08oz2n09fHgM04kzlwzjLxGDWClx9ibZwkuEPMw=@protonmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=ewellic.org;
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: b5f3575d-b72b-4064-d181-08da7c2bf2d8
x-ms-traffictypediagnostic: MN2PR03MB5264:EE_
x-ms-exchange-senderadcheck: 1
x-ms-exchange-antispam-relay: 0
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: uDiiHtB3ZUpjyoV/R8GiAeru0FLRrO2FBKEi+5ymA95mALPWF9XckNcNPZC799dXzmZXXGdxwRLLVgRzhcLnMncQYZnNddcOQliSaW/yjvGlOKhO4XGo+pdFGV9odmnMKoES2IXGPPCke/Idoa92jGlMmPw9Iy+87c4Dcl7vA/GuE8Us13ztM1YQh2ElqSIvckWgpaO9elk4ymIRPD6P5+UwL4eaS+eEXU1UAE8b/jo4sgfKhzDsmkVYyTGFk2BzUx7f+9dEO1/NPLrYoaD/3rAtIr8sCdQTwfk+YJ+DuqNWJOMFeSBa1JpI0BTALs1Y+OaUm7vuyzIJCUbJyosaA5x2X+lmCx2jQup3PI/FwG4LWKQtvrTUrNdxTKBKZ/9NAFstJsBgnsbEEUtazZGLuUkR1myDVqP0bXvbAgVGV9/3DOX/luE9xRStwwIOjd2iEgWfroVCsm88hZnXxtNB0p/3xPStddJhBXX/k26kmduknH0DlE0/WGdl5ldBmI9uyQq3gSpouWdXS2gocEr0uiUP459N8iFcR21G1YEqoOV8w4bdL9+buJFWFJzkJ2xN7IJbq3dyYzXBZh9lDSinorK8rRg9czfe2U/TWEm72KaXA6f1SXf+Wyd4GA6Qb77ZSeZlbTmu5JRuGstutvR+SzkR0liOSbAdUSJX3qYz62fVEXS7qgiDokrHGDqZK5i6uJJQVFhxq+jNdDQ/rijXUVkc+brux2jRKUub7BRE0+GM8ZB/jNQ3wGIjARkcEFr/BLI7xUulGKsg9fEE6uJDDbwW52vRk8YMMPEkNMo8Hh/gHDvoIuU8pROLEJfKOMDl
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:SJ0PR03MB6598.namprd03.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230016)(366004)(39830400003)(136003)(376002)(346002)(396003)(38100700002)(186003)(41300700001)(71200400001)(86362001)(7696005)(9686003)(6506007)(55016003)(316002)(83380400001)(6916009)(4326008)(76116006)(66946007)(64756008)(66446008)(66476007)(66556008)(2906002)(33656002)(966005)(5660300002)(478600001)(8936002)(38070700005)(8676002)(52536014)(122000001); DIR:OUT; SFP:1101;
x-ms-exchange-antispam-messagedata-chunkcount: 1
x-ms-exchange-antispam-messagedata-0: 90vkeg1clfrnwSihbgkwFnZVy6GsCmgNzjWoge+ldLFJNCgvr76O8POx2zOEpWC1o2AyOD7m5cu8J6a/Ha06wEYZdfNsIQTBdJ/MAIkZ9WBKpkGnC3Oma7yIUMnKl43uI1A3w9dPo1rmE6MKSuj5LmSwq8clwprnHbQYIUSPMMJKCecgCSe6/ZeJ7zfglJH/flKW1SIU/bsYYIpcw9JUGMtCZFgTPj+77K5Tfh5p7dZUxtd3m5ox+jCco0hXp0HplozYG9wDIXXRzdUY2uvBLM4uSd7k2jt9YPhI6V4w8hedoJfVYz6F6C9lsBKS/1ahyUFcdfg6AdYDkEctFPXxlYMBKGHA8QgRoc2Qx5FTrvQ/R3ROBZ8ZsQKEGtvLR+cjG9LM6yhKdOtRKeiJkq/hatZgbwym3Vn59vpeN0kfVso1MN0kwr/Afiw4bqedUc3NcyTm68/1fg2LwZLwy8VBJ535UIwCkMBPU7YBg6TZrzl006/vfpZ+XEF2aDDbZTsk0EXukv44m3Yvq5YflfPsY97LeP1ovn68/M58Ik+G6uMUTxI0DPv3n/aPBeFSdfXKPUv6sNoYWyJ+3eBu05fRkoaC9GkqG37gfB1V3BSWC7BkheNgRkl2FjmBs/5V6p402AXgTw+jmfdJip1Pfd7jgpRGIdPWfIPieDw8TlcAadSG9nVpsqvQ05VCZx7lifloOeRR3V/hKlAY7FpsjV8EET67N7wNk66EVKoGeCjymMhaGMZcM2Pen1Nrc9uq7nUfQuc0YmdCV5PMPoA8irY1/ZMFQjsxd/m+cMB1Qfp/NWhFNmWUv/WMJWLTPLqBXe4/vDrne8ju5js3xFQXsVE7kt1zRzngbfCingHmYVBul0dBrD7oeSNsj5vl7873BXf2XqXr/l5dUoUyQf9DwVUYxVDl6VazD+Nt6Q43DeIdEaFDFfKtEUMZ0dd+4DPvuEHMLd/1SmMfvL+kRJLbpmXmB9Tq7Gb/TodxkmD2Xt4a2D5IypWt7LBrppiloGojCbtIgGNbnU0b4hCtKjLj2lR8hjB2LZHBRa94OkEsq4vP49xxc5z6b9GtSZiPn+EnYWeAfx1/sBfN00r38DDt3GZqsVAql41UPdgFIx2V2dgiVbAnr5eBf8JAvc0IqK05EL3nWlqOH+8wxauhl3WBu+L5AHABDwrIZSRtwHe1U5pe9gODd0C6amDzAlQhbq3vvTLILjo2o+0Xpp7CNSbYMeQEc4PXF/iV9EbxBg3cctHcQd218WEp5xKMG7x38flwju4NaDNIZjlA9032n2pR2tnBwH5Q+IlWomAc3eHQBjLxdWqyp8RvMoy1KsJWxhqxMG5jg827LDUU+QyfdsY3appvBpDd4tK7rsZ3hDpi5N125aIM03mFds6+I19ZnuZCA3aGRw8a4nrP/n0F4C1rA1lyc3Ruj8SfKbXlNG38eTbP/mDO0J9pCfLHp2Xz92Otwf3i+HRQ+yh5OdIIVoldFEWSFAXmBg8j4wuG5CZ7SDr+QA2kf5a2UkvJnSIFdcBinGeLjkXl2JFKQsn4Oy88za1RtU/fWhU6cHJjm5NvI0HghMMcZgWTJ7Uk8sXrLIHPob9iU6u4MTFWL+pwxyFsEN6suj7ZeT7jpafYrj3s//Zp8MY=
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: ewellic.org
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: SJ0PR03MB6598.namprd03.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: b5f3575d-b72b-4064-d181-08da7c2bf2d8
X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Aug 2022 06:29:02.5611 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: af914547-9fbe-40e1-a852-1a58e1f247dc
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: rN/GowoEoR4ZXcXTrTnbaPlRRIWqciB1Will02pPgWYzKvQ0yaP+4ZkbTrBP+CldUyo2W//s7RbhPb61Zqe82A==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR03MB5264
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/7k99WEcZ_cg3wK02IxGUHEMA2Y4>
Subject: Re: [Ietf-languages] Punjabi language code fix recommendations
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Review of requests for language tag registration according to BCP 47 \(RFC 4646\)" <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Aug 2022 06:29:11 -0000

bgo_eiu (OSM mailing list email) wrote:

> * Remove "Guru" as suppress script on code "pa". The majority of
> Punjabi speakers do not read or write in Gurmukhi. This was already
> requested in 2012 and not followed through on for unclear reasons.

You may submit a form to remove the Suppress-Script property from this subtag, in accordance with Section 3.5 of BCP 47 (https://datatracker.ietf.org/doc/html/rfc5646#section-3.5). However, doing so is probably based on incorrect assumptions; to explain, I will jump to your next point.

> * Rename the English language label for "pnb" to be just "Punjabi".

Language subtags in the IANA Language Subtag Registry are based on the ISO 639-1, 639-2, 639-3, and 639-5 standards. BCP 47 requires that the Registry accept the judgment of the respective ISO Registration Authorities in determining what is a language, and what a language is called. Human language is a complex field of study and there are many well-known differences of opinion; BCP 47 does not aim to add even more opinions and controversy to this situation.

The ISO 639 family of standards considers these two to be separate languages, and has encoded them separately. Therefore, they will remain as two entities in the Registry, even though some people many disagree with this analysis.
 
The Description field in question could not be changed as you suggest in any case, because that would cause a conflict between 'pa' and 'pnb' (Section 3.1.5). Doing this intentionally to try to show an equivalence between the two entries which is not supported by ISO 639-3 is not an option.

> This language has two codes, which are used differently for different
> software platforms. For example, pa is used on Android for both
> Punjabi in Gurmukhi script and Shahmukhi script, whereas on Wikimedia
> pa is used for Gurmukhi and pnb is used for Shahmukhi.

Wikimedia appears to be closer to the ISO 639-3 intent here, encoding (Eastern) Punjabi in Gurmukhi as "pa", and (Western) Punjabi in Shahmukhi as "pnb".

The ISO 639 determinations of "what is a language" are not intended to be influenced by how a given application, operating system, or website chooses to encode languages.

> They do not refer to different languages or varieties though and
> should just be maintained as necessary duplicates of each other.

ISO 639-3 encodes them separately, and so the Registry does so as well, and will continue to do so. The concept of "duplicates" that you describe does not exist in BCP 47.

> * Tying into that, "Western Punjabi" and "Lahnda" are not real
> languages, macrolanguages or varieties of any language. [...]
> In light of this, please remove any references to "lah" as a
> macrolanguage, from "pnb" and anywhere else it appears.

The concept and definition of "macrolanguage" belong to ISO 639-3. The Registry simply reflects the definitions and assignments made in 639-3, not any of our personal judgments. Therefore, these changes will not be made in the Registry either.

Be especially aware that the concept of "macrolanguage," which again is an ISO 639-3 concept, does not necessarily mean what many people believe it means.

If you wish to see changes in the core ISO 639-3 standards, you can contact the Registration Authority at https://iso639-3.sil.org/ .

Now, back to the Suppress-Script question. The discussion 10 years ago started with a request from Andrew Glass:
https://mailarchive.ietf.org/arch/msg/ietf-languages/CrDEpLsO_9_hMFeRJmA8BMX-RjY/

which was questioned by John Cowan along the same lines as this discussion: Andrew may have been thinking of Punjabi as a single language, but that is not how it is coded in ISO 639-3, nor (therefore) in the Registry. The discussion ended because Andrew did not follow up, and there was no formal proposal to make the change.

> * Add script subcodes for the following varieties of the Perso-Arabic
> based scripts used in Pakistan:
>   - Urdu
>   - Punjabi Shahmukhi
>   - Saraiki Shahmukhi
>   - Sindhi
>   - Pashto

Just as language subtags are determined by ISO 639, script subtags are determined by ISO 15924. They are not added, changed, or deleted in the Registry by this group so as to create a mismatch with the standard. However, it is possible to propose the registration of variant subtags for script varieties. You should read Section 3.5 (and really, all of Section 3) for information on submitting such proposals.

> * If you wish to include variant codes for dialect distinctions within
> Punjabi,

Dialects, like script variations, can be captured in the Registry using variant subtags. The fluid nature of these dialects that you describe, however, may pose a challenge to getting them encoded.

I hope this clarifies some misconceptions about how BCP 47 and the Registry work.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org