Re: [Ietf-languages] Suggestion to update Urdu Script Designation in the subtag registry

Peter Constable <pgcon6@msn.com> Wed, 12 August 2020 21:37 UTC

Return-Path: <pgcon6@msn.com>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1A1003A0BEA for <ietf-languages@ietfa.amsl.com>; Wed, 12 Aug 2020 14:37:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.749
X-Spam-Level:
X-Spam-Status: No, score=-1.749 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=msn.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uXNczuO0u9MZ for <ietf-languages@ietfa.amsl.com>; Wed, 12 Aug 2020 14:37:15 -0700 (PDT)
Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11olkn2072.outbound.protection.outlook.com [40.92.20.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 15E503A0BE0 for <ietf-languages@ietf.org>; Wed, 12 Aug 2020 14:37:15 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=emTbGNiFm57E9OkFEtoEvOQbaE+eMg7Q9GY+cRXX7wJgd5o6A3UR0GnGZmMRKufWQywJvycaE9IL7ttJwcqf4sYZIHCfFa6sTlopBPAf7KU0ZR03/eBuxgJ7t0W+38hP3/xNxL1H9fh7WYBTz6K8Wfs5J5TDCS2Q1hiejIo80yuSL9ODUx2ReCcIqo6UtZzI6zZ3eNNmyCbyB/JXplJfculOi3dg4xjG5oI8kK4OdLxEFB4cdDZLhG8gbM2fPS/JGHBdL9CkdfEYYrwwGyIHtKyBNgAsTi7mg+gPqQkDbxzBPrGBAOKN8d9ejpLMzn9dQua2L8DsA8ALvQ803O/c+g==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=sKRufNik+Ay1l7fnnwwJiOiLT4r2qsyUYz+RlPocZ8o=; b=UTu3LDKJOVFUk3H43JlZAAGRKNsHhkYyFiWKSRk3qspyme3wpZZBHJpD+fWnY/k4DIT89ANTdOnxXdrN+WEYPOpZWO3UXSja535hdYx5PfEGhF0F2Gz9cJ9hVrX8h9kTZvH7xotI4yL/tp7MVpI8lhm7rkg0NfgWjnUJVounPkFVxpafX8wd2etgFRwmyHlLBNHI4qQbWC/qet0viw2RTNtobqggA/FHpLxbrfrPhMT7lZsJnCP+GCooiTjRJuMM9mZEYUXUw1BFOuMsGg4KWbSPjETFitEE2dYkNRtnuJm5BSo9hWoHvb89mCxVrhNp065I+R0o0fev4I22MlLlKA==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=msn.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=sKRufNik+Ay1l7fnnwwJiOiLT4r2qsyUYz+RlPocZ8o=; b=OXy3aN3TgWcmrnOSQMVN1Qp+OWEUWKCyo4xgo3uwZ2TUW23W3WSBmb+N87JB5DY/Q+PoZGPcSBncdkUMCPcA7rnea6EcoX5NRfCwKBUm5ornG9I/DAb4nMRnLMqu0dNmVLZaGg1JhMUXWuMxZNiLqIh/+m0e+AI6kstkN/v/okcsrCbgBmjJKKrn2y7Tns8184EgsxXGdP1kSvG2U4fGN/dr/AYkcupAl9N4k/F18m5YzUVMCZhl4sGVAFfsUyHf+vP3hcA86wRhdm5YWiDwoLEHhC4XEswmb1wKEHFiGOPyVzVGT2vFuZBUpEqmmkk47dsOeBzCjbgM2rms1Mi3ew==
Received: from CO1NAM11FT023.eop-nam11.prod.protection.outlook.com (2a01:111:e400:3861::4e) by CO1NAM11HT116.eop-nam11.prod.protection.outlook.com (2a01:111:e400:3861::119) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3261.16; Wed, 12 Aug 2020 21:37:08 +0000
Received: from MWHPR1301MB2112.namprd13.prod.outlook.com (2a01:111:e400:3861::46) by CO1NAM11FT023.mail.protection.outlook.com (2a01:111:e400:3861::291) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3261.16 via Frontend Transport; Wed, 12 Aug 2020 21:37:08 +0000
Received: from MWHPR1301MB2112.namprd13.prod.outlook.com ([fe80::cde3:7e26:b95b:8ee9]) by MWHPR1301MB2112.namprd13.prod.outlook.com ([fe80::cde3:7e26:b95b:8ee9%6]) with mapi id 15.20.3283.011; Wed, 12 Aug 2020 21:37:08 +0000
From: Peter Constable <pgcon6@msn.com>
To: "ietf-languages@ietf.org" <ietf-languages@ietf.org>
Thread-Topic: [Ietf-languages] Suggestion to update Urdu Script Designation in the subtag registry
Thread-Index: AdZw0YwxrG6XW+9cQ1W5H4C6dzxxrAAF0WyAAAAbZnAAARceAAAAu+vQ
Date: Wed, 12 Aug 2020 21:37:08 +0000
Message-ID: <MWHPR1301MB21126A9D480224E7E175C85686420@MWHPR1301MB2112.namprd13.prod.outlook.com>
References: <CY4PR0401MB36203305BEFEBF938B654E8FC6420@CY4PR0401MB3620.namprd04.prod.outlook.com> <000201d670e8$d25e7e60$771b7b20$@ewellic.org> <CY4PR0401MB362045E1E4D11D92E1F89443C6420@CY4PR0401MB3620.namprd04.prod.outlook.com> <001a01d670ed$9c868530$d5938f90$@ewellic.org>
In-Reply-To: <001a01d670ed$9c868530$d5938f90$@ewellic.org>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-incomingtopheadermarker: OriginalChecksum:5E411221D01B1B710A646AA04891EF78EF87C742DD587CA0D1A80259110C6D8C; UpperCasedChecksum:844F8E2F108A3CD4AB592EF24281836A2FB667EE884CE61EAC167C991B0ECEFF; SizeAsReceived:7108; Count:43
x-tmn: [Z26PLpxz1iXjvuEEogQH/fXGWp/+Tvus]
x-ms-publictraffictype: Email
x-incomingheadercount: 43
x-eopattributedmessage: 0
x-ms-office365-filtering-correlation-id: ff0ff34b-788a-4d62-5171-08d83f07dd90
x-ms-traffictypediagnostic: CO1NAM11HT116:
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: b6031IO3MI+3uxeWZb2vI7M0q9fhE0OGdKvXfDgAcN3OEEWIcJSyIBCBtu2WcdNVo0UL8oCSDHC2wymjSqq7vyrdOYXFs/YI3Nu8Gqb8YohQaSFplxKNFRg25X8FID7c33P3gQMyoiBrWIy5zxsDoOb8JXdP8quCF8I3rf6a5y7pJF0qx741HU1yC4GuiRo2d4nXIhtkm6lmsnGOYCjvVzTxuMz09txdxtGcX9aabZuvZv6v7Cfe4YlGuTbvsr8h
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:0; SRV:; IPV:NLI; SFV:NSPM; H:MWHPR1301MB2112.namprd13.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:; DIR:OUT; SFP:1901;
x-ms-exchange-antispam-messagedata: keobka16QcA36GmazT2kMxI95zhBIiifyDcUxWoWphvaaatGaHNqIIhbGQiUiYM6ZPEtJUPbJg3dKqEJm2YxsNznJbxpx1TMS7J0letXKIHjeCoNxgkIU5Al/QrM9/0vGj5OcVXzVxw3QlWr54hidA==
x-ms-exchange-transport-forked: True
Content-Type: multipart/alternative; boundary="_000_MWHPR1301MB21126A9D480224E7E175C85686420MWHPR1301MB2112_"
MIME-Version: 1.0
X-OriginatorOrg: outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT023.eop-nam11.prod.protection.outlook.com
X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000
X-MS-Exchange-CrossTenant-Network-Message-Id: ff0ff34b-788a-4d62-5171-08d83f07dd90
X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000
X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Aug 2020 21:37:08.6879 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Internet
X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa
X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1NAM11HT116
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/FhkF4OhtDM5_ESoAmOf6tpdr2h0>
Subject: Re: [Ietf-languages] Suggestion to update Urdu Script Designation in the subtag registry
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Aug 2020 21:37:17 -0000

> We can certainly check with ISO 15924/RA-JAC to see if there is any unstated expectation that ‘Arab’ implies the Naskh variant.

I am certain that ‘Arab’ does not imply _any_ style variant of Arabic script. One has to look no further than Unicode character properties to see that.


Peter

From: Ietf-languages <ietf-languages-bounces@ietf.org> On Behalf Of Doug Ewell
Sent: Wednesday, August 12, 2020 2:15 PM
To: 'Daniel LaVon Billings' <daniel=40ChurchofJesusChrist.org@dmarc.ietf.org>; ietf-languages@ietf.org
Subject: Re: [Ietf-languages] Suggestion to update Urdu Script Designation in the subtag registry

You are, of course, perfectly at liberty to tag content as “ur-Aran” to specify the Nastaliq variant, just as you can use “cmn-Latn” to specify Mandarin written in Latin. Neither BCP 47 nor the contents of the Registry is locking or holding back anyone in that regard.

Suppress-Script isn’t really meant as a font selection device for any language. There are hundreds or thousands of languages known to be written predominantly in a particular script, for which there is no Suppress-Script value.

We can certainly check with ISO 15924/RA-JAC to see if there is any unstated expectation that ‘Arab’ implies the Naskh variant.

--
Doug Ewell | Thornton, CO, US | ewellic.org


From: Ietf-languages <ietf-languages-bounces@ietf.org<mailto:ietf-languages-bounces@ietf.org>> On Behalf Of Daniel LaVon Billings
Sent: Wednesday, August 12, 2020 14:49
To: Doug Ewell <doug@ewellic.org<mailto:doug@ewellic.org>>; ietf-languages@ietf.org<mailto:ietf-languages@ietf.org>
Subject: Re: [Ietf-languages] Suggestion to update Urdu Script Designation in the subtag registry

It seems like it could be generally assumed that Arab was created to signify the Naskh variant because otherwise, there isn’t a reason for creating the Aran script code. We need our applications to use a Nastaliq font whenever Urdu is called since that is the standard for Urdu, but this subtag registry currently is in competition with that ideology. We have plenty of use cases to use cmn-Latn (for Romanized Chinese text) or other variants of the standard tagging that we know why they are different from the standard, but in Urdu’s case, we would never want Urdu to use a non-Nastaliq font.

Daniel Billings |  Internationalization and Translation Systems Manager
Language Services and Area Support
Publishing Services Department
daniel@churchofjesuschrist..org<mailto:daniel@churchofjesuschrist.org>

“We shall not fight our battles alone. There is a just God who presides over the destinies of nations, and who will raise up friends to fight our battles for us. The battle, sir, is not to the strong alone; it is for the vigilant, the active, the brave.” – Patrick Henry

From: Doug Ewell <doug@ewellic.org<mailto:doug@ewellic.org>>
Sent: Wednesday, August 12, 2020 2:41 PM
To: Daniel LaVon Billings <daniel@ChurchofJesusChrist..org<mailto:daniel@ChurchofJesusChrist.org>>; ietf-languages@ietf.org<mailto:ietf-languages@ietf.org>
Subject: RE: [Ietf-languages] Suggestion to update Urdu Script Designation in the subtag registry

Hi Daniel,

The purpose of Suppress-Script in BCP 47 is to improve compatibility between BCP 47 applications and those written to older specifications, which did not support script subtags.

There was a great deal of concern, in the mid-’00s when RFC 4646 was being developed, that the new script subtags would be overused, so that, for example, users who previously tagged English content as “en” or “en-US” would start tagging it as “en-Latn” or “en-Latn-US” instead. This would add virtually no information to the tag, because English is normally written in the Latin script; but it could cause compatibility problems with processes that did not understand the script subtag. Suppress-Script was devised as a way to discourage users from adding unnecessary script subtags like this. It is optional, pragmatic, and suggestive in nature; it does not attempt to provide a scholarly reference about the language.

By changing the Suppress-Script for Urdu from ‘Arab’ to ‘Aran’, we would be essentially saying that the tag “ur-Arab” does add significant information beyond the tag “ur”, which is not true (most Urdu content is indeed written in the Arabic script) and in my opinion would be a step backward. Note that there is no corresponding script subtag for “Arabic script (Naskh variant).”

I suspect, somewhat echoing Peter, that most users do not even know ‘Aran’ exists or why it is separately encoded. While I understand some of the thought process behind this proposal, I agree that the change should not be made.

--
Doug Ewell | Thornton, CO, US | ewellic.org


From: Ietf-languages <ietf-languages-bounces@ietf.org<mailto:ietf-languages-bounces@ietf.org>> On Behalf Of Daniel LaVon Billings
Sent: Wednesday, August 12, 2020 11:56
To: ietf-languages@ietf.org<mailto:ietf-languages@ietf.org>
Subject: [Ietf-languages] Suggestion to update Urdu Script Designation in the subtag registry

Urdu is listed as Arab script reference in the subtag registry when it should have the newer approved Aran designation:

https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.iana.org%2Fassignments%2Flanguage-subtag-registry%2Flanguage-subtag-registry&data=02%7C01%7C%7Cdbdb28a6412b4e4afe8908d83f04c432%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637328636993289640&sdata=n%2Fcaqly1MrrPN%2FIWcU6MPW%2BjagOcsems%2FMRyd10KnaA%3D&reserved=0>

Urdu should be using the Aran script, not the Arab script:

%%
Type: language
Subtag: ur
Description: Urdu
Added: 2005-10-16
Suppress-Script: Arab
%%


%%

Type: script

Subtag: Aran

Description: Arabic (Nastaliq variant)

Added: 2014-12-11

%%

How can we get the subtag registry to be updated?

Daniel Billings |  Internationalization and Translation Systems Manager
Language Services and Area Support
Publishing Services Department
daniel@churchofjesuschrist...org<mailto:daniel@churchofjesuschrist.org>

“We shall not fight our battles alone. There is a just God who presides over the destinies of nations, and who will raise up friends to fight our battles for us. The battle, sir, is not to the strong alone; it is for the vigilant, the active, the brave.” – Patrick Henry