Re: [Ietf-languages] Suggestion to update Urdu Script Designation in the subtag registry

r12a <ishida@w3.org> Thu, 13 August 2020 10:54 UTC

Return-Path: <ishida@w3.org>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5FF3E3A0B78; Thu, 13 Aug 2020 03:54:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.848
X-Spam-Level:
X-Spam-Status: No, score=-2.848 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.949, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tOlyeBp3w9so; Thu, 13 Aug 2020 03:54:30 -0700 (PDT)
Received: from isaac.sophia.w3.org (isaac.sophia.w3.org [193.51.208.85]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4415B3A0B76; Thu, 13 Aug 2020 03:54:29 -0700 (PDT)
Received: from cpc119540-heme14-2-0-cust153.9-1.cable.virginm.net ([82.18.227.154] helo=Richards-MacBook-Pro-2.local) by isaac.sophia.w3.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from <ishida@w3.org>) id 1k6AsM-00046E-WD; Thu, 13 Aug 2020 10:54:27 +0000
To: Doug Ewell <doug@ewellic.org>, 'Daniel LaVon Billings' <daniel=40ChurchofJesusChrist.org@dmarc.ietf.org>
Cc: ietf-languages@ietf.org
References: <CY4PR0401MB36203305BEFEBF938B654E8FC6420@CY4PR0401MB3620.namprd04.prod.outlook.com> <000201d670e8$d25e7e60$771b7b20$@ewellic.org> <CY4PR0401MB362045E1E4D11D92E1F89443C6420@CY4PR0401MB3620.namprd04.prod.outlook.com> <001a01d670ed$9c868530$d5938f90$@ewellic.org>
From: r12a <ishida@w3.org>
Message-ID: <f4fa9f5c-3bb6-6b27-f294-7df9e0afa3d4@w3.org>
Date: Thu, 13 Aug 2020 11:54:24 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:52.0) Gecko/20100101 PostboxApp/7.0.25
MIME-Version: 1.0
In-Reply-To: <001a01d670ed$9c868530$d5938f90$@ewellic.org>
Content-Type: multipart/alternative; boundary="------------A496956A7176112D77138D92"
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/N0aB3Bj2yjDRuINO6SScKWIEu7w>
Subject: Re: [Ietf-languages] Suggestion to update Urdu Script Designation in the subtag registry
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Aug 2020 10:54:32 -0000

Doug Ewell wrote on 12/08/2020 22:14:
> We can certainly check with ISO 15924/RA-JAC to see if there is any 
> unstated expectation that ‘Arab’ implies the Naskh variant.
>

I would hope not, since Naskh is only one of several writing styles used 
for Arabic.  These include Naskh, Nastaliq (Aran), Ruq'a, Kano, Kufi, 
and so on.  If Arab was equated with naskh only, we'd be stuck for what 
to use to represent text written in the other styles.

I would have thought that, generally speaking, the presence of ur would 
already indicate that an application should by default use a nastaliq 
font for Urdu (and ks for Kashmiri), without the need to further 
qualify.  Additional subtags are mostly useful for modifying the default 
assumptions that come with the language, rather than completing the intent.

It seems to me that Aran might appeal for languages such as Persian, 
which are commonly written in naskh style, but can be written in a kind 
of nastaliq, so the -Aran marker could help to indicate that 
distinction. In a similar way, then, -Arab could be used after ur to 
indicate that a non-nastaliq font should be used.  But the problem here 
seems to be that -Arab and -Aran only work for a tiny subset of the 
actual list of writing-style identifiers that are actually needed,. 
There are also other places where it would be useful to distinguish 
between particular styles. For example, Hausa in Arabic script can be 
written with the hafs or warsh orthographies (typically requiring 
different fonts because they include different character repertoires), 
but in Nigeria Hausa also uses the Kano writing-style.  One might also 
want to label text that uses a magrebi style font in North Africa. Etc.

I think the -Arab subtag is mostly useful for languages (such as the 
many in Central Asia and nearby) that can be written in more than one 
script, and where you need to explicitly distinguish whether, say, 
Latin, Arabic, or Cyrillic, should be used for a given bit of text. But 
even there my personal preference is only to use the script tag when i 
need to make a meaningful distinction, not all the time.

I find myself wondering whether Aran ought really to have been a variant 
subtag, to which we could add others for different writing styles.  In 
particular, because some of the usage distinctions just mentioned can't 
be expressed by combining language and region tags.


ri