Re: [Ietf-languages] Suggestion to update Urdu Script Designation in the subtag registry

Hugh Paterson III <sil.linguist@gmail.com> Thu, 13 August 2020 12:49 UTC

Return-Path: <sil.linguist@gmail.com>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 873643A0BF4 for <ietf-languages@ietfa.amsl.com>; Thu, 13 Aug 2020 05:49:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Zl2CQLWAC-jy for <ietf-languages@ietfa.amsl.com>; Thu, 13 Aug 2020 05:49:40 -0700 (PDT)
Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 017183A0BF3 for <ietf-languages@ietf.org>; Thu, 13 Aug 2020 05:49:39 -0700 (PDT)
Received: by mail-ed1-x52c.google.com with SMTP id di22so4068337edb.12 for <ietf-languages@ietf.org>; Thu, 13 Aug 2020 05:49:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=C4MJX2yi2gyYZ9j2d8S0DgscxgMgyUSTJlpGZc6722U=; b=QXW4aqaDIfV+6jd81mpG+PQlpFxmH3zwk3um8Rg+GC8zBqS9hvBa9sHk4QBih8Bs92 d6wBnNfJVtSJw0VjRznvuErmsEas1aRyjKA0Jm6YF1PoZE2abHjJkdYYDVc9GDtjDiYX GqsEDF9ufZHT/n24BbGxiKJiujMe1+KwnmgIRKKjgU9OJS4YS+axng+oRomf80bdCDDJ HMcLPiybKvCvAmXKAaZ2nwHNIFM5A6tXm0RF/0W681roktWUTtZt8PesFsY1K94zRP1h 4VUNRGSdNFtGZMMjqJ0ocTzWkJD2exHNiVKoFQ6LXT60tSt/lnQTXmrcbvPnVg9pTSAy qj7A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=C4MJX2yi2gyYZ9j2d8S0DgscxgMgyUSTJlpGZc6722U=; b=debHHRwcpaY/mgZAv7JSYBemr0Rb0g0uAZsOUTLOeEkgRsFiXFsmjhG3X6aWUtdqwM xsZNbfvREUhMG69XJ9yqxbvbUU2J7CDUETncxB0HYlLGwTgk6b77ktSJz7BWlltZlQ5R kgVCiJv4RDQgePGvc6sU8xmYj6uYAtTCpnB5t11i8yZ5Vq5y+qTX6k/hiXLWMTIUBiIp dnPo0MEYNwUya+YFzVLh/5m+ardwPdeDd6TL6gVPs99uUscw6f6NhTWk1nLogYwWFBxH wVBxfr+XHe+h9X1Xxqi6Ly2sK3eRbOYebDTP5mAAymYWTFjD1v9UvNV9zPF4bFRWgDMR TFNg==
X-Gm-Message-State: AOAM530oaMtb5kcRn68pcg/DkI1bJBH28N5+shY6VXaCaUpKNf8ZWBeu SUUrrKo2vH4ugtCSVE/r4x7M3vY87JuLnzV/WCVkyzct
X-Google-Smtp-Source: ABdhPJw/RAuJk23FfYkUALT+vkWaayGspJhjPLhRfydvQKv3wEohlf8BHZsXnYdnAns0QfcTpJv5VD4CtdUfvquBaKo=
X-Received: by 2002:aa7:db50:: with SMTP id n16mr4373612edt.292.1597322978362; Thu, 13 Aug 2020 05:49:38 -0700 (PDT)
MIME-Version: 1.0
References: <CY4PR0401MB36203305BEFEBF938B654E8FC6420@CY4PR0401MB3620.namprd04.prod.outlook.com> <000201d670e8$d25e7e60$771b7b20$@ewellic.org> <CY4PR0401MB362045E1E4D11D92E1F89443C6420@CY4PR0401MB3620.namprd04.prod.outlook.com> <001a01d670ed$9c868530$d5938f90$@ewellic.org> <f4fa9f5c-3bb6-6b27-f294-7df9e0afa3d4@w3.org>
In-Reply-To: <f4fa9f5c-3bb6-6b27-f294-7df9e0afa3d4@w3.org>
From: Hugh Paterson III <sil.linguist@gmail.com>
Date: Thu, 13 Aug 2020 14:49:27 +0200
Message-ID: <CAE=3Ky-ZR1py3+Ok1i+YjDR-WUH1Q=0bahZhcAC_Y+i+xc80Cw@mail.gmail.com>
To: r12a <ishida@w3.org>
Cc: Daniel LaVon Billings <daniel=40ChurchofJesusChrist.org@dmarc.ietf.org>, Doug Ewell <doug@ewellic.org>, ietf-languages@ietf.org
Content-Type: multipart/alternative; boundary="000000000000b7874a05acc1bc58"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/1vEkn-3CFR3kePisq1Spo7EPdkU>
Subject: Re: [Ietf-languages] Suggestion to update Urdu Script Designation in the subtag registry
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Aug 2020 12:49:43 -0000

ri,

In Nigeria Hausa can also be written with the Latin script. Where can I go
to find what the basic default assumptions are for a language tag? Is the
default always Latin?

In an interesting case where I have done some research. A language of the
Ivory Coast and Liberia share a language called Dan. I know of 4
orthographies used in print in Dan all of them Latin or Latin with
borrowings from Cyrillic . One in Liberia, three in CI. So dnj_ci is not
sufficient to distinguish the three in the Ivory Coast. My work focuses on
the production of optimized keyboard layouts which are orthography
specific, so I use the -x- component of bcp47 to distinguish the texts and
the tools. This seems to be a valid way but is it the best way? It seems to
me that the script and the orthography layers are independent (and perhaps
also the writing style), and orthography is only explicitly addressed by
the sub-tag registry. I.e the German language tags including the 1996
related tag.

All the best,
- Hugh


On Thu, Aug 13, 2020 at 12:54 PM r12a <ishida@w3.org> wrote:

> Doug Ewell wrote on 12/08/2020 22:14:
>
> We can certainly check with ISO 15924/RA-JAC to see if there is any
> unstated expectation that ‘Arab’ implies the Naskh variant.
>
>
> I would hope not, since Naskh is only one of several writing styles used
> for Arabic.  These include Naskh, Nastaliq (Aran), Ruq'a, Kano, Kufi, and
> so on.  If Arab was equated with naskh only, we'd be stuck for what to use
> to represent text written in the other styles.
>
> I would have thought that, generally speaking, the presence of ur would
> already indicate that an application should by default use a nastaliq font
> for Urdu (and ks for Kashmiri), without the need to further qualify.
> Additional subtags are mostly useful for modifying the default assumptions
> that come with the language, rather than completing the intent.
>
> It seems to me that Aran might appeal for languages such as Persian, which
> are commonly written in naskh style, but can be written in a kind of
> nastaliq, so the -Aran marker could help to indicate that distinction. In a
> similar way, then, -Arab could be used after ur to indicate that a
> non-nastaliq font should be used.  But the problem here seems to be that
> -Arab and -Aran only work for a tiny subset of the actual list of
> writing-style identifiers that are actually needed,. There are also other
> places where it would be useful to distinguish between particular styles.
> For example, Hausa in Arabic script can be written with the hafs or warsh
> orthographies (typically requiring different fonts because they include
> different character repertoires), but in Nigeria Hausa also uses the Kano
> writing-style.  One might also want to label text that uses a magrebi style
> font in North Africa. Etc.
>
> I think the -Arab subtag is mostly useful for languages (such as the many
> in Central Asia and nearby) that can be written in more than one script,
> and where you need to explicitly distinguish whether, say, Latin, Arabic,
> or Cyrillic, should be used for a given bit of text. But even there my
> personal preference is only to use the script tag when i need to make a
> meaningful distinction, not all the time.
>
> I find myself wondering whether Aran ought really to have been a variant
> subtag, to which we could add others for different writing styles.  In
> particular, because some of the usage distinctions just mentioned can't be
> expressed by combining language and region tags.
>
>
> ri
>
>
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages@ietf.org
> https://www.ietf.org/mailman/listinfo/ietf-languages
>
-- 
All the best,
-Hugh

Sent from my iPhone
Paris, France