Re: [Ietf-languages] Between language and script in Burmese

Martin Hosken <martin_hosken@sil.org> Thu, 18 November 2021 02:05 UTC

Return-Path: <martin_hosken@sil.org>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 70D853A07A4 for <ietf-languages@ietfa.amsl.com>; Wed, 17 Nov 2021 18:05:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Level:
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=sil.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cJtdTt5DZBES for <ietf-languages@ietfa.amsl.com>; Wed, 17 Nov 2021 18:05:18 -0800 (PST)
Received: from mail-pj1-x102b.google.com (mail-pj1-x102b.google.com [IPv6:2607:f8b0:4864:20::102b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2D9443A07A0 for <ietf-languages@ietf.org>; Wed, 17 Nov 2021 18:05:17 -0800 (PST)
Received: by mail-pj1-x102b.google.com with SMTP id fv9-20020a17090b0e8900b001a6a5ab1392so4241291pjb.1 for <ietf-languages@ietf.org>; Wed, 17 Nov 2021 18:05:17 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sil.org; s=google; h=date:from:to:cc:subject:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=1OmaUGSuuYDoJYoVAykukAgk3NMJcLbzmIPfxWyLTkA=; b=hkEMHR32+PGBUkuVrmbjy2uCxKmVs+t0+LifpfpsMGVIm9MQt4loNlXLEIFOcROjfD NAaKrS036WsfsLVCDoDCZJbtPYxmXEzfPqDM1s+BYyvjTVrIU5Qvde9qEzMuwYPWUsfQ umisumXTkgZXu/CkQUgL9SY9Trwhlginndy08=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1OmaUGSuuYDoJYoVAykukAgk3NMJcLbzmIPfxWyLTkA=; b=loIEHrlUN+XLSBZcmiM0Sbi7YkZ/ZekQpOvQWFNGVNwT0GwZXhuP5oQ3K0fIN6neSk 91sZnRkgaqyH+lHju7rhfIV2byw3p+SsVK1XcuCYdZRSJKkmNSLrxzuzn6fCCY3XuJ83 wN1R2Z0eRsnQBzNZOeWHvgej+cn/FJhi74cyWtA7Jbn8FiqS9VDmpVqkPvhFYCkQqDf6 ZrIpCdPM+g6jR+TQH2NMmOxUojxxaL2wJuhydgqyqG32J9+/DT+OGf2RDwnllw88RWfZ ArnwqVnnzdqh3to9bPaDa1IYfgzR6u2vJN4/HVYzH9ticfflTbEsy9OxU/duUpQt/Kil vDEw==
X-Gm-Message-State: AOAM531S7aNfXNqWKvpr/4vTnivzl2Nv49VKAKxklg41tjcxdLOT+TLp Ul/zfqF1sfNhILl3TypbAKKJ9w==
X-Google-Smtp-Source: ABdhPJzs03zIhtmZNw9702xxgLEK+MGTMnMdTyN+n40XqJWRm9ZmMy0PUJCcLxut5T55Xr4t+sAQsA==
X-Received: by 2002:a17:903:230b:b0:141:e3ce:2738 with SMTP id d11-20020a170903230b00b00141e3ce2738mr61237618plh.57.1637201116286; Wed, 17 Nov 2021 18:05:16 -0800 (PST)
Received: from silmh9 (node-7oh.pool-1-1.dynamic.totinternet.net. [1.1.166.225]) by smtp.gmail.com with ESMTPSA id q89sm842883pjk.50.2021.11.17.18.05.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Nov 2021 18:05:15 -0800 (PST)
Date: Thu, 18 Nov 2021 09:05:11 +0700
From: Martin Hosken <martin_hosken@sil.org>
To: Simon Cozens <simon@simon-cozens.org>
Cc: Peter Constable <pgcon6@msn.com>, "ietf-languages@ietf.org" <ietf-languages@ietf.org>
Message-ID: <20211118090511.305c280c@silmh9>
In-Reply-To: <f90cc1e1-69ef-006a-bace-51b95070b59d@simon-cozens.org>
References: <6690448e-380c-e7a7-9d0a-320066e20eae@simon-cozens.org> <MWHPR1301MB2112CAE05F699F78DFA28489869A9@MWHPR1301MB2112.namprd13.prod.outlook.com> <f90cc1e1-69ef-006a-bace-51b95070b59d@simon-cozens.org>
X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/Qe08fqRHY2HjP5u5b9LP4z8x3gw>
Subject: Re: [Ietf-languages] Between language and script in Burmese
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Review of requests for language tag registration according to BCP 47 \(RFC 4646\)" <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Nov 2021 02:05:22 -0000

Dear All,

Sorry, I was away for a few days.

> > You've touched on the easier cases that can be supported now in BCP 47. For Mon, you suggested mnw-TH could be used, though I would also consider mnw-Thai versus mnw-Mymr  
> 
> I'm not sure this is right. mnw-Thai would be Mon written in the Thai 
> script (not what we want), and mnw-Mymr would be Mon (not Thai Mon) 
> written in Myanmar script; neither of which would be referring to the 
> Thai Mon language written in the Thai Mon orthographic variant of 
> Myanmar script.

I agree with Simon here. mnw-Thai is Mon in Thai script for which there is no evidence (which seems odd to me).

> > For Pali written in Myanmar script, just that much information should be tagged using pl-Mymr, and that tag alone would convey _nothing_ about orthographic or typographic variants.  
> 
> Right. So that's obviously not enough.
> 
> > If there are finer distinctions to be made-certainly for orthographic differences-then additional subtags would be needed. If orthographic distinctions correlate closely with region differences, then region subtags could be used to capture that. But for the cases you mention, region is not a good correlate. For those cases, variant subtags would be needed.  
> 
> I think we are going to need variant subtags to denote orthographic 
> forms of the Myanmar script preferred by the Shan, Khamti and Mon 
> communities. Advice on how to get these variants registered would be 
> appreciated!

Yes. I'm not sure why there is a need to register all these Pali orthographies (since a change in underlying orthography is a different orthography for the language concerned). What's wrong with simply tagging it shn? Is there a need to distinguish pi in this way.

Please do also bear in mind that there is a small plethora of Shan orthographies, which need their variants and into which Pali my be represented. For example: Cushing, Common, Modern, Old, etc. Perhaps we need to cut the Shan pie and register variants for those, before we address Pali.

Khamti is recommended to be encoded using variation selectors which would alleviate the need for a contrastive tag, although there is still the font variation regarding using filled dots over unfilled dots, which re-raises the need. Mon has no stylistic variation over Burmese that I know of, although there may be spelling/encoding differences between pi-Mymr-x-mnw vs pi-Mymr.

> > Of course, mapping from language tags to OpenType Layout language system tags to implement typographic distinctions is a related but separate matter.  
> 
> Well, yes. I'll be pestering you about registering some new OTL language 
> system tags once we've got something for them to map from!

It all depends how much of a hurry you are in and how important it is to distinguish pi as a separate language based on these orthographies. There's no free lunch here and with the free and easy code switching that happens with Pali and these other languages, I merely raise the question of how important marking pi is for them.

Yours,
Martin

> 
> Simon
> 
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages@ietf.org
> https://www.ietf.org/mailman/listinfo/ietf-languages