[Ietf-languages] Between language and script in Burmese
Simon Cozens <simon@simon-cozens.org> Wed, 17 November 2021 09:20 UTC
Return-Path: <simon@simon-cozens.org>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3DE883A0B44 for <ietf-languages@ietfa.amsl.com>; Wed, 17 Nov 2021 01:20:16 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Level:
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=simon-cozens.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id X5u_g0Eop4ov for <ietf-languages@ietfa.amsl.com>; Wed, 17 Nov 2021 01:20:11 -0800 (PST)
Received: from dealer.simon-cozens.org (dealer.simon-cozens.org [212.71.248.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 029CD3A0B40 for <ietf-languages@ietf.org>; Wed, 17 Nov 2021 01:20:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=simon-cozens.org; s=x; h=Subject:Content-Transfer-Encoding:Content-Type: From:To:MIME-Version:Date:Message-ID:Sender:Reply-To:Cc:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=Bfd+eykqypQLD580vWr4SPt/7UVraxJiiZ3JZY1/x88=; b=l/S2N8Ni8N/Dcld4dcBeWJES5q PmRVKURa+R3Q1/DqLGsyFXnkWvPpppUafbCMtX7BCdpy+023kJGJTycein2iZubT8y/he9p3bkWsW Hkp7IPa/pqkrt3ScgX6HWPUXBxyTB76AuBEy/aQjKYWqNee/c0Bq0Nzv+ek8PxJ7Src0=;
Received: from [77.100.132.4] (helo=[192.168.1.103]) by dealer.simon-cozens.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from <simon@simon-cozens.org>) id 1mnH6r-00035i-Vl for ietf-languages@ietf.org; Wed, 17 Nov 2021 09:20:08 +0000
Message-ID: <6690448e-380c-e7a7-9d0a-320066e20eae@simon-cozens.org>
Date: Wed, 17 Nov 2021 09:20:05 +0000
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.3.1
To: ietf-languages@ietf.org
From: Simon Cozens <simon@simon-cozens.org>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-SA-Exim-Connect-IP: 77.100.132.4
X-SA-Exim-Mail-From: simon@simon-cozens.org
X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000)
X-SA-Exim-Scanned: No (on dealer.simon-cozens.org); Unknown failure
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/iZ8jb3d1PpNbfWSIQSDcCZHYzgw>
Subject: [Ietf-languages] Between language and script in Burmese
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Review of requests for language tag registration according to BCP 47 \(RFC 4646\)" <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Nov 2021 09:21:12 -0000
Hello! I've been working on a system font which covers a number of minority languages and scripts of Burma, some of which are not currently addressable because they lack IETF (and OpenType) script/language tags, or where the correct tag combination is not obvious. The Burmese script has many language-specific and context-specific variant forms (see UTN11 - https://www.unicode.org/notes/tn11/UTN11_4.pdf - for examples), and the boundary between script and language is not always obvious. Some of these differences in letterforms are encoded separately in Unicode and some of them as allographs. It's all a bit of mess. The easy problem we have is the Thai Mon language. This is a variant of the Mon language used by Mon people in Thailand. It has its own distinct script tradition. (https://www.unicode.org/L2/L2020/20163-arakanese-mon.pdf) There's no distinct language subtag but I believe mnw-TH is enough to distinguish this language - although we may have to pull some OpenType strings to enable that distinction to select Thai Mon specific orthographic forms. The hard problem we have is that some of these language-specific variant orthographies are used to write text in *other* languages. In that sense, they are essentially functioning as *different scripts* to standard Myanmar. For example: a document written in the Shan language using the Shan variant orthography of Burmese is clearly shn-Mymr, and setting the Shan language in a document should be enough to activate the Shan variant forms. No problems here. And a document written in Pali using the standard Burmese orthography is obviously pl-Mymr, and because it's standard Burmese, a Burmese font doesn't need to do any magic to get the right glyphs. But what is a document written in the Pali language using the Shan (or Khamti, or Mon) orthography? Do we need variant tags to distinguish the flavour of Burmese script used in these cases? Shouldn't Shan, Khamti and Mon actually be separate scripts? And if not, how on earth are we going to get browsers to choose the Shan forms for this document, without pretending that it's actually written in the Shan language? Any advice would be helpful! Thanks, Simon
- [Ietf-languages] Between language and script in B… Simon Cozens
- Re: [Ietf-languages] Between language and script … Peter Constable
- Re: [Ietf-languages] Between language and script … Simon Cozens
- Re: [Ietf-languages] Between language and script … Martin Hosken
- Re: [Ietf-languages] Between language and script … Richard Wordingham
- Re: [Ietf-languages] Between language and script … Simon Cozens
- Re: [Ietf-languages] Between language and script … Martin J. Dürst
- Re: [Ietf-languages] Between language and script … r12a
- Re: [Ietf-languages] Between language and script … John Cowan
- Re: [Ietf-languages] Between language and script … Richard Wordingham
- Re: [Ietf-languages] Between language and script … Simon Cozens
- Re: [Ietf-languages] Between language and script … Richard Wordingham
- Re: [Ietf-languages] Between language and script … Peter Constable
- Re: [Ietf-languages] Between language and script … Peter Constable
- Re: [Ietf-languages] Between language and script … Doug Ewell