Re: [Ietf-languages] Between language and script in Burmese
r12a <ishida@w3.org> Fri, 19 November 2021 12:50 UTC
Return-Path: <ishida@w3.org>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DAFED3A087E for <ietf-languages@ietfa.amsl.com>; Fri, 19 Nov 2021 04:50:17 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.75
X-Spam-Level:
X-Spam-Status: No, score=-3.75 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, NICE_REPLY_A=-1.852, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0yUKuUqET8pS for <ietf-languages@ietfa.amsl.com>; Fri, 19 Nov 2021 04:50:13 -0800 (PST)
Received: from isaac.sophia.w3.org (isaac.sophia.w3.org [193.51.208.85]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B23043A088A for <ietf-languages@ietf.org>; Fri, 19 Nov 2021 04:50:13 -0800 (PST)
Received: from cpc119494-heme14-2-0-cust33.9-1.cable.virginm.net ([82.12.236.34] helo=[192.168.1.169]) by isaac.sophia.w3.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from <ishida@w3.org>) id 1mo3LA-0000cz-O1; Fri, 19 Nov 2021 12:50:04 +0000
To: Simon Cozens <simon@simon-cozens.org>
Cc: Martin Hosken <martin_hosken@sil.org>, Peter Constable <pgcon6@msn.com>, ietf-languages@ietf.org, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
References: <20211118090511.305c280c@silmh9> <1F8F5822-FD34-4456-9B87-AE9AF3EABAD4@simon-cozens.org> <6bebcf1c-c4a5-85f4-4170-add5cfddcc7f@it.aoyama.ac.jp>
From: r12a <ishida@w3.org>
Message-ID: <40c6e9ae-7c88-d322-7d0f-b04436468bbd@w3.org>
Date: Fri, 19 Nov 2021 12:49:59 +0000
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.16; rv:52.0) Gecko/20100101 PostboxApp/7.0.52
MIME-Version: 1.0
In-Reply-To: <6bebcf1c-c4a5-85f4-4170-add5cfddcc7f@it.aoyama.ac.jp>
Content-Type: multipart/alternative; boundary="------------F8C2CF94A2AEE6F000775214"
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/hVz6RdKB2-wTHif_388Z8duRAhQ>
Subject: Re: [Ietf-languages] Between language and script in Burmese
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Review of requests for language tag registration according to BCP 47 \(RFC 4646\)" <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Nov 2021 12:50:18 -0000
fwiw, here are some thoughts for my 2p: It seems to me that the less you have to remember that a particular combination of tags means x or y, the better. If we can use tags that are easy to understand and guess without having to remember special rules or conventions, then we're much better off. I think the language tag should always reflect the actual language of the text, regardless of script or orthography. Labelling something as shn when it's not will mess up things like voice browsers, spell checkers, and possibly layout, opentype features, etc. We actually appear to have a few script tags that are in fact orthography tags, such as aran, syrn, and syrj (see https://r12a.github.io/app-subtags/?lookup=aran,syrn,syrj) but it's never been clear to me why that is a good idea (except for hant, and hans, but the situation is somewhat different there). I'm inclined to think that script tags should just be distinctive at the script level. I'm also wary of the idea of using region tags to specify a particular orthography (like we used to with zh-TW, and zh-CN), unless you really want to identify the orthography only on the basis of a particular regional standard (like en-GB), because (a) the usage may not be cleanly defined by a region tag (eg. things like Azeri, Kurdish, etc. or use of zh-CN for Singapore) and (b) it may be difficult to know/remember which region tag should be used. These relationships with region may also change in the future, as people migrate or countries mutate. On the other hand, we have *loads* of variant tags that are related to specific orthographies. See https://r12a.github.io/app-subtags/?find=ortho which list around 50. So i'm inclined to think that perhaps the solution is to submit a request for new orthographic variant subtags. ri Martin J. Dürst wrote on 19/11/2021 10:21: > Hello Simon, others, > > On 2021-11-19 17:10, Simon Cozens wrote: > >> But Shan Pali is a useful case because it does ask the question of >> whether something like Shan - and all these variant forms of Burmese >> orthography - is a language or a script (or a script variant). Some >> of them diverge quite strongly from the standard Burmese letterforms >> but IETF considers them all Mymr. I’d question that. > > It's not the IETF that decides on script codes. The IETF just pulls > together various ISO standards into a single language tag, and has > some escape hatches such as variant subtags when the ISO standards are > not enough. > > Regards, Martin. > > _______________________________________________ > Ietf-languages mailing list > Ietf-languages@ietf.org > https://www.ietf.org/mailman/listinfo/ietf-languages
- [Ietf-languages] Between language and script in B… Simon Cozens
- Re: [Ietf-languages] Between language and script … Peter Constable
- Re: [Ietf-languages] Between language and script … Simon Cozens
- Re: [Ietf-languages] Between language and script … Martin Hosken
- Re: [Ietf-languages] Between language and script … Richard Wordingham
- Re: [Ietf-languages] Between language and script … Simon Cozens
- Re: [Ietf-languages] Between language and script … Martin J. Dürst
- Re: [Ietf-languages] Between language and script … r12a
- Re: [Ietf-languages] Between language and script … John Cowan
- Re: [Ietf-languages] Between language and script … Richard Wordingham
- Re: [Ietf-languages] Between language and script … Simon Cozens
- Re: [Ietf-languages] Between language and script … Richard Wordingham
- Re: [Ietf-languages] Between language and script … Peter Constable
- Re: [Ietf-languages] Between language and script … Peter Constable
- Re: [Ietf-languages] Between language and script … Doug Ewell