Re: [Ietf-languages] ISO TR 21636 Dimensions

Richard Wordingham <richard.wordingham@ntlworld.com> Fri, 27 November 2020 23:43 UTC

Return-Path: <richard.wordingham@ntlworld.com>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 329303A0656 for <ietf-languages@ietfa.amsl.com>; Fri, 27 Nov 2020 15:43:26 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.1
X-Spam-Level:
X-Spam-Status: No, score=-2.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=ntlworld.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SoNmJ0Jq8B-D for <ietf-languages@ietfa.amsl.com>; Fri, 27 Nov 2020 15:43:24 -0800 (PST)
Received: from know-smtprelay-omc-4.server.virginmedia.net (know-smtprelay-omc-4.server.virginmedia.net [80.0.253.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7C8633A064E for <ietf-languages@ietf.org>; Fri, 27 Nov 2020 15:43:23 -0800 (PST)
Received: from JRWUBU2 ([82.27.122.109]) by cmsmtp with ESMTP id inOVkvgyCsDNpinOVkfk9o; Fri, 27 Nov 2020 23:43:18 +0000
X-Originating-IP: [82.27.122.109]
X-Authenticated-User:
X-Spam: 0
X-Authority: v=2.3 cv=CPIEoyjD c=1 sm=1 tr=0 a=lZfnwhydZ+7bl6OdZ0zTBw==:117 a=lZfnwhydZ+7bl6OdZ0zTBw==:17 a=IkcTkHD0fZMA:10 a=KVMNbhJc4Eg27oU4vyAA:9 a=QEXdDO2ut3YA:10
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ntlworld.com; s=meg.feb2017; t=1606520598; bh=UddxXP4RGsk2zLB1/58hS8Ca3B55ssd94THl3NF7NGQ=; h=Date:From:To:Subject:In-Reply-To:References; b=yZkFV96WPV2WHIPrfubwauxAsVN1UctPeZBisWtl1R+yOAkIK47TAxbPp7ubryI7v LPhjZrO5+MMXk9EzV2TwgHQeKeiZN0O8gsU4LeeXTkLA9gmLU6XgyaBsg34u1RpSdv goh/W/DVMzttjI/iEaRY15l+MkGRd2Q6C12L69QLNxYSIHaGFU0r5yagFJCnSC+3aV j9xGELedIjczQWySkDrj5m8BAwYhTOGcUNNjfRtG3lN5qLDL7Cjvr/2WcyGIYaZcLk G0a45akeEnI5C4wMuA0ZvPxnZvhvT7K76igney7AGZH0BBbjx8na894WyKsnqiBF2c EZUwQKDReo7UQ==
Date: Fri, 27 Nov 2020 23:43:15 +0000
From: Richard Wordingham <richard.wordingham@ntlworld.com>
To: ietf-languages@ietf.org
Message-ID: <20201127234315.32daf238@JRWUBU2>
In-Reply-To: <920a1960-ab18-aaaa-ae3e-b547314b66d7@xs4all.nl>
References: <CAKZQS29HBak-v6M2HLCpdgeZHJTFVc2W_w4G=qOK+mtPcXEenQ@mail.gmail.com> <4846f915-5706-e9dc-8b16-9f16362f82f0@xs4all.nl> <CAD2gp_SMJ_P7kyKT13Ax_ae+nr9rbTOHNn+rRp7=EKOKVSVq_g@mail.gmail.com> <236b86db-3dfb-b9e0-ab82-fa31753d0459@xs4all.nl> <20201126005439.76a9cc1d@JRWUBU2> <920a1960-ab18-aaaa-ae3e-b547314b66d7@xs4all.nl>
X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.30; i686-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-CMAE-Envelope: MS4wfCg6sjuUrACQDKU/I8EzoRa/0pC0xQtcq+B2PEPGF1yxhVeQ9P1hgLJJhsfs0HeOTXcK5weAr8t//u/7csvASzI0DNSmIaekGKPIlAYGiCYqZZT4XDwa 1yVNBcVY9FKW9GJLLTsm+P0KoMnvrPtzOexHBtQwTFWdbT9FCYelR9b87RPB4HDVE1eRT1f0ApZphA==
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/eMLfTDCuv1oT4gmpyys707sVw7U>
Subject: Re: [Ietf-languages] ISO TR 21636 Dimensions
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Nov 2020 23:43:26 -0000

On Wed, 25 Nov 2020 23:23:08 -0300
Sebastian Drude <drude@xs4all.nl> wrote:

> thanks for your thorough thinking and your comments and questions.  I 
> will try to address them below in your text as good as I can.

> ISO 21636 mainly proposes to differenciate between the 8 dimensions 
> clearly.  BCP 47 is fine for dialects if they coincide with usual
> countries regions, and for writing systems (scripts and
> orthographies) and their 
> variants.  But for time periods,  sociolects, registers, even
> modalities (signed, whistled, drummed speech of non-sign-languages)?
> And learners varieties and the 'anomalies' (we use the less pejorative
> and pathologizing 'communicative functioning')?  There ISO 21636 can
> make a contribution, I believe, (see the case at hand: sociolect?
> register?) and that may even help with those dialects that do not
> coincide with regional administrative borders.

BCP 47 can handle things like the Scanian dialect of Bornholm
(da-bornholm).  It can't handle street-by-street variations.

On the other hand, BCP 47 can tempt one to some dubious practices.  For
example, I might tag Unicode-encoded Pali in the Tham script as
pi-Lana-TH for material from Northern Thailand but as pi-Lana-LA
for material from North-eastern Thailand.  The latter may come unstuck
for text to speech - I can imagine a Siamese accent being used in NE
Thailand.

>> After that comes the question of orthography within the script.
>> I'm not sure that differences with a political tint (Russian, Lao)
>> come within the time dimension, and the gross differences in Thai
>> script Northern Thai (Thai names thap sap v. rup pariwat) definitely
>> don't.  

> I do not know what you are referring to here, so I do not understand
> at all why the time dimension would be involved.

BCP 47 has variant subtags for spelling schemes, such as
"1901" and "1996" for German.  These give a strong hint as to the
period of the text described by them, so I had though they were
categorisations along the time dimension.  You now tell me that they
apply along the medium dimension.

Some spelling differences have a political implication.  Russian
exiles between the World Wars generally ignored the spelling reforms
introduced after the Russian Revolution.  Rejection of the Lao
replacement of LO LING ('lo lot' in Lao) by LO LOOT ('lo ling' in Lao)
is reportedly a sign of political leanings.  I now gather such
differences are to be categorised along the medium dimension.

>> The same passage in the same script in Pali can have quite a
>> variation in punctuation system.  
> Wouldn't that go into the same category as different orthographic
> rules?

>> Perhaps that's a separate subdimension within 'time'.  
> I confess I do not know enough about Asian languages to grasp why the 
> time dimension would be involved here -- did these rules for
> punctuation change over time?  Then see my comment on Turkish, above.

Yes, many Asian writing systems have recently adopted European
punctuation.  It can be quite uneven.  A novel in a Thai magazine will
have far more non-blank punctuation than a Thai text book. 

>> Similarly, Pali chants in Thai script use different writing systems
>> for the masses and for more academic use - the former is an
>> 'alphabet' by Daniels' definition and the latter is an abugida. Is
>> this difference on the 'communicative functioning' dimension?  
> No, this would all go into medium, just like script and orthography,
> but I reckon that these differences are already covered by BCP 47, and
> would not even attempted to be addressed by any implementation of ISO
> 21636.

It's in scope, but no subtags have been assigned for this case.

>> Old manuscript European documents can be full of abbreviations - bars
>> for Vr and rV survived quite late in Modern English.  The
>> abbreviations are usually expanded when such documents
>> non-palaeographically transcribed.  Is the use of these abbreviations
>> on the 'communicative functioning' dimension?  
> No, I would guess, same thing, orthography --> medium dimension if 
> needed, hopefully already covered by BCP 47.

It's in scope, but no subtags have been assigned for this case.

Hoping I've fleshed out my use cases,

Richard.