Re: [Ietf-languages] ISO 21636 Dimensions

Sebastian Drude <drude@xs4all.nl> Thu, 26 November 2020 02:23 UTC

Return-Path: <drude@xs4all.nl>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8C1343A09A8 for <ietf-languages@ietfa.amsl.com>; Wed, 25 Nov 2020 18:23:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Level:
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=xs4all.nl
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OmNoQU1YHVxl for <ietf-languages@ietfa.amsl.com>; Wed, 25 Nov 2020 18:23:20 -0800 (PST)
Received: from lb3-smtp-cloud7.xs4all.net (lb3-smtp-cloud7.xs4all.net [194.109.24.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D2EC83A0997 for <ietf-languages@ietf.org>; Wed, 25 Nov 2020 18:23:19 -0800 (PST)
Received: from cust-d2ef4cbd ([IPv6:fc0c:c138:75cc:34bc:4631:c48c:494:61cb]) by smtp-cloud7.xs4all.net with ESMTPA id i6w8kMTCsN7Xgi6wDkXTOp; Thu, 26 Nov 2020 03:23:17 +0100
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=xs4all.nl; s=s2; t=1606357397; bh=E1sLXQRbdYmkDxie32VZeR6M2+b8JCH7fRvI2680NwM=; h=Subject:To:From:Message-ID:Date:MIME-Version:Content-Type:From: Subject; b=HKMno6+vl3EBlUL40vuaXVbBUHn2BXO887SdvjBFt0T6KIEthy0WzSEvNqCIdV5pq 5tYQASiZyqA1aNMAQcjCBuQecEcNAZ36z63gXgY7jTscpAePZJTbDLSGhiWylkalNz QFnE/+vTUnb9fjE4cQE9HLfnCmhLanjx2Tw/Zw5/jMAnr/IeSizVotKQsTqRTvpLiR gc0kwy4DjT80XxVOHCi8Cnwj6FdOSiICCerNOVT9ON2hm0JGcB1WrsN5oZsrJPu31Y S4krBJmC3md84Dw6utJZJwRPgw4vxGeRyvdrWnER9O5zVGqNj2X9f+N1dC7z7R919C +KRIXB3/LHGfw==
To: Richard Wordingham <richard.wordingham=40ntlworld.com@dmarc.ietf.org>, ietf-languages@ietf.org
References: <CAKZQS29HBak-v6M2HLCpdgeZHJTFVc2W_w4G=qOK+mtPcXEenQ@mail.gmail.com> <4846f915-5706-e9dc-8b16-9f16362f82f0@xs4all.nl> <CAD2gp_SMJ_P7kyKT13Ax_ae+nr9rbTOHNn+rRp7=EKOKVSVq_g@mail.gmail.com> <236b86db-3dfb-b9e0-ab82-fa31753d0459@xs4all.nl> <20201126005439.76a9cc1d@JRWUBU2>
From: Sebastian Drude <drude@xs4all.nl>
Message-ID: <920a1960-ab18-aaaa-ae3e-b547314b66d7@xs4all.nl>
Date: Wed, 25 Nov 2020 23:23:08 -0300
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0
MIME-Version: 1.0
In-Reply-To: <20201126005439.76a9cc1d@JRWUBU2>
Content-Type: multipart/alternative; boundary="------------F7A9A115EC23227E15C986EE"
Content-Language: pt-BR
X-CMAE-Envelope: MS4xfAR3oP9PGw54ID91KrcAvOP/i/9sAZ88cZoOo8UJ4ztMOs2mURZWnfEgJ6QiN8Yrc9qO86V/rv87I3H7P1yejt3ITVhhXeS9LjTRcYXtcZsBBAqBmkC/ Lguht/0zUcjwqLUWO1Je58uBajp5ic5r9ElN3aLrZNFpbQlyydxEXcHz+bBH+eDSaMxsQ8hVHwUt9TNXPHbbVxZtIz27wzRNhjbwjS0t+5S9XVDE0gf7fDwS 5rKoh1hAg7WKhq+sY44Soe/oKFSsFlRZSopIzKZt3UNQhIKToSykEHKg5CUQ14QbnEtuKzFBOd/xPUTLnklB9KaTPAnmUglXwYulZOr6C13XkqtcEdKm0Ain ok7nUu7ty8+4fXH4uOyW5UgIW8nElQ==
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/gSK8C9-8RbwW7XGXgYvGlOrC3Ew>
Subject: Re: [Ietf-languages] ISO 21636 Dimensions
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Nov 2020 02:23:24 -0000

Dear Richard, all,

thanks for your thorough thinking and your comments and questions.  I 
will try to address them below in your text as good as I can.

I will investigate with the ISO people whether I am allowed to share the 
draft document with this group, this seems to be really useful.

Best,

Sebastian

-- 

Museu P.E. Goeldi, CCH, Linguistica ▪ Av. Perimetral, 1901
Terra Firme, CEP: 66077-530 ▪ Belém do Pará – PA ▪ Brazil
drude@xs4all.nl ▪ +55 (91) 3217 6024 ▪ +55 (91) 983733319
Priv: Tv. Juvenal Cordeiro, 184, Apt 104 ▪ 66070-300 Belém

On 25/11/2020 21:54, Richard Wordingham wrote:
> On Tue, 24 Nov 2020 23:40:37 -0300
> Sebastian Drude <drude@xs4all.nl> wrote:
>
>
>> The eight dimensions we identified for the purposes of standardized
>> coding:
>>
>>   1. Space (dialects & over-regional standard varieties)
>>   2. Time (epochs, periods, stages)
>>   3. Social group (sociolects, including more specific technolects)
>>   4. Medium (modalities: oral/multimodal, written, signed, whistled,
>>      drummed...)
>>   5. Situation (registers, e.g. of different formality, including
>> genres and the like)
>>   6. Person (“personal varieties” ~ elswhere sometimes called
>> “idiolects”) 7. Proficiency (learner varieties)
>>   8. Communicative functioning (constrained communicative functioning
>>      varieties, 'anomalies')
> I'm having a bit of trouble envisaging how one would apply this model
> to certain situations.
>
> The first one is orthography.  The simplest one is that of script.  For
> example, articles in the Serbian Wikipedia may be delivered in Cyrillic
> or Latin, at the reader's choice.  Which dimension does that difference
> sit on?

These details of script and orthography would systematically go to the 
medium dimension, as sub-categories of the written modality.  But in 
fact, as there is already an ISO standard for Scripts (ISO 15924:2004), 
and as dealing with such different writing systems and their variants is 
already excellently done in BCP 47, the future TR 21636 points to these 
documents and does not foresee to engage in any further details wrt. 
writings systems.

If different writing conventions are actually the consequence of 
belonging to a certain social group (e.g., in written African American 
English, some /-er/ endings are replaced by /-a/) or of being in an 
informal setting (such as contractions), then that would belong to the 
social or situation dimensions, respectively, of course.

ISO 21636 mainly proposes to differenciate between the 8 dimensions 
clearly.
BCP 47 is fine for dialects if they coincide with usual countries 
regions, and for writing systems (scripts and orthographies) and their 
variants.  But for time periods,  sociolects, registers, even modalities 
(signed, whistled, drummed speech of non-sign-languages)?  And learners 
varieties and the 'anomalies' (we use the less pejorative and 
pathologizing 'communicative functioning')?  There ISO 21636 can make a 
contribution, I believe, (see the case at hand: sociolect?  register?) 
and that may even help with those dialects that do not coincide with 
regional administrative borders.

> After that comes the question of orthography within the script.  I'mnot sure that differences with a political tint (Russian, Lao) come within the time dimension, and the gross differences in Thai script Northern Thai (Thai names thap sap v. rup pariwat) definitely don't.

I do not know what you are referring to here, so I do not understand at 
all why the time dimension would be involved.  In principle, the time 
dimension is there to capture differences such as Middle vs. Modern 
English, or 17th-century English vs. Contemporary English (and it can be 
even more fine-grained, of course).
If a certain language, like Turkish, changes its script (or the 
orthography) at a certain point in time, the different written 
sub-modality (medium dimension) and the different 'chronolects' coincide 
in most cases, but they are still independent, as people can choose, for 
example, to continue to use the older system.

> For the oral medium, where does tempo come in?  That significantly affects a dropping of distinctions, so may be relevant for converting text to speech and possibly vice versa.

That would be a typical application for the communicative funcioning 
dimension.  That dimension contains 'varieties' (they are actually not 
real varieties, as they concern performance, and not the 
system/competence, where idiolects => varieties lie, but we include this 
for the purposes of ISO as an additional dimension anyways) for 
phenomena such as stutter, lisping, and the like, or being drunk, 
breathless etc. -- this dimension also includes very slow or very fast 
speaking, if that needs to be tagged.  True, these are all features 
which apply mainly to speech, i.e. the oral modality, but you can have 
similar categories for the written modality -- think of an unredacted 
text written by a dyslexic, or someone writing due to stress, emotions 
etc. with many errors, or in handwriting, to write in a hurry, almost 
illegibly -- these are categories ('varieties') in the communicative 
funcioning dimension.

> Would 'communicative functioning' include matters like punctuation?
No, I would not think so (see previous comment).

> The same passage in the same script in Pali can have quite a variation in punctuation system.
Wouldn't that go into the same category as different orthographic rules?

> Perhaps that's a separate subdimension within 'time'.
I confess I do not know enough about Asian languages to grasp why the 
time dimension would be involved here -- did these rules for punctuation 
change over time?  Then see my comment on Turkish, above.

> Similarly, Pali chants in Thai script use different writing systems for the masses and for more academic use - the former is an 'alphabet' by Daniels' definition and the latter is an abugida.
> Is this difference on the 'communicative functioning' dimension?
No, this would all go into medium, just like script and orthography, but 
I reckon that these differences are already covered by BCP 47, and would 
not even attempted to be addressed by any implementation of ISO 21636.

> Old manuscript European documents can be full of abbreviations - bars for Vr and rV survived quite late in Modern English.  The abbreviations are usually expanded when such documents non-palaeographically transcribed.  Is the use of these abbreviations on the 'communicative functioning' dimension?
No, I would guess, same thing, orthography --> medium dimension if 
needed, hopefully already covered by BCP 47.

> Comparing the English of Sebastian's post I'm replying to this reply, I noticed only a few differences:
>
> SD's regional variety could be British, using the spelling preferred by
> the Oxford English Dictionary (viz. '-ize' rather than '-ise'), whereas
> mine is, I think clearly British (shibboleth: 'palaeographic').
It's true, I have learned English in the UK (almost 40 years ago), and 
have never lived in the US, but as over many years I am much more 
frequently exposed to and interacting with speakers of American English, 
I assumed that my English would be closer to the American variety.  But 
it's true, I use (and defend) the Oxford rules when it comes to ...-ize, 
...-ization.

> Mine might be older - 1960s or 1970s judging by the writing "viz.", though the vocabulary is later.
I believe we all adapt to the time, so we all speak "2020 English". The 
phenomena of the younger using slang and new expressions that we elder 
do not use (or even understand) goes, for me, into the category 
sociolect: these are social (age-based) groups in the current 
English-speaking society.  When the teenagers from today are 40 or 50, 
they certainly will not use most of that youth-specific expressions any 
more.

> 'Situation' is difficult to name.  I think it's fairly formal, apart from the seemingly mandatory use of contracted auxiliary verbs and the use of first names.  Perhaps 'formal but for obligatory informality'. Or are the deviations from formality covered by 'communicative functioning'?  I could be wrong; perhaps my use of 'a bit of trouble' makes it informal.

I guess we, as most people most of the time, use the neutral register, 
which is appropriate in formal as well as in informal situations.  I 
probably imitate the English I read in texts, many of them scientific 
papers or interchanges like this one, others news-related, where 
contractions of auxiliaries are fairly common.  And indeed, I cannot 
remember having been adressed as "Mr. Drude" in a gathering of 
colleagues, however formal, so I use the first names, too.

> SD's 'proficiency' is within the usual native range.  The one grammar error I spotted when scanning for Americansims, "whether if", looks like a case of incomplete editing.

Thanks, but as a German who learned English in school (and two longer 
stays in the UK in holidays during highschool), I am certainly not a 
native speaker.  Although until recently I used English on an almost 
daily basis most of my work days for some 8 years, I know that my 
English (in particular orally) is far from perfect (my Portuguese is 
better, I assume).
Yes, "whether if" was an edititing error for sure.

>
> Richard.
>
> _______________________________________________
> Ietf-languages mailing list
> Ietf-languages@ietf.org
> https://www.ietf.org/mailman/listinfo/ietf-languages