Re: Pending requests

Michael Everson <everson@evertype.com> Fri, 27 November 2015 11:39 UTC

Return-Path: <everson@evertype.com>
X-Original-To: ietf-languages@alvestrand.no
Delivered-To: ietf-languages@alvestrand.no
Received: from localhost (localhost [127.0.0.1]) by mork.alvestrand.no (Postfix) with ESMTP id 5A7907C5647 for <ietf-languages@alvestrand.no>; Fri, 27 Nov 2015 12:39:55 +0100 (CET)
X-Virus-Scanned: Debian amavisd-new at alvestrand.no
Received: from mork.alvestrand.no ([127.0.0.1]) by localhost (mork.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aJzdBqUbTlkg for <ietf-languages@alvestrand.no>; Fri, 27 Nov 2015 12:39:53 +0100 (CET)
X-Greylist: from auto-whitelisted by SQLgrey-1.8.0
X-Greylist: from auto-whitelisted by SQLgrey-1.8.0
Received: from pechora4.lax.icann.org (pechora4.icann.org [IPv6:2620:0:2d0:201::1:74]) by mork.alvestrand.no (Postfix) with ESMTPS id 9A3107C5644 for <ietf-languages@alvestrand.no>; Fri, 27 Nov 2015 12:39:52 +0100 (CET)
Received: from si-002-i39.relay.mailchannels.net (si-002-i39.relay.mailchannels.net [184.154.112.204]) by pechora4.lax.icann.org (8.13.8/8.13.8) with ESMTP id tARBdSk0008858 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for <ietf-languages@iana.org>; Fri, 27 Nov 2015 11:39:49 GMT
X-Sender-Id: letshost|x-authuser|everson+evertype.com@lh19.dnsireland.com
Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 26818101489; Fri, 27 Nov 2015 11:39:04 +0000 (UTC)
Received: from lh19.dnsireland.com (ip-10-237-13-110.us-west-2.compute.internal [10.237.13.110]) by relay.mailchannels.net (Postfix) with ESMTPA id 03E5D100FDD; Fri, 27 Nov 2015 11:39:02 +0000 (UTC)
X-Sender-Id: letshost|x-authuser|everson+evertype.com@lh19.dnsireland.com
Received: from lh19.dnsireland.com (lh19.dnsireland.com [10.122.67.156]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.5.5); Fri, 27 Nov 2015 11:39:03 +0000
X-MC-Relay: Neutral
X-MailChannels-SenderId: letshost|x-authuser|everson+evertype.com@lh19.dnsireland.com
X-MailChannels-Auth-Id: letshost
X-MC-Loop-Signature: 1448624343579:1592543297
X-MC-Ingress-Time: 1448624343579
Received: from [37.228.247.168] (port=45065 helo=[192.168.0.10]) by lh19.dnsireland.com with esmtpsa (TLSv1:DHE-RSA-AES256-SHA:256) (Exim 4.86) (envelope-from <everson@evertype.com>) id 1a2HMg-0005hc-Q6; Fri, 27 Nov 2015 11:38:58 +0000
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 9.1 \(3096.5\))
Subject: Re: Pending requests
From: Michael Everson <everson@evertype.com>
In-Reply-To: <D27DB15C.33F68%kent.karlsson14@telia.com>
Date: Fri, 27 Nov 2015 11:39:00 +0000
Content-Transfer-Encoding: quoted-printable
Message-Id: <CCEECD33-091B-4016-99E5-097198266836@evertype.com>
References: <D27DB15C.33F68%kent.karlsson14@telia.com>
To: ietflang IETF Languages Discussion <ietf-languages@iana.org>
X-Mailer: Apple Mail (2.3096.5)
X-AuthUser: everson+evertype.com@lh19.dnsireland.com
X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.0 (pechora4.lax.icann.org [192.0.33.74]); Fri, 27 Nov 2015 11:39:50 +0000 (UTC)
Cc: "Amir E. Aharoni" <amir.aharoni@mail.huji.ac.il>
X-BeenThere: ietf-languages@alvestrand.no
X-Mailman-Version: 2.1.16
Precedence: list
List-Id: IETF Language tag discussions <ietf-languages.alvestrand.no>
List-Unsubscribe: <http://www.alvestrand.no/mailman/options/ietf-languages>, <mailto:ietf-languages-request@alvestrand.no?subject=unsubscribe>
List-Archive: <http://www.alvestrand.no/pipermail/ietf-languages/>
List-Post: <mailto:ietf-languages@alvestrand.no>
List-Help: <mailto:ietf-languages-request@alvestrand.no?subject=help>
List-Subscribe: <http://www.alvestrand.no/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@alvestrand.no?subject=subscribe>
X-List-Received-Date: Fri, 27 Nov 2015 11:39:55 -0000

On 27 Nov 2015, at 06:06, Kent Karlsson <kent.karlsson14@telia.com> wrote:

>> 
>> The Wikipedia is a big and important application, is it not?
> 
> Yes, so. en-levelB2 (if that corresponds to what they are targeting, that would work just fine for Wikipedia and many others.

It is not within our scope to assign CEFR levels to identify different kinds of language variants. 

Moreover, it is really unlikely that they will wish to replace http://simple.wikipedia.org with http://en-levelB2.wikipedia.org 

> That Wikipedia has house rules precising what the level is fine, but nothing "we” should encode.

We encode subtags which extend the code for representation of names of languages. English, en. Scouse English en-scouse. 

> Likewise for VoA "Learning English" levels (which I do think can
> be found correspond to CEFR levels). They have house rules (I'd assume,
> though they don't appear to have published them), but "we" should not
> attempt to embody the house rules in LSR.

wpsimple points to Wikipedia which has its own rules. 

>> No, they aren¹t. Our subtags describe linguistic entities, not hierarchies of language-learning and speaker competence.
> 
> Language-learning and speaker competence levels of a particular language are linguistic entities as well. "wpsimple" is just one instance.

Speaker competence is out of scope. Whether I can understand a sentence you write in Swedish is of no consequence. The subtag identifies it as Swedish. 

>> en-scouse points directly at Scouse. en-cornu points directly at Cornu-English/Anglo-Cornish/Cornish English. en-basiceng would point directly at Basic English. CEFR hierarchies have nothing to do with this. Our subtags point at things. I don¹t think it is within our scope to pick a set of CEFR definitions and attempt to apply them (on the basis of no research) to one or more varieties of controlled vocabulary and syntax. The CEFR is ALL about learner competence with regard to standard language, and Basic English and Wikipedia Simple English are examples of controlled language (engineered language, not constructed language), not examples of standard language.
> 
> Disregarding Ogden's Basic English (which must NOT get the subtag 'basiceng'), which is not "simplified English", but rather a (strangely) "constrained English”.

It isn’t called “simplified English”. The Wikipedia’s is. Basic English is called Basic English; it isn’t called anything else. Your “must NOT” is your opinion. I don’t share it. “basiceng” points to “Basic English”. It doesn’t point to “Wikipedia Simple English” or VOA or anybody else’s thing. The right thing to do is to use a name which is iconic and 

Your sentence was a fragment, by the way. 

> No, I don't say that *WE* should attempt to apply CEFR levels to simplified form so-and-so from Wikipedia, VoA, or anyone else. That should be up to Wikipedia, VoA, and anyone else (respectively) [and these I do think can be reasonably mapped; by THEM, not us].

What, you want me to go to Amir and say “Map this to the CEFR hierarchy, so we can use a name which none of your users will understand”? No, Kent. That’s not a good idea. Nor is it workable, because Amir isn’t going to be able to do that mapping either. Nor have the CEFR specified levels of difficulty. A text is a text. The CEFR specifies levels of user competence. That is a different thing. 

> But I do not find it appropriate to encode house rules for company/organisation so-and-so in LSR, regardless of how big they are. BUT we should cater for this use-case (simplified, or learner's, language) in LSR, but in a general manner, not house rules. The latter are of course needed, but a matter for each "house" (company/organisation), not the LSR.

We aren’t encoding Wikipedia’s house rules. They have already done that on references which are linked to in the application. We are providing a label which identifies their usage. 

Michael Everson * http://www.evertype.com/