Re: Language Subtag Registration

Michael Everson <everson@evertype.com> Thu, 29 October 2015 09:18 UTC

Return-Path: <everson@evertype.com>
X-Original-To: ietf-languages@alvestrand.no
Delivered-To: ietf-languages@alvestrand.no
Received: from localhost (localhost [127.0.0.1]) by mork.alvestrand.no (Postfix) with ESMTP id B1F697C321B for <ietf-languages@alvestrand.no>; Thu, 29 Oct 2015 10:18:12 +0100 (CET)
X-Virus-Scanned: Debian amavisd-new at alvestrand.no
Received: from mork.alvestrand.no ([127.0.0.1]) by localhost (mork.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4unQe4mh4RJQ for <ietf-languages@alvestrand.no>; Thu, 29 Oct 2015 10:18:11 +0100 (CET)
X-Greylist: from auto-whitelisted by SQLgrey-1.8.0
X-Greylist: from auto-whitelisted by SQLgrey-1.8.0
Received: from pechora1.lax.icann.org (pechora1.icann.org [IPv6:2620:0:2d0:201::1:71]) by mork.alvestrand.no (Postfix) with ESMTPS id 3D6EB7C0E46 for <ietf-languages@alvestrand.no>; Thu, 29 Oct 2015 10:18:09 +0100 (CET)
Received: from aso-006-i450.relay.mailchannels.net (aso-006-i450.relay.mailchannels.net [23.91.64.131]) by pechora1.lax.icann.org (8.13.8/8.13.8) with ESMTP id t9T9HiEG015621 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for <ietf-languages@iana.org>; Thu, 29 Oct 2015 09:18:06 GMT
X-Sender-Id: letshost|x-authuser|everson+evertype.com@lh19.dnsireland.com
Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 1ACC91206D5; Thu, 29 Oct 2015 05:24:19 +0000 (UTC)
Received: from lh19.dnsireland.com (ip-10-42-131-234.us-west-2.compute.internal [10.42.131.234]) by relay.mailchannels.net (Postfix) with ESMTPA id 9544E120900; Thu, 29 Oct 2015 05:24:17 +0000 (UTC)
X-Sender-Id: letshost|x-authuser|everson+evertype.com@lh19.dnsireland.com
Received: from lh19.dnsireland.com (lh19.dnsireland.com [10.122.72.91]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.5.5); Thu, 29 Oct 2015 05:24:18 +0000
X-MC-Relay: Good
X-MailChannels-SenderId: letshost|x-authuser|everson+evertype.com@lh19.dnsireland.com
X-MailChannels-Auth-Id: letshost
X-MC-Loop-Signature: 1446096258287:911660557
X-MC-Ingress-Time: 1446096258287
Received: from [76.106.113.33] (port=55587 helo=[192.168.5.120]) by lh19.dnsireland.com with esmtpsa (TLSv1:DHE-RSA-AES256-SHA:256) (Exim 4.86) (envelope-from <everson@evertype.com>) id 1Zrfh7-0006Zn-Op; Thu, 29 Oct 2015 05:24:13 +0000
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 9.1 \(3096.5\))
Subject: Re: Language Subtag Registration
From: Michael Everson <everson@evertype.com>
In-Reply-To: <19366746-228E-4039-BF48-6EE87B8FE890@w3.org>
Date: Thu, 29 Oct 2015 01:24:12 -0400
Content-Transfer-Encoding: quoted-printable
Message-Id: <B2CA9A39-FEE0-492D-A846-91CA6364CC4B@evertype.com>
References: <55F61E82.8030106@moisan.ca> <1BF02550-02CA-463D-B011-445966506C49@evertype.com> <FD0AA4FB-FB59-4AB8-8BD7-A5C6776CF750@w3.org> <D6BFD1B7-4D93-4458-AEC2-C29153D030FA@evertype.com> <3B678307-6F04-4FA0-B80C-E7FA4E86550A@w3.org> <FE62AC5A-E6DE-4802-A8FC-F759207813B5@evertype.com> <19366746-228E-4039-BF48-6EE87B8FE890@w3.org>
To: Felix Sasaki <fsasaki@w3.org>
X-Mailer: Apple Mail (2.3096.5)
X-AuthUser: everson+evertype.com@lh19.dnsireland.com
X-Greylist: Delayed for 03:52:59 by milter-greylist-4.0 (pechora1.lax.icann.org [192.0.33.71]); Thu, 29 Oct 2015 09:18:06 +0000 (UTC)
Cc: ietflang IETF Languages Discussion <ietf-languages@iana.org>, amir.aharoni@mail.huji.ac.il
X-BeenThere: ietf-languages@alvestrand.no
X-Mailman-Version: 2.1.16
Precedence: list
List-Id: IETF Language tag discussions <ietf-languages.alvestrand.no>
List-Unsubscribe: <http://www.alvestrand.no/mailman/options/ietf-languages>, <mailto:ietf-languages-request@alvestrand.no?subject=unsubscribe>
List-Archive: <http://www.alvestrand.no/pipermail/ietf-languages/>
List-Post: <mailto:ietf-languages@alvestrand.no>
List-Help: <mailto:ietf-languages-request@alvestrand.no?subject=help>
List-Subscribe: <http://www.alvestrand.no/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@alvestrand.no?subject=subscribe>
X-List-Received-Date: Thu, 29 Oct 2015 09:18:12 -0000

On 28 Oct 2015, at 22:45, Felix Sasaki <fsasaki@w3.org> wrote:
> 
>> Perhaps this can be finessed. Either we say “Wikipedia Simple Language Version” with en as the prefix adding fr or de or ru later, or to keep “Wikipedia Simple English” and add “Wikipedia Simple French” etc later at need. 
> 
> The issue is that the notion of simple language may differ severely among different wikipedia language version.

What, in morphology, vocabulary, and syntax? Sure: they’d be forms of distinct languages. Hence the prefix. 

> So if the purpose of the extension is to cover wikipedia simple english this should be made explicit in the subtag itself and not in the prefix. And there may then be a need later to create other subtags for other wikipedia language versions.

The prefix proposed is en, since as yet there are no fr, de, or ru Simple Wikipedias, though these have been discussed. The subtag proposed is wpsimple because for any Simple Wikipedia there will be house-style guidelines which define the content. 

Less precise than Basic English, for example. But nevertheless, defined and implemented. 

> My point is that each community behind a selected wikipedia language version will likely say: we want our own language (subtag) identifier. 

I don’t see why you make this assumption. If an eventual Simple French Wikipedia were implemented, the Language Committee would simply tell them “Your prefix will be fr-wpsimple.” There would be no need for such a community to apply for a subtag. 

> The generalization of simple language to cover several language versions is problematic, like the generalization of sign languages was problematic (and is now an approach of the past).

Scouse differs from standard English by having a set of lexical and phonological differences. Simple English differs from standard English by being defined and implemented according to certain defined strictures.  Both should have en- and both a subtag. 

The sign language generalization is handy for librarians attempting to catalogue a class of items. That’s a different thing from what we’re doing with data, true.

Michael