Re: [Ltru] Macrolanguage usage

"Doug Ewell" <doug@ewellic.org> Sat, 24 May 2008 18:06 UTC

Return-Path: <ltru-bounces@ietf.org>
X-Original-To: ltru-archive@megatron.ietf.org
Delivered-To: ietfarch-ltru-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id B3B613A6A75; Sat, 24 May 2008 11:06:35 -0700 (PDT)
X-Original-To: ltru@core3.amsl.com
Delivered-To: ltru@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id B099C3A6A78 for <ltru@core3.amsl.com>; Sat, 24 May 2008 11:06:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.705
X-Spam-Level:
X-Spam-Status: No, score=-2.705 tagged_above=-999 required=5 tests=[AWL=1.893, BAYES_00=-2.599, GB_I_LETTER=-2, STOX_REPLY_TYPE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jzWZHcYxH6Vu for <ltru@core3.amsl.com>; Sat, 24 May 2008 11:06:31 -0700 (PDT)
Received: from smtpauth20.prod.mesa1.secureserver.net (smtpauth20.prod.mesa1.secureserver.net [64.202.165.36]) by core3.amsl.com (Postfix) with SMTP id D1E873A6A3B for <ltru@ietf.org>; Sat, 24 May 2008 11:06:30 -0700 (PDT)
Received: (qmail 10956 invoked from network); 24 May 2008 18:06:29 -0000
Received: from unknown (71.229.245.230) by smtpauth20.prod.mesa1.secureserver.net (64.202.165.36) with ESMTP; 24 May 2008 18:06:28 -0000
Message-ID: <001b01c8bdc8$e2d66770$e6f5e547@DGBP7M81>
From: Doug Ewell <doug@ewellic.org>
To: LTRU Working Group <ltru@ietf.org>
References: <mailman.2658.1211631529.13675.ltru@ietf.org>
Date: Sat, 24 May 2008 12:06:26 -0600
MIME-Version: 1.0
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.3138
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198
Subject: Re: [Ltru] Macrolanguage usage
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Sender: ltru-bounces@ietf.org
Errors-To: ltru-bounces@ietf.org

(Note 1: This message requests WG chair action.)
(Note 2: Nothing in this message is meant to express an opinion for or 
against extlang per se.)

Leif Halvard Silli <lhs at malform dot no> wrote:

> Doesn't Cantonese users understand Mandarin?

From what I've heard, in terms of spoken language: absolutely not.

> As you know, many tag Norwegian texts as 'no-no'.

Of course, because that means "Norwegian as used in Norway."  (Which is 
kind of redundant really, since there doesn't seem to be much evidence 
of variation in the Norwegian language associated with regions other 
than Norway; but some creators of language tags and locale identifiers 
feel it is important to apply region subtags consistently.)

> So it is obviously that when 'no-no' falls back to 'no', then of 
> course 'no-nn' or 'no-nno' would fall back just as well. Why should I 
> not believe so?

I ask the co-chairs to settle this matter with a third consensus-call 
question:

Q3: If we did go back to using "extlang," we could combine this subtag
    with the region subtag, and require that at most one of the two be
    used in a single tag.  Possible responses:  (pick ONE)
        A - I would like this.
        B - I could live with this.
        C - I would object to this.

Remember that we did create such a "Leif rule" for the purpose of 
allowing two-letter extlangs, as in "no-nn", then:

1. The region/extlang subtag would have to come AFTER any script 
subtags, thus: "no-Latn-nn" rather than "no-nn-Latn", and "zh-Hant-cmn" 
rather than "zh-cmn-Hant" -- unless we wanted to change that existing 
BCP 47 syntactical rule as well.

2. It would be impossible to tell whether a non-initial two-letter 
subtag such as 'tw' referred to a region, as in "zh-TW" (Taiwan), or an 
extlang, as in "ak-tw" (Twi).  Case is not significant in language 
tags -- unless we wanted to change that existing BCP 47 syntactical rule 
as well.

3. It would be impossible to write a tag for, say, "Cantonese as used in 
Singapore" that also expressed the macrolanguage relationship --  
whatever that may be -- between 'zh' and 'yue'.

> In the draft you sent out you start by saying that "The arguments for 
> extlang are that they give superior results,". However, this is an 
> exaggeration of the standpoint that I for intance have. First, I 
> assume you meant "technical superior". Well, no, I can understand that 
> using short tags is easier to deal with, technically. And therefore 
> superior to extlang. (In my testing with Apache, 'nn' and 'nb' was 
> easier to deal with than 'no-nyn' and 'no-bok'.)
>
> But then a problem is that the users "in the wild" still are tagging 
> Norwegian as if we had an extlang system.

"no-nyn" and "no-bok" are grandfathered tags, registered under RFC 1766 
in 1995, long before anyone ever used the word "macrolanguage" or 
"extlang".  Their similarity to extlangs is to be considered 
coincidental.

> And this is a special kind of language negotiation. For a small 
> macrolanguage like NOrwegian, we suddenly get 3 options. If instead we 
> had extlang for Norwegian, we would in reality only have two options.

This isn't new or sudden.  You've had 3 options since 2000, when ISO 639 
registered 'nb' and 'nn', and actually since 1995, when ietf-languages 
registered the whole tags "no-bok" and "no-nyn" which were not to be 
considered parsable.

> I have allready been told that it is very important to read things out 
> of the tags without needing to look into the registry. And I agree. 
> That is a basic, and very good thing.

See my note 2 above.  Regions and encompassed languages are not at all 
the same, and the "Leif rule" would require tag producers and consumers 
to look in the Registry to see which is intended.  (If Mark can say "a 
la Ewell" then I can say "the Leif rule.")

--
Doug Ewell  *  Arvada, Colorado, USA  *  RFC 4645  *  UTN #14
http://www.ewellic.org
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages  ˆ

_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www.ietf.org/mailman/listinfo/ltru