Re: extlang (was Re: Suggested language for "mis" (Re: [Ltru] RE: ISO 639-2 decision: "mis"))

John Cowan <cowan@ccil.org> Tue, 19 June 2007 01:34 UTC

Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1I0ScA-0003ci-J8; Mon, 18 Jun 2007 21:34:38 -0400
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1I0Sc8-0003bC-TO for ltru-confirm+ok@megatron.ietf.org; Mon, 18 Jun 2007 21:34:36 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1I0Sc8-0003b4-Jw for ltru@ietf.org; Mon, 18 Jun 2007 21:34:36 -0400
Received: from earth.ccil.org ([192.190.237.11]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1I0Sc6-0003RW-AF for ltru@ietf.org; Mon, 18 Jun 2007 21:34:36 -0400
Received: from cowan by earth.ccil.org with local (Exim 4.63) (envelope-from <cowan@ccil.org>) id 1I0Sc5-0006a7-4f; Mon, 18 Jun 2007 21:34:33 -0400
Date: Mon, 18 Jun 2007 21:34:33 -0400
To: Mark Davis <mark.davis@icu-project.org>
Subject: Re: extlang (was Re: Suggested language for "mis" (Re: [Ltru] RE: ISO 639-2 decision: "mis"))
Message-ID: <20070619013433.GA15048@mercury.ccil.org>
References: <30b660a20706171252l3c61d451p464b96e864d1a515@mail.gmail.com> <007f01c7b166$8ef7bf10$6401a8c0@DGBP7M81> <30b660a20706181006x3efbf772t9a0751feb070a6cb@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <30b660a20706181006x3efbf772t9a0751feb070a6cb@mail.gmail.com>
User-Agent: Mutt/1.5.13 (2006-08-11)
From: John Cowan <cowan@ccil.org>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: d8ae4fd88fcaf47c1a71c804d04f413d
Cc: Doug Ewell <dewell@roadrunner.com>, LTRU Working Group <ltru@ietf.org>
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

Mark Davis scripsit:

> We added extlang to allow ourselves the freedom to make choices
> when 639-3 came along. We *very clearly did not define its meaning*,
> because we didn't know what 639-3 was finally going to look like,

Not so much.  We had an excellent idea of what 639-3 would both look and
actually be like when 4646 was finalized.  We couldn't include 639-3 or
extlangs because 639-3 itself was not yet final.

> nor did we have agreement on what we should actually do.

We had at least the consensus of silence; at least, I don't remember any
complaints at the time.  Remember that the development of 4646 started at
least a year before LTRU was formally created.

> We *already* had macrolanguages with ISO 639-2 in RFC 4646 and we *did
> not* use extlang for them: examples are "sr", "hr", "nb", etc.

(Rather, these are examples of languages *encompassed* by macrolanguages,
henceforth "encompassed languages".  I realize that's just a slip.)

> We are not going to (and cannot) be forcing users to encode nb as no-nb,
> nor sr as sh-sr.

Nobody has ever proposed that.  Language subtags coming from 639-1
or 639-2 will not change, even if 639-3 says they encode encompassed
languages.

> When we ("Google") tried implementing matching with "zh-yue" and others,
> we found it made things *more* difficult, not less.

Respectfully I suggest that because you ("Google") assign tags to incoming
rather than outgoing content, your use of BCP 47 is essentially private
rather than in interchange.  That makes it potentially important, but
definitely not prototypical.

> Matching "zh" and "yue" is not something you want to do
> automatically. Moreover, because of #2 we had to have a mechanism for
> dealing with macrolanguages in RFC 4646 *anyway*.

Very plausible in your circumstances.  But note that Yue (Cantonese)
content is *already* properly tagged "zh-yue" on precisely the theory
that's being applied to 639-3 encompassed languages.

I realize that this cause is probably not important to you, because
you can (comparatively) easily change all "zh-yue" tags to "yue", but
this is not the case for other users of BCP 47 on and off the Internet,
who will never even hear about the change.

> Thus to make a proposed change from 4646 to use the extlang mechanism
> for languages that have macrolanguages, we need a very compelling
> case that the additional complication solves more problems than it
> creates. We haven't seen that yet, and certainly have no consensus
> that it is the case.

On the contrary, the burden of persuasion is with you.  Tags like
zh-yue are already present in 4646, and it's up to you to provide a
convincing argument to deprecate them.  Furthermore, LTRU and its ad hoc
predecessor has been assuming the extlang structure since at least 2004.
Derailing that is what will take a "very compelling case".

> A. [Don't use extlangs.]

I continue in opposition to this.

> B. (optional) Add a field Macrolanguage: to the language subtag
> registry.

I am not opposed to this, precisely because encompassed languages and
the corresponding macrolanguage cannot be identified syntactically.

-- 
May the hair on your toes never fall out!       John Cowan
        --Thorin Oakenshield (to Bilbo)         cowan@ccil.org


_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru