Flagging macrolanguages in the LSR (Was: extlang (was Re: Suggested language for "mis" (Re: [Ltru] RE: ISO 639-2 decision: "mis"))

Stephane Bortzmeyer <bortzmeyer@nic.fr> Thu, 21 June 2007 15:42 UTC

Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1I1OnX-0006wP-7x; Thu, 21 Jun 2007 11:42:15 -0400
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1I1OnW-0006so-1U for ltru-confirm+ok@megatron.ietf.org; Thu, 21 Jun 2007 11:42:14 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1I1OnV-0006rr-N2 for ltru@ietf.org; Thu, 21 Jun 2007 11:42:13 -0400
Received: from virtual3.netaktiv.com ([80.67.170.53] helo=mail.bortzmeyer.org) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1I1OnU-0000SX-Al for ltru@ietf.org; Thu, 21 Jun 2007 11:42:13 -0400
Received: by mail.bortzmeyer.org (Postfix, from userid 10) id 70BC5240828; Thu, 21 Jun 2007 17:42:08 +0200 (CEST)
Received: by mail.sources.org (Postfix, from userid 1000) id EB1B71110C; Thu, 21 Jun 2007 17:41:04 +0200 (CEST)
Date: Thu, 21 Jun 2007 17:41:04 +0200
From: Stephane Bortzmeyer <bortzmeyer@nic.fr>
To: Addison Phillips <addison@yahoo-inc.com>
Subject: Flagging macrolanguages in the LSR (Was: extlang (was Re: Suggested language for "mis" (Re: [Ltru] RE: ISO 639-2 decision: "mis"))
Message-ID: <20070621154104.GA25638@sources.org>
References: <30b660a20706171252l3c61d451p464b96e864d1a515@mail.gmail.com> <007f01c7b166$8ef7bf10$6401a8c0@DGBP7M81> <30b660a20706181006x3efbf772t9a0751feb070a6cb@mail.gmail.com> <20070619013433.GA15048@mercury.ccil.org> <003701c7b238$16124fc0$6401a8c0@DGBP7M81> <20070619140547.GB30227@mercury.ccil.org> <4677F589.3090003@yahoo-inc.com> <20070619205855.GA19853@sources.org> <46784742.9080009@yahoo-inc.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <46784742.9080009@yahoo-inc.com>
X-Transport: UUCP rules
X-Operating-System: Debian GNU/Linux 3.1
User-Agent: Mutt/1.5.9i
X-Spam-Score: 0.7 (/)
X-Scan-Signature: 9ed51c9d1356100bce94f1ae4ec616a9
Cc: LTRU Working Group <ltru@ietf.org>
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

On Tue, Jun 19, 2007 at 02:14:42PM -0700,
 Addison Phillips <addison@yahoo-inc.com> wrote 
 a message of 58 lines which said:

> I take it you're suggesting (or interpreted the proposal) to mean
> that the macrolanguage would contain a pointer or pointers to its
> sublanguages.

No, I was not suggesting it myself but it is a good opportunity to
discuss it.

> %%
> Type: language
> Subtag: zh
> Description: Chinese
> ...
> Encloses: yue,cmn,.....
> %%

It seems reasonable to enumerate a macrolanguage "members "in the
macrolanguage itself. But I see some issues with this proposal:

1) What will be the stability rules for Encloses? Can an extlang be
dropped from a Encloses field? (if advances in linguistics show that
it was wrongly added here, for instance.) And can an extlang be added
to an Encloses? If yes (which seems reasonable), it means that the
macrolanguage entry can change quite often (consider Zapotec with its
dozens of enclosed extlangs).

2) It introduces a redundancy in the registry, an information which
was already there and could be computed from other records. Randy
Presuhn noticed it can be seen as a good thing ("it should be flagged
in its record, rather than requiring it to be computed by examining
all other records"). But more redundancy means more possibilities to
introduce contradictions.

3) 4646bis does not indicate an upper bound for the length of a field
("The content of this field is not restricted, except by [...]
reasonable practical size limitations."). Zapotec's "Encloses:" will
be at least 232 bytes long and sign langauges will be worse. While
there is a long discussion of length issues for tags, it seems there
is none for registry-parsing applications. I suspect that some of them
may have hard-wired limits for field lengths.


_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru