[Ltru] Re: Registry change of the day: prefix with more than one subtag

"Doug Ewell" <dewell@roadrunner.com> Tue, 10 July 2007 05:57 UTC

Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1I88ik-0000fV-Ie; Tue, 10 Jul 2007 01:57:10 -0400
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1I88ij-0000fQ-EJ for ltru-confirm+ok@megatron.ietf.org; Tue, 10 Jul 2007 01:57:09 -0400
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1I88ii-0000fG-T9 for ltru@ietf.org; Tue, 10 Jul 2007 01:57:09 -0400
Received: from mta11.adelphia.net ([68.168.78.205]) by chiedprmail1.ietf.org with esmtp (Exim 4.43) id 1I88ii-00030Q-Jb for ltru@ietf.org; Tue, 10 Jul 2007 01:57:08 -0400
Received: from DGBP7M81 ([76.167.184.182]) by mta11.adelphia.net (InterMail vM.6.01.05.02 201-2131-123-102-20050715) with SMTP id <20070710055635.TUNW19079.mta11.adelphia.net@DGBP7M81> for <ltru@ietf.org>; Tue, 10 Jul 2007 01:56:35 -0400
Message-ID: <006501c7c2b7$126dd4e0$6401a8c0@DGBP7M81>
From: Doug Ewell <dewell@roadrunner.com>
To: LTRU Working Group <ltru@ietf.org>
References: <E1I7Chz-0004HK-Kx@megatron.ietf.org>
Date: Mon, 09 Jul 2007 22:56:33 -0700
MIME-Version: 1.0
Content-Type: text/plain; format="flowed"; charset="utf-8"; reply-type="original"
Content-Transfer-Encoding: 8bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.3138
x-mimeole: Produced By Microsoft MimeOLE V6.00.2900.3138
X-Spam-Score: 0.0 (/)
X-Scan-Signature: b4a0a5f5992e2a4954405484e7717d8c
Subject: [Ltru] Re: Registry change of the day: prefix with more than one subtag
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

Addison Phillips <addison at yahoo dash inc dot com> replied to Stéphane 
Bortzmeyer:

>> First subtag(s) in the registry with a prefix which is more than one 
>> subtag. (And, yes, it triggered a bug in my programs.)
>
> You've got to think ahead with those unit tests :-). I must admit that 
> we've improved the text in draft-4646bis. In 4646 there is an example 
> (and a paragraph of text) identifying this case, but it is the section 
> on variant rather than in section 3.

Now It Can Be Told:  Stéphane's bug finally motivated me to, um, add 
complex-prefix support to my own tag-generating and -analyzing program, 
which I'd talked about quite a bit but had never released, due to (a) 
lack of said support and (b) lack of an installer package (which may be 
solved shortly).

While adding support for complex prefixes, I noticed that there is still 
some wiggle room in the rules about which variants should and should not 
be used with which prefixes.  Here is the relevant passage from 
draft-ietf-ltru-4646bis-06, Section 2.2.5, which is unchanged from RFC 
4646:

<<
Most variants that share a prefix are mutually exclusive. For example, 
the German orthographic variations '1996' and '1901' SHOULD NOT be used 
in the same tag, as they represent the dates of different spelling 
reforms. A variant that can meaningfully be used in combination with 
another variant SHOULD include a 'Prefix' field in its registry record 
that lists that other variant. For example, if another German variant 
'example' were created that made sense to use with '1996', then 
'example' should include two Prefix fields: "de" and "de-1996".
>>

A human can easily see that "de-1901-1996" and "sl-rozaj-njiva" are 
meaningless, because the variants are clearly contradictory.  But a few 
variants with the same prefix can in fact be used together, such as 
"en-scouse-fonipa" or "en-boont-fonipa".  In fact, 'fonipa' and 'fonupa' 
have no prefix and can thus theoretically be used with any tag, but some 
combinations like "el-monoton-fonipa" are inappropriate nonetheless.

I can't think of a clean way to tell software which variants can be used 
together and which cannot, beyond the clear rule that all subtags in the 
prefix must be present -- for example, you can't use 'njiva' unless both 
'sl' and 'rozaj' are both present.

At http://users.adelphia.net/~dewell/prefix-anomaly.html you can find an 
illustration of this situation.  The drop-down list of variant subtags 
for "sl-rozaj" shows not only the reasonable choices, like 'fonipa' (no 
prefix) and 'njiva' (prefix = "sl-rozaj"), but also the inappropriate 
'nedis' (prefix = "sl").

What information is available in either RFC 4646(bis) or in the Registry 
that would help software to make this decision?  Does a prefix of 
"language with no variant" mean that the variant should not be used in 
the presence of any other variant?  How does one determine that 
"en-scouse-fonipa" is OK while "el-monoton-fonipa" is not?

--
Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14
http://users.adelphia.net/~dewell/
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages





_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru