Re: [Ltru] Re: Test suite for language tags?

Addison Phillips <addison@yahoo-inc.com> Tue, 29 August 2006 17:09 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1GI75Q-0001eN-MV; Tue, 29 Aug 2006 13:09:16 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1GI75Q-0001e6-A0 for ltru@ietf.org; Tue, 29 Aug 2006 13:09:16 -0400
Received: from rsmtp2.corp.yahoo.com ([207.126.228.150]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1GI75O-00036Q-Tx for ltru@ietf.org; Tue, 29 Aug 2006 13:09:16 -0400
Received: from [10.72.77.75] (snvvpn2-10-72-77-c75.corp.yahoo.com [10.72.77.75]) (authenticated bits=0) by rsmtp2.corp.yahoo.com (8.13.6/8.13.6/y.rout) with ESMTP id k7TH92GX097007 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 29 Aug 2006 10:09:03 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:cc:subject: references:in-reply-to:content-type:content-transfer-encoding; b=egQKI9Tzj83dq0tHQg2G2xzpAipzdmnxp7owB5bLuhMnbKiU6flZ0sf4B9g7o389
Message-ID: <44F474AC.7070906@yahoo-inc.com>
Date: Tue, 29 Aug 2006 10:09:00 -0700
From: Addison Phillips <addison@yahoo-inc.com>
User-Agent: Thunderbird 1.5.0.5 (Windows/20060719)
MIME-Version: 1.0
To: John Cowan <cowan@ccil.org>
Subject: Re: [Ltru] Re: Test suite for language tags?
References: <E1G8H4g-00035u-Ej@megatron.ietf.org> <001d01c6b78a$07b37890$040aa8c0@DGBP7M81> <44D36598.1020304@yahoo-inc.com> <20060829092257.GA22927@nic.fr> <20060829133301.GE8529@ccil.org>
In-Reply-To: <20060829133301.GE8529@ccil.org>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Spam-Score: -15.0 (---------------)
X-Scan-Signature: 00e94c813bef7832af255170dca19e36
Cc: ltru@ietf.org
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

John Cowan wrote:
> 
>> Does it mean that a processor should (ordinary "should" not RFC
>> "SHOULD") define the grammar production "grandfathered" as a list of
>> possible grandfathered tags, rather than the grammar rules expressed
>> in draft-ietf-ltru-registry-14?
> 
> Yes, absolutely.  MUST would be the appropriate modal verb, in fact.

No, actually not.

There are two types of grandfathered tag, actually. There are those tags 
which are "well-formed" but for which some of the subtags are not 
registered (let us call these "regular" tags). And there are those tags 
which do not follow the grammar of RFC 3066bis (let us call these 
"irregular" tags).

A well-formed processor needs only a list of the irregular tags, but it 
MUST have such a list or it will malfunction.

A validating processing, by definition, needs the complete list.

Note that the list of irregular tags is permanent and immutable.

It should be noted that Stephane's example of "fr-Lat" passing is not an 
error. A three-letter subtag in the second position is an extlang. The 
tag is not currently valid, but it is well-formed. A validating 
processor, of course, will catch this.

> The grammar rules are meant to cover the possibilities, not to serve as
> a test for recognizing grandfathered tags.

Yes. And the fact is that we can only list the irregular tags in RFC, 
since the regular grandfathered tags can become redundant.
> 
> At the present time there are exactly 34 grandfathered tags, of which
> 8 are deprecated with replacements (but must still be recognized).
> Two more will probably be changed to redundant status before the adoption
> of 3066ter.

Which ones did you have in mind here, John? I don't think I see any that 
will become redundant at present (unless you have designs on en-scouse 
or en-boont). Deprecated, certainly, but not redundant, I think.

> The only tags that will remain both grandfathered and
> not deprecated are cel-gaulish (unless ISO 639-3 is extended to include
> it), en-GB-oed, i-default, i-enochian (unless ISO 639-3 is extended to
> include it), and the rather useless zh-min.  However, there will be no
> harm in continuing to bake the full list of 34 into validators.

Indeed, it remains important to have the full list of grandfathered tags 
in validating processors. Well-formed processors will continue to need 
the full list of irregular tags.

Note that "cel-gaulish" is a regular tag: it still has potential of 
becoming redundant. Although "zh-min" is regular, the code "min" in ISO 
639-3 (at present) represents a wholly different language (Minangkabau) 
that will be a primary language subtag (and thus can't be an extlang).

> 
> This suggests that in 3066ter we should specify the ABNF definition of
> "grandfathered" explicitly as a choice of tags.

If we can dispose, somehow, of all the regular tags, then I would 
support such a scheme. If there are unresolved regular tags (whose 
stripes could change to redundant), I think it would be more of a problem.

Addison

-- 
Addison Phillips
Globalization Architect -- Yahoo! Inc.

Internationalization is an architecture.
It is not a feature.

_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru