Re: [Ltru] Re: Test suite for language tags?

John Cowan <cowan@ccil.org> Tue, 29 August 2006 19:28 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1GI9GQ-0002M5-IU; Tue, 29 Aug 2006 15:28:46 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1GI9GP-0002M0-HO for ltru@ietf.org; Tue, 29 Aug 2006 15:28:45 -0400
Received: from mercury.ccil.org ([192.190.237.100]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1GI9GM-0004ez-7E for ltru@ietf.org; Tue, 29 Aug 2006 15:28:45 -0400
Received: from cowan by mercury.ccil.org with local (Exim 4.34) id 1GI9GL-0006a1-Fy; Tue, 29 Aug 2006 15:28:41 -0400
Date: Tue, 29 Aug 2006 15:28:41 -0400
To: Addison Phillips <addison@yahoo-inc.com>
Subject: Re: [Ltru] Re: Test suite for language tags?
Message-ID: <20060829192841.GE22065@ccil.org>
References: <E1G8H4g-00035u-Ej@megatron.ietf.org> <001d01c6b78a$07b37890$040aa8c0@DGBP7M81> <44D36598.1020304@yahoo-inc.com> <20060829092257.GA22927@nic.fr> <20060829133301.GE8529@ccil.org> <44F474AC.7070906@yahoo-inc.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <44F474AC.7070906@yahoo-inc.com>
User-Agent: Mutt/1.3.28i
From: John Cowan <cowan@ccil.org>
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 02ec665d00de228c50c93ed6b5e4fc1a
Cc: ltru@ietf.org
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

Addison Phillips scripsit:

> There are two types of grandfathered tag, actually. There are those tags 
> which are "well-formed" but for which some of the subtags are not 
> registered (let us call these "regular" tags). And there are those tags 
> which do not follow the grammar of RFC 3066bis (let us call these 
> "irregular" tags).
> 
> A well-formed processor needs only a list of the irregular tags, but it 
> MUST have such a list or it will malfunction.
> 
> A validating processing, by definition, needs the complete list.
> 
> Note that the list of irregular tags is permanent and immutable.

+1

> It should be noted that Stephane's example of "fr-Lat" passing is not an 
> error. A three-letter subtag in the second position is an extlang. The 
> tag is not currently valid, but it is well-formed. A validating 
> processor, of course, will catch this.

My point is that fr-Lat is not a *grandfathered* tag.  Nothing can ever
be a grandfathered tag except the 34 existing examples.  The current ABNF
for a grandfathered tag, 1*3ALPHA 1*2("-" (2*8alphanum)), therefore
recognizes too much.

> >The grammar rules are meant to cover the possibilities, not to serve as
> >a test for recognizing grandfathered tags.
> 
> Yes. And the fact is that we can only list the irregular tags in RFC, 
> since the regular grandfathered tags can become redundant.

Listing them would do no harm.

> Which ones did you have in mind here, John? I don't think I see any that 
> will become redundant at present (unless you have designs on en-scouse 
> or en-boont). Deprecated, certainly, but not redundant, I think.

See my last two ietf-languages postings.

> Note that "cel-gaulish" is a regular tag: it still has potential of 
> becoming redundant. Although "zh-min" is regular, the code "min" in ISO 
> 639-3 (at present) represents a wholly different language (Minangkabau) 
> that will be a primary language subtag (and thus can't be an extlang).

Correct; these are regular in form but effectively irregular (Gaulish
isn't a variant; if anything, it's a macrolanguage, but not treated
as such in 639-3.)

> If we can dispose, somehow, of all the regular tags, then I would 
> support such a scheme. If there are unresolved regular tags (whose 
> stripes could change to redundant), I think it would be more of a problem.

I think there will be none:  cel-gaulish and zh-min are regular but
cannot be made redundant; the other remaining tags will be redundant,
deprecated, or both.

-- 
MEET US AT POINT ORANGE AT MIDNIGHT BRING YOUR DUCK OR PREPARE TO FACE WUGGUMS
John Cowan      cowan@ccil.org      http://www.ccil.org/~cowan

_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru