Re: [Ltru] Re: Test suite for language tags?
"Mark Davis" <mark.davis@icu-project.org> Sun, 17 September 2006 21:49 UTC
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1GP4Vj-0004p1-BM; Sun, 17 Sep 2006 17:49:11 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1GP4Vh-0004ou-Ix for ltru@lists.ietf.org; Sun, 17 Sep 2006 17:49:09 -0400
Received: from nf-out-0910.google.com ([64.233.182.185]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1GP4Vb-0000jr-VC for ltru@lists.ietf.org; Sun, 17 Sep 2006 17:49:09 -0400
Received: by nf-out-0910.google.com with SMTP id n15so2911115nfc for <ltru@lists.ietf.org>; Sun, 17 Sep 2006 14:49:02 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=N5hW8j4MkV1cUonpAjXHGlwQnd0wXsYDtmmkXK41kvHSAdFxNumrFmTwzwctJnBeMduaL7NIWIVWNqOdk0oydR8Kq8m3SGjALIVklXtOC7ISqH2h+D5Looi/yPlNdHw4T0EINQUaNMxE0otZvxfEpCZUf00svmfz4GYwWQEoHVE=
Received: by 10.48.14.4 with SMTP id 4mr16244310nfn; Sun, 17 Sep 2006 14:49:01 -0700 (PDT)
Received: by 10.49.65.16 with HTTP; Sun, 17 Sep 2006 14:49:01 -0700 (PDT)
Message-ID: <30b660a20609171449u1ee4b3b9n9c715666aa369226@mail.gmail.com>
Date: Sun, 17 Sep 2006 14:49:01 -0700
From: Mark Davis <mark.davis@icu-project.org>
To: Martin Duerst <duerst@it.aoyama.ac.jp>
Subject: Re: [Ltru] Re: Test suite for language tags?
In-Reply-To: <6.0.0.20.2.20060917115535.0899d670@localhost>
MIME-Version: 1.0
References: <20060801203351.GA8854@sources.org> <20060804165720.GA24037@sources.org> <44D4AC42.79E0@xyzzy.claranet.de> <20060830093000.GA31895@nic.fr> <44F6313D.2070000@yahoo-inc.com> <6.0.0.20.2.20060831201004.101ab8d0@localhost> <44F6EF0E.20602@yahoo-inc.com> <6.0.0.20.2.20060901024806.109a6d90@localhost> <30b660a20609161628t22ab3c4flc81ea92f40800a09@mail.gmail.com> <6.0.0.20.2.20060917115535.0899d670@localhost>
X-Google-Sender-Auth: a8a8f3037ff56768
X-Spam-Score: 0.5 (/)
X-Scan-Signature: 0fa76816851382eb71b0a882ccdc29ac
Cc: Frank Ellermann <nobody@xyzzy.claranet.de>, ltru@lists.ietf.org
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1724309798=="
Errors-To: ltru-bounces@ietf.org
Fixed, thanks! Mark On 9/16/06, Martin Duerst <duerst@it.aoyama.ac.jp> wrote: > > There's a mistake in the regex, from the underlying > langtagRegex.txt, that allows zh-zh-cmn-Hans. > > Regards, Martin. > > > At 08:28 06/09/17, Mark Davis wrote: > >BTW, I had updated my regex to the final spec for 4646. Here is a single > Perl or Java regex that does most of the parse: > > > >Regex: ((?: [a-z A-Z]{2,3} (?: [-] [a-z A-Z]{3} ){0,3} | [a-z A-Z]{4,8} > ))(?: [-] ((?: [a-z A-Z]{4} )) )?(?: [-] ((?: [a-z A-Z]{2} | [0-9]{3} )) > )?(?: [-] ((?: (?: [0-9] [a-z A-Z 0-9]{3} | [a-z A-Z 0-9]{5,8} ) (?: [-] (?: > [0-9] [a-z A-Z 0-9]{3} | [a-z A-Z 0-9]{5,8} ) )* )) )?(?: [-] ((?: (?: [a-w > y-z A-W Y-Z] (?: [-] [a-z A-Z 0-9]{2,8} )+ ) (?: [-] (?: [a-w y-z A-W Y-Z] > (?: [-] [a-z A-Z 0-9]{2,8} )+ ) )* )) )?(?: [-] ((?: [xX] (?: [-] [a-z A-Z > 0-9]{1,8} )+ )) )?| ( (?i) art [-] lojban| cel [-] gaulish| en [-] (?: boont > | GB [-] oed | scouse )| i [-] (?: ami | bnn | default | enochian | hak | > klingon | lux | mingo | navajo | pwn | tao | tay | tsu )| no [-] (?: bok | > nyn)| sgn [-] (?: BE [-] fr | BE [-] nl | CH [-] de)| zh [-] (?: cmn | zh > [-] cmn [-] Hans | cmn [-] Hant | gan | guoyu | hakka | min | min [-] nan | > wuu | xiang | yue))| ((?: [xX] (?: [-] [a-z A-Z 0-9]{1,8} )+ )) > > > >It checks for the grandfathered tags, since otherwise too much cruft > sneaks in. You can't check in regex that there are only single instances of > each singleton extension. (In retrospect we could have allowed multiple > singletons: we could have accepted en-a-bcdef-ghijk-b-123 -a-lmnop as > equivalent to the canonical form en-a-bcdef-ghijk-lmnop-b-123, but that's > water under the bridge at this point.) Of course, I didn't put this together > by hand. The table used to build it is much more readable, at > > > >< > http://unicode.org/cldr/data/tools/java/org/unicode/cldr/util/data/langtagRegex.txt > > > http://unicode.org/cldr/data/tools/java/org/unicode/cldr/util/data/langtagRegex.txt > > > >and a test file that includes strings mentioned on this list is at: > > > >< > http://unicode.org/cldr/data/tools/java/org/unicode/cldr/util/data/langtagTest.txt > > > http://unicode.org/cldr/data/tools/java/org/unicode/cldr/util/data/langtagTest.txt > >Mark > > > #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University > #-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp > >
_______________________________________________ Ltru mailing list Ltru@ietf.org https://www1.ietf.org/mailman/listinfo/ltru
- [Ltru] Test suite for language tags? Stephane Bortzmeyer
- Re: [Ltru] Test suite for language tags? Addison Phillips
- Re: [Ltru] Test suite for language tags? Mark Davis
- [Ltru] Re: Test suite for language tags? Stephane Bortzmeyer
- [Ltru] Re: Test suite for language tags? Stephane Bortzmeyer
- Re: [Ltru] Re: Test suite for language tags? Mark Davis
- Re: [Ltru] Re: Test suite for language tags? Mark Davis
- [Ltru] Re: Test suite for language tags? Addison Phillips
- [Ltru] Re: Test suite for language tags? Doug Ewell
- [Ltru] Re: Test suite for language tags? Doug Ewell
- Re: [Ltru] Test suite for language tags? Doug Ewell
- Re: [Ltru] Test suite for language tags? Addison Phillips
- [Ltru] Re: Test suite for language tags? Stephane Bortzmeyer
- [Ltru] Re: Test suite for language tags? Stephane Bortzmeyer
- [Ltru] Re: Test suite for language tags? Doug Ewell
- [Ltru] Re: Test suite for language tags? Frank Ellermann
- Re: [Ltru] Re: Test suite for language tags? John Cowan
- [Ltru] Re: Test suite for language tags? Stephane Bortzmeyer
- Re: [Ltru] Re: Test suite for language tags? John Cowan
- [Ltru] Re: Test suite for language tags? Doug Ewell
- Re: [Ltru] Re: Test suite for language tags? Addison Phillips
- Re: [Ltru] Re: Test suite for language tags? John Cowan
- Re: [Ltru] Re: Test suite for language tags? Addison Phillips
- Re: [Ltru] Re: Test suite for language tags? Addison Phillips
- [Ltru] Re: Test suite for language tags? Stephane Bortzmeyer
- [Ltru] Re: Test suite for language tags? Addison Phillips
- [Ltru] Re: Test suite for language tags? Stephane Bortzmeyer
- Re: [Ltru] Re: Test suite for language tags? John Cowan
- [Ltru] Re: Test suite for language tags? Frank Ellermann
- [Ltru] Re: Test suite for language tags? Frank Ellermann
- Re: [Ltru] Re: Test suite for language tags? Addison Phillips
- [Ltru] Re: Test suite for language tags? Addison Phillips
- Re: [Ltru] Re: Test suite for language tags? John Cowan
- [Ltru] Re: Test suite for language tags? Stephane Bortzmeyer
- [Ltru] Re: Test suite for language tags? Frank Ellermann
- Re: [Ltru] Re: Test suite for language tags? Addison Phillips
- Re: [Ltru] Re: Test suite for language tags? Martin Duerst
- [Ltru] Re: Test suite for language tags? Stephane Bortzmeyer
- [Ltru] Re: Test suite for language tags? Frank Ellermann
- Re: [Ltru] Re: Test suite for language tags? Martin Duerst
- Re: [Ltru] Re: Test suite for language tags? Addison Phillips
- [Ltru] Re: Test suite for language tags? Frank Ellermann
- Re: [Ltru] Re: Test suite for language tags? Martin Duerst
- Re: [Ltru] Re: Test suite for language tags? Mark Davis
- Re: [Ltru] Re: Test suite for language tags? Martin Duerst
- [Ltru] Re: Test suite for language tags? Stephane Bortzmeyer
- Re: [Ltru] Re: Test suite for language tags? Mark Davis
- [Ltru] Re: Test suite for language tags? Mark Davis
- [Ltru] Re: Test suite for language tags? Mark Davis
- [Ltru] Re: Test suite for language tags? Doug Ewell
- Re: [Ltru] Re: Test suite for language tags? Mark Davis
- Re: [Ltru] Re: zh-hakka Doug Ewell
- [Ltru] Re: Test suite for language tags? Stephane Bortzmeyer
- Re: [Ltru] Re: Test suite for language tags? John Cowan
- [Ltru] Re: zh-hakka Doug Ewell
- [Ltru] Re: Test suite for language tags? Doug Ewell
- Re: [Ltru] Re: Test suite for language tags? Addison Phillips
- Re: [Ltru] Re: zh-hakka Addison Phillips
- [Ltru] Re: Test suite for language tags? Stephane Bortzmeyer
- [Ltru] Re: zh-hakka Frank Ellermann
- [Ltru] Re: zh-hakka Frank Ellermann
- [Ltru] Region subtag changes John Cowan
- [Ltru] Re: Region subtag changes Doug Ewell
- Re: [Ltru] Re: Region subtag changes John Cowan
- Re: [Ltru] Re: Region subtag changes Doug Ewell
- Re: [Ltru] Re: zh-hakka David Conrad
- [Ltru] Re: zh-hakka Stephane Bortzmeyer
- [Ltru] Re: zh-hakka Frank Ellermann
- Re: [Ltru] Re: zh-hakka Doug Ewell
- [Ltru] Re: zh-hakka Frank Ellermann
- Re: [Ltru] Re: zh-hakka David Conrad
- RE: [Ltru] Re: zh-hakka Debbie Garside
- [Ltru] RS region subtag John Cowan
- [Ltru] Re: zh-hakka Frank Ellermann
- [Ltru] Available parsers? (Was: Test suite for la… Stephane Bortzmeyer
- [Ltru] Re: Available parsers? (Was: Test suite fo… Doug Ewell
- Re: [Ltru] Re: Test suite for language tags? Martin Duerst
- [Ltru] Re: Test suite for language tags? Stephane Bortzmeyer
- Re: [Ltru] Re: Test suite for language tags? Mark Davis