Re: [Ltru] Great Script Debate "the Next Generation"...

"Mark Davis" <mark.davis@icu-project.org> Sat, 14 October 2006 18:04 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1GYnsU-00048Q-0D; Sat, 14 Oct 2006 14:04:54 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1GYnsT-00046o-50 for ltru@ietf.org; Sat, 14 Oct 2006 14:04:53 -0400
Received: from wx-out-0506.google.com ([66.249.82.227]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1GYnsR-0002Xe-Nw for ltru@ietf.org; Sat, 14 Oct 2006 14:04:53 -0400
Received: by wx-out-0506.google.com with SMTP id t4so1247762wxc for <ltru@ietf.org>; Sat, 14 Oct 2006 11:04:51 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=r2Z/ZkdG6Za5tNZfJMJbYsjtroavwAfULHerZzOsDOh2RmdkXN4n13KSFGMlWA0oTwOozagZQegRkD9ifq7viLndONRH/rc0vpS4h4MLvFfy4cWwYwQ8djcl9Pj1vh6kKfxGywG6S3Yv4rDogD1AY9I5+OIG6La5H+OwpwdvILU=
Received: by 10.90.105.20 with SMTP id d20mr3022069agc; Sat, 14 Oct 2006 11:04:51 -0700 (PDT)
Received: by 10.90.92.11 with HTTP; Sat, 14 Oct 2006 11:04:51 -0700 (PDT)
Message-ID: <30b660a20610141104p5ea6e30cye725c4dd26ebc5bd@mail.gmail.com>
Date: Sat, 14 Oct 2006 11:04:51 -0700
From: Mark Davis <mark.davis@icu-project.org>
To: John Cowan <cowan@ccil.org>
Subject: Re: [Ltru] Great Script Debate "the Next Generation"...
In-Reply-To: <20061014174906.GG2033@ccil.org>
MIME-Version: 1.0
References: <452555BA.2040601@yahoo-inc.com> <20061014174906.GG2033@ccil.org>
X-Google-Sender-Auth: 0d161c7ff578788b
X-Spam-Score: 0.1 (/)
X-Scan-Signature: a87a9cdae4ac5d3fbeee75cd0026d632
Cc: ltru@ietf.org
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0501937285=="
Errors-To: ltru-bounces@ietf.org

I'm very much in agreement with this approach.

The text needs a few fixes. A language tag can be used to tag content (and
not just documents as you have below), or can be used to query content (in
which case slightly different conditions apply). And the content may not be
written -- and some script tags distinguish unwritten content. And "normally
used" is too weak -- someone might interpret that as >= 50% of the time,
which would mean not using "Hans", for example, since Hans occurs more
frequently than Hant. And using sometimes "customarily" and sometimes
"normally" might lead a reader to think that some distinction is meant
between them.

So here is a suggested rewrite:

5.  There MUST be at most one script subtag in a language tag.

6.  A script subtag SHOULD be used when tagging content if the content is
   in a script other than that customarily used for the language
   (i.e. zh-Latn, en-Brai).

7.  A script subtag SHOULD be used when tagging content if the language of
the content
   is customarily written in more than one script (e.g. sr-Cyrl,
   sr-Latn).

8.  A script subtag SHOULD NOT be used if the content being tagged is
written
   in the script customarily used for the vast majority of content in that
language
(e.g. en-Latn SHOULD NOT be used).

Mark

On 10/14/06, John Cowan <cowan@ccil.org> wrote:
>
> Addison Phillips scripsit:
>
> > Reactions?
>
> <span lang="sco">It's jist awfu complicated.</span>
>
> On reflection I think we should remove all traces of Suppress-Script
> from 4646bis and 4645bis as well, and go with text in 2.2.3 like this:
>
> 5.  There MUST be at most one script subtag in a language tag.
>
> 6.  A script subtag SHOULD be used if the document is written
>     in a script other than that normally used for the language
>     (i.e. zh-Latn, en-Brai).
>
> 7.  A script subtag SHOULD be used if the language of the document
>     is customarily written in more than one script (e.g. sr-Cyrl,
>     sr-Latn).
>
> 8.  A script subtag SHOULD NOT be used if the document being tagged is
>     written in the script normally used for the language (e.g. en-Latn
>     SHOULD NOT be used).
>
> And leave it at that.
>
> After all, script is one of the most observable properties of a document
> (people may debate which language a document is written in, but scarcely
> which script it is written in, and even computers can tell which if the
> document is Unicoded), so we don't really have to worry about bad
> script tags, and this advice will warn people off using them
> unnecessarily.
>
> --
> One Word to write them all,             John Cowan <cowan@ccil.org>
>   One Access to find them,              http://www.ccil.org/~cowan
> One Excel to count them all,
>   And thus to Windows bind them.                --Mike Champion
>
> _______________________________________________
> Ltru mailing list
> Ltru@ietf.org
> https://www1.ietf.org/mailman/listinfo/ltru
>
_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru