RE: [Ltru] RE: duel(ing) tags

"Debbie Garside" <debbie@ictmarketing.co.uk> Wed, 10 October 2007 23:45 UTC

Return-path: <ltru-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1IflEq-0002sE-Rv; Wed, 10 Oct 2007 19:45:16 -0400
Received: from ltru by megatron.ietf.org with local (Exim 4.43) id 1IflEq-0002oG-1L for ltru-confirm+ok@megatron.ietf.org; Wed, 10 Oct 2007 19:45:16 -0400
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1IflEp-0002o2-Lt for ltru@ietf.org; Wed, 10 Oct 2007 19:45:15 -0400
Received: from 132.nexbyte.net ([62.197.41.132] helo=mx1.nexbyte.net) by chiedprmail1.ietf.org with esmtp (Exim 4.43) id 1IflEo-0006p5-KV for ltru@ietf.org; Wed, 10 Oct 2007 19:45:15 -0400
Received: from 145.nexbyte.net ([62.197.41.145]) by mx1.nexbyte.net (mx1.nexbyte.net [62.197.41.132]) (MDaemon PRO v9.6.2) with ESMTP id md50007325896.msg for <ltru@ietf.org>; Thu, 11 Oct 2007 00:48:43 +0100
Received: from CPQ86763045110 ([83.67.121.192]) by 145.nexbyte.net with MailEnable ESMTP; Thu, 11 Oct 2007 00:45:17 +0100
From: Debbie Garside <debbie@ictmarketing.co.uk>
To: addison@yahoo-inc.com
References: <E1Ifdys-0005rn-RG@megatron.ietf.org><C9BF0238EED3634BA1866AEF14C7A9E55A5988038C@NA-EXMSG-C116.redmond.corp.microsoft.com> <470D2F66.2010202@yahoo-inc.com> <060201c80b79$9a7079f0$0d00a8c0@CPQ86763045110> <C9BF0238EED3634BA1866AEF14C7A9E55A59880490@NA-EXMSG-C116.redmond.corp.microsoft.com> <060801c80b7d$7e0d7d90$0d00a8c0@CPQ86763045110> <470D5AE9.6020304@yahoo-inc.com>
Subject: RE: [Ltru] RE: duel(ing) tags
Date: Thu, 11 Oct 2007 00:44:03 +0100
Message-ID: <062401c80b97$706d2ef0$0d00a8c0@CPQ86763045110>
X-Mailer: Microsoft Office Outlook 11
In-Reply-To: <470D5AE9.6020304@yahoo-inc.com>
Thread-Index: AcgLkpHX4oQPe2ZBQFqIqMl1dmUCVwAAdc2A
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
X-Spam-Processed: mx1.nexbyte.net, Thu, 11 Oct 2007 00:48:43 +0100 (not processed: message from valid local sender)
X-MDRemoteIP: 62.197.41.145
X-Return-Path: prvs=18038546a2=debbie@ictmarketing.co.uk
X-Envelope-From: debbie@ictmarketing.co.uk
X-MDaemon-Deliver-To: ltru@ietf.org
X-MDAV-Processed: mx1.nexbyte.net, Thu, 11 Oct 2007 00:48:44 +0100
X-Spam-Score: 0.0 (/)
X-Scan-Signature: ccfb4541e989aa743998098cd315d0fd
Cc: ltru@ietf.org
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: debbie@ictmarketing.co.uk
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Errors-To: ltru-bounces@ietf.org

Hi Addison

Comments in line...


>  > why not include a way to catch the "illegal" tags that  >
> is easy for applications to implement?
>
> Ur... that's what the registry does. I can catch the fact
> that "cmn-HK"
> is not valid and it's really easy to implement. What you mean
> is that I should be able to infer that the user really means
> "zh-cmn-HK", which I might be able to do in my software (but
> which I am free *not* to do at present).

That is exactly what I mean.

>
> After all, plenty of content is labeled with the language tag
> "english".
> Implementations are free to infer that this really means
> "en". However, they are not obliged to do so.
>
>  > I see more use than harm in using
>  > extended language subtags and deprecating the individual
> > 639-3 codes to  > facilitate matching.  This can be
> documented within 4646bis.
>
> I sense perhaps some confusion here?

I don't think I am confused at all... Having read your email :-)

> We are not talking about deprecated any ISO 639-3 codes. We
> are talking about how ISO 639-3 codes are incorporated into
> the registry.

That is exactly what I am talking about.

>'cmn' is a perfectly valid ISO 639-3 code.

Agreed

> We
> have several choices for how to include it in the registry:
>
> - as a primary language subtag (making tags like "cmn-Hant-HK" valid)
> - as an extlang of 'zh' (making tags like "zh-cmn-Hant-HK" valid)
>    // note: notice that I chose an un-grandfathered tag there
> - as a primary language subtag that is deprecated (making
> tags like "cmn-Hant-HK" valid, but deprecated)
> - not at all (none of the tags in this example would be valid then)

Indeed.  I understood this perfectly well.  I was attempting to show how cmn
could be incorporated within the registry as an extlang of 'zh' whilst also
covering the back door by registering 'cmn' and then immediately deprecating
'cmn' in favour of 'zh-cmn'.  You already have several deprecated subtags
what is the problem  with a few more to deal with the macrolanguage
scenario?


> Let me note that "special case" handling and mapping in
> matching algorithms is incredibly painful to build and
> maintain. If you choose to canonicalize tags and ranges
> before processing, this isn't so bad. But if you are a "good
> citizen" and preserve the original form of the tags/ranges
> for later consumption, one must implement quite complex logic
> for doing what's supposed to be quite simple matching.

What I have proposed is fairly simple logic for matching purposes.

> Furthermore, this requires, absolutely, access to information
> above and beyond that embodied in the tags/ranges themselves.
> That is a wholesale departure from current practice in
> matching. While my own implementations might very well use
> such information (and more) to "do the right thing", that is
> *my* application and you might not care for my particular
> implementation choices (they may not be right for your
> application). To the extent that we can describe behaviors
> that do not require registry access, I am content with either
> primary language subtags or extlangs (but NOT both for the
> same language).

I am not proposing both for the same language as such.  I am proposing to
deprecate one in favour of the other.

>
> Remember: a deprecated subtag is still a valid subtag.

Yes, I know.  Perhaps you would like to call it something else?  Retired as
part of macrolanguage?  Replaced by?

The whole point is to have both within the registry; one current, one not
current but both valid for matching.  By deprecating cmn to zh-cmn matching
is facilitated.  Text within RFC4646bis can be quite simply written to
explain this.

Now I am going to shut up about it as I am obviously banging my head against
a brick wall.  But I really cannot see what the problem is; other than
people will tag with both zh-cmn and cmn whether we like it or not and I
think it is far better to deal with that scenario within the registry rather
than leave it to application designers who may not be as conversant with
RFC4646bis as you (and maybe) me :-)

Best regards

Debbie



> Addison
>
> PS> Note that the word "valid" in the foregoing refers
> explicitly to the
> terminology defined in Section 2.2.9 "Classes of Conformance"
> in the current draft of RFC 4646bis (and by intention to the
> same section in RFC 4646).
>
> Debbie Garside wrote:
> > Shawn wrote:
> >
> > Its just my experience
> >> that people don't really always pay that much attention, so they
> >> might use both.  I was merely suggesting that robust applications
> >> might want to recognize both forms, even if one was "illegal".
> >
> > But if we already know that is what people will do (which
> is the case)
> > and if we are to use extended language subtags with the
> inclusion of
> > 639-3 within the registry why not include a way to catch
> the "illegal" tags that
> > is easy for applications to implement?   I see more use
> than harm in using
> > extended language subtags and deprecating the individual
> 639-3 codes
> > to facilitate matching.  This can be documented within 4646bis.
> >
> > Best regards
> >
> > Debbie
> >
> >
> >> -----Original Message-----
> >> From: Shawn Steele [mailto:Shawn.Steele@microsoft.com]
> >> Sent: 10 October 2007 21:30
> >> To: debbie@ictmarketing.co.uk; addison@yahoo-inc.com
> >> Cc: ltru@ietf.org
> >> Subject: RE: [Ltru] RE: duel(ing) tags
> >>
> >>>> I definitely am against providing both tagging options.
> >>> But where is the harm in including cmn in the registry and then
> >>> deprecating in favour of zh-cmn?  As stated previously,
> this would
> >>> catch those who unwittingly tag with any 639-3 code and are
> >> unaware of
> >>> macrolanguages and 4646bis.
> >> I wasn't suggesting registering both.  Its just my experience that
> >> people don't really always pay that much attention, so
> they might use
> >> both.  I was merely suggesting that robust applications
> might want to
> >> recognize both forms, even if one was "illegal".
> >>
> >> - Shawn
> >>
> >>
> >>
> >
> >
> >
> >
>
> --
> Addison Phillips
> Globalization Architect -- Yahoo! Inc.
> Chair -- W3C Internationalization Core WG
>
> Internationalization is an architecture.
> It is not a feature.
>
>
>






_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www1.ietf.org/mailman/listinfo/ltru