Re: [Ltru] Does 'de' really mean "only standard German"?

Peter Constable <petercon@microsoft.com> Wed, 28 May 2008 23:13 UTC

Return-Path: <ltru-bounces@ietf.org>
X-Original-To: ltru-archive@megatron.ietf.org
Delivered-To: ietfarch-ltru-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id BD7393A6CA9; Wed, 28 May 2008 16:13:18 -0700 (PDT)
X-Original-To: ltru@core3.amsl.com
Delivered-To: ltru@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 671333A6CA9 for <ltru@core3.amsl.com>; Wed, 28 May 2008 16:13:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.622
X-Spam-Level:
X-Spam-Status: No, score=-10.622 tagged_above=-999 required=5 tests=[AWL=-0.023, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jXdDhQHumU4P for <ltru@core3.amsl.com>; Wed, 28 May 2008 16:13:16 -0700 (PDT)
Received: from smtp.microsoft.com (maila.microsoft.com [131.107.115.212]) by core3.amsl.com (Postfix) with ESMTP id 5C4EF3A677E for <ltru@ietf.org>; Wed, 28 May 2008 16:13:15 -0700 (PDT)
Received: from tk1-exhub-c103.redmond.corp.microsoft.com (157.54.46.187) by TK5-EXGWY-E801.partners.extranet.microsoft.com (10.251.56.50) with Microsoft SMTP Server (TLS) id 8.1.240.5; Wed, 28 May 2008 16:13:25 -0700
Received: from NA-EXMSG-C117.redmond.corp.microsoft.com ([157.54.62.46]) by tk1-exhub-c103.redmond.corp.microsoft.com ([157.54.46.187]) with mapi; Wed, 28 May 2008 16:13:24 -0700
From: Peter Constable <petercon@microsoft.com>
To: LTRU Working Group <ltru@ietf.org>
Date: Wed, 28 May 2008 16:13:12 -0700
Thread-Topic: [Ltru] Does 'de' really mean "only standard German"?
Thread-Index: AcjAOQYPI0crGRdbQZSTD2K9KrewmAAlEXeQAAh44YAACXCOsA==
Message-ID: <DDB6DE6E9D27DD478AE6D1BBBB835795633304E6C1@NA-EXMSG-C117.redmond.corp.microsoft.com>
References: <01c301c8bbe5$8c2810c0$6801a8c0@oemcomputer><008a01c8bedc$72b97b20$6801a8c0@oemcomputer><30b660a20805252132g28ff50b0kd5b04d6f47ca35d2@mail.gmail.com><002001c8bef3$e0497520$6801a8c0@oemcomputer><30b660a20805262003j21fff6c4tf20d59be11f28633@mail.gmail.com><20080527032120.GA18303@mercury.ccil.org><30b660a20805271138v67b081dat5809395233575c90@mail.gmail.com><001901c8c02c$42c59c40$6801a8c0@oemcomputer><20080527192640.GC27379@mercury.ccil.org><001e01c8c038$a272e740$6801a8c0@oemcomputer> <DDB6DE6E9D27DD478AE6D1BBBB835795633304E182@NA-EXMSG-C117.redmond.corp.microsoft.com> <819912BDAE6BCB4097883B226DA473B10AC94A44@SACEXMV02.hq.netapp.com>
In-Reply-To: <819912BDAE6BCB4097883B226DA473B10AC94A44@SACEXMV02.hq.netapp.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
MIME-Version: 1.0
Subject: Re: [Ltru] Does 'de' really mean "only standard German"?
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/ltru>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: ltru-bounces@ietf.org
Errors-To: ltru-bounces@ietf.org

> From: Texin, Tex [mailto:Tex.Texin@netapp.com]

> Actually, I have the research book of Berlin and Kay. I thought I might
> have been the only one in the world to have read it.

I have to admit, while I've known about their work for a long time, I've never seen or read the book; I was relying on other references (for this discussion, a book by George Lakoff discussing cognitive models by which we define concepts).


> But you make quite a leap in the last para when you equate the focal
> point to the tag.
> Making the ideal equivalent to the tag, then makes everything non-ideal
> excluded, when in fact until now it is a superset. (Which is what this
> discussion revolves around.)

What I meant was that the best way to document the denotation of a code element like "de" is in terms of the best example rather than in terms of specifying a range, though in terms of application of that code element we would allow it to be used over some range of varieties (even though the limits to the range haven't been defined).

If you think about it, that's just the same things we do in Unicode for character coding.


> Rather, I would think you would make the ideal a subtag. In fact
> standardizing the subtag for the most representative dialect would then
> help define the more general case while not limiting it. So if N1 is
> used to represent number 1 or best case, then you can identify best
> cases for each language and still have primary tags be more inclusive;
> de-N1 Also people can then specify if they want the most general or the
> most representative languages for langs like zh.

Every tag must allow for some range of variation. For instance, even if we're considering some ideal, conventional variety, two native speakers of that language do not express things in exactly the same way. There will always be some degree of variation for any language category we define. So, it's not a matter of picking one specific value versus allow some range of variation; rather, it's a matter of deciding what order of range of variation is scoped by a given category.

In evaluating ISO 639-2 to decide how the denotation of existing code elements should be characterized with a greater degree of precision, there were a few hundred choices that had to be made regarding scope. When faced with something like "ar" or "de", the approach that seemed to make most sense, considering common usage wrt tagged content and resources, was to scope at the individual-language level, which in those cases meant the standard varieties. We went as far as to come up with some operational definitions to apply that approach consistently across the board, but then we encountered some division in approach to usage in the case of some languages. Chinese was a particularly obvious case because of the pre-existing tags registered under RFC 1766. So, we ended up adopting a different principle for those cases, leading to the macrolanguage concept. Should we have taken the different approach for even more cases, such as "de"? Well, I had no information pointing to "de" being used for anything other than the standard variety: A/V content was not yet a major consideration in application of language tags, and wrt the most likely non-standard variety that could potentially have been lumped into "de", Swiss German (i.e. gsw), I kept hearing that written usage was not significant. If "de" had been analyzed the same way as "zh", then we'd be facing all the same problems for "de" as we are facing now for "zh".

So, while it isn't all perfect, I think what was done was reasonable and workable. And I think there can appropriate cases in which tags of the form de-variantx might be used to describe some specific variaties that could reasonably be considered as within the scope for the standard variety. (But don't ask me which are or are not appropriate cases for "de".)



Peter
_______________________________________________
Ltru mailing list
Ltru@ietf.org
https://www.ietf.org/mailman/listinfo/ltru