Re: [apps-discuss] Internationalization Terminlogy

John C Klensin <> Fri, 20 May 2011 15:59 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id CD3F4E071C for <>; Fri, 20 May 2011 08:59:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -102.51
X-Spam-Status: No, score=-102.51 tagged_above=-999 required=5 tests=[AWL=0.089, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id uksl6m950Hww for <>; Fri, 20 May 2011 08:59:41 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id DB21FE0710 for <>; Fri, 20 May 2011 08:59:40 -0700 (PDT)
Received: from [] (helo=localhost) by with esmtp (Exim 4.34) id 1QNS6x-0004YJ-Dx; Fri, 20 May 2011 11:59:39 -0400
Date: Fri, 20 May 2011 11:59:30 -0400
From: John C Klensin <>
To: "J-F C. Morfin" <>
Message-ID: <1D8C60E204C582795721FBE7@PST.JCK.COM>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Subject: Re: [apps-discuss] Internationalization Terminlogy
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 20 May 2011 15:59:41 -0000

--On Monday, May 16, 2011 01:27 +0200 "J-F C. Morfin"
<> wrote:

> John,
> Good work. I have just searched the text for some words.

Thanks.  And thanks for the review.  Some of the omissions that
you note below are deliberate, so let me comment on those at

> I note that:
> - "globalization" is missing which is embarrassing as Unicode
> uses to say (if I am correct) globalization =
> internationalization  + localization + language tagging.

As we have explained to others (mostly off-list), our criterion,
as indicated by the title, is terms used, and found useful, in
the IETF.  The definition or use of a term in Unicode (or
elsewhere) does not justify inclusion.  When we've concluded
that a term should be included, we have tried to either make our
definition consistent with the definition elsewhere (often by
copying) or we have tried to comment on the differences between
IETF usage and usage elsewhere along with any issues those
differences raise.

As I am sure you know, "globalization" is used heavily in
international political and economic contexts with meanings
quite different from the one you describe above.  Inconsistent
definitions of a term are probably a good reason to avoid
recommending it.  The term is not, as far as I know, used
significantly in the IETF.  And, as far as I can tell after a
quick inspection, Unicode does not, in fact, define it (it at
least doesn't appear in their index, which I've found to be
pretty good).  Any of "no use in the IETF", "ambiguity of
meaning", and "no strong requirement in the IETF" would be
grounds for excluding it; all three seem to apply in this case.

> - "langtag" term is missing, so is their IANA table (largest
> by far IANA file). RFC 5646 is named. RFC 4647 is  quoted, but
> not explained, so  "language filtering" is not alluded to.

I've done a quick search, and I don't see "langtag" as a common
term in the IETF.  "Language tag" is usually spelled out
instead.  As with our decision to send readers directly to RFC
5890 for IDNA terminology, I think that anyone who really needs
to understand the terminology for language tagging and
associated issues and actions is better off looking at the
relevant documents (which are referenced for a reason).  If you
have suggested text for clarifying that, or a good selection of
counterexamples in which "langtag" is used in IETF documents,
please send them along. 

> - "linguistic diversity" is missing. Not an IETF word but the
> IETF targets its support?

It seems to me that this is a term about which many people have
good (but not necessarily consistent) intuitions but for which
there is no generally-accepted definition that would stand any
sort of precise test.  How would you recommend defining it using
plain English?  

> - "majuscules" are named, but uncorrectly explained: in latin
> languages, at least, an upper case may be a majuscule, but a
> majuscule may not be printed as an upper case, or stays a
> majuscule even if incorrectly printed as a lower case.

The definition given is attributed to Unicode and accurately
copied.  It is certainly a correct definition in a typography
context and probably a correct one in character set coding or
internationalization ones.  It seems to me that what you are
after is a definition appropriate to some localization contexts.
That has, so far, been out of the scope of this document
(although certainly within scope of some of your other work).
I'd rather keep it out of scope for this document, if only
because I think the definition you propose would be very
controversial, even for Latin-based writing systems in general.

> - "plurilingual" is not quoted which is different from
> multilingual?
> - "linguistic independence" is not alluded to - ex. using
> digital codes.
> - "orthotypograpy" is missing in IETF

As far as I know, neither of those two terms have been used in
any IETF document or discussion except by you and your group(s).