Re: [Ltru] [apps-discuss] Fwd: Defining a CBOR tag for RFC 5646Language Tags

Carsten Bormann <cabo@tzi.org> Wed, 14 May 2014 08:26 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: ltru@ietfa.amsl.com
Delivered-To: ltru@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 359871A0278; Wed, 14 May 2014 01:26:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.749
X-Spam-Level:
X-Spam-Status: No, score=0.749 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_EQ_DE=0.35, MANGLED_CASH=2.3, SPF_HELO_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gYFfwFmczwY0; Wed, 14 May 2014 01:26:45 -0700 (PDT)
Received: from informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) by ietfa.amsl.com (Postfix) with ESMTP id 4CDB41A0273; Wed, 14 May 2014 01:26:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from smtp-fb3.informatik.uni-bremen.de (smtp-fb3.informatik.uni-bremen.de [134.102.224.120]) by informatik.uni-bremen.de (8.14.5/8.14.5) with ESMTP id s4E8QVSV019627; Wed, 14 May 2014 10:26:31 +0200 (CEST)
Received: from eduroam-pool7-0232.wlan.uni-bremen.de (eduroam-pool7-0232.wlan.uni-bremen.de [134.102.112.232]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by smtp-fb3.informatik.uni-bremen.de (Postfix) with ESMTPSA id 55894196D; Wed, 14 May 2014 10:26:30 +0200 (CEST)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 7.2 \(1874\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <CAKHUCzyFAyLciHD0gzGM_5eaEqXdUFbyK8cJ_gVsQjmc+0fWEA@mail.gmail.com>
Date: Wed, 14 May 2014 10:26:29 +0200
X-Mao-Original-Outgoing-Id: 421748789.324169-a20965c2f85802dc9abb9a53e438585d
Content-Transfer-Encoding: quoted-printable
Message-Id: <0C126A09-1909-449E-B0B4-9F41677710E2@tzi.org>
References: <18971982.1399873468367.JavaMail.root@mswamui-cedar.atl.sa.earthlink.net> <9BE5D3F7FAEE4CAB8FD3326ED8F1ED75@PeterPC> <CAKHUCzyFAyLciHD0gzGM_5eaEqXdUFbyK8cJ_gVsQjmc+0fWEA@mail.gmail.com>
To: Dave Cridland <dave@cridland.net>
X-Mailer: Apple Mail (2.1874)
Archived-At: http://mailarchive.ietf.org/arch/msg/ltru/a_GLOEexETDLor5_Yl8oKzyBHGM
X-Mailman-Approved-At: Wed, 14 May 2014 09:42:08 -0700
Cc: Randy Presuhn <randy_presuhn@mindspring.com>, LTRU Working Group <ltru@ietf.org>, "apps-discuss@ietf.org" <apps-discuss@ietf.org>
Subject: Re: [Ltru] [apps-discuss] Fwd: Defining a CBOR tag for RFC 5646Language Tags
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ltru/>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 May 2014 08:26:47 -0000

On 14 May 2014, at 09:18, Dave Cridland <dave@cridland.net> wrote:

> embedding language tags in invalid UTF-8

In 2010, RFC 6082 deprecated that (and moved RFC 2482 to Historic), saying:

   It is an idea whose
   time never quite came.  It has been superseded by whole-transaction
   language identification such as the MIME Content-language header
   [RFC3282] and more general markup mechanisms such as those provided
   by XML.  The Unicode Consortium has deprecated the language tag
   character facility and strongly recommends against its use.

The application that motivated the new CBOR tag we are talking about happens not to require multi-language strings.
(It seems their usage is mostly about internationalization, i.e., pairing a string with its locale.)

If such a facility were needed, it could be modeled as (shown in CBOR diagnostic notation):

somenewtag([[“en”, "She said ‘“], [“fr”, “Bonjour”], [“en”, "', and then ‘“], [“zh”, “你好”], [“en”, "’.”]])

or even, maximizing compatibility by using the proposed tag as well:

othernewtag([38([“en”, "She said ‘“]), 38([“fr”, “Bonjour”]), 38([“en”, "', and then ‘“]), 38([“zh”, “你好”]), 38([“en”, "’.”])])

(there are many other ways this could be modeled, too).

Grüße, Carsten