Re: [Ltru] Fwd: [apps-discuss] Defining a CBOR tag for RFC 5646 Language Tags

John Cowan <cowan@mercury.ccil.org> Mon, 12 May 2014 05:37 UTC

Return-Path: <cowan@ccil.org>
X-Original-To: ltru@ietfa.amsl.com
Delivered-To: ltru@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7DA1F1A03F7; Sun, 11 May 2014 22:37:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.251
X-Spam-Level:
X-Spam-Status: No, score=-3.251 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, RP_MATCHES_RCVD=-0.651] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zXRh6eZziOHc; Sun, 11 May 2014 22:37:12 -0700 (PDT)
Received: from earth.ccil.org (earth.ccil.org [192.190.237.11]) by ietfa.amsl.com (Postfix) with ESMTP id 9320C1A02C7; Sun, 11 May 2014 22:37:11 -0700 (PDT)
Received: from cowan by earth.ccil.org with local (Exim 4.72) (envelope-from <cowan@ccil.org>) id 1Wjiv9-0004qd-Hg; Mon, 12 May 2014 01:37:03 -0400
Date: Mon, 12 May 2014 01:37:03 -0400
From: John Cowan <cowan@mercury.ccil.org>
To: Ira McDonald <blueroofmusic@gmail.com>
Message-ID: <20140512053703.GS17946@mercury.ccil.org>
References: <9C6F4F37-39C8-498C-8FDA-C894C3A7BF29@tzi.org> <CAN40gStpNZWn7r+JHt45xjGmk87kDw89mu6P=T-OE6auRV5O5Q@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAN40gStpNZWn7r+JHt45xjGmk87kDw89mu6P=T-OE6auRV5O5Q@mail.gmail.com>
User-Agent: Mutt/1.5.20 (2009-06-14)
Sender: John Cowan <cowan@ccil.org>
Archived-At: http://mailarchive.ietf.org/arch/msg/ltru/hf9A2f0A9w5nyBTzZNknk5x78QY
Cc: Peter Occil <poccil14@gmail.com>, Carsten Bormann <cabo@tzi.org>, LTRU Working Group <ltru@ietf.org>, apps-discuss@ietf.org
Subject: Re: [Ltru] Fwd: [apps-discuss] Defining a CBOR tag for RFC 5646 Language Tags
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ltru/>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 12 May 2014 05:37:14 -0000

Carsten Bormann scripsit:

> The proposal is almost trivially obvious (pair a language tag with an
> UTF-8 string in a two-element array) and looks right to me.  But I'm
> not an expert in Language Tags, and silly mistakes are being made by
> non-experts all the time.

Looks good to me.  But I would recommend requiring the encoder to do the
case folding rather than leaving it to the decoder.  This is a form of
early uniform normalization, which is generally a Good Thing if you can
get it.

The main mistake people make is trying to make the language tag fixed
length, which you have already avoided.

-- 
John Cowan          http://www.ccil.org/~cowan        cowan@ccil.org
A: "Spiro conjectures Ex-Lax."
Q: "What does Pat Nixon frost her cakes with?"
  --"Jeopardy" for generative semanticists