Re: [Ltru] Fwd: [apps-discuss] Defining a CBOR tag for RFC 5646Language Tags

"Peter Occil" <poccil14@gmail.com> Wed, 14 May 2014 03:33 UTC

Return-Path: <poccil14@gmail.com>
X-Original-To: ltru@ietfa.amsl.com
Delivered-To: ltru@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 366A51A0218; Tue, 13 May 2014 20:33:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.588
X-Spam-Level:
X-Spam-Status: No, score=0.588 tagged_above=-999 required=5 tests=[BAYES_20=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, SPF_PASS=-0.001, STOX_REPLY_TYPE=0.439] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mqKUWhcOUluU; Tue, 13 May 2014 20:33:25 -0700 (PDT)
Received: from mail-qc0-x22f.google.com (mail-qc0-x22f.google.com [IPv6:2607:f8b0:400d:c01::22f]) by ietfa.amsl.com (Postfix) with ESMTP id A974A1A0145; Tue, 13 May 2014 20:33:25 -0700 (PDT)
Received: by mail-qc0-f175.google.com with SMTP id w7so1852730qcr.34 for <multiple recipients>; Tue, 13 May 2014 20:33:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:from:to:cc:references:in-reply-to:subject:date :mime-version:content-type:content-transfer-encoding:importance; bh=fjwarKIQ6jPjCEa/u68v/yEz/f9hfYMZo1ncnA4LfhI=; b=K0drNiFGY+m+pyVCy85YbwaSwigwiBHIwCju6K14uT5IX7JEyR4aWdAIyjCi+Bqguj VVMdrBrhA4/lOykOmgHgduXQ0miMgxQroW83s8KeBAk9rPAsXHDUw4eyHW7akPlDdNDD 6oH07DRgVtY1/PD20uOEQ2DE8Ew6d1Z/kWoztn04jC/UqV1dZ3DO2C0d4HufZNXG1yR/ ZsHCxRzGJlb8/7TcUhyV2U6GyjOFOypZXbQWjFKsTz631n/uCFZetUc2UcDlQlEVK1GQ /pxtfpXaEw8zhC+akahMwXade2QlXkfjXLav6XdjX5R/A/ciIGtliX3MJWykJBMNuEk5 rI+A==
X-Received: by 10.229.192.7 with SMTP id do7mr2094984qcb.1.1400038399007; Tue, 13 May 2014 20:33:19 -0700 (PDT)
Received: from PeterPC (c-50-169-108-108.hsd1.ma.comcast.net. [50.169.108.108]) by mx.google.com with ESMTPSA id w2sm920440qar.21.2014.05.13.20.33.17 for <multiple recipients> (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 13 May 2014 20:33:18 -0700 (PDT)
Message-ID: <9BE5D3F7FAEE4CAB8FD3326ED8F1ED75@PeterPC>
From: "Peter Occil" <poccil14@gmail.com>
To: "Randy Presuhn" <randy_presuhn@mindspring.com>, "LTRU Working Group" <ltru@ietf.org>
References: <18971982.1399873468367.JavaMail.root@mswamui-cedar.atl.sa.earthlink.net>
In-Reply-To: <18971982.1399873468367.JavaMail.root@mswamui-cedar.atl.sa.earthlink.net>
Date: Tue, 13 May 2014 23:33:11 -0400
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset="utf-8"; reply-type=original
Content-Transfer-Encoding: 8bit
X-Priority: 3
X-MSMail-Priority: Normal
Importance: Normal
X-Mailer: Microsoft Windows Live Mail 15.4.3555.308
X-MimeOLE: Produced By Microsoft MimeOLE V15.4.3555.308
X-Antivirus: avast! (VPS 140513-3, 05/13/2014), Outbound message
X-Antivirus-Status: Clean
Archived-At: http://mailarchive.ietf.org/arch/msg/ltru/LIm4pu6zQsapcU03uf_GaCgpSIA
Cc: Carsten Bormann <cabo@tzi.org>, apps-discuss@ietf.org
Subject: Re: [Ltru] Fwd: [apps-discuss] Defining a CBOR tag for RFC 5646Language Tags
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ltru/>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 May 2014 03:33:27 -0000

I'm not aware of any use case where having multiple language tags for the 
same plain-text string is useful.  For instance, RDF supports only one 
language tag for each string.  And HTML5 doesn't support multiple languages 
in the Content-Language header field or META tag; instead, for multilingual 
documents, it relies on markup to set the language used for each section. 
But plain-text strings don't admit of HTML-like markup without more.

Moreover, having multiple language tags for plain text leads to the 
additional problem of determining which parts of the text each language tag 
applies to, which is not so easy in the case of your three-language example.

--Peter

-----Original Message----- 
From: Randy Presuhn
Sent: Monday, May 12, 2014 1:44 AM
To: Ira McDonald ; LTRU Working Group ; Carsten Bormann ; Peter Occil ; Ira 
McDonald
Subject: Re: [Ltru] Fwd: [apps-discuss] Defining a CBOR tag for RFC 
5646Language Tags

Hi -

Is representation of multi-lingual strings a concern?

E.g.  "She said 'Bonjour', and then 'Ciao'."

Randy

>From: Ira McDonald <blueroofmusic@gmail.com>
>Sent: May 11, 2014 4:50 PM
>To: LTRU Working Group <ltru@ietf.org>rg>, Carsten Bormann <cabo@tzi.org>rg>, 
>Peter Occil <poccil14@gmail.com>om>, Ira McDonald <blueroofmusic@gmail.com>
>Subject: [Ltru] Fwd: [apps-discuss] Defining a CBOR tag for RFC 5646 
>Language Tags
>
>Hi,
>
><oops - retrying with the correct LTRU WG address this time, I hope...>
>
>Forwarding this note to the LTRU list for language tag experts to review.
>Please copy your reply to IETF Apps Discuss list (see below).
>
>Cheers,
>- Ira McDonald
>
>
>---------- Forwarded message ----------
>From: Carsten Bormann <cabo@tzi.org>
>Date: Sun, May 11, 2014 at 6:48 PM
>Subject: [apps-discuss] Defining a CBOR tag for RFC 5646 Language Tags
>To: IETF Apps Discuss <apps-discuss@ietf.org>
>Cc: Peter Occil <poccil14@gmail.com>
>
>
>If you care about language tags, I hope the subject got your
>attention.  If you don't, please ignore this request for assistance.
>
>Concise Binary Object Representation (CBOR, RFC 7049) is a binary
>format for structured objects.  CBOR has a number of pre-defined data
>types and allows additional data types to be registered as "tags".
>Among the pre-defined data types is a text string (Unicode characters,
>encoded in UTF-8).  No facility is pre-defined for associating a
>Language Tag with such a string.
>
>A new CBOR tag is being proposed for combining a CBOR text string with
>a Language Tag.  The (single-page) proposal is in:
>
>http://peteroupc.github.io/CBOR/langtags.html
>
>The proposal is almost trivially obvious (pair a language tag with an
>UTF-8 string in a two-element array) and looks right to me.  But I'm
>not an expert in Language Tags, and silly mistakes are being made by
>non-experts all the time.
>
>If you are an expert in Language Tags:
>-- Is anything missing or could anything be done in a better way?
>-- Or does this really simply look right?
>
>Responses to me privately or to the list are appreciated.
>
>Grüße, Carsten
>
>_______________________________________________
>apps-discuss mailing list
>apps-discuss@ietf.org
>https://www.ietf.org/mailman/listinfo/apps-discuss