Re: [Json] Encoding detection (Was: Re: JSON: remove gap between Ecma-404 and IETF draft)

"Joe Hildebrand (jhildebr)" <jhildebr@cisco.com> Thu, 14 November 2013 15:00 UTC

Return-Path: <jhildebr@cisco.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A1A6621E80E1 for <json@ietfa.amsl.com>; Thu, 14 Nov 2013 07:00:16 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.442
X-Spam-Level:
X-Spam-Status: No, score=-10.442 tagged_above=-999 required=5 tests=[AWL=0.157, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Lrwu9RiYhxE3 for <json@ietfa.amsl.com>; Thu, 14 Nov 2013 06:59:59 -0800 (PST)
Received: from rcdn-iport-1.cisco.com (rcdn-iport-1.cisco.com [173.37.86.72]) by ietfa.amsl.com (Postfix) with ESMTP id 2059A11E80F2 for <json@ietf.org>; Thu, 14 Nov 2013 06:59:59 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=677; q=dns/txt; s=iport; t=1384441199; x=1385650799; h=from:to:cc:subject:date:message-id:in-reply-to: content-id:content-transfer-encoding:mime-version; bh=ErPHPmd0u7g6U2jLbtVAMTwcKqH5m0oza0wt+DdZERg=; b=WNJz49EOZdSNXqSSFcVrOAX2UON+ZwZa9zJwAOiEQZbv4x1e9mUIm55L yZnB5B/ns2mgGj2jhbNPPT3b1GgK66WGhgEMEs26EnPrp7IFAdvBmPNxv GLubboU37Zs29hNF3wMce39MzXOEsSBVOFu8XbvBMgVxjdC/v9dfdEc3Z A=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ag0FAFPkhFKtJV2c/2dsb2JhbABagwc4U78hgR8WdIIlAQEBAwE6PwUNAQg2QiUCBAENBQmHcgYNwBSOJoE5AgWEMQOYEJIMgWqBPoFxOQ
X-IronPort-AV: E=Sophos;i="4.93,700,1378857600"; d="scan'208";a="284584049"
Received: from rcdn-core-5.cisco.com ([173.37.93.156]) by rcdn-iport-1.cisco.com with ESMTP; 14 Nov 2013 14:59:58 +0000
Received: from xhc-aln-x07.cisco.com (xhc-aln-x07.cisco.com [173.36.12.81]) by rcdn-core-5.cisco.com (8.14.5/8.14.5) with ESMTP id rAEExwPQ004844 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Thu, 14 Nov 2013 14:59:58 GMT
Received: from xmb-rcd-x10.cisco.com ([169.254.15.47]) by xhc-aln-x07.cisco.com ([173.36.12.81]) with mapi id 14.03.0123.003; Thu, 14 Nov 2013 08:59:58 -0600
From: "Joe Hildebrand (jhildebr)" <jhildebr@cisco.com>
To: Pete Cordell <petejson@codalogic.com>, Paul Hoffman <paul.hoffman@vpnc.org>
Thread-Topic: Encoding detection (Was: Re: [Json] JSON: remove gap between Ecma-404 and IETF draft)
Thread-Index: AQHO4TGpZY8HSctvG0iE6+OoZ+IYqZokwTMA
Date: Thu, 14 Nov 2013 14:59:57 +0000
Message-ID: <CEAA3067.2D132%jhildebr@cisco.com>
In-Reply-To: <8413609C8A86497F856897AF2AA24960@codalogic>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.3.8.130913
x-originating-ip: [10.129.24.62]
Content-Type: text/plain; charset="us-ascii"
Content-ID: <AB36A219D8836B4D8C4B1887B1ACFE5E@emea.cisco.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "www-tag@w3.org" <www-tag@w3.org>, JSON WG <json@ietf.org>
Subject: Re: [Json] Encoding detection (Was: Re: JSON: remove gap between Ecma-404 and IETF draft)
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 14 Nov 2013 15:00:16 -0000

On 11/14/13 5:04 AM, "Pete Cordell" <petejson@codalogic.com> wrote:

> In http://www.ietf.org/mail-archive/web/json/current/msg00565.html I
>mentioned that we also need to allow for characters such as U+2c00 to be
>the 
>first character in a quoted string.

Ah, yes.  Sorry, I quoted from the wrong part of the conversation.  I
completely agree.

>That can be reduced a bit if we use "--" to indicate "not-tested":
>
>   00 00 -- --  UTF-32BE
>   00 xx -- --  UTF-16BE
>   xx 00 00 00  UTF-32LE
>   xx 00 00 xx  UTF-16LE
>   xx 00 xx --  UTF-16LE
>   xx xx -- --  UTF-8

+1 to this table.  It's clear, correct, and implementable.

-- 
Joe Hildebrand