Re: [Json] Proposed change: update the Unicode version

"Matt Miller (mamille2)" <mamille2@cisco.com> Wed, 05 June 2013 17:55 UTC

Return-Path: <mamille2@cisco.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4E1AD21F96C2 for <json@ietfa.amsl.com>; Wed, 5 Jun 2013 10:55:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.276
X-Spam-Level:
X-Spam-Status: No, score=-10.276 tagged_above=-999 required=5 tests=[AWL=0.323, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aFccfygtRWCn for <json@ietfa.amsl.com>; Wed, 5 Jun 2013 10:55:29 -0700 (PDT)
Received: from rcdn-iport-8.cisco.com (rcdn-iport-8.cisco.com [173.37.86.79]) by ietfa.amsl.com (Postfix) with ESMTP id 570D721F8B21 for <json@ietf.org>; Wed, 5 Jun 2013 10:55:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=7300; q=dns/txt; s=iport; t=1370454925; x=1371664525; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=2vEAsoHFp3ayWBj4HlF7rhfRWRkF02a4eTbwUSAF9XQ=; b=JA99owNJV72zCvJNw80AA1XHS2iB+wGwBEIFBWWOsehr1OjqqggFMhyl hthUbO6Tgr5YdMMrVVQE2Mro2V9YisgSDxTGA1qFUXKRXt6xheIXkF95Z Es5QNuFJ3bie0oV+wOyoq1zVWqxDz7aKCZGt9MA15o2scBE+kQoJSpO9h g=;
X-Files: smime.p7s : 4136
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AhkFAGd6r1GtJXG9/2dsb2JhbABZgwm/XX4WdIIjAQEBAwF5BQsCAQgiJAIwJQIEDgUIBod5Br1YjnoxB4J6YQOQAIEsl1ODD4In
X-IronPort-AV: E=Sophos; i="4.87,808,1363132800"; d="p7s'?scan'208"; a="219203764"
Received: from rcdn-core2-2.cisco.com ([173.37.113.189]) by rcdn-iport-8.cisco.com with ESMTP; 05 Jun 2013 17:55:24 +0000
Received: from xhc-rcd-x07.cisco.com (xhc-rcd-x07.cisco.com [173.37.183.81]) by rcdn-core2-2.cisco.com (8.14.5/8.14.5) with ESMTP id r55HtOqB010424 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Wed, 5 Jun 2013 17:55:24 GMT
Received: from xmb-aln-x11.cisco.com ([169.254.6.24]) by xhc-rcd-x07.cisco.com ([173.37.183.81]) with mapi id 14.02.0318.004; Wed, 5 Jun 2013 12:55:24 -0500
From: "Matt Miller (mamille2)" <mamille2@cisco.com>
To: John Cowan <cowan@mercury.ccil.org>
Thread-Topic: [Json] Proposed change: update the Unicode version
Thread-Index: AQHOYUwcEVXhzbeP6EGj0XkggNKAG5kmK8CAgAAQAwCAABqRgIAAJU4AgAAAqoCAAAEIAIAABL+AgAAtMICAAPvJAIAAAYUAgAABBYCAAA9XgA==
Date: Wed, 05 Jun 2013 17:55:23 +0000
Message-ID: <BF7E36B9C495A6468E8EC573603ED9411527CBCD@xmb-aln-x11.cisco.com>
References: <A723FC6ECC552A4D8C8249D9E07425A70FC286AF@xmb-rcd-x10.cisco.com> <51AE6E95.3050007@stpeter.im> <CAHBU6iu083Q+tFcBt=CshS68DWFZ-8JH3ahquXKGW1t1GgCyjg@mail.gmail.com> <51AE736D.7030209@stpeter.im> <BF7E36B9C495A6468E8EC573603ED9411527BCD5@xmb-aln-x11.cisco.com> <5DC8FE77-10A8-4835-8415-ACC3FC323663@tzi.org> <CAHBU6itdKgenDnKPP94VWGro+p0GkC-3aDnwqdgztVknu89WJA@mail.gmail.com> <20130605170028.GJ3680@mercury.ccil.org>
In-Reply-To: <20130605170028.GJ3680@mercury.ccil.org>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: yes
X-MS-TNEF-Correlator:
x-originating-ip: [10.129.24.59]
Content-Type: multipart/signed; boundary="Apple-Mail=_D255A269-FBDD-4F9A-B82E-71D9803DAFA0"; protocol="application/pkcs7-signature"; micalg="sha1"
MIME-Version: 1.0
Cc: Carsten Bormann <cabo@tzi.org>, Tim Bray <tbray@textuality.com>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Proposed change: update the Unicode version
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Jun 2013 17:55:42 -0000

On Jun 5, 2013, at 11:00 AM, John Cowan <cowan@mercury.ccil.org>
 wrote:

> Tim Bray scripsit:
> 
>> Hm? The first paragraph says “A Unicode string data type is simply an
>> ordered sequence of code units. Thus a Unicode 8-bit string is an ordered
>> sequence of 8-bit code units, a Unicode 16-bit string is an ordered
>> sequence of 16-bit code units, and a Unicode 32-bit string is an  ordered
>> sequence of 32-bit code units.”
> 
> Yes, but is that what you actually want?  If so, you probably want the
> "sequence of 16-bit code units" language.  This matches JavaScript and
> at least some implementations of JSON.
> 
> However, I want a JSON string to be an ordered sequence of codepoints/
> characters, not of code units.


/me doffs hat

When it's actually inside the JavaScript interpreter, a "sequence of 16-bit code units" is most likely correct.  Over the wire or on the filesystem, that isn't necessarily true; often, it is a "sequence of 8-bit code units".

I think here we need to concentrate on what's over the wire/on the filesystem, not what's in the interpreter/processor/etc.


- m&m

Matt Miller < mamille2@cisco.com >
Cisco Systems, Inc.