Re: [Json] Unpaired surrogates in JSON strings

Douglas Crockford <douglas@crockford.com> Thu, 06 June 2013 11:03 UTC

Return-Path: <douglas@crockford.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9D98821F9399 for <json@ietfa.amsl.com>; Thu, 6 Jun 2013 04:03:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.819
X-Spam-Level:
X-Spam-Status: No, score=-1.819 tagged_above=-999 required=5 tests=[AWL=0.780, BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Is21cJR-asPs for <json@ietfa.amsl.com>; Thu, 6 Jun 2013 04:03:45 -0700 (PDT)
Received: from mout.perfora.net (mout.perfora.net [74.208.4.195]) by ietfa.amsl.com (Postfix) with ESMTP id A11A821F939E for <json@ietf.org>; Thu, 6 Jun 2013 04:03:45 -0700 (PDT)
Received: from [192.168.0.108] (173-228-7-202.dsl.static.sonic.net [173.228.7.202]) by mrelay.perfora.net (node=mrus3) with ESMTP (Nemesis) id 0MQzYs-1Ux5C12jS2-00U5Jf; Thu, 06 Jun 2013 07:03:44 -0400
Message-ID: <51B06C80.50708@crockford.com>
Date: Thu, 06 Jun 2013 04:03:28 -0700
From: Douglas Crockford <douglas@crockford.com>
User-Agent: Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20130509 Thunderbird/17.0.6
MIME-Version: 1.0
To: Tim Bray <tbray@textuality.com>
References: <20130605162246.GG3680@mercury.ccil.org> <51AF7988.6040009@crockford.com> <20130605184702.GB6999@mercury.ccil.org> <51AF8A09.50806@crockford.com> <AE081E5F-82AB-416F-A690-E8373C0369B0@vpnc.org> <CAHBU6is9NBuicPm=mNSTLRUvXjrAt8BA5KH=A4pSeCNJy=vTNQ@mail.gmail.com> <20130606010945.GA1362@mercury.ccil.org> <CAHBU6isarPqHv0Xteg1c1xKNbZ7N8TE-9qh7N2uwEHU3uQubNA@mail.gmail.com>
In-Reply-To: <CAHBU6isarPqHv0Xteg1c1xKNbZ7N8TE-9qh7N2uwEHU3uQubNA@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-Provags-ID: V02:K0:8gxmyy+OKAy5wNKFdJKDExewsOBsbcilm282WVIfuFk GJaY+cMMWOB7cIMwcZNnLzVidHZvB5cNcyaTbzcwr2txL38Y7Q ZJnnkUPd8xjs2h1KGRzSdHjsueRbYOuV6NgEoKUCAnMWKt3ShD JOhwg8L5kH30qX5qYqRCWdZIW1loyxjOvSkPtH5F0iMEWEKlCp 8rQwWJI96bf6V3za2tXNnU64ZPdlCSaAlnvIPv7XEm1N4D7VEn L3jGN4y2YrjyZ+uYd5+GCXjMTYKwPi8AQRq1CzPR+1cZPkwUCZ +TtY55JDktr0Iat9/+0r3pS+ctexBK5gJCe2heK6jvMXddvqBy 63Dtx5VH88RKgYdv+YXg=
Cc: John Cowan <cowan@mercury.ccil.org>, Paul Hoffman <paul.hoffman@vpnc.org>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Unpaired surrogates in JSON strings
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Jun 2013 11:03:51 -0000

On 6/5/2013 9:48 PM, Tim Bray wrote:
> On Wed, Jun 5, 2013 at 6:09 PM, John Cowan <cowan@mercury.ccil.org 
> <mailto:cowan@mercury.ccil.org>> wrote:
>
>     > I think anyone who’s delivering those codepoints is already in
>     > violation of 4627, and I don’t think we should retroactively forgive
>     > those sins.
>
>     It's already been stated that ECMA can't swallow this change.
>
>
> I thought ECMA’s indigestion was over dupe keys.
>
> It seems to me that if unpaired surrogates are to be allowed, it’s not 
> OK for the spec to assert that strings are made of Unicode characters, 
> because both of these things can’t be correct.  -T
I agree. The critical thing that Unicode contributes to JSON is the 
numbers used in \u forms. It is unfortunate that ECMAScript and Unicode 
use the word character so differently. So I think JSON should talk 
exclusively about Unicode Codepoints, because that provides the least 
ambiguity, and not use the word characters at all.