Re: [Json] Unpaired surrogates in JSON strings

"Joe Hildebrand (jhildebr)" <jhildebr@cisco.com> Wed, 05 June 2013 23:36 UTC

Return-Path: <jhildebr@cisco.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E27FB21E809C for <json@ietfa.amsl.com>; Wed, 5 Jun 2013 16:36:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.599
X-Spam-Level:
X-Spam-Status: No, score=-10.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NuSCBqVG5rxA for <json@ietfa.amsl.com>; Wed, 5 Jun 2013 16:36:13 -0700 (PDT)
Received: from rcdn-iport-5.cisco.com (rcdn-iport-5.cisco.com [173.37.86.76]) by ietfa.amsl.com (Postfix) with ESMTP id C86B211E80AE for <json@ietf.org>; Wed, 5 Jun 2013 16:35:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=1209; q=dns/txt; s=iport; t=1370475356; x=1371684956; h=from:to:cc:subject:date:message-id:in-reply-to: content-id:content-transfer-encoding:mime-version; bh=MzVsE0K2h7Q00Um8DujizWDgix9N92cWbTPg3lbHwKA=; b=ZdMSdPKTIReTjoL4XzKrm93LqxzuQzWsMyw3onSkBdOvZgCkLr5TgWh7 C/JUAo+vwyNGFK8AZBMIgEQvoANVD09IiFES1B+PSsaef+SRKsssCvSBl IY0fUixLGejL1m0qhwcGChT/l5JT+kI1plavNHMLHXc8Z8m3sbPupW9nS U=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AhoFALvKr1GtJXG8/2dsb2JhbABZgwkwv0KBABZ0giMBAQEEAQEBNzQLEgEIGAoUNwslAgQBDQUIiAUMvHoEjXSBBjEHgnphA6h/gw+BaT4
X-IronPort-AV: E=Sophos;i="4.87,810,1363132800"; d="scan'208";a="219335843"
Received: from rcdn-core2-1.cisco.com ([173.37.113.188]) by rcdn-iport-5.cisco.com with ESMTP; 05 Jun 2013 23:35:18 +0000
Received: from xhc-rcd-x02.cisco.com (xhc-rcd-x02.cisco.com [173.37.183.76]) by rcdn-core2-1.cisco.com (8.14.5/8.14.5) with ESMTP id r55NZIrD000318 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Wed, 5 Jun 2013 23:35:18 GMT
Received: from xmb-rcd-x10.cisco.com ([169.254.15.56]) by xhc-rcd-x02.cisco.com ([173.37.183.76]) with mapi id 14.02.0318.004; Wed, 5 Jun 2013 18:35:17 -0500
From: "Joe Hildebrand (jhildebr)" <jhildebr@cisco.com>
To: Douglas Crockford <douglas@crockford.com>, John Cowan <cowan@mercury.ccil.org>
Thread-Topic: [Json] Unpaired surrogates in JSON strings
Thread-Index: AQHOYgjwaH+sWe/75UqTmsr195d0xpknuQ0AgAAQ1ACAAALZgP//6RgA
Date: Wed, 05 Jun 2013 23:35:17 +0000
Message-ID: <A723FC6ECC552A4D8C8249D9E07425A70FC2C12D@xmb-rcd-x10.cisco.com>
In-Reply-To: <51AF8A09.50806@crockford.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.3.4.130416
x-originating-ip: [10.21.123.84]
Content-Type: text/plain; charset="us-ascii"
Content-ID: <9B93EFA1EBC90C4F913305361FE598CF@emea.cisco.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Unpaired surrogates in JSON strings
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Jun 2013 23:36:18 -0000

I don't mind allowing surrogate pairs in \u notation.  But I think we
should specify what happens when you send just half of the pair.  For
example, adding this after section 2.5, graph 4:

Escape sequences between \uD800 and \uDFFF SHOULD be generated only as
valid UTF16 surrogate pairs (this SHOULD is only to allow backward
compatibility).  When encountering an invalid surrogate pair (such as
"foo\uD834bar" or "\uDD1E\uD834"), parsers MAY either throw an error
(taking the risk of some backward incompatibility with old generators) or
MAY ignore the sequence.


On 6/5/13 12:57 PM, "Douglas Crockford" <douglas@crockford.com> wrote:

>On 6/5/2013 11:47 AM, John Cowan wrote:
>> Douglas Crockford scripsit:
>>
>>> Such a requirement will be breaking. Breaking changes are out of scope.
>> How is it a breaking change to limit what documents are allowed to be
>> *generated*?
>>
>Because JSON is currently being used in applications that deliver those
>codepoints.
>I believe that ECMA cannot accept if this is changed.
>_______________________________________________
>json mailing list
>json@ietf.org
>https://www.ietf.org/mailman/listinfo/json
>


-- 
Joe Hildebrand