Re: [Json] Unpaired surrogates in JSON strings

Nico Williams <nico@cryptonector.com> Fri, 07 June 2013 18:09 UTC

Return-Path: <nico@cryptonector.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D545021F9950 for <json@ietfa.amsl.com>; Fri, 7 Jun 2013 11:09:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.977
X-Spam-Level:
X-Spam-Status: No, score=-1.977 tagged_above=-999 required=5 tests=[AWL=-0.000, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kcGjwzfjSqYy for <json@ietfa.amsl.com>; Fri, 7 Jun 2013 11:09:49 -0700 (PDT)
Received: from homiemail-a25.g.dreamhost.com (caiajhbdccac.dreamhost.com [208.97.132.202]) by ietfa.amsl.com (Postfix) with ESMTP id C769721F994F for <json@ietf.org>; Fri, 7 Jun 2013 11:09:49 -0700 (PDT)
Received: from homiemail-a25.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a25.g.dreamhost.com (Postfix) with ESMTP id E5A0C678063 for <json@ietf.org>; Fri, 7 Jun 2013 11:09:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h= mime-version:in-reply-to:references:date:message-id:subject:from :to:cc:content-type; s=cryptonector.com; bh=9TYD84NnoqlQqiW1m8hI s0SjmdE=; b=KIXQ+pdnLt85nb7dCDmTsUgFLRWRkzslZckfclP9lsDVgKSJWds4 tU8qUB0X+WXD8B+AWRqP5XoUzOE2v9ryosHTAutzc25SFGkSQBOCVwl2q/GBt3/3 9wy5UCWepo43hMlecmyO+9AOIjt9MIg6rjLueiT//dkyZydrENMZrgc=
Received: from mail-wi0-f177.google.com (mail-wi0-f177.google.com [209.85.212.177]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: nico@cryptonector.com) by homiemail-a25.g.dreamhost.com (Postfix) with ESMTPSA id 8871D67803E for <json@ietf.org>; Fri, 7 Jun 2013 11:09:48 -0700 (PDT)
Received: by mail-wi0-f177.google.com with SMTP id ey16so1589411wid.16 for <json@ietf.org>; Fri, 07 Jun 2013 11:09:47 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=fweog52hrYsyjjwScVjtM1j4voWO45wEFk7b506kCUA=; b=G96fXj6I4BmfTlKdA0v9OafQDC7WWfL267a/nZ2UwLz9LmE82Hl8zpqesi74qucgug Q4ScECQxnkOTmxYCX/4CtK0Z2UO9b1YG/RpOZm3W9ArsEnx4qJCebkjXa14eQ1B9qyiM tC71/FqF9iKfic6E4hr5t52vsqgCtdR+qypicxWFeUnRex2dvPSzxkHlGU38XHJDRzNY rm/E/Q/nj0IGV1ya5dAEdPcyWdSRSsBS6j3CVSy0ykVTp8Twpgrw1vrWL51vRSiBLFkB QUuskgykWvt761M5jBWELpqhHJF3UzBN+rV+G+zhYBj8fG8HHktvsmsWKVL73pbR2zLX hCTQ==
MIME-Version: 1.0
X-Received: by 10.194.79.74 with SMTP id h10mr4978625wjx.84.1370628587165; Fri, 07 Jun 2013 11:09:47 -0700 (PDT)
Received: by 10.216.63.136 with HTTP; Fri, 7 Jun 2013 11:09:47 -0700 (PDT)
In-Reply-To: <9ld3r8pc0tufif18dohb2fmi0ijna1vs4n@hive.bjoern.hoehrmann.de>
References: <A723FC6ECC552A4D8C8249D9E07425A70FC2E7E1@xmb-rcd-x10.cisco.com> <51B06F38.8050707@crockford.com> <CAHBU6iuFBuW-RfgBLQF5q4BnUOzs088QXW3uOQG1OjBFjZttkw@mail.gmail.com> <51B1B4E7.8090101@it.aoyama.ac.jp> <9ld3r8pc0tufif18dohb2fmi0ijna1vs4n@hive.bjoern.hoehrmann.de>
Date: Fri, 07 Jun 2013 13:09:47 -0500
Message-ID: <CAK3OfOgw7-hwiYVESNkVe8xCux+JQBY6_-D5L4nthhHjMzXnGQ@mail.gmail.com>
From: Nico Williams <nico@cryptonector.com>
To: Bjoern Hoehrmann <derhoermi@gmx.net>
Content-Type: text/plain; charset="UTF-8"
Cc: "\"Martin J. Dürst\"" <duerst@it.aoyama.ac.jp>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Unpaired surrogates in JSON strings
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Jun 2013 18:09:55 -0000

On Fri, Jun 7, 2013 at 5:42 AM, Bjoern Hoehrmann <derhoermi@gmx.net> wrote:
> Actually there are many good reasons for having unpaired surrogates in
> JSON documents. A simple example would be a test suite for string APIs.

Or what heck, if ECMAScript allows any 16-bit values, 0x0000..0xFFFF
to be used (escaped as \uXXXX if necessary) then one very useful use
of that is encoding binary data: when parsing you know if you have
binary data when you see any 16-bit code units that don't make any
sense in Unicode text.  Not that I'm advocating this... but if we did
allow this then it wouldn't preclude us from saying that a string of
text must not include unpaired surrogates.

Nico
--