Re: [Json] Unpaired surrogates in JSON strings

Nico Williams <nico@cryptonector.com> Fri, 07 June 2013 17:52 UTC

Return-Path: <nico@cryptonector.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5D62D21F8994 for <json@ietfa.amsl.com>; Fri, 7 Jun 2013 10:52:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.977
X-Spam-Level:
X-Spam-Status: No, score=-1.977 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SCeT5C1IeAVb for <json@ietfa.amsl.com>; Fri, 7 Jun 2013 10:52:19 -0700 (PDT)
Received: from homiemail-a35.g.dreamhost.com (caiajhbdccah.dreamhost.com [208.97.132.207]) by ietfa.amsl.com (Postfix) with ESMTP id BBB5221F8A68 for <json@ietf.org>; Fri, 7 Jun 2013 10:52:16 -0700 (PDT)
Received: from homiemail-a35.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a35.g.dreamhost.com (Postfix) with ESMTP id DF9E55406F for <json@ietf.org>; Fri, 7 Jun 2013 10:52:13 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h= mime-version:in-reply-to:references:date:message-id:subject:from :to:cc:content-type; s=cryptonector.com; bh=jmub4OESUJKuuk2igX0X OhbJemk=; b=yScT15QXS4Cz9fR/CflGjyoX5XEwvMEOTKv0CMB2cilJWLhBWdGx KAkcVYsljwOGiKvuZc4REMhpsCgdv9+tVam9Ct5hJbT2nXxHtzc2SAWgjHbO4qI9 uXunYgipCW/1RixwbcVA7iEnHxXWpY92g6/34m32564vm78E5nzu1Q8=
Received: from mail-wi0-f177.google.com (mail-wi0-f177.google.com [209.85.212.177]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: nico@cryptonector.com) by homiemail-a35.g.dreamhost.com (Postfix) with ESMTPSA id D05D754058 for <json@ietf.org>; Fri, 7 Jun 2013 10:52:07 -0700 (PDT)
Received: by mail-wi0-f177.google.com with SMTP id ey16so1577558wid.4 for <json@ietf.org>; Fri, 07 Jun 2013 10:52:04 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=MrACdg6vWa2YcHnCtZ8yhetZCBZj8XaK+T4UfcVbHj4=; b=H5Gz743EhyYqZ70aCG1XMW17MytS0uqkpbDnL2kiUOT8pkuzeQgRZtkbljobfrqIkP JK5tBWJF4l0YcsppE3ZcNvdk9DfNXxCdcUG1+vDEC2FTr52VbkQiQCGToJzJEjvsNyVP tNiR3ci4groqUe1n5T6Mjo2TkjwJZQqubBlmpHcN2jolShbBCiMSiKtwY/pD1f10smZD SEtoqWQSDlNhiD+jGkG2PrwbTQReEDLUd3rjyhZlo4JlFJoavdWpRYO929nC6pQz4ytM 0+hrCjS5/PwUeiFF5bdpvPe+EbfXazePtClkQOjKmvxIzMtLZUN0lkebGCRB3NqnQ6kt KhCw==
MIME-Version: 1.0
X-Received: by 10.181.12.1 with SMTP id em1mr2186990wid.4.1370627524518; Fri, 07 Jun 2013 10:52:04 -0700 (PDT)
Received: by 10.216.63.136 with HTTP; Fri, 7 Jun 2013 10:52:04 -0700 (PDT)
In-Reply-To: <A723FC6ECC552A4D8C8249D9E07425A70FC2E753@xmb-rcd-x10.cisco.com>
References: <20130606042921.GC1362@mercury.ccil.org> <A723FC6ECC552A4D8C8249D9E07425A70FC2E753@xmb-rcd-x10.cisco.com>
Date: Fri, 07 Jun 2013 12:52:04 -0500
Message-ID: <CAK3OfOjPfk7tY5bh1pHbksK_yRtw8MN_iMtKKh635vXdqus-MQ@mail.gmail.com>
From: Nico Williams <nico@cryptonector.com>
To: "Joe Hildebrand (jhildebr)" <jhildebr@cisco.com>
Content-Type: text/plain; charset="UTF-8"
Cc: Carsten Bormann <cabo@tzi.org>, John Cowan <cowan@mercury.ccil.org>, Paul Hoffman <paul.hoffman@vpnc.org>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Unpaired surrogates in JSON strings
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Jun 2013 17:52:24 -0000

On Thu, Jun 6, 2013 at 1:59 AM, Joe Hildebrand (jhildebr)
<jhildebr@cisco.com> wrote:
> On 6/5/13 10:29 PM, "John Cowan" <cowan@mercury.ccil.org> wrote:
>
>>Carsten Bormann scripsit:
>>
>>> Code points can refer to those of the characters or those of the code
>>> units (byte for UTF-8, etc.).
>>
>>Code points are (mathematical) integers corresponding to Unicode
>>characters, though not all of them are assigned to characters.
>
> The intro to the Unicode standard makes this pretty clear:
>
> http://www.unicode.org/versions/Unicode6.2.0/ch01.pdf
>
>
> This is why I wanted to decouple from a particular version of Unicode.  If
> the reference remained at version 4, for example, the word "character"
> means that any code point not in that version of Unicode is not
> technically legal JSON (although we know it will interop just fine in
> practice, which is why it's pretty safe to do the update).

We already know to allow the user of unassigned code points.  We can't
avoid having a normative reference to a Unicode version, if that was
your goal.

Nico
--