Re: [Json] On characters and code points
Tim Bray <tbray@textuality.com> Fri, 07 June 2013 16:02 UTC
Return-Path: <tbray@textuality.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6D0DC21F8887 for <json@ietfa.amsl.com>; Fri, 7 Jun 2013 09:02:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.81
X-Spam-Level:
X-Spam-Status: No, score=0.81 tagged_above=-999 required=5 tests=[AWL=-0.547, BAYES_00=-2.599, FH_RELAY_NODNS=1.451, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_PBL=0.905, RCVD_IN_SORBS_DUL=0.877, RDNS_NONE=0.1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id j0BciLncIiwP for <json@ietfa.amsl.com>; Fri, 7 Jun 2013 09:02:13 -0700 (PDT)
Received: from mail-vb0-x22f.google.com (mail-vb0-x22f.google.com [IPv6:2607:f8b0:400c:c02::22f]) by ietfa.amsl.com (Postfix) with ESMTP id BB6BF21F9744 for <json@ietf.org>; Fri, 7 Jun 2013 09:01:41 -0700 (PDT)
Received: by mail-vb0-f47.google.com with SMTP id x14so2800187vbb.20 for <json@ietf.org>; Fri, 07 Jun 2013 09:01:39 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:cc:content-type:x-gm-message-state; bh=H7h+62GBPc3jhqSvzH3hkK4PEKJdwG3w/QmblNK+zYU=; b=cMW/7l1+FALyYtuu1Cn4PGvS5qb/8+W9bbBoa2NrhNPWqKuHXUe6uhZRoSrxFctatg +HIE63YGX61iHiCfAaD2/ZAAjiUuEuGeUXrv7sialS9BlXR82peCk3HxV2xqjwfFptbC F5ddnAxpaZz8f6BL9cpds772WG1qfYtx9DtPb6E9UVKh31q4XT1jUNijSwwfpZzRXXLM h69smB2A5keM+GlFy31LLw7wxUjk4s+LjX+wllRvW0K6ty/vEJE//P9xiGCF0q+clLFx n62qXZOndvE8tygr1TMbL/WHz0mzEa/Eug08884W7GCNUd1v2unN7+ul9vWsk69YPLzW YV3g==
MIME-Version: 1.0
X-Received: by 10.52.237.228 with SMTP id vf4mr3232968vdc.79.1370620899451; Fri, 07 Jun 2013 09:01:39 -0700 (PDT)
Received: by 10.220.48.14 with HTTP; Fri, 7 Jun 2013 09:01:39 -0700 (PDT)
X-Originating-IP: [24.84.235.32]
In-Reply-To: <56A163E9-E7CD-46B3-9984-8F009EBFF500@vpnc.org>
References: <A723FC6ECC552A4D8C8249D9E07425A70FC2E7E1@xmb-rcd-x10.cisco.com> <51B06F38.8050707@crockford.com> <CAHBU6iuFBuW-RfgBLQF5q4BnUOzs088QXW3uOQG1OjBFjZttkw@mail.gmail.com> <51B1B4E7.8090101@it.aoyama.ac.jp> <9ld3r8pc0tufif18dohb2fmi0ijna1vs4n@hive.bjoern.hoehrmann.de> <56A163E9-E7CD-46B3-9984-8F009EBFF500@vpnc.org>
Date: Fri, 07 Jun 2013 09:01:39 -0700
Message-ID: <CAHBU6ivG=ONc8roT7W=LdpKYNMqRH_d5BobZ=pHnk=mVaKZKaA@mail.gmail.com>
From: Tim Bray <tbray@textuality.com>
To: Paul Hoffman <paul.hoffman@vpnc.org>
Content-Type: multipart/alternative; boundary="089e0122f6aad575d404de928b10"
X-Gm-Message-State: ALoCoQmdT++1dsF4sWmrVln/Lwrzd0A+NbqHEWqfdk6YS0a1P7v5zxvDCp97iTIVzNhGvIA/yRbi
Cc: "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] On characters and code points
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Jun 2013 16:02:22 -0000
On Fri, Jun 7, 2013 at 8:56 AM, Paul Hoffman <paul.hoffman@vpnc.org> wrote: > This may be a part of the spec where some people have to hold their noses. > The Unicode definition of "character" does not include non-characters, and > the code points for some of those non-characters make sense in JSON strings > when those strings. Bjoern has pointed out a good one: strings used for > test cases of other code. The issue not just unpaired surrogates. Do we > *really* want to prohibit: > { "End of data marker": "\uFFFF" } > Yes, I *really* want to prohibit that. The one corner case it buys you is outweighed by a factor of a thousand or so in not being able to use general-purpose string processing software to deal with JSON payloads. BTW, a huge amount of deployed software out there ALREADY processes JSON text fields using general-purpose string processing libraries, and will explode unpredictably and in hard-to-debug ways if this starts happening. Also, consider the lovely consequences when unpaired surrogates start showing up in key fields and are fed to hash functions in every programming language in the world, which expect to receive Unicode characters. -T > > Proposal: > > Remove the word "character" from the spec except in an explanatory > paragraph in Section 2.5 that says: > All code points, even those that represent non-characters in the > Unicode specification [UNICODE], are allowed in JSON strings. > > --Paul Hoffman > _______________________________________________ > json mailing list > json@ietf.org > https://www.ietf.org/mailman/listinfo/json >
- [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Douglas Crockford
- Re: [Json] Unpaired surrogates in JSON strings Paul Hoffman
- Re: [Json] Unpaired surrogates in JSON strings Douglas Crockford
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Douglas Crockford
- Re: [Json] Unpaired surrogates in JSON strings Joe Hildebrand (jhildebr)
- Re: [Json] Unpaired surrogates in JSON strings Paul Hoffman
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Tim Bray
- Re: [Json] Unpaired surrogates in JSON strings Paul Hoffman
- Re: [Json] Unpaired surrogates in JSON strings Tim Bray
- Re: [Json] Unpaired surrogates in JSON strings Douglas Crockford
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings R S
- Re: [Json] Unpaired surrogates in JSON strings Carsten Bormann
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Tim Bray
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Carsten Bormann
- Re: [Json] Unpaired surrogates in JSON strings Joe Hildebrand (jhildebr)
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Joe Hildebrand (jhildebr)
- Re: [Json] Unpaired surrogates in JSON strings Joe Hildebrand (jhildebr)
- Re: [Json] Unpaired surrogates in JSON strings Douglas Crockford
- Re: [Json] Unpaired surrogates in JSON strings Douglas Crockford
- Re: [Json] Unpaired surrogates in JSON strings Douglas Crockford
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Tim Bray
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Paul Hoffman
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Joe Hildebrand (jhildebr)
- Re: [Json] Unpaired surrogates in JSON strings Joe Hildebrand (jhildebr)
- Re: [Json] Unpaired surrogates in JSON strings Martin J. Dürst
- Re: [Json] Unpaired surrogates in JSON strings Bjoern Hoehrmann
- [Json] On characters and code points Paul Hoffman
- Re: [Json] On characters and code points Tim Bray
- Re: [Json] On characters and code points Stephen Dolan
- Re: [Json] On characters and code points Stefan Drees
- Re: [Json] On characters and code points Tim Bray
- Re: [Json] On characters and code points Stefan Drees
- Re: [Json] On characters and code points Tim Bray
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] On characters and code points John Cowan
- Re: [Json] On characters and code points John Cowan
- Re: [Json] On characters and code points Tim Bray
- Re: [Json] On characters and code points John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Nico Williams
- Re: [Json] Unpaired surrogates in JSON strings Nico Williams
- Re: [Json] Unpaired surrogates in JSON strings Tatu Saloranta
- Re: [Json] Unpaired surrogates in JSON strings Joe Hildebrand (jhildebr)
- Re: [Json] On characters and code points Bjoern Hoehrmann
- Re: [Json] On characters and code points Tim Bray
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] On characters and code points Nico Williams
- Re: [Json] On characters and code points John Cowan
- Re: [Json] On characters and code points Bjoern Hoehrmann
- Re: [Json] On characters and code points Carsten Bormann
- Re: [Json] On characters and code points Stefan Drees
- Re: [Json] On characters and code points Paul Hoffman
- Re: [Json] On characters and code points Carsten Bormann
- Re: [Json] On characters and code points Nico Williams