Re: [Json] Unpaired surrogates in JSON strings
Tim Bray <tbray@textuality.com> Thu, 06 June 2013 14:57 UTC
Return-Path: <tbray@textuality.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E804821F9640 for <json@ietfa.amsl.com>; Thu, 6 Jun 2013 07:57:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.168
X-Spam-Level: *
X-Spam-Status: No, score=1.168 tagged_above=-999 required=5 tests=[AWL=-0.189, BAYES_00=-2.599, FH_RELAY_NODNS=1.451, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_PBL=0.905, RCVD_IN_SORBS_DUL=0.877, RDNS_NONE=0.1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zXzFo7-vepDz for <json@ietfa.amsl.com>; Thu, 6 Jun 2013 07:57:48 -0700 (PDT)
Received: from mail-vb0-x22b.google.com (mail-vb0-x22b.google.com [IPv6:2607:f8b0:400c:c02::22b]) by ietfa.amsl.com (Postfix) with ESMTP id BFFA921F90A5 for <json@ietf.org>; Thu, 6 Jun 2013 07:57:48 -0700 (PDT)
Received: by mail-vb0-f43.google.com with SMTP id e15so2037202vbg.30 for <json@ietf.org>; Thu, 06 Jun 2013 07:57:48 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:in-reply-to:references:date :message-id:subject:from:to:cc:content-type:x-gm-message-state; bh=zuDP0ATOAeP/d/I2KxXsRmYrNzeBsSt+mzxl7ioWRpo=; b=FcuFdrYkDA1XKLuMXaXGG83k7HvL7NUd/JICbLd8UIXNigy+i1+gn/wwKvaNbRgSyT yPP78hzHRz5J2wNH+K4GrbpC5JRz9rfOCqgkjYdQzyzJ03R01/sh6OMRownh/QK3AUMd sb+4SYYzSjHDdyMQrDGmLWt7+TxA1bFRzRI7fano86D+lRkiFEsVqaOxSDxEtXD8br4c 58+byiLsUqfv91ySJCTE6CMfJi9CZgAx1Ks7HlkLrE1MDdtpAdpL+lghgdhpvQGbizOa iZXv6lKSczj9OTcgbkkp6TsuKIq93GdQ6E4GZP2GNE5LY+fPs0Dg0Rm+balu4083V9nG KTPQ==
MIME-Version: 1.0
X-Received: by 10.52.112.5 with SMTP id im5mr3482818vdb.4.1370530668098; Thu, 06 Jun 2013 07:57:48 -0700 (PDT)
Received: by 10.220.48.14 with HTTP; Thu, 6 Jun 2013 07:57:47 -0700 (PDT)
X-Originating-IP: [24.84.235.32]
In-Reply-To: <51B06F38.8050707@crockford.com>
References: <A723FC6ECC552A4D8C8249D9E07425A70FC2E7E1@xmb-rcd-x10.cisco.com> <51B06F38.8050707@crockford.com>
Date: Thu, 06 Jun 2013 07:57:47 -0700
Message-ID: <CAHBU6iuFBuW-RfgBLQF5q4BnUOzs088QXW3uOQG1OjBFjZttkw@mail.gmail.com>
From: Tim Bray <tbray@textuality.com>
To: Douglas Crockford <douglas@crockford.com>
Content-Type: multipart/alternative; boundary="bcaec54857e8a0484904de7d89c8"
X-Gm-Message-State: ALoCoQkLXGkV+/dxHl7eHE+qqvijeNQAXLuPhbb1dyiu4nnk+5a1rPc6N2vdzlZw1VEeLlhr2RGN
Cc: Paul Hoffman <paul.hoffman@vpnc.org>, "Joe Hildebrand (jhildebr)" <jhildebr@cisco.com>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Unpaired surrogates in JSON strings
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Jun 2013 14:57:54 -0000
F0, 90, 8D, 86 On Thu, Jun 6, 2013 at 4:15 AM, Douglas Crockford <douglas@crockford.com>wrote: > What then is the standard name for a 16-bit element of text? When > JavaScript was created, that word was character. What is the word now? > The only somewhat-standardized term would be “UTF-16 codepoint”. But that’s not really a “unit of text” any more than the 2nd byte of a character encoded in 3 bytes with UTF-8 is. I’m fairly shocked. I have always believed that JSON encodes what its introduction (and section 2.5 "Strings") say it encodes, Unicode characters. If it is a requirement to accommodate the class of bug where languages that use UTF-16 (Java, JavaScript, C#) can emit unpaired UTF-16 surrogates, the spec needs to be clear that the INTENT is actually to support Unicode characters, and that unpaired surrogates are always evidence of a bug, and there can be no expectation that any software receiving such buggy data will be able to do anything useful with it, or even avoid crashing in a hard-to-debug way down in the bowels of a library routine. -T
- [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Douglas Crockford
- Re: [Json] Unpaired surrogates in JSON strings Paul Hoffman
- Re: [Json] Unpaired surrogates in JSON strings Douglas Crockford
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Douglas Crockford
- Re: [Json] Unpaired surrogates in JSON strings Joe Hildebrand (jhildebr)
- Re: [Json] Unpaired surrogates in JSON strings Paul Hoffman
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Tim Bray
- Re: [Json] Unpaired surrogates in JSON strings Paul Hoffman
- Re: [Json] Unpaired surrogates in JSON strings Tim Bray
- Re: [Json] Unpaired surrogates in JSON strings Douglas Crockford
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings R S
- Re: [Json] Unpaired surrogates in JSON strings Carsten Bormann
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Tim Bray
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Carsten Bormann
- Re: [Json] Unpaired surrogates in JSON strings Joe Hildebrand (jhildebr)
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Joe Hildebrand (jhildebr)
- Re: [Json] Unpaired surrogates in JSON strings Joe Hildebrand (jhildebr)
- Re: [Json] Unpaired surrogates in JSON strings Douglas Crockford
- Re: [Json] Unpaired surrogates in JSON strings Douglas Crockford
- Re: [Json] Unpaired surrogates in JSON strings Douglas Crockford
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Tim Bray
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Paul Hoffman
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Joe Hildebrand (jhildebr)
- Re: [Json] Unpaired surrogates in JSON strings Joe Hildebrand (jhildebr)
- Re: [Json] Unpaired surrogates in JSON strings Martin J. Dürst
- Re: [Json] Unpaired surrogates in JSON strings Bjoern Hoehrmann
- [Json] On characters and code points Paul Hoffman
- Re: [Json] On characters and code points Tim Bray
- Re: [Json] On characters and code points Stephen Dolan
- Re: [Json] On characters and code points Stefan Drees
- Re: [Json] On characters and code points Tim Bray
- Re: [Json] On characters and code points Stefan Drees
- Re: [Json] On characters and code points Tim Bray
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] On characters and code points John Cowan
- Re: [Json] On characters and code points John Cowan
- Re: [Json] On characters and code points Tim Bray
- Re: [Json] On characters and code points John Cowan
- Re: [Json] Unpaired surrogates in JSON strings Nico Williams
- Re: [Json] Unpaired surrogates in JSON strings Nico Williams
- Re: [Json] Unpaired surrogates in JSON strings Tatu Saloranta
- Re: [Json] Unpaired surrogates in JSON strings Joe Hildebrand (jhildebr)
- Re: [Json] On characters and code points Bjoern Hoehrmann
- Re: [Json] On characters and code points Tim Bray
- Re: [Json] Unpaired surrogates in JSON strings John Cowan
- Re: [Json] On characters and code points Nico Williams
- Re: [Json] On characters and code points John Cowan
- Re: [Json] On characters and code points Bjoern Hoehrmann
- Re: [Json] On characters and code points Carsten Bormann
- Re: [Json] On characters and code points Stefan Drees
- Re: [Json] On characters and code points Paul Hoffman
- Re: [Json] On characters and code points Carsten Bormann
- Re: [Json] On characters and code points Nico Williams