Re: [Json] Proposed minimal change for strings

Bjoern Hoehrmann <derhoermi@gmx.net> Sat, 06 July 2013 14:47 UTC

Return-Path: <derhoermi@gmx.net>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5AD4321F9AA4 for <json@ietfa.amsl.com>; Sat, 6 Jul 2013 07:47:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.539
X-Spam-Level:
X-Spam-Status: No, score=-2.539 tagged_above=-999 required=5 tests=[AWL=0.060, BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Nh8Di+k7FApr for <json@ietfa.amsl.com>; Sat, 6 Jul 2013 07:47:14 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.17.22]) by ietfa.amsl.com (Postfix) with ESMTP id 45CF521F9BB1 for <json@ietf.org>; Sat, 6 Jul 2013 07:47:07 -0700 (PDT)
Received: from netb.Speedport_W_700V ([91.35.52.158]) by mail.gmx.com (mrgmx103) with ESMTPA (Nemesis) id 0MTSmp-1UodSc3Hc7-00SPdm; Sat, 06 Jul 2013 16:46:59 +0200
From: Bjoern Hoehrmann <derhoermi@gmx.net>
To: "Manger, James H" <James.H.Manger@team.telstra.com>
Date: Sat, 06 Jul 2013 16:46:55 +0200
Message-ID: <gm9gt896hliopo3mpk4catb5olrkttsuji@hive.bjoern.hoehrmann.de>
References: <9BACB3F2-F9BF-40C7-B4BA-C0C2F33E4278@vpnc.org> <00sdt8hmont8gqvams8qbuas6o9c32ap5o@hive.bjoern.hoehrmann.de> <255B9BB34FB7D647A506DC292726F6E1151C2C71BD@WSMSG3153V.srv.dir.telstra.com>
In-Reply-To: <255B9BB34FB7D647A506DC292726F6E1151C2C71BD@WSMSG3153V.srv.dir.telstra.com>
X-Mailer: Forte Agent 3.3/32.846
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Provags-ID: V03:K0:tKiAMifxfYmvGfxqCTO9kIpj8yB2J2Zauc4l51BaFgJBgLXZWEy DP6OrutMRm8d3ZaQV2HXuGqvDogvh9QYG14KuE9GJwZBn7l0XsBztn/V3SnWuIGnFK2XbN/ AusJIR0H/TwaZj4C6eKhp9jozntzruDeDOeyy9Eq3L3abLDn6Y3YfssD14o2EZ+adTF5dqA h2qhBQg6UTfFpd+EQXZ+g==
Cc: Paul Hoffman <paul.hoffman@vpnc.org>, "json@ietf.org WG" <json@ietf.org>
Subject: Re: [Json] Proposed minimal change for strings
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 06 Jul 2013 14:47:19 -0000

* Manger, James H wrote:
>It is only "unnecessary dataloss" when you are certain a value will only 
>be handled by ECMAScript -- in which case relying on ECMAScript's 
>(reasonable) extension beyond JSON is ok.

This is not an ecmascript-specific problem. Environments that can encode
unpaired surrogates in the native string type are the norm; Perl can

  % perl -MJSON -e "print JSON->new->ascii->encode([chr(0xd800)])"
  ["\ud800"]

and as far as I am aware so can Java, C#, C++, Python, and many others,
and for many of them it is easy to create lone surrogates by accident.

>I think your previous paragraph provides the argument. If different 
>decoders can legitimately roundtrip, substitute, or fail on parsing an 
>unpaired surrogate escape than senders need to be warned (in big 
>flashing lights) of this non-interoperable behaviour. The best way to do 
>that is to exclude unpaired surrogate escapes from the specification of 
>JSON.

But receivers need to be warned that they will likely have to deal with
JSON documents with unpaired surrogate escapes in strings, and the best
way to do that is to include them in the specification...
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/