Re: [Json] On characters and code points

Stefan Drees <stefan@drees.name> Fri, 07 June 2013 16:18 UTC

Return-Path: <stefan@drees.name>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 895B821F871D for <json@ietfa.amsl.com>; Fri, 7 Jun 2013 09:18:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.195
X-Spam-Level:
X-Spam-Status: No, score=-2.195 tagged_above=-999 required=5 tests=[AWL=0.054, BAYES_00=-2.599, HELO_EQ_DE=0.35]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fHepld2fpaSr for <json@ietfa.amsl.com>; Fri, 7 Jun 2013 09:18:49 -0700 (PDT)
Received: from mout.web.de (mout.web.de [212.227.15.3]) by ietfa.amsl.com (Postfix) with ESMTP id 9FB2121F86D5 for <json@ietf.org>; Fri, 7 Jun 2013 09:18:48 -0700 (PDT)
Received: from newyork.local.box ([93.129.186.5]) by smtp.web.de (mrweb101) with ESMTPSA (Nemesis) id 0LoYjO-1U9CaP2TLW-00gUVG; Fri, 07 Jun 2013 18:18:44 +0200
Message-ID: <51B207E2.1060403@drees.name>
Date: Fri, 07 Jun 2013 18:18:42 +0200
From: Stefan Drees <stefan@drees.name>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:17.0) Gecko/20130509 Thunderbird/17.0.6
MIME-Version: 1.0
To: stefan@drees.name
References: <A723FC6ECC552A4D8C8249D9E07425A70FC2E7E1@xmb-rcd-x10.cisco.com> <51B06F38.8050707@crockford.com> <CAHBU6iuFBuW-RfgBLQF5q4BnUOzs088QXW3uOQG1OjBFjZttkw@mail.gmail.com> <51B1B4E7.8090101@it.aoyama.ac.jp> <9ld3r8pc0tufif18dohb2fmi0ijna1vs4n@hive.bjoern.hoehrmann.de> <56A163E9-E7CD-46B3-9984-8F009EBFF500@vpnc.org> <CAHBU6ivG=ONc8roT7W=LdpKYNMqRH_d5BobZ=pHnk=mVaKZKaA@mail.gmail.com> <51B20731.3040300@drees.name>
In-Reply-To: <51B20731.3040300@drees.name>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Provags-ID: V03:K0:NeFeWfSjLdgEBgkjiHeOFbv1E5Smcx6yj64T9kkpCsXKVt22eDd tzv6Y30hv2NPHk2MAyzxwXSv+7SslToMHwB1TuqyXeOwZ3dbbl7ejB+DUxe1xonb11wSBS8 EWE/pugtjTiybhkazbBydUhgTj9krvJRS+w/lk+uSy2NcCLatGNVgBbbOoHqJCZuIkiuQ5p NsnWiLaMWHb0L3sWQ1g1w==
Cc: Tim Bray <tbray@textuality.com>, Paul Hoffman <paul.hoffman@vpnc.org>, "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] On characters and code points
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: stefan@drees.name
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Jun 2013 16:18:53 -0000

Sorry, did miss one slash in my text, whci should irritate the reader.
c.f. below.
Am 07.06.13 18:15, schrieb Stefan Drees:
> On 2013-06-07 18:01, Tim Bray wrote:
>> On Fri, Jun 7, 2013 at 8:56 AM, Paul Hoffman ... wrote:
>>
>>     This may be a part of the spec where some people have to hold their
>>     noses. The Unicode definition of "character" does not include
>>     non-characters, and the code points for some of those non-characters
>>     make sense in JSON strings when those strings. Bjoern has pointed
>>     out a good one: strings used for test cases of other code. The issue
>>     not just unpaired surrogates. Do we *really* want to prohibit:
>>         { "End of data marker": "\uFFFF" }
>>
>>
>> Yes, I *really* want to prohibit that. The one corner case it buys you
>> is outweighed by a factor of a thousand or so in not being able to use
>> general-purpose string processing software to deal with JSON payloads.
>> BTW, a huge amount of deployed software out there ALREADY processes JSON
>> text fields using general-purpose string processing libraries, and will
>> explode unpredictably and in hard-to-debug ways if this starts happening.
>
> and what about { "Decorate my slash": "\/" } and "general-purpose string
> processing software". Isn't this also a case, where you need a
> "pre-conditioner" that replaces the JSON specific escape sequence "\"

above lines should of course read:

"pre-conditioner" that replaces the JSON specific escape sequence "\/"

> with "/" before feeding it into "general-purpose string processing
> software" :-?)
>
>
>> Also, consider the lovely consequences when unpaired surrogates start
>> showing  up in key fields and are fed to hash functions in every
>> programming language in the world, which expect to receive Unicode
>> characters.
>>   -T
>>
>
> For today I better not imagine all these laguages and implementations
> blindly stuffing some json text transformed into their own memory
> structures ... maybe later during the weekend
>
>
>>     ...
>
> Stefan.
>
>
>
> _______________________________________________
> json mailing list
> json@ietf.org
> https://www.ietf.org/mailman/listinfo/json