Re: [Json] JSON for Internet messages

John Cowan <cowan@mercury.ccil.org> Wed, 03 July 2013 17:15 UTC

Return-Path: <cowan@ccil.org>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 480E021F9D9D for <json@ietfa.amsl.com>; Wed, 3 Jul 2013 10:15:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.569
X-Spam-Level:
X-Spam-Status: No, score=-3.569 tagged_above=-999 required=5 tests=[AWL=0.030, BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id q36YtJpRklcA for <json@ietfa.amsl.com>; Wed, 3 Jul 2013 10:15:48 -0700 (PDT)
Received: from earth.ccil.org (earth.ccil.org [192.190.237.11]) by ietfa.amsl.com (Postfix) with ESMTP id 731EF21F9D98 for <json@ietf.org>; Wed, 3 Jul 2013 10:15:48 -0700 (PDT)
Received: from cowan by earth.ccil.org with local (Exim 4.72) (envelope-from <cowan@ccil.org>) id 1UuQeh-0005D3-RN; Wed, 03 Jul 2013 13:15:47 -0400
Date: Wed, 03 Jul 2013 13:15:47 -0400
From: John Cowan <cowan@mercury.ccil.org>
To: Tim Bray <tbray@textuality.com>
Message-ID: <20130703171547.GH32044@mercury.ccil.org>
References: <CAHBU6it55C5vCNLBki1LvjpWd4fANY8LdC4fzxj3a2G_+q=qSA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CAHBU6it55C5vCNLBki1LvjpWd4fANY8LdC4fzxj3a2G_+q=qSA@mail.gmail.com>
User-Agent: Mutt/1.5.20 (2009-06-14)
Sender: John Cowan <cowan@ccil.org>
Cc: "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] JSON for Internet messages
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 03 Jul 2013 17:15:52 -0000

Tim Bray scripsit:

> So I care a lot about JSON, but I don’t care in the slightest about usages
> where the JSON isn’t being used for application-level message protocol
> payloads (are there such usages?  I'm curious).

I suppose it depends on what you mean by "application-level".  There are
several NoSQL databases that speak JSON, and I wouldn't call them
applications per se.

> Also, I’ve never encountered a scenario where the messages were of
> sufficient size that anyone gave a rat’s ass about streaming.

Well, at least eight programmers have taken the trouble to write
such parsers, and I don't think we can assume that they are
all works of supererogation.  Rob Gonzalez says that he wrote
his PHP streaming parser because he needed to be able to react
to JSON input on the fly.  I quote the first few paragraphs of
<http://www.salsify.com/blog/json-streaming-parser-for-php>:

    Recently I looked around for a JSON streaming parser written in
    PHP and couldn’t find one. So I wrote my own and am making it
    available to anyone who wants to use it.

    For those unfamiliar with streaming parsers I’ll give a brief
    intro. In short, they’re useful if you have a very large
    document and don’t want to have to load the whole thing in
    memory at once.

    Most (non-streaming) parsers—including, for example, the stock
    JSON parsers that come with PHP—will load an entire document in
    memory and then give you access to the whole thing at once. The
    advantage to this whole-document-at-once approach is that you
    get random access to every object in the document. Also the
    programmatic interface they provide can be really convenient. In
    PHP’s standard library you can get a handle on a native PHP
    array for the entire JSON document, which is really easy to
    work with.

    The downside is, as mentioned, the whole document must be
    loaded into memory. For most web servers or web services this
    is a serious constraint. Furthermore, this means that your
    application can’t start actually doing anything useful with
    the data until the whole thing is loaded in memory. A streaming
    parser, in contrast, gives your application access to data almost
    as soon as the data is read by PHP so can be much faster.

    As I was writing a plugin for Magento, I needed a JSON streaming
    parser written in PHP. I looked at StackOverflow, PHP.net, and
    finally Google. It’s pretty crazy to me that no one seems to
    have had this need before given the popularity of both JSON and
    PHP, but there you go!

> 
> - All JSON messages MUST be encoded in valid UTF-8.
> - All numbers MUST be of precision < 2**53 (use strings for your big crypto
> numbers)

Better: All numbers MUST be representable as IEEE 754:2008 binary
64-bit floats.

> - All JSON objects MUST have unique keys
> - All JSON messages MUST be either arrays or objects
> - Software receiving something violating any of these MUSTs has encountered
> conclusive evidence of serious upstream breakage and MUST NOT trust the
> contents nor act upon them.

The trouble with all this is that They Break Data.  All of a sudden, what
was valid JSON before is not valid JSON now.  That's serious, especially
as people *do* keep around their JSON rather than using it only transiently
See above remarks about databases: it doesn't matter if the JSON is stores
as JSON or regenerated on the fly, because if I put in JSON today I expet
to get back JSON tomorrow, and not something that no longer counts as JSON.

-- 
All Gaul is divided into three parts: the part          John Cowan
that cooks with lard and goose fat, the part            http://ccil.org/~cowan
that cooks with olive oil, and the part that            cowan@ccil.org
cooks with butter. --David Chessler