Re: [Json] Regarding JSON text sequence ambiguities (Re: serializing sequences of JSON values)

Matthew Morley <> Tue, 11 March 2014 15:31 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id BFEFD1A074E for <>; Tue, 11 Mar 2014 08:31:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.977
X-Spam-Status: No, score=-1.977 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 8nNjtR1rv5Rg for <>; Tue, 11 Mar 2014 08:31:11 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id C311A1A03F7 for <>; Tue, 11 Mar 2014 08:31:10 -0700 (PDT)
Received: by with SMTP id 10so5632793lbg.7 for <>; Tue, 11 Mar 2014 08:31:04 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=DcdlyPdgMjmua3hW7fH2DdXDxtIyN1lcz98q4oCOAcA=; b=dbKnZHFeKNmZC93ki8L2noVvuOySkXDZjdBWTBGg7Yp2ojWJ4o+CxsVbsvNSA0qAnj ssZLOAkXwVA1uZaHr7ub5EhfUlNzAVHq8h11puIRVDgLEYkuhbHZvHxHy7CcQ2sumuKz z9AO/Gc9VK1/e8ESTX3wBhPSgZlC0glRuWEPENb7U9AaMZqVZzUJ301YDYpz98ncP5Q8 LFmuswewpTdetPxItsb1XPXa0LtnYW/7iYF31mlZUS9uxO+JsaNrdtJ+K/LTem5pbkdq JQAC5Xd0+GnZPGvFYDsWY7BeatWImqnTAbsG5O7lchvwVAbPGPr1d1hSKt3AH/Ha16HB VUAQ==
X-Gm-Message-State: ALoCoQlZrT24PXxsxgn2a2USZzfuVmIBy1Ap3aKiy0/R1UzdMFo4M4UwxyLHcTFyy9Jedj/J/yhl
MIME-Version: 1.0
X-Received: by with SMTP id be6mr26642936lbc.5.1394551864159; Tue, 11 Mar 2014 08:31:04 -0700 (PDT)
Received: by with HTTP; Tue, 11 Mar 2014 08:31:04 -0700 (PDT)
In-Reply-To: <>
References: <> <>
Date: Tue, 11 Mar 2014 11:31:04 -0400
X-Google-Sender-Auth: qlrGfjytj3BB3VYquUMqaX3Nvc8
Message-ID: <>
From: Matthew Morley <>
To: Nico Williams <>
Content-Type: multipart/alternative; boundary=001a11c266487c074404f45668c6
Cc: Paul Hoffman <>, "" <>
Subject: Re: [Json] Regarding JSON text sequence ambiguities (Re: serializing sequences of JSON values)
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 11 Mar 2014 15:31:14 -0000

This issue came up with json-rpc structures and streaming json, which while
not directly related in that we expect to encounter only objects at the top
level... having guidance on handling top-level streaming is important for

On Mon, Mar 10, 2014 at 6:11 PM, Nico Williams <>wrote;wrote:

> Consider jq ( as an example.
> jq can and will read a sequence of JSON texts from stdin.  That
> sequence isn't an array.  jq applies its program to each input.  jq
> produces as its output all the values -encoded as JSON texts- output
> by the jq program.  This is exceedingly convenient, including for log
> files (for the reason that you give: log files are an indeterminate
> length, unending append-only sequence).
> jq handles the ambiguity by requiring top-level values to be
> "terminated" by any of: unambiguous top-level values, whitespace, or
> EOF.  Thus the sequence of null, true, false, 1, and 2, requires
> whitespace between every value.  While the sequence of true, "foo",
> false, "bar", 1, "foobar", and 2 doesn't.  An input that looks like
> "truefalsenull12" elicits a parse error.
> Note that the jq parser does NOT in fact parse multiple top-level
> values; it parses at most one top-level value.  Instead the program
> using the parser feeds input bytes to the parser until the parser
> finds the end of a top-level value and outputs it.  Remaining unparsed
> bytes are buffered, of course so that when the program adds more input
> bytes the parser can be restarted to parse the next top-level value.
> This does mean that the jq parser will not output null when fed 'null'
> until one more byte _or_ EOF are fed to it.  But if fed '[1,2]' then
> the parser emits the parsed array value immediately when the closing
> bracket is parsed, without waiting for further inputs.
> jq always outputs a newline (though it could be space or tab) after
> outputting any top-level value's JSON text encoding.  It does so
> precisely to avoid these ambiguities.
> (This is also why jq must continue to output a newline or other
> whitespace after every output text, except, perhaps, when in raw
> output mode, in which case the outputs aren't JSON texts.)
> Nico
> --
> _______________________________________________
> json mailing list

Matthew P. C. Morley