Re: [Json] serializing sequences of JSON values

Nico Williams <> Fri, 14 March 2014 22:50 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id 978801A01F5 for <>; Fri, 14 Mar 2014 15:50:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.044
X-Spam-Status: No, score=-1.044 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FM_FORGED_GMAIL=0.622, IP_NOT_FRIENDLY=0.334] autolearn=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id cTQimALea1H2 for <>; Fri, 14 Mar 2014 15:50:42 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 703DC1A0212 for <>; Fri, 14 Mar 2014 15:50:42 -0700 (PDT)
Received: from (localhost []) by (Postfix) with ESMTP id 919CB598060 for <>; Fri, 14 Mar 2014 15:50:35 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed;; h= mime-version:in-reply-to:references:date:message-id:subject:from :to:cc:content-type;; bh=laR8Gz6xgw9Wym75xwZk oo5/3s8=; b=G7M3MZL5KDJkUQ21NKBOpOUga6RASNe8mzioUmDfPEM8C7JTPEA5 LsC1RZyYtsI6llag2b99Gd70V6i3hOJFOyD6VfwbtMJI7pp/EdwNIgUwH+53hCM/ VIPZi6X8UsnWiGPW9L3HecaXdibjLoYzaN6fn95RiR9Sguv3FqsFZwo=
Received: from ( []) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: by (Postfix) with ESMTPSA id 4394E59805F for <>; Fri, 14 Mar 2014 15:50:35 -0700 (PDT)
Received: by with SMTP id d1so152790wiv.13 for <>; Fri, 14 Mar 2014 15:50:34 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20130820; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=mMscLdCOaoYWYtw7SWEcRMuiL2bQIn00mhvk7UKndR4=; b=Haz2UyLt02yE2+QdrxA9FATIbBfbgrykjHy3QcRWlcIc5f015+DjEXSCyjNXPx55wp Nl02Jc87N2A2/jXlv9yODIjCCV1dVjkUMiP93UVT/4EWG2kUgCu1C/gPidDBEhYN6GwS /hBmL6TdUSaGVvS/mDwlobpJlQomhhaa1J6BmLUdHWo+uF96JlOVm6EPe0D/u3dNO5wc uWVmoL4QHjJONGK5kiAucB5TZFmJoh0KRbBMzyTBWnklU1zobHGxpOlOFio2pI7CkO/0 qLc1vLuRJXL5M7nvXvreG84MwEpB93zhPi+IlmUBaatEvZP5oWOfm5o3Kv9gyppQ9B6R rH+w==
MIME-Version: 1.0
X-Received: by with SMTP id ef3mr249548wib.39.1394837434117; Fri, 14 Mar 2014 15:50:34 -0700 (PDT)
Received: by with HTTP; Fri, 14 Mar 2014 15:50:34 -0700 (PDT)
In-Reply-To: <>
References: <em2c025504-6532-4513-a339-3d71c4cdfbda@helsinki> <> <> <> <> <> <> <> <> <> <> <>
Date: Fri, 14 Mar 2014 17:50:34 -0500
Message-ID: <>
From: Nico Williams <>
To: Jacob Davies <>
Content-Type: text/plain; charset=UTF-8
Cc: "" <>, Tatu Saloranta <>, Matt Miller <>
Subject: Re: [Json] serializing sequences of JSON values
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 14 Mar 2014 22:50:43 -0000

On Fri, Mar 14, 2014 at 5:38 PM, Jacob Davies <> wrote:
> Requiring that the separator be newline and that the JSON be
> serialized without embedded newlines is convenient but seems like an
> awkward way to handle splittable sequences when there are other
> characters that are not permitted to appear in JSON texts that could
> be used with unmodified existing JSON emitters. For instance, form
> feed (ASCII 12) or record separator (ASCII 30). Yes, newline has the
> advantage of letting you use an existing read-line facility, but
> having to explain "A JSON sequence is a sequence of JSON texts - but
> they're special JSON texts with no newlines! - separated by newlines"
> seems pretty awkward.
> I can see how it would be convenient in some cases, but I think it's
> better for higher-level structures built on lower-level ones not to
> make special requirements of the latter. JSON has always allowed
> newlines.

If we can assume incremental JSON parsers then there's no need to say
that the separator may not appear in the JSON texts.

However, we probably shouldn't assume incremental parsers.  And,
conveniently, every encoder I've ever seen has an option for emitting
compact texts which includes not emitting newlines.

Awkward it may be, but it's handy in two ways: 1) the dumbest parsers
can handle it with no efficiency penalty, 2) sending compact texts
saves bandwidth.

jq, for example, has an incremental parser, so it can handle texts
that have embedded newlines just fine.  And it has an option to emit
compact texts.  It has no option for alternative text separators on
the encoder side, but it does accept any JSON whitespace as text
separators on the parser side and does not require a separator when
it's not required to disambiguate texts (i.e., it accepts '[][]' as a
sequence of two empty arrays).

I think allowing implementation with the dumbest parsers is a
debatable goal.  If we don't have it, instead requiring incremental
(or better) parsers, then we don't need to forbid newlines in texts,
we can say any whitespace works as a text separator, and we can say
that the separator may be omitted in some cases.

Let's debate that possible goal: do we need to permit implementation
of JSON text sequence parsing with non-incremental JSON parsers?

I don't have a strong opinion on that.  It'd certainly be most
convenient to require at least incremental JSON text parsing
capabilities, and it'd be less awkward.