Re: [Json] serializing sequences of JSON values

Nico Williams <nico@cryptonector.com> Tue, 11 March 2014 00:41 UTC

Return-Path: <nico@cryptonector.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CAB911A0670 for <json@ietfa.amsl.com>; Mon, 10 Mar 2014 17:41:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.044
X-Spam-Level:
X-Spam-Status: No, score=-1.044 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FM_FORGED_GMAIL=0.622, IP_NOT_FRIENDLY=0.334] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id srcdz5pUM7Wu for <json@ietfa.amsl.com>; Mon, 10 Mar 2014 17:41:45 -0700 (PDT)
Received: from homiemail-a103.g.dreamhost.com (agjbgdcfdbea.dreamhost.com [69.163.253.140]) by ietfa.amsl.com (Postfix) with ESMTP id 7CEEF1A04E8 for <json@ietf.org>; Mon, 10 Mar 2014 17:41:45 -0700 (PDT)
Received: from homiemail-a103.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a103.g.dreamhost.com (Postfix) with ESMTP id 15A892005D109 for <json@ietf.org>; Mon, 10 Mar 2014 17:41:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h= mime-version:in-reply-to:references:date:message-id:subject:from :to:cc:content-type; s=cryptonector.com; bh=C2n8U17mlqWowOMPOE0X 9L8/wis=; b=nd+ls0RU6plR6j46On9071pHzzRpaESGq5zW1uX+Y41kULpm3FXr FgUxJgNWijSveCw5/ni3tjBKVK2RVnGCQns6aXugxGKikxUB6yidUGQwq9kpaMrB SyreAziC5sy8RfjwSPw4OTbazgnxak0glKTQXf6ptE2JVsU9ebfUSTU=
Received: from mail-wg0-f51.google.com (mail-wg0-f51.google.com [74.125.82.51]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: nico@cryptonector.com) by homiemail-a103.g.dreamhost.com (Postfix) with ESMTPSA id BB5E82005D105 for <json@ietf.org>; Mon, 10 Mar 2014 17:41:39 -0700 (PDT)
Received: by mail-wg0-f51.google.com with SMTP id k14so7171795wgh.22 for <json@ietf.org>; Mon, 10 Mar 2014 17:41:38 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=7OL6jWNJC1inmnbv8f4cFnDCVsfPg4Oc+5V8W2PRF3w=; b=RmZytsBtdg9FKTiKVd539/O5FY9kc10J+Ll7velRFu3wvqqKb80dJLeREwOxWmszCd iLh4Nhi8SVLTWzzy0c1W9waGnuT/8xo5l80LBBaMfn8QM1ugiejNRCas0UrS2fRob0VQ BXDxALK2Rs64+VJll6/zaidCrs5two63CAQIk3O1Pk9PSf47k7/PRob2VpE3nT9s9evY G+OeKZwF70RNLNCEDYxoydeIBkeyOIGwS0vX0sULVPzN1QCD87pWLVLJx0eCRv8+C1Kp ukVLqx+KRSGLYQiq1NAna8qL6Is2Hti4lwoYyD8Cad50pBN+21pBKvkQB9pdqbq7xBEN 3kxg==
MIME-Version: 1.0
X-Received: by 10.180.163.206 with SMTP id yk14mr643994wib.5.1394498498552; Mon, 10 Mar 2014 17:41:38 -0700 (PDT)
Received: by 10.216.199.6 with HTTP; Mon, 10 Mar 2014 17:41:38 -0700 (PDT)
In-Reply-To: <CAO1wJ5SLFoUSGoyM4WZa+r2Sf1A_-9e1DmUtRQqfx0UT77VXTA@mail.gmail.com>
References: <c19534113ff9489abcc4402fae3c1f62@BL2PR02MB307.namprd02.prod.outlook.com> <CAK3OfOigDS2CizGAtdRaQWbSgiJqw0Ogi-TPkWv35GjPB=CQGw@mail.gmail.com> <CAO1wJ5SLFoUSGoyM4WZa+r2Sf1A_-9e1DmUtRQqfx0UT77VXTA@mail.gmail.com>
Date: Mon, 10 Mar 2014 19:41:38 -0500
Message-ID: <CAK3OfOieeE=rF0+OW7MB2BFvt-KAXHQ5zsY6-jY3sew0NzDsSQ@mail.gmail.com>
From: Nico Williams <nico@cryptonector.com>
To: Jacob Davies <jacob@well.com>
Content-Type: text/plain; charset=UTF-8
Archived-At: http://mailarchive.ietf.org/arch/msg/json/Frc5TojxOrZ0AVXOJhn56Y_61HM
Cc: "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] serializing sequences of JSON values
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Mar 2014 00:41:48 -0000

On Mon, Mar 10, 2014 at 1:14 PM, Jacob Davies <jacob@well.com> wrote:
> Does either the old or new specification say anything about multiple
> values in sequence? I'm pretty darn sure the old version at least said
> that application/json documents contain one and only one value, and
> says nothing about serializing JSON in other formats, so encapsulation
> is not addressed. I also don't see how any process that expects to
> read sequences of self-delimited JSON objects and arrays is going to
> blithely accept numbers and strings and whatnot. I find the idea
> unlikely that there is some running code that 1. expects sequences of

This exists:

http://stedolan.github.io/jq

and existed prior to the new RFC, and it happily consumed (parsed),
processed, and produced (encoded) sequences of top-level values of any
JSON value type.

Code exists that uses jq in that way.  jq happens to always emit a
newline after emitting a JSON text.

> JSON values with no delimiters, 2. uses a parser that precisely
> followed the JSON specification and rejected anything but JSON objects
> and arrays prior to this change, 3. has that parser change under it to

Every JSON parser I've ever used or read the docs for (probably half a
dozen) has an option to parse things other than arrays/objects at the
top-level.

If the sequenced texts are separated by newlines (or, really, any
whitespace) and the parser can be fed a chunk of octets at a time
until a complete text is read, then it's trivial to build an
application that consumes sequences of texts.  There are other parser
features that can be used to build such an application, but since
numbers must be delimited somehow ("on the right"), the simple rule to
follow is to just emit a newline after any JSON text.

> accept all kinds of values and 4. is so indifferent to the contents of
> what it is processing that it doesn't notice that it's getting numbers
> and strings instead of objects or arrays.

Your imagination fails you:

jq 'if type == "number" then ... elif type == "string" then ... elif
type == "array" then ... else ... end'

You can do the same in Python, JavaScript, Ruby, C, and so on and on.

You might argue that this is silly, but I'd say once more that your
imagination fails you.  Suppose I'm grepping for specific values and I
want to output the paths to them in a stream of texts whose form I
know little about -- being new to some project with lousy docs I might
be trying to understand some data.  Why not?

For me the main issue here is that there's an ambiguity (see separate
thread) that is best resolved by always emitting a newline after each
text in a sequence.

Nico
--