Re: [Json] Using a non-whitespace separator (Re: Working Group Last Call on draft-ietf-json-text-sequence)

Nico Williams <nico@cryptonector.com> Wed, 04 June 2014 20:23 UTC

MIME-Version: 1.0
In-Reply-To: <CAMm+LwjoeC1R4O2iCPo+RfUFn4Qca4zyytqa817ayH60mNaWLg@mail.gmail.com>
References: <CAK3OfOidgk13ShPzpF-cxBHeg34s99CHs=bpY1rW-yBwnpPC-g@mail.gmail.com> <CAHBU6itr=ogxP4uoj57goEUSOCpsRx1AXVnW1NQwSTPxbbttkw@mail.gmail.com> <CAK3OfOhft+XJeMrg5rdY9E6fxAkJ2qsT3UHwu7zt=NEz2Q3XOQ@mail.gmail.com> <CAK3OfOhy-N0zjCVxtOMB8SqZEKceVvBz9Y6i0fo2W8i+gHKm4Q@mail.gmail.com> <CAK3OfOiQnLq29cv+kas3B8it-+82VmXvL3Rq1C5_767FDhBjRg@mail.gmail.com> <03CFAB3E-F4C6-4AE8-A501-8525376C4AA7@vpnc.org> <CAK3OfOja-17V391tTK91R98X8XQzd0iPnur2=oo4ii+MCOt+Rg@mail.gmail.com> <CFB42410.4EDDC%jhildebr@cisco.com> <CAMm+Lwime-=UQPu3t2ty05CZLb7xUMi9KGi31Xi2B7RNF5S3Og@mail.gmail.com> <CAK3OfOg_k4Ngq+z1pn4b+XRf0M1Hqx8qZ9BtW0sa8QQ+bjKJyA@mail.gmail.com> <084664DB-A55D-465E-8888-97BA0BB59637@vpnc.org> <CAHBU6itEph5GzB-P8bUUvUMopRNxcCE-16qys7ofhdmsDvpN4w@mail.gmail.com> <CAMm+LwjoeC1R4O2iCPo+RfUFn4Qca4zyytqa817ayH60mNaWLg@mail.gmail.com>
Date: Wed, 04 Jun 2014 15:23:51 -0500
Message-ID: <CAK3OfOhjPZUXK6C0qSsQQZvOgR3Sv3SWpyH=qTuihuDC9uvXrA@mail.gmail.com>
From: Nico Williams <nico@cryptonector.com>
To: Phillip Hallam-Baker <ietf@hallambaker.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: http://mailarchive.ietf.org/arch/msg/json/mOnkg1DtTE1g_mUHUWueWcG1_yY
Cc: Tim Bray <tbray@textuality.com>, Paul Hoffman <paul.hoffman@vpnc.org>, Joe Hildebrand Hildebrand <jhildebr@cisco.com>, IETF JSON WG <json@ietf.org>
Subject: Re: [Json] Using a non-whitespace separator (Re: Working Group Last Call on draft-ietf-json-text-sequence)
Precedence: list

On Wed, Jun 4, 2014 at 1:01 PM, Phillip Hallam-Baker
<ietf@hallambaker.com> wrote:
> On Wed, Jun 4, 2014 at 1:54 PM, Tim Bray <tbray@textuality.com> wrote:
>> Hah, I hadn’t realized that RS (U+001E, INFORMATION SEPARATOR TWO) was
>> excluded.   OK, so the abnf for JSON-sequence becomes one of these two:

Someone (Carsten?) proposed it earlier.  RS is, indeed, perfect for this.

>>
>> JSON-sequence = JSON-text *( %1e JSON-text )
>> JSON-sequence = *( ws %1e JSON-text )
>>
>> Depending on whether you see the RS as an initiator or a separator.  I think
>> I very slightly prefer the second.

It has to be an initiator for append-write logfiles, as it then marks
the end of a possibly incompletely-written text.  Otherwise you might
lose the first text following an incompletely-written text.

RS could also follow a text, but I prefer LF for this because it's
friendly to line-oriented text tools.  It's harmless to have a
trailing LF (since the JSON-text ABNF allows extra trailing whitespace
anyways).

> +1
>
> I prefer the 'strict writer, lose reader' approach here. And that is
> not my usual stance.

Is that in reference to my (2) or Tim's point?  Since you top-posted,
and since Tim didn't propose a loose reader, I can't tell :)

> The reason that I think readers need to be tolerant is that they
> should be able to read log files after they have been 'damaged' by
> tools that strip out the RS characters.

Yes.  If you see something like:

<text> <text> <text>

it should parse, even though it should have been:

RS<text>RS<text>RS<text>

because, why not parse it?

Thus my (2).

To repeat myself, jq doesn't insist on any separator at parse time,
except where the separator is needed to disambiguate.  E.g., if you
have to texts consisting of numbers, or booleans, or null, then the
parser can't parse them without a separator of some sort.

> For example, lets say that I have some program that records every
> transaction to a logfile and it is discovered that one of the
> transactions was wrong and is corrupting the database. The simplest
> solution is usually to take the log file, find the broken transaction,
> edit it out and rebuild the data base.

Or leave it in and just skip past it when you parse.

> Given the quality of editing tools available on many machines, I don't
> trust them to preserve non printing ASCII characters. Heck, the editor
> on ubuntu can't even start in a root account without writing garbage
> to the terminal.
>
> So readers should be tolerant.

Right, which is one reason that I want the ABNF for writers to be:

    sequence = RS JSON-text LF

but parsers should be more liberal.

I'll grant that if there's an RS present then one can use any kind of
JSON parser to parse the sequence whereas otherwise only incremental
and streaming JSON parsers can be used.  This is the one reason to
require that RS always be written.

Nico
--

[Json] Using a non-whitespace separator (Re: Work… Nico Williams
Re: [Json] Using a non-whitespace separator (Re: … John Cowan
Re: [Json] Using a non-whitespace separator (Re: … Tim Bray
Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
Re: [Json] Using a non-whitespace separator (Re: … Paul Hoffman
Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
Re: [Json] Using a non-whitespace separator (Re: … Paul Hoffman
Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
Re: [Json] Using a non-whitespace separator (Re: … Martin J. Dürst
Re: [Json] Using a non-whitespace separator (Re: … Joe Hildebrand (jhildebr)
Re: [Json] Using a non-whitespace separator (Re: … Phillip Hallam-Baker
Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
Re: [Json] Using a non-whitespace separator (Re: … Paul Hoffman
Re: [Json] Using a non-whitespace separator (Re: … Tim Bray
Re: [Json] Using a non-whitespace separator (Re: … Tim Bray
Re: [Json] Using a non-whitespace separator (Re: … Phillip Hallam-Baker
Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
Re: [Json] Using a non-whitespace separator (Re: … Tim Bray
Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
Re: [Json] Using a non-whitespace separator (Re: … Manger, James
Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
Re: [Json] Using a non-whitespace separator (Re: … Jacob Davies
Re: [Json] Using a non-whitespace separator (Re: … Paul Hoffman
Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
Re: [Json] Using a non-whitespace separator (Re: … Paul Hoffman
Re: [Json] Using a non-whitespace separator (Re: … Tim Bray
Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
Re: [Json] Using a non-whitespace separator (Re: … John Cowan
Re: [Json] Using a non-whitespace separator (Re: … John Cowan
Re: [Json] Using a non-whitespace separator (Re: … John Cowan
Re: [Json] Using a non-whitespace separator (Re: … Nico Williams
Re: [Json] Using a non-whitespace separator (Re: … Manger, James