Re: [Json] JSON and int64s - any change in current best practice since I-JSON

Joe Hildebrand <hildjj@cursive.net> Fri, 19 January 2024 01:57 UTC

Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.300.61.1.2\))
From: Joe Hildebrand <hildjj@cursive.net>
In-Reply-To: <29BD1557-59A1-4578-901B-C626ABBE9A78@tzi.org>
Date: Thu, 18 Jan 2024 20:57:00 -0500
Cc: "json@ietf.org" <json@ietf.org>, cbor@ietf.org, Tim Bray <tbray@textuality.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <B25E10D2-17CF-4B3D-B04B-BABE3A209B90@cursive.net>
References: <87527a42-aaac-4f39-b320-05f18a2808c1@codalogic.com> <C31BF4C8-9E6C-48F8-BF7B-D2C379273B3F@tzi.org> <CAHBU6it4SaLawSiBgK9ySkbxjtHE6CX-P3r=hzcVy4ksoQo-Cg@mail.gmail.com> <CAChr6SxHfLW-A1asAndKJz-AiyJv5QP18bi=_bNdKXw7zYHThw@mail.gmail.com> <CAChr6SweYdCWxSABZ7g20Zd-xBFzcK0Ritq53S7WtjSwc-vLmw@mail.gmail.com> <E5A68370-CC2F-4618-AB39-39A382656616@cursive.net> <807fea1b-a22b-4d6b-aa5d-720c9b12023c@codalogic.com> <09233A73-3A6B-4E6F-AEB8-596AC6442E24@cursive.net> <869950DC-647B-4481-AEF8-9E092384E99F@tzi.org> <CBD32B58-8328-4602-89C6-BC2A7A875A0D@cursive.net> <994E2C0A-4AE0-4720-8C67-913BBF033E11@tzi.org> <0BB09B30-B606-44CC-85DC-95A47E485316@cursive.net> <B22EDB2D-0AD1-4582-9191-EFB40E163F19@tzi.org> <F6EB02CA-C240-4FA1-92A8-C5BB883929C7@cursive.net> <29BD1557-59A1-4578-901B-C626ABBE9A78@tzi.org>
To: Carsten Bormann <cabo@tzi.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/JIMpOfyX63KgLdJbB-Uci6lqFgI>
Subject: Re: [Json] JSON and int64s - any change in current best practice since I-JSON
Precedence: list

> On Jan 18, 2024, at 8:56 AM, Carsten Bormann <cabo@tzi.org> wrote:
> 
> Hi Joe,
> 
> thanks for the list!
> I’m CCing cbor@, as we are just finishing the extended diagnostic notation (EDN) document, and it is worth considering these points.

Context for folks there: on the JSON list, there is a little bit of loose talk about a text-based format that would be a superset of JSON (perhaps?), would have defined numeric processing, including big integers, perhaps, maybe some of the features from JSON5 (https://json5.org/), might make commas optional (including trailing commas), etc.

>> We would need to rework that grammar to make it suitable for interchange:
> 
> [context: as a JSON extension]

Yes.

>> - Remove encoding indicators as mentioned
> 
> Right.  When you say “remove”, you mean “do not include in JSON subset”?
> (ABNF doesn’t have the “.feature” feature of CDDL…)

I think by "JSON subset" you mean "subset of EDN used to define a text format".  That was confusing, because I was thinking "a text format that is a superset of JSON, perhaps influenced by EDN".  As such, "subset" and "superset" are referring to more or less the same thing.  I'm fine using your wording for the moment, with the understanding that the format we're talking about is not a subset of JSON, but a subset of EDN.

I'm not tied to ABNF, but I don't think CDDL would work to describe the syntax of the notation itself.  It could be used to describe information encoded in the notation, though.

>> - Reference IEEE754 for the 0x1.2p3 format, or remove it
> 
> Good point.  Section 5.12.3 of IEEE Std 754-2019?

I think so?  I can't find my copy of 754.

>> - Remove embedded notation
> 
> (I.e., not in JSON subset.)

Yes.  This is only needed for expository text, not for interchange.

>> - Remove ellipsis processing
> 
> (I.e., not in JSON subset.)

Yes.  This is only needed for expository text, not for interchange.

>> - Discuss whether results should always be an array, as sequence is currently top-level
> 
> Good question.  I gather the pickup of JSON sequences (RFC 7464) is not that great?  EDN is not entirely compatible with that as we don’t do the RS character, but do an explicit comma.
> The JSON subset could simply use “item” as the entry point.

Maybe I'm misunderstanding the ABNF in the EDN doc, but it looks like the top-level rule is `seq`, which is comma-separated.  I'm open to designing a streaming form of this notation at the same time, but that's a topic for discussion.

>> - Discuss adding other JSON5 features
> 
> Do you have some examples for what you would like to add?
> What are we missing out for EDN?

I like basically everything that JSON5 adds, except maybe escaping newlines; I think I'd prefer newlines to just be valid in strings without needing to be escaped.  I don't care about explicit `+` before numbers, but they don't bother me.  Reproduced here with comments:

> Objects
>     Object keys may be an ECMAScript 5.1 IdentifierName. 

Strong yes.  No need for quoting most keys in a world that has /\p{ID_Start}\p{ID_Continue}*/u

>     Objects may have a single trailing comma.

Strong yes.  I want to have the same coding standard in this format as I do for other languages.  I'm fine with all commas being optional for those that feel differently.

> Arrays
>     Arrays may have a single trailing comma.

Yes.  See above.

> Strings
>     Strings may be single quoted.

Same coding standard argument, plus there are times when I want to use " or ' without having to backslash-escape it.

>     Strings may span multiple lines by escaping new line characters.

This is for:

```json5
"foo\
bar"
```

I'd prefer:

```eson
"foo
bar"
```

>     Strings may include character escapes.

This is for "\u{1F4A9}" in addition to "\uD83D\uDCA9", which is what you need to do in JSON.

> Numbers
>     Numbers may be hexadecimal.

Fine.  Particularly nice for config files.  I don't think octal or binary is needed, but if someone else feels strongly, it's fine.

>     Numbers may have a leading or trailing decimal point.

Sure.  As long as we fix everything about numbers at least as well as I-JSON.

>     Numbers may be IEEE 754 positive infinity, negative infinity, and NaN.

Yes.

>     Numbers may begin with an explicit plus sign.

Shrug.

> Comments
>     Single and multi-line comments are allowed.

Strong yes, and possibly the reason to do the work in the first place.  Absolutely required for config files.

> White Space
>     Additional white space characters are allowed.

Not important to me, but I don't speak any of the languages whose whitespace is now legal.  May as well include everything in Zs for future-proofing.

>> We wouldn't get Tim's desired property of one character look-ahead.
> 
> On constrained systems, you could always use CBOR…

Tim can make his own argument here.  I think I'm on record as believing that CBOR is a good thing.

>> Other than that, it's as good a place to start as anywhere else.
> 
> Indeed.
> I could imagine we finish the EDN-literal document for its CBOR target audience, and then write another document about the JSON subset (which would be less of a delta and more of a free-standing specification that is just anchored in the full version for CBOR).

I think "inspired by" is as far as I'd commit to at the moment, until we have some agreement on requirements.

> I just noticed that we do have a required comma between items in arrays and sequences.  I just confused this with CDDL (which I’m prone to do), where that comma is not required.  Leaving off the comma conflicts with implicit string concatenation (which is one way to address the 2D string problem).  So maybe there is a bit more work to do...

I don't think this format needs string concat.

-- 
Joe Hildebrand

[Json] JSON and int64s - any change in current be… Pete Cordell
Re: [Json] JSON and int64s - any change in curren… Carsten Bormann
Re: [Json] JSON and int64s - any change in curren… Tim Bray
Re: [Json] JSON and int64s - any change in curren… Rob Sayre
Re: [Json] JSON and int64s - any change in curren… Rob Sayre
Re: [Json] JSON and int64s - any change in curren… Joe Hildebrand
Re: [Json] JSON and int64s - any change in curren… Tim Bray
Re: [Json] JSON and int64s - any change in curren… Carsten Bormann
Re: [Json] JSON and int64s - any change in curren… Pete Cordell
Re: [Json] JSON and int64s - any change in curren… Pete Cordell
Re: [Json] JSON and int64s - any change in curren… Joe Hildebrand
Re: [Json] JSON and int64s - any change in curren… Carsten Bormann
Re: [Json] JSON and int64s - any change in curren… Joe Hildebrand
Re: [Json] JSON and int64s - any change in curren… Carsten Bormann
Re: [Json] JSON and int64s - any change in curren… Tim Bray
Re: [Json] JSON and int64s - any change in curren… Carsten Bormann
Re: [Json] JSON and int64s - any change in curren… Joe Hildebrand
Re: [Json] JSON and int64s - any change in curren… Carsten Bormann
Re: [Json] JSON and int64s - any change in curren… Joe Hildebrand
Re: [Json] JSON and int64s - any change in curren… Anders Rundgren
Re: [Json] JSON and int64s - any change in curren… Rob Sayre
Re: [Json] JSON and int64s - any change in curren… Rob Sayre
Re: [Json] JSON and int64s - any change in curren… Tim Bray
Re: [Json] JSON and int64s - any change in curren… Joe Hildebrand
Re: [Json] JSON and int64s - any change in curren… Rob Sayre
Re: [Json] JSON and int64s - any change in curren… Carsten Bormann
Re: [Json] JSON and int64s - any change in curren… Rob Sayre
Re: [Json] JSON and int64s - any change in curren… Joe Hildebrand
Re: [Json] JSON and int64s - any change in curren… Carsten Bormann
Re: [Json] JSON and int64s - any change in curren… Joe Hildebrand
Re: [Json] JSON and int64s - any change in curren… Carsten Bormann
Re: [Json] JSON and int64s - any change in curren… Rob Sayre
Re: [Json] JSON and int64s - any change in curren… Joe Hildebrand
Re: [Json] JSON and int64s - any change in curren… Joe Hildebrand
Re: [Json] JSON and int64s - any change in curren… Tim Bray
Re: [Json] JSON and int64s - any change in curren… Daniel P
Re: [Json] JSON and int64s - any change in curren… Carsten Bormann
Re: [Json] JSON and int64s - any change in curren… Joe Hildebrand
Re: [Json] JSON and int64s - any change in curren… Tim Bray
Re: [Json] JSON and int64s - any change in curren… Rob Sayre
Re: [Json] JSON and int64s - any change in curren… Carsten Bormann
Re: [Json] JSON and int64s - any change in curren… Carsten Bormann
Re: [Json] JSON and int64s - any change in curren… Joe Hildebrand
Re: [Json] JSON and int64s - any change in curren… Joe Hildebrand
Re: [Json] [Cbor] JSON and int64s - any change in… Carsten Bormann
Re: [Json] JSON and int64s - any change in curren… Richard Gibson
Re: [Json] [Cbor] JSON and int64s - any change in… Jeremy O'Donoghue
Re: [Json] [Cbor] JSON and int64s - any change in… Carsten Bormann
Re: [Json] [Cbor] JSON and int64s - any change in… Jeremy O'Donoghue
Re: [Json] JSON and int64s - any change in curren… Joe Hildebrand
Re: [Json] [Cbor] JSON and int64s - any change in… Carsten Bormann
Re: [Json] JSON and int64s - any change in curren… Rob Sayre
Re: [Json] [Cbor] JSON and int64s - any change in… Jeremy O'Donoghue