Re: [Json] JSON and int64s - any change in current best practice since I-JSON

Carsten Bormann <cabo@tzi.org> Wed, 17 January 2024 17:22 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 99F94C18DB84 for <json@ietfa.amsl.com>; Wed, 17 Jan 2024 09:22:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.908
X-Spam-Level:
X-Spam-Status: No, score=-1.908 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7DWDoDSo_TS6 for <json@ietfa.amsl.com>; Wed, 17 Jan 2024 09:22:34 -0800 (PST)
Received: from smtp.zfn.uni-bremen.de (smtp.zfn.uni-bremen.de [IPv6:2001:638:708:32::21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7FFC4C17C884 for <json@ietf.org>; Wed, 17 Jan 2024 09:22:34 -0800 (PST)
Received: from eduroam-0298.wlan.uni-bremen.de (eduroam-0298.wlan.uni-bremen.de [134.102.17.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4TFXkc5Rt3zDCdY; Wed, 17 Jan 2024 18:22:32 +0100 (CET)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.7\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <360A7E32-C6F5-4EA4-AEDC-C61DE029BFB4@cursive.net>
Date: Wed, 17 Jan 2024 18:22:32 +0100
Cc: Tim Bray <tbray@textuality.com>, Pete Cordell <petejson@codalogic.com>, "json@ietf.org" <json@ietf.org>
X-Mao-Original-Outgoing-Id: 727204952.2726851-defec09d03fdb6918f90bafea8e80827
Content-Transfer-Encoding: quoted-printable
Message-Id: <3B62862C-AEF0-430A-9E9C-0AB3090F1EDE@tzi.org>
References: <87527a42-aaac-4f39-b320-05f18a2808c1@codalogic.com> <C31BF4C8-9E6C-48F8-BF7B-D2C379273B3F@tzi.org> <CAHBU6it4SaLawSiBgK9ySkbxjtHE6CX-P3r=hzcVy4ksoQo-Cg@mail.gmail.com> <CAChr6SxHfLW-A1asAndKJz-AiyJv5QP18bi=_bNdKXw7zYHThw@mail.gmail.com> <CAChr6SweYdCWxSABZ7g20Zd-xBFzcK0Ritq53S7WtjSwc-vLmw@mail.gmail.com> <E5A68370-CC2F-4618-AB39-39A382656616@cursive.net> <807fea1b-a22b-4d6b-aa5d-720c9b12023c@codalogic.com> <09233A73-3A6B-4E6F-AEB8-596AC6442E24@cursive.net> <869950DC-647B-4481-AEF8-9E092384E99F@tzi.org> <CBD32B58-8328-4602-89C6-BC2A7A875A0D@cursive.net> <994E2C0A-4AE0-4720-8C67-913BBF033E11@tzi.org> <CAHBU6isiUhvhk5VPpQ1A_kGDJZhsGLc1xkyu6pNeLUBHw2_dzg@mail.gmail.com> <0D9273E3-A07F-4303-9AF7-89375FDC2496@tzi.org> <360A7E32-C6F5-4EA4-AEDC-C61DE029BFB4@cursive.net>
To: Joe Hildebrand <hildjj@cursive.net>
X-Mailer: Apple Mail (2.3608.120.23.2.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/ygYV52-zKsn89nIw2qYE927wdes>
Subject: Re: [Json] JSON and int64s - any change in current best practice since I-JSON
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 17 Jan 2024 17:22:37 -0000

On 2024-01-17, at 18:06, Joe Hildebrand <hildjj@cursive.net> wrote:
> 
> It may be an implementation detail which type gets generated from parsing. 

Once there is an application that gives different semantics to 1n in a specific place than to 1, it is no longer an implementation detail at all.

There is a large swath of JSON implementations that bin JSON numbers without dots or exponents into an integer bin and those with at least one of them into a floating point bin.  I’ll call this “syntactic boundary”.

10 and 10.0/1e1/0.1e2 are different with these implementations.

JavaScript gives all these numbers the exact same meaning; it doesn’t have a boundary at all (until you start using | and similar, which then mogrifies into 32-bit integers, IIRC).

(JSON doesn’t give any meaning to numbers, but it seems to imply that they are somehow standing for the mathematical concept of a number without mentioning the computer concept of separate integer and floating point types.
So one might think that JSON favors the JavaScript interpretation.
But then Section 7 of RFC 8259 says:

   The representation of numbers is similar to that used in most
   programming languages.

… where most programming languages have a rather thick firewall between integers and floating point numbers.)

>  The questions for interop are:
> 
> - are long integers without prefixes allowed or cause a parse error?

JSON certainly allows them, so if ESON wants to be a superset, they need to be allowed.

> - if they are allowed, are they expected to have (somewhat unspecified) JSON behavior, or are they expected to behave in int-like fashion?

There is no JSON behavior.  They are numbers, "similar to that used in most programming languages”.

There is the common practice of binning non-dot, non-exponent numbers into the int-like bin (syntactic boundary).  There also is the JavaScript behavior.  One could imaging a semantic boundary, where 10/1e1/0.1e2/10.0 are integers because they are integral numbers, and only non-integral numbers are floating point — see the dCBOR discussion over at the CBOR mailing list for one such model of applications.

> - are long numbers with decimals or exponents allowed, or cause a parse error?

They are allowed in JSON.

> - same as ints, what is their expected behavior?

Depends on where ESON puts the integer/float boundary — on syntactic grounds (like many non-JavaScript implementations).
This gets interesting again with 1e1000 (a common way to say “Infinity” in JSON, which doesn’t have Infinity).

> Part of this depends on how much you want to be backward-compatible with JSON.

Indeed.  I think trying to be backwards compatible with all (syntactically) JSON documents creates an interesting decision space.  Or maybe JSON/ECMA404 (syntax) is less important than semantic compatibility.

Grüße, Carsten