Re: [Json] Adding integers to JSON (Re: JSON Schema Language)

Carsten Bormann <cabo@tzi.org> Wed, 08 May 2019 19:30 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A5CCF1201FA for <json@ietfa.amsl.com>; Wed, 8 May 2019 12:30:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.199
X-Spam-Level:
X-Spam-Status: No, score=-4.199 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4e-m9DxAuYBx for <json@ietfa.amsl.com>; Wed, 8 May 2019 12:30:40 -0700 (PDT)
Received: from smtp.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C024F1201CA for <json@ietf.org>; Wed, 8 May 2019 12:30:39 -0700 (PDT)
Received: from [192.168.217.106] (p54A6CC75.dip0.t-ipconnect.de [84.166.204.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.uni-bremen.de (Postfix) with ESMTPSA id 44zmlx2pLDzybH; Wed, 8 May 2019 21:30:37 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <20190508183130.GV21049@localhost>
Date: Wed, 08 May 2019 21:30:36 +0200
Cc: JSON WG <json@ietf.org>
X-Mao-Original-Outgoing-Id: 579036634.4782569-d40057c4d26aac1a8239b4ad9ef3b935
Content-Transfer-Encoding: quoted-printable
Message-Id: <52F0F8D4-277F-4E7E-A5E0-7444C83DEF57@tzi.org>
References: <20190507154201.GP21049@localhost> <CEF72901-5077-4305-BA68-60624DCE952D@bzfx.net> <69ea0c99-e983-5972-c0aa-824ddeecb7c4@dret.net> <CAMm+LwjyVjnJuWE4+a9Ea=_X1uuEGuK+O4KojzN3uVQ+s+HqUQ@mail.gmail.com> <058f58a3-dd27-998e-5f54-4874aff5f2f0@dret.net> <20190507221726.GR21049@localhost> <CAJK=1Rj7PBD-bbwvsqgjQQzp4Aoidb-W2q5Lj6asMHHDHaTVYQ@mail.gmail.com> <702ee54b-9465-7ca8-b521-2a88c1a47785@gmail.com> <20190508160740.GU21049@localhost> <ACD9A0A2-A75E-4B6E-9E9B-165DC222781B@tzi.org> <20190508183130.GV21049@localhost>
To: Nico Williams <nico@cryptonector.com>
X-Mailer: Apple Mail (2.3445.9.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/eW43okKHEhP2MQhWXs6cfJcYHms>
Subject: Re: [Json] Adding integers to JSON (Re: JSON Schema Language)
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 08 May 2019 19:30:43 -0000

Hi Nico,

> On May 8, 2019, at 20:31, Nico Williams <nico@cryptonector.com> wrote:
> 
> On Wed, May 08, 2019 at 06:59:54PM +0200, Carsten Bormann wrote:
>> On May 8, 2019, at 18:07, Nico Williams <nico@cryptonector.com> wrote:
>>> That's an interop bug.  They should fix it.
>> 
>> It also nicely demonstrates the law of inevitable extensibility.
>> 
>> JSON was designed not to provide extensibility.  Ever.
>> Radical, but a well motivated idea at the time.
> 
> The spec allows extensions, actually, but indeed, it doesn't provide for
> extensibility.
> 
> (For example, jq will parse Inf and Infinity, but it will not output
> such words, nor will it output NaN.)

Huh?  RFC 8259 is very outspoken here:

   Numeric values that cannot be represented in the grammar below (such
   as Infinity and NaN) are not permitted.

So you have a superset of JSON as your jq input grammar, which is fine for me (it just isn’t JSON anymore).  Maybe you mean the sentence in the section about Parsers:

   A JSON parser MAY accept non-JSON forms or extensions.

Yep, non-JSON.  JSON has no extensibility points.

>> The law of inevitable extensibility says:
>> 
>> # If you don’t provide an extension point, somebody will.
>> # They will shove their extension into any crevice available.
>> # (And your up to now rock-solid design breaks.)
> 
> The particular “extension" we're talking about isn't an extension.

Extending the data model to enable expressing two different kinds of numbers, which are identical in numeric value (i.e., JSON semantics), but have different semantics in the data model defined by the extension (distinguishing integer 3 from floating point 3), definitely is an extension.  The fact that only instances are communicated that are also valid under (by definition unextended) JSON semantics doesn’t change this.

>> The crevice here was that there are many ways to express the number 3:
>> 3, 3.0, 3e0, 0.03e2, 300e-2 etc.  We can give some of these
>> expressions different semantics than the other.  Voila.  Extensibility
>> achieved.  Brittleness added.
> 
> If anything one would expect this to bring up canonicalization...
> (oh no, I've mentioned that.  please no one take that as a serious
> invitation to discuss canonical JSON!)

C14n is a different thing: It reduces the number of different representations allowed for a specific instance in the data model.  There is no new semantics.  A generic decoder will have no problem with inputting a c14nized serialization.  A generic encoder won’t always spit out a c14nized instance (and we don’t expect it to), but a step can be added to convert the JSON instance into the equivalent c14nized instance without breaking that generic JSON encoder.  This is all fairly innocuous.  (It is also hard, and the specific proposal pushed here neither has fully demonstrated that it has nailed the hard parts nor does it use reasonable taste — principle of least surprise — in some other places.  Convert to UTF-16-BE before sorting?  Oof.  But sure, sorting Hittite characters between hangul Jamo and private use can be made interoperable, if it rocks your boat.)  More importantly, C14n ensures that there is no appetite for interpreting two lexically different instances that have the same semantics in JSON as different — at least one of them couldn’t be C14nal!

> […]

Grüße, Carsten