Re: [Json] Adding integers to JSON (Re: JSON Schema Language)

Austin Wright <aaa@bzfx.net> Wed, 08 May 2019 22:49 UTC

Return-Path: <aaa@bzfx.net>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B841E120194 for <json@ietfa.amsl.com>; Wed, 8 May 2019 15:49:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=bzfx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IIkaKjHIQ-Wz for <json@ietfa.amsl.com>; Wed, 8 May 2019 15:49:23 -0700 (PDT)
Received: from mail-pf1-x444.google.com (mail-pf1-x444.google.com [IPv6:2607:f8b0:4864:20::444]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AA3661200E0 for <json@ietf.org>; Wed, 8 May 2019 15:49:23 -0700 (PDT)
Received: by mail-pf1-x444.google.com with SMTP id v80so209592pfa.3 for <json@ietf.org>; Wed, 08 May 2019 15:49:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bzfx.net; s=google; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=uYaLI6tHXrYgppIK9m9Gr0EmG+cakljTrIlnIKaerJA=; b=qZ9Rd6Pr1DBEJYxIssGvyPFZG1JaPXZ9rvWT/1D2XJiNFBu61dZGh2GXuPBtGbhOZv 9/3yIybA3/pY5mtAuWl5gkXMxqSdv3QCKEerdApCmGZqbp2CM4/qYqidrOfjnJ0OlOyR MfQ0UQr1mEuA9/6XPHyhCtc54DzF/l5EhpTi4=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=uYaLI6tHXrYgppIK9m9Gr0EmG+cakljTrIlnIKaerJA=; b=K18I5YoZzFWPH2wygNGrhI+0ugMP0c2JDZcuzfmqNmHFw6ifvW3McxqaOgqNa0rtPM 3G6JGUTR/8sWMwo21T/a5x1GhgGe9dpwHPe+1gEw9m9L2jtyjtKhaoqTesWwZZEGzmFk ypqZPokm0x8I4pXnh5pd8eeaVfsL3JDhyfTovd5f6N9ZCJKWoyAdeRyX7QAJgz12CZ/T yk6gzbGKn9/TUZkGxRUfW0jc/Dh/UvzC97rWskfIUAL6dlm6+DT3GF+ppVBCJ77UZxqZ 58BWNOqT0Ftu1NGI77MwP0VqI073UrkAD59HyxlCcJ73tqs7DM1RDZ+DdWQPMCGWbW2I valw==
X-Gm-Message-State: APjAAAW5JR4mIsKPn83csC3MfzGp1+uoalpkgbJZYiEDfQWoeGPk8RsE zD+kpPXnJMtQ9Z2weJToJDaxWw==
X-Google-Smtp-Source: APXvYqy6chgXrveFFFCrG3A50BGp+d/25qkQCqdg82o07QzXZPnvkg/v2el0L/hpeBANuO74K+7HsQ==
X-Received: by 2002:a65:5003:: with SMTP id f3mr863888pgo.336.1557355762797; Wed, 08 May 2019 15:49:22 -0700 (PDT)
Received: from [192.168.0.116] ([184.101.46.90]) by smtp.gmail.com with ESMTPSA id r18sm354926pfd.89.2019.05.08.15.49.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 08 May 2019 15:49:21 -0700 (PDT)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\))
From: Austin Wright <aaa@bzfx.net>
In-Reply-To: <ACD9A0A2-A75E-4B6E-9E9B-165DC222781B@tzi.org>
Date: Wed, 08 May 2019 15:49:20 -0700
Cc: Nico Williams <nico@cryptonector.com>, JSON WG <json@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <02755092-1682-45E8-AB6C-0EDA7D35703A@bzfx.net>
References: <CAHBU6itE8kub1qtdRoW8BqxaOmzMv=vUo1aDeuAr3HX141NUGg@mail.gmail.com> <77994bdb-a400-be90-5893-b846a8e13899@gmail.com> <20190507154201.GP21049@localhost> <CEF72901-5077-4305-BA68-60624DCE952D@bzfx.net> <69ea0c99-e983-5972-c0aa-824ddeecb7c4@dret.net> <CAMm+LwjyVjnJuWE4+a9Ea=_X1uuEGuK+O4KojzN3uVQ+s+HqUQ@mail.gmail.com> <058f58a3-dd27-998e-5f54-4874aff5f2f0@dret.net> <20190507221726.GR21049@localhost> <CAJK=1Rj7PBD-bbwvsqgjQQzp4Aoidb-W2q5Lj6asMHHDHaTVYQ@mail.gmail.com> <702ee54b-9465-7ca8-b521-2a88c1a47785@gmail.com> <20190508160740.GU21049@localhost> <ACD9A0A2-A75E-4B6E-9E9B-165DC222781B@tzi.org>
To: Carsten Bormann <cabo@tzi.org>
X-Mailer: Apple Mail (2.3445.104.8)
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/yjgjgqxpm1y7esyVDTBzZ6jnVws>
Subject: Re: [Json] Adding integers to JSON (Re: JSON Schema Language)
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 08 May 2019 22:49:27 -0000


> On May 8, 2019, at 09:59, Carsten Bormann <cabo@tzi.org> wrote:
> 
> On May 8, 2019, at 18:07, Nico Williams <nico@cryptonector.com> wrote:
>> 
>> That's an interop bug.  They should fix it.
> 
> It also nicely demonstrates the law of inevitable extensibility.
> 
> JSON was designed not to provide extensibility.  Ever.
> Radical, but a well motivated idea at the time.
> 
> Now people did not want to accept that.
> They wanted to extend the JSON data model with the distinction between two number types.

Who’s “they”? There’s only two uses of “integer” in relation to JSON that I’m aware of:

1. Implementations that parse JSON documents into a pre-defined structure, and will raise an error if the struct cannot store the lexical data (for example, putting a null into a int64_t, or an object into a char*).

2. JSON Schema, which defines integer as a strict subset of “number”. (It’s not even a data model type, it’s a special case for the “type” keyword that means the same as `{type:”number”, multipleOf:1}`)

Neither of these “extend" JSON, because JSON does not impose any requirements on how implementations parse numbers. My reading of RFC8259 allows implementations to differentiate between `3` vs `3.0` vs `3.00`.

Nonetheless, JSON Schema specifies that numbers are mathematical, so `10.0` would still be considered an integer. Likewise, I think parsers should ignore excess precision if it doesn’t change the mathematical value. But this is entirely a choice based on how most implementations work; rather than what JSON seems to allow. For example, scientific software might store significant figures such that `10.0` is considered less precise than `10.000`. In my reading of RFC8259 and ECMA 404, this is perfectly legal, but JSON Schema makes no distinction for validation purposes (i.e. there would be no way to assert “number must be at least this precise”).

> 
> The law of inevitable extensibility says:
> 
> # If you don’t provide an extension point, somebody will.
> # They will shove their extension into any crevice available.
> # (And your up to now rock-solid design breaks.)
> 
> The crevice here was that there are many ways to express the number 3: 3, 3.0, 3e0, 0.03e2, 300e-2 etc.  We can give some of these expressions different semantics than the other.  Voila.  Extensibility achieved.  Brittleness added.
> 
> I have seen the law of inevitable extensibility applied to:
> 
> — duplicate map keys for adding comments to JSON.  Some people believe:
> 
>  { “postcode”: “This is a comment about the post code 4711”,
>    “postcode”: 4711 }
> 
>  is a valid way to extend JSON with comments.  (Because their implementation happens to ignore map keys that are invalidly used again later, which is not a given.)

Certainly, this is not interoperable. But it is legal: RFC8259 merely says documents “SHOULD NOT” use multiple keys, and allows different implementations to handle it differently. ECMA-404 specifically says there’s no limits on string names, even repeated names (ECMA 404 is the document referenced by ECMAScript, though that specifies an algorithm that ignores repeated keys.)

For this reason, JSON Schema does not have any features dealing with repeated keys or their order (specifically, the data model specifies that objects are treated as an unordered string to value mapping), this is another artistic license that favors uniformity over supporting the complete lexical space of JSON.

> 
> - unnecessary escaping for ascribing some special semantics.  Say, if a string starts with “\u0073” and not with the equivalent “s”, this is supposed to mean the string has some different semantics (e.g., the rest is a base64url encoded byte string as opposed to what would normally be a text string, or a type tag, or…).

While JSON strongly implies these two forms must be considered the same thing, note this isn’t true for all formats. Namely, %2F is not the same as a forward slash in a URI, they are to be treated differently. (There’s been many security holes caused because people treat `%2F..%2F` the same as `/../` and it’s not.)

> 
> — millions of ways of interpreting whitespace or the lack thereof.  I’ve lost count.

Could you provide one or two examples please?

> 
> 
> I do not think any longer that anything that is meant to see significant usage can be designed without designing in extensibility points from the start.  At least if it is supposed to last.
> 
> Grüße, Carsten
> 
> _______________________________________________
> json mailing list
> json@ietf.org
> https://www.ietf.org/mailman/listinfo/json