Re: [Json] JSON Schema Language

Austin Wright <aaa@bzfx.net> Sun, 05 May 2019 06:17 UTC

Return-Path: <aaa@bzfx.net>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 768CF12012E for <json@ietfa.amsl.com>; Sat, 4 May 2019 23:17:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.001
X-Spam-Level:
X-Spam-Status: No, score=-2.001 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=bzfx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Kh8ax2UAly0B for <json@ietfa.amsl.com>; Sat, 4 May 2019 23:17:06 -0700 (PDT)
Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C91DC120105 for <json@ietf.org>; Sat, 4 May 2019 23:17:06 -0700 (PDT)
Received: by mail-pl1-x635.google.com with SMTP id cb4so818128plb.3 for <json@ietf.org>; Sat, 04 May 2019 23:17:06 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bzfx.net; s=google; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=J7rNzC3UW6Zw5N3fknU7QQY4bm52us+G0OUopATZs74=; b=kV6b2pn/TVbsqw4v2AlOUrhebDUnwgzJY6/7HKCxAh0Z3d6ZCB7To2R+PSaYxtspcw Kffyv8zPxqDb9zl19EZ/DpeuDE57kkL+nf6uZ3wiETPeJEJ1RlMIaSSIbWILtEA2yRnK Y15e5+AmLPIy7QmLv+i3x8c4+UXYdAbBsgeGY=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=J7rNzC3UW6Zw5N3fknU7QQY4bm52us+G0OUopATZs74=; b=L5JCUOKDwMeDuNG9FENyrsefn+6PtKKtT4uCX0fJpxGtNt7tiUV1+mRrCHrDTXxXTc M0UjLSkGf61Sf45NDQOnbPtN/zZqERKPIOuBhh/C76Fk5PgPrmoFn7JVO/n8L5IN2d+P 0HsllwtNoODkA+5FCmdbQpGVQYK20X6oBaUNbONfETe4G7fezGNq2oke5ljPhzy32yjq t+9anP7itpCl/3rT1OBsCeX6/R5qBa9jovctD38TrvVu5yqyih77DoAsb1kkrWyGpDFN D8iuIopErTrncumu1lb1IoMiIdWaqAUX4Ca4kR2O/nrCNvUoTu0/oi7gpZO/MoFrsknj 15DQ==
X-Gm-Message-State: APjAAAVvEL8pT05D/2LlWqLCIWy4j8Mg0j9q0cKAzL0iEaKkA50MFrcN 5DniAx7rScIxhTuAywqyuA4I8w==
X-Google-Smtp-Source: APXvYqxYAvlyelVvWxzKLGjgoO/H7Dt7ddW1L5IhAJigY1AkeCDu7ML7IOlMmvPakmRBzQ3HEwVHOA==
X-Received: by 2002:a17:902:108a:: with SMTP id c10mr16267714pla.48.1557037025989; Sat, 04 May 2019 23:17:05 -0700 (PDT)
Received: from [192.168.0.116] ([184.101.46.90]) by smtp.gmail.com with ESMTPSA id k26sm8267662pfi.136.2019.05.04.23.17.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 04 May 2019 23:17:05 -0700 (PDT)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\))
From: Austin Wright <aaa@bzfx.net>
In-Reply-To: <40f80ea0-d130-3f3b-39fa-2c84e802ed55@gmail.com>
Date: Sat, 04 May 2019 23:16:42 -0700
Cc: Carsten Bormann <cabo@tzi.org>, json@ietf.org, Ulysse Carion <ulysse@segment.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <35E2623E-753D-4918-8AF4-BF0BC5DE4868@bzfx.net>
References: <CAJK=1RjV1uv0eOdtFZ8cKn-FfCwCiGP5r2hOz1UamiM6YV4H1A@mail.gmail.com> <39682ec8-f993-a44c-d3e2-1638d2c1608f@gmail.com> <29CAE1CE-D6CB-4796-B2F2-2095BE921385@tzi.org> <AD5ABD9C-F5F2-477D-B862-529C890D5472@bzfx.net> <DA1767B8-22D6-4EA9-8112-4B36B79E9039@tzi.org> <D21B379B-23CC-48B3-BE10-D2777308E2E0@bzfx.net> <40f80ea0-d130-3f3b-39fa-2c84e802ed55@gmail.com>
To: Anders Rundgren <anders.rundgren.net@gmail.com>
X-Mailer: Apple Mail (2.3445.104.8)
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/hUC6xbmMyxbb5T7JzmbfmQ0xuO0>
Subject: Re: [Json] JSON Schema Language
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 05 May 2019 06:17:09 -0000


> On May 4, 2019, at 22:07, Anders Rundgren <anders.rundgren.net@gmail.com> wrote:
> 
> On 2019-05-05 04:51, Austin Wright wrote:
>>> On May 4, 2019, at 02:42, Carsten Bormann <cabo@tzi.org> wrote:
>>> 
>>> Curious:
>>> 
>>> On that web page, it also says “For consistency, integer JSON numbers SHOULD NOT be encoded with a fractional part.”
>>> 
>>> What does that mean?
>> It’s a non-normative suggestion with the aim of enhancing performance: Many programming languages distinguish between integers and IEEE floats by the presence of a decimal point. While JSON makes no such distinction (all numbers are arbitrary precision decimal), some parsers do make that distinction, and it’s slightly easier to determine if an int32_t is an integer than if a double is an integer.
> 
> in the C# example I provided before this is (by default) not non-normative, since it threw an exception.   Here is a little bit more detail:
> 
> class MyObject {
>  int Counter;
>   .
>   .
> }
> 
> Deserializing JSON into this type (which BTW works as as "schema"), REQUIRES "Counter" data to adhere to normal integer notation, including value span.
> 
> A scheme language should align with the actual use and interpretation of JSON.  This involves alternative serialization formats as well.  As an example monetary values are (probably without exceptions) expressed as JSON Strings since floating point is unsuited for decimal arithmetic.
> 
> Based on the RFC, one might come to the conclusion that JSON is an inferior information transfer format, but aided by external mapping it actually works extremely well, albeit being slightly verbose :)

The Newtonsoft.Json parser behavior would be incompatible with JSON Schema, according to what you provided. But I can’t imagine it’s much of an issue: If the encoder knows the value will always be an integer, why add a fractional part?

That behavior still seems arbitrary to me, though. Instead of erroring on the first period, all you have to do is error on the first nonzero after the period. Scientific notation is another matter, but presumably it allows 1.9e4 as a valid integer? (I’m not sure, off-hand.)

Probably a better way to approach JSON parsers is to raise an error if it can’t preserve all the information in the source. For example, you can preserve 1e10, 0.5, and even 0.1 (if you permit a small error, such that if you converted the float back to decimal, 0.1 would still be the closest decimal representation). But you can’t preserve 9007199254740993.5 as a double: it’s too many significant figures, and above that value, the precision is below 1. In this case, you would throw.

Parsers that ensure the preservation of the encoded data like this might encourage better use of the JSON types, like for monetary values, even if you are reading into an IEEE floating point.

Still, now that we bring it up, that SHOULD NOT seems suspicious. It’s stating the obvious.

Austin.

> 
> Cheers,
> Anders
> 
>> Cheers,
>> Austin.
>>> 
>>> Grüße, Carsten
>>> 
>>> 
>>>> On May 4, 2019, at 11:36, Austin Wright <aaa@bzfx.net> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On May 4, 2019, at 00:58, Carsten Bormann <cabo@tzi.org> wrote:
>>>>> 
>>>>> On May 4, 2019, at 06:47, Anders Rundgren <anders.rundgren.net@gmail.com> wrote:
>>>>>> 
>>>>>> Example: although 10.0 is a valid JSON Number, in system where you expect
>>>>>> an integer, this should be flagged as a syntax error.
>>>>> 
>>>>> 10.0 is an integer number.
>>>>> 
>>>>> “Schema Languages” operate at the data model level.  In the JSON data model, there is only one kind of number.
>>>>> Of course, the JSON data model is not actually defined in a standard, which is one of the major shortcomings of JSON.
>>>>> 
>>>> 
>>>> JSON Schema handles this exactly the same way, defining a data model [1]. The lexical representation is surjective onto the data model: as you point out, 10.0 is the same as 10, which is an integer.
>>>> 
>>>> The one case where this might fall apart is if significant digits are important, such that 10.0 is different than 10.00 (i.e., 10.00 is more precise by an order of magnitude). However, I’m not aware of any JSON parsers that keep track of the precision of numbers, even ones that support arbitrary precision. I imagine scientific applications would want to store an explicit precision, since they’re not always powers-of-10 (e.g. {value: 32.0, precision: 0.5}).
>>>> 
>>>> [1] http://json-schema.org/latest/json-schema-core.html#rfc.section.4.2.1 "4.2.1. Instance Data Model"
>>>> 
>>>> Austin.
>>> 
>