Re: [Json] JSON Schema Language

Nico Williams <nico@cryptonector.com> Mon, 06 May 2019 21:15 UTC

Return-Path: <nico@cryptonector.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C0AD31200EA for <json@ietfa.amsl.com>; Mon, 6 May 2019 14:15:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cryptonector.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vENpUVtRtp6M for <json@ietfa.amsl.com>; Mon, 6 May 2019 14:15:21 -0700 (PDT)
Received: from ostrich.birch.relay.mailchannels.net (ostrich.birch.relay.mailchannels.net [23.83.209.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 25077120073 for <json@ietf.org>; Mon, 6 May 2019 14:15:21 -0700 (PDT)
X-Sender-Id: dreamhost|x-authsender|nico@cryptonector.com
Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 15E6F5E3C35; Mon, 6 May 2019 21:15:20 +0000 (UTC)
Received: from pdx1-sub0-mail-a100.g.dreamhost.com (100-96-79-5.trex.outbound.svc.cluster.local [100.96.79.5]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 7F66E5E3B75; Mon, 6 May 2019 21:15:19 +0000 (UTC)
X-Sender-Id: dreamhost|x-authsender|nico@cryptonector.com
Received: from pdx1-sub0-mail-a100.g.dreamhost.com ([TEMPUNAVAIL]. [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.17.2); Mon, 06 May 2019 21:15:20 +0000
X-MC-Relay: Neutral
X-MailChannels-SenderId: dreamhost|x-authsender|nico@cryptonector.com
X-MailChannels-Auth-Id: dreamhost
X-Robust-Imminent: 01adb2f816766a68_1557177319902_1926402974
X-MC-Loop-Signature: 1557177319902:3536525405
X-MC-Ingress-Time: 1557177319902
Received: from pdx1-sub0-mail-a100.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a100.g.dreamhost.com (Postfix) with ESMTP id 00A5E7FEBC; Mon, 6 May 2019 14:15:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h=date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to:content-transfer-encoding; s= cryptonector.com; bh=JRBNDVtGqZyGUEDHGklaYz6/+Wg=; b=AHNAaTb8+IY 5q+Fiy7nuHxRWVKxNiCOiqojsG/Sn245ImfDYvK4izeDqYDtwMSMTOraOaHm/SAf 9JtvURzRSK487AoC5ShvUKPCGcdtd7zpJfvrDVeWxPJk9IR/uFpnsxtsDWRpjRx3 g62N5nn6wlHNlmLXIgCEje6JOccPHhBQ=
Received: from localhost (unknown [24.28.108.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: nico@cryptonector.com) by pdx1-sub0-mail-a100.g.dreamhost.com (Postfix) with ESMTPSA id 9F2287FEBB; Mon, 6 May 2019 14:15:13 -0700 (PDT)
Date: Mon, 06 May 2019 16:15:11 -0500
X-DH-BACKEND: pdx1-sub0-mail-a100
From: Nico Williams <nico@cryptonector.com>
To: Austin Wright <aaa@bzfx.net>
Cc: Carsten Bormann <cabo@tzi.org>, json@ietf.org
Message-ID: <20190506211509.GL21049@localhost>
References: <AD5ABD9C-F5F2-477D-B862-529C890D5472@bzfx.net> <DA1767B8-22D6-4EA9-8112-4B36B79E9039@tzi.org> <D21B379B-23CC-48B3-BE10-D2777308E2E0@bzfx.net> <40f80ea0-d130-3f3b-39fa-2c84e802ed55@gmail.com> <35E2623E-753D-4918-8AF4-BF0BC5DE4868@bzfx.net> <6260354b-aca2-e001-7145-148b32658416@gmail.com> <9D90C1F1-6747-4373-93B0-8D51C5B25F1C@bzfx.net> <751DAC92-D70C-4C5E-9C61-954D6E300A1F@tzi.org> <20190506192453.GK21049@localhost> <753A412B-299F-400F-9D19-A9688068D842@bzfx.net>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
In-Reply-To: <753A412B-299F-400F-9D19-A9688068D842@bzfx.net>
User-Agent: Mutt/1.9.4 (2018-02-28)
X-VR-OUT-STATUS: OK
X-VR-OUT-SCORE: -100
X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgeduuddrjeekgdeiudcutefuodetggdotefrodftvfcurfhrohhfihhlvgemucggtfgfnhhsuhgsshgtrhhisggvpdfftffgtefojffquffvnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjughrpeffhffvuffkfhggtggugfgjfgesthekredttderjeenucfhrhhomheppfhitghoucghihhllhhirghmshcuoehnihgtohestghrhihpthhonhgvtghtohhrrdgtohhmqeenucfkphepvdegrddvkedruddtkedrudekfeenucfrrghrrghmpehmohguvgepshhmthhppdhhvghloheplhhotggrlhhhohhsthdpihhnvghtpedvgedrvdekrddutdekrddukeefpdhrvghtuhhrnhdqphgrthhhpefpihgtohcuhghilhhlihgrmhhsuceonhhitghosegtrhihphhtohhnvggtthhorhdrtghomheqpdhmrghilhhfrhhomhepnhhitghosegtrhihphhtohhnvggtthhorhdrtghomhdpnhhrtghpthhtohepnhhitghosegtrhihphhtohhnvggtthhorhdrtghomhenucevlhhushhtvghrufhiiigvpedt
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/MrwfYVC1D5ZRjf1mxR127D3OGq0>
Subject: Re: [Json] JSON Schema Language
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 06 May 2019 21:15:23 -0000

On Mon, May 06, 2019 at 02:00:54PM -0700, Austin Wright wrote:
> > On May 6, 2019, at 12:24, Nico Williams <nico@cryptonector.com> wrote:
> > Because some of us have generic parsers with associated DSLs and we
> > don't want to be left out.
> 
> Left out of what, exactly? See below.

Out of interoperating with other applications using this schema thing.

> > [...]
> 
> I’m not suggesting a variance from the JSON semantics.

If you would have your parser reject 10.0 because the schema says to
expect an integer, then you are forking JSON as you are precluding
interoperability with some JSON encoders.  Especially too if you then
also insisted on a fractional part when a schema element expects a real
number so that you'd have the parser reject 10 -- then encoders would be
damned if they do and damned if they don't.

I explained how this would preclude use of jq or any other XSLT/XPath-
alike for JSON from interoperating with applications using such a schema
language.

> My suggestion is an alternative to parsers that lose data when they
> parse JSON documents, for example, ECMAScript's JSON.parse number
> parsing. You can, of course, always parse a JSON document according to
> the generic semantics.

The key word there is "alternative".  You're forking JSON.

> The issue here is how does a program parse a JSON document that with a
> wider range of values than what the program has room for? Either a
> string that’s too long, an array with too many items, or a number with
> too many significant figures?

I explained this earlier.

First, as to fractional parts of numeric values, a) zero fractional
parts MUST be tolerated, b) non-zero fractional parts can be truncated,
rounded, or rejected according to the schema authors' choice.

This is a case where being liberal in what you accept is a good thing.

> Normally you would have to write your own parser, or use a tokenizer

Why not use an off-the-shelf one?

> that preserves the lexical values without casting them to native
> types. Then you perform your own validation in code, and decide how to
> convert a lexical JSON number into a native number, depending on
> context.

That's one way, and it has to be possible, because that's essentially
what you'd do with jq or ECMAScript.

You could also use a streaming parser (which jq also has) to immediately
handle each value as it is parsed.

You could also write a combined parser and validator.

And, of course, you could generate a parser&validator from the schema.

I'm not rejecting any of those choices.

I'm insisting ONLY on the first choice (first parse, then validate)
needing to remain valid and workable.  If you make this choice no longer
feasible, then you've forked JSON.  Please don't.

Nico
--