[Json] A minimal examplotron-style JSON validation language.

John Cowan <cowan@ccil.org> Wed, 29 May 2019 19:14 UTC

Return-Path: <cowan@ccil.org>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F215A120163 for <json@ietfa.amsl.com>; Wed, 29 May 2019 12:14:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=ccil-org.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mji4cO14lypn for <json@ietfa.amsl.com>; Wed, 29 May 2019 12:14:55 -0700 (PDT)
Received: from mail-ot1-x32b.google.com (mail-ot1-x32b.google.com [IPv6:2607:f8b0:4864:20::32b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 67139120113 for <json@ietf.org>; Wed, 29 May 2019 12:14:55 -0700 (PDT)
Received: by mail-ot1-x32b.google.com with SMTP id r10so3189835otd.4 for <json@ietf.org>; Wed, 29 May 2019 12:14:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ccil-org.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=sT9yYv3xuu53+Q8u5JF5W79cmKyvWAlpw6A5VFYf86Y=; b=hlWDxOnTbH000/y1ALo/ilJ0NpRnRRAJDEAFa9kn2uKLs9sMWpD+BW7Wmk9vheph2h 9URvags4J0l+Rf2CIC2PD8ausqY4Yjyu5V9YQBic9X2Zrb1qYd6KGZLt+MvPHzKNrsAs mK32IAurlnxvfQimb3R4IAL/Dw9/9wu/Kb8Uf47Vkq8OcnpZ1r0uiAWU+PrnpgIV1xkt gGK1oqYi0uO8rQLY+txK2HJ8HbJUXQPTWLvXVzI+S06pk5iU6c92lDnskjnLCx8l+2eV /hxoFM3DJM3VsNhrx8tZk6gbhkXen63zgwFFKIGGj95vgYEab7S0CeYQmVr6EE3vFK7D of6w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=sT9yYv3xuu53+Q8u5JF5W79cmKyvWAlpw6A5VFYf86Y=; b=B588IYtU9CZ2UzA321cX2tfUPB8VMGjjTzkf3DX3KZea/Ucy0AaWHprOlt6ULb66Rx Vo+PcisE5sSo0FQWAiSWUTz844cV/DeN9xdSJ5zq7jrt4w7ui6LwFmHiKM554lMO9IYr rLs8VPMjZpeU+w+mXgAkNBZPM4jWg1A+/BjVtMEziZjlaEDGc6/PZLJ35tgTA2a5WjjY tGQKQ383USN06DGs5PMaiJKF6XBTsMipllx5gle5anm+GG2IzMREWi9F+mO88sdwwjjZ 1Bc1Xt4Edy0GPk+9fVgj8x01ZZIF13+pxLaowQg0L8V55PkTDb/o74b9p6vqH3SO2Siv HeLA==
X-Gm-Message-State: APjAAAXAZkz4TSa0dB1VNxsc7g7R8pFA+ODQmSnvWXuFdrHGjPOCMfXm BNqI7rC1i+Fq/XMZXWUIfoXBtKkX/oEd56JiHciPKQ==
X-Google-Smtp-Source: APXvYqyr7ENOAy5BzSmM7jb1vUdGu6XHDM7SXv4iYGW0mA1+kFFdXGT2blxTzN9WPTJC++AvW5D4zDzpzQN2a8HHHEQ=
X-Received: by 2002:a9d:6287:: with SMTP id x7mr32566558otk.287.1559157294471; Wed, 29 May 2019 12:14:54 -0700 (PDT)
MIME-Version: 1.0
References: <CAJK=1Rj7PBD-bbwvsqgjQQzp4Aoidb-W2q5Lj6asMHHDHaTVYQ@mail.gmail.com> <646abf11-496b-c120-45d6-2a1aeab051a8@codalogic.com> <8224451C-F21B-41E5-A834-A9005050CB1F@tzi.org> <CAJK=1RjdYD6TZCNrw=H3d9ZLKLxZZOwVCOYYPwfbP+1ETDDz1Q@mail.gmail.com> <11CDA7F6-30BB-40E4-8926-2EDCBCFD785B@tzi.org> <CAHBU6iv8ZsFM5yco5gi+gcyU8d=u3bOSgiKaF6-hv-GARgNh9w@mail.gmail.com> <CAChr6SwNvG4Z7TKUxAVeH7HMVWiPsEBNb12K9zVkjaGt2_v0fw@mail.gmail.com> <CAHBU6ivTD_v7L-wQ+P9TmSfBY=5N+k-caaZ0TZhg6yZ_SWR_aA@mail.gmail.com> <CAChr6SzD8qdETafQKKU41BcYayTWf+C4GENd9FNzy5JYOv5jRQ@mail.gmail.com> <CAHBU6isx5aB94U-vn_t6GGoQ9W+ATDNYR6_+CtXgOhFho5Qh-g@mail.gmail.com> <20190529144005.GC11773@localhost>
In-Reply-To: <20190529144005.GC11773@localhost>
From: John Cowan <cowan@ccil.org>
Date: Wed, 29 May 2019 15:14:43 -0400
Message-ID: <CAD2gp_QELt-3=wqA1gRafNim8Y6fsxZ6hcQmTsoOxCSxU8eM1Q@mail.gmail.com>
To: Nico Williams <nico@cryptonector.com>
Cc: Tim Bray <tbray@textuality.com>, JSON WG <json@ietf.org>, Carsten Bormann <cabo@tzi.org>, Ulysse Carion <ulysse@segment.com>, Rob Sayre <sayrer@gmail.com>
Content-Type: multipart/alternative; boundary="000000000000afb594058a0b9865"
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/okXiqlmi79yGmntbLI_Vj5_g8hk>
Subject: [Json] A minimal examplotron-style JSON validation language.
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 29 May 2019 19:14:58 -0000

The trouble is that "validation" is a very vague term.  I would be happy
with a language that can only express simple type validation, minimally as
follows:

0) JSON values are divided into types, namely null, numbers, strings,
booleans, object types, and arrays of any of these.

1) An object type specifies what attributes the corresponding object has
and what the types of the corresponding values are, or else specifies that
it may have arbitrary keys and values.   It may have a name in the schema
language

2) When an array is expected, the validator will check that all its
contents have the same specified type.

3) There is a way to specify for a given object type whether it must
contain exactly the specified keys, may contain a superset of the specified
keys where the additional keys are not checked, or may contain a subset of
the specified keys.

This language leaves validation of the values of strings and numbers to
lpost-validation code.  It just makes sure that for a statically typed
language a proper combination of records/structs/classes that can be
generated in advance can represent any JSON value that is valid against the
schema.

Here is a very simple examplotron language:

Sample document from json.org:

{"menu": {
  "id": "file",
  "value": "File",
  "popup": {
    "menuitem": [
      {"value": "New", "onclick": "CreateNewDoc()"},
      {"value": "Open", "onclick": "OpenDoc()"},
      {"value": "Close", "onclick": "CloseDoc()"}
    ]
  }
}}

Corresponding schema (constructed by hand, may have errors):

{
  "root": {"menu", "menu"},
  "menu": {
    "id": "",
    "value": "",
    "popup": "popup",
  },
  "popup":  {"menuitem": ["menuitem"]},
  "menuitem": {
    "value": "",
    "onclick": ""
  }
  "__whole__": {"menu": "root"},
  "__strictness__": {"menu": "superset"}
}

A schema is a JSON object whose keys are the names of object types, and
whose values are the corresponding definitions.  Object type names are
meaningful only within the schema.  In addition, the key "__whole__"
(mandatory) is the type of the whole document. If the key "__strictness__"
is present, its keys are object types and its values are one of the strings
"exact", "subset", or "superset".  Object types without names, or those not
mentioned in "__strictness__", are always exact.

An object type definition is an object whose keys are the keys of objects
that are valid against it and whose values are the corresponding types.
Alternatively, an empty object designates an object with arbitrary keys (in
effect, a dictionary).

An object type's designator can be its name (as a string), or an in-place
object type definition.  A null type's designator is the null value.
Designators for a number, string, or boolean type are specified by any
number, the empty string "", or any boolean respectively. An array type's
designator is an array with one value, a designator for the type of the
elements of the array.

The schema for schemas is not very useful: it is simply {"__whole__": {}},
because the top-level keys have unpredictable names.


On Wed, May 29, 2019 at 10:59 AM Nico Williams <nico@cryptonector.com>
wrote:

> On Tue, May 28, 2019 at 08:47:16PM -0700, Tim Bray wrote:
> > TBH, what I want from a schema system is (a) useful error messages and
> (b)
> > ability to drive code generation, classes and serializer/deserializers
> and
> > so on.
>
> This is all I want as well.  Validation and codegen.
>
> _______________________________________________
> json mailing list
> json@ietf.org
> https://www.ietf.org/mailman/listinfo/json
>