[T2TRG] Syntax evolution for SDF (OneDM Simple Definition Format)

Carsten Bormann <cabo@tzi.org> Mon, 04 May 2020 20:47 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: t2trg@ietfa.amsl.com
Delivered-To: t2trg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 860EB3A1051 for <t2trg@ietfa.amsl.com>; Mon, 4 May 2020 13:47:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4mU-wYBSH1TZ for <t2trg@ietfa.amsl.com>; Mon, 4 May 2020 13:47:18 -0700 (PDT)
Received: from gabriel-vm-2.zfn.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4BFC13A1042 for <t2trg@irtf.org>; Mon, 4 May 2020 13:47:10 -0700 (PDT)
Received: from [172.16.42.112] (p548DCD70.dip0.t-ipconnect.de [84.141.205.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-2.zfn.uni-bremen.de (Postfix) with ESMTPSA id 49GFK62gT7z100p; Mon, 4 May 2020 22:47:06 +0200 (CEST)
From: Carsten Bormann <cabo@tzi.org>
Content-Type: multipart/mixed; boundary="Apple-Mail=_4B0893DB-5D53-4CB9-89CB-3D7D1E74C1A9"
Content-Transfer-Encoding: quoted-printable
X-Mao-Original-Outgoing-Id: 610318025.723032-ebc4b768a065b99a80e977e25ad3c401
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\))
Date: Mon, 04 May 2020 22:47:05 +0200
Message-Id: <0310272B-B1B1-408A-9BB3-2FFC82C8AA09@tzi.org>
To: t2trg@irtf.org
X-Mailer: Apple Mail (2.3608.80.23.2.2)
Archived-At: <https://mailarchive.ietf.org/arch/msg/t2trg/_dRTc8aoc4OdvV8SLVjmpod4pTs>
Subject: [T2TRG] Syntax evolution for SDF (OneDM Simple Definition Format)
X-BeenThere: t2trg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IRTF Thing-to-Thing Research Group <t2trg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/t2trg>, <mailto:t2trg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/t2trg/>
List-Post: <mailto:t2trg@irtf.org>
List-Help: <mailto:t2trg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/t2trg>, <mailto:t2trg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 May 2020 20:47:22 -0000

The SDF (Simple Definition Format, [1] [2]) developed by the One Data Model liaison group is represented in JSON.

To define the syntax for this format, currently a json-schema.org-style schema is being used [3].  This has been defined in such a way that future extensions to the format are accepted (a.k.a. "ignore unknown rule").
For his SDF file linter [4], Ari Keränen has developed an “alternative schema” that is less loosely defined — after all, you want to catch misspellings in a linter, as opposed to silently accepting them as if they were future extensions.

I think that is a symptom of a problem we very often have with data definitions (“schemas”): we want a good representation of what a current valid instance would be, which could be processed by the majority of the tools and reflects the current consensus of the spec evolution.  But we also want a more loosely defined data definition that outlines the boundaries of where we think extensions might put their data; the latter would be the basis for “validating” an instance, while the former would be used for “linting" it.

For SDF, I’ll call the first one (for linting) the “current detailed” syntax.  The second would be the “framework” syntax, with an intent to keep this invariant until we have extensions that don’t just exercise known extension points, but actually represent larger changes.

The “alt-schema” [5] is trying to be that “current detailed” syntax.
The definition in the language definition repository is more on the side of a “framework” syntax.

Now this is a snapshot.  When we evolve the language, the “current detailed” syntax morphs into the “next detailed” syntax, where additional features have been added, with the intent of that becoming the “current detailed” with the next minor revision of the standard.  The “framework” syntax should be a superset of either specification.  But the framework will also need to change at some point, requiring both a “current framework” syntax and drafts for a “next framework” for a major revision.

Below is some graphics; if you don’t have SVG, here is an ASCII-art version of that.


.-----------------.              +-----------------+
|                 |              |                 |
|   Current       |              |  Next           |
|   Framework     ===============>  Framework      |
|                 |              |                 |
|                 |              |                 |
|_________________|              +-----------------+
         | \
         |  \                ==>  Evolution
         |   \               -->  Subset
         |    \
         |     \
         |      \
         |       \
         |        .------------------+
         |                            \
         v                             v
.--------+--------.              +------+----------+
|                 |              |                 |
|   Current       |              |  Next           |
|   Detailed      ===============>  Detailed       |
|                 |              |                 |
|                 |              |                 |
|_________________|              +-----------------+


One question that Ari brought up is how a maintain these syntax specs, preferably from a single input file can do multiple of these outputs.
Obviously, this can be ifdefed, or one can use m4, etc.
One can also use a program to generate the definition json, which probably is a reasonable thing anyway for quirky, hard to get right definition formats.
Best would, of course, be a data definition language that directly caters toward evolution.

Any suggestions how we should handle this issue?

Grüße, Carsten


[1]: https://github.com/one-data-model/language
[2]: https://www.youtube.com/watch?v=sTrqa5jYVKo — Video of brief Tutorial
[3]: https://github.com/one-data-model/language/blob/master/sdf-schema.json
[4]: https://github.com/EricssonResearch/ipso-odm/blob/master/odmlint/odmlint.js
[5]: https://github.com/EricssonResearch/ipso-odm/blob/master/sdf-alt-schema.json