Re: [Json] Nudging the English-language vs. formalisms discussion forward

"Pete Cordell" <petejson@codalogic.com> Wed, 19 February 2014 22:29 UTC

Return-Path: <petejson@codalogic.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 465A91A025C for <json@ietfa.amsl.com>; Wed, 19 Feb 2014 14:29:54 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 4.752
X-Spam-Level: ****
X-Spam-Status: No, score=4.752 tagged_above=-999 required=5 tests=[BAYES_50=0.8, FH_HOST_EQ_D_D_D_D=0.765, HELO_MISMATCH_COM=0.553, RDNS_DYNAMIC=0.982, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, STOX_REPLY_TYPE=0.439, TVD_FINGER_02=1.215] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nM4FMXvNLYqF for <json@ietfa.amsl.com>; Wed, 19 Feb 2014 14:29:51 -0800 (PST)
Received: from ppsa-online.com (lvps217-199-162-192.vps.webfusion.co.uk [217.199.162.192]) by ietfa.amsl.com (Postfix) with ESMTP id DEFBF1A0226 for <json@ietf.org>; Wed, 19 Feb 2014 14:29:50 -0800 (PST)
Received: (qmail 23300 invoked from network); 19 Feb 2014 22:28:50 +0000
Received: from host81-155-177-242.range81-155.btcentralplus.com (HELO codalogic) (81.155.177.242) by lvps217-199-162-217.vps.webfusion.co.uk with ESMTPSA (RC4-MD5 encrypted, authenticated); 19 Feb 2014 22:28:45 +0000
Message-ID: <357740A8AA0F4316BE630917321FAB4D@codalogic>
From: "Pete Cordell" <petejson@codalogic.com>
To: "Carsten Bormann" <cabo@tzi.org>, "JSON WG" <json@ietf.org>
References: <C87F9B96-E028-4F0E-A950-B39D3F68FFE7@vpnc.org> <CAMm+LwhUh_yN-hzaoDWfrO_H2iGvYvj99BCE4EcYmgqCPqXoVQ@mail.gmail.com> <CAHBU6itpttXBfVQGKw=u==k_XSdrht81+m_YDNZP6RM+=9CNow@mail.gmail.com> <CAK3OfOjHkBFOzJSx=bhhoQJ8Z2bWyEXK52dNyYGWVb9FAj99ow@mail.gmail.com> <CAHBU6itzQ0rzU3EUYUqzm2qhx03qk1mpx2sehS_zeiw1ypcEgw@mail.gmail.com> <CAK3OfOhfjkbq6eREkt=MBVL1C9ubh-6My3Lvg-mnOxD0+cpN1Q@mail.gmail.com> <CAHBU6isZbew8O1HJ+XcFsMCR42iDoO_uemPXVwa3=vM5A=MngA@mail.gmail.com> <CAK3OfOgmVsNJqrqCfsD7h37axssOoaX3DGHqO=bTn5bWrA+MFA@mail.gmail.com> <A4B53816-6FBF-4A37-8BC9-F0A9D0867BCD@tzi.org>
X-Unsent: 1
Date: Wed, 19 Feb 2014 22:29:32 -0000
X-Vipre-Scanned: 02B1FAA400681402B1FBF1
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset="Windows-1252"; reply-type=original
Content-Transfer-Encoding: 8bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5931
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157
Archived-At: http://mailarchive.ietf.org/arch/msg/json/63buFQWqMAsVI-06dCIbZvMnuy4
Subject: Re: [Json] Nudging the English-language vs. formalisms discussion forward
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 19 Feb 2014 22:29:54 -0000

I agree with Carsten's suggestion of looking for something to do the 80%. 
The JSON group's biggest blessing might be that it dissolves before it can 
add sufficient esoterica to the schema to make it useless.

Things I've learnt while working with schema is that a schema needs to:

- Be able to be explicit about which one (or many) constructs serve as the 
top-level construct,

- Expecting a 'schema' processor to miraculously import stuff over the 
network is hard,

- Being able to extend a schema via a third-party document is important 
(especially for the IETF).

Also, cardinality and integer ranges, and string lengths are 80% of 
constraints a schema needs on types.

I thought I'd throw down n gauntlet as a baseline for something to be beat. 
So I started with something C like, and stole stuff from other languages and 
came up with the following

import "something.jsc"
    // import just tells the schema processor to issue a warning if
    // the specified filename does not appear on the command-line.
    // It's not a directive to find the specified file.

namespace com.ietf.person;
    // Namespaces are handy for composing vocabularies across
    // multiple domains.  Switching to another namespace then output
    // a new namespace directive.

start struct com.ietf.person
    // The 'start' keyword marks one or more root constructs
    // struct maps to an object of unordered members
{
    string<1..255> name;
        // Constraints on a type are in angled brackets.  In the case
        // of a string it is the string length, here 1 to 255 characters.
    string<1..255>[?] alias;
        // Cardinality is expressed in square brackets.  The Kleene
        // operators are used for the common cases.  More complex
        // cases might be [5], [1..2], [1..*]
    ShortString[?] maidenName;
    int<!0..!65536>[1..10] scores;
        // Constraints on integers are the number range.  Inclusive
        // range might be <0..65535>, exclusive ranges <!0..!10>
    bool isEligible;
    Car[1..*] cars;
    House[*] houses;
    Sport[*] sports;
    com.ietf.other.job job;
        // Referring to a type in another namespace
};

string<1..255> ShortString;
    // Defining a simple type

struct Car
    // Defining a compound type
{
    string make;
    string model;
    int<1900..9999> year;
};

struct House
{
    string name;
    string street;
    string town;
    string country;
};

union Sport
{
    null track;
    null racket;
    null water;
    null water;
    null motor;
};

// Allow extensions that can plug components into
// type defined elsewhere
plug into Car, House
{
    float[?] cost;
};

plug into Sport
{
    null insane;
};

// If required to return to the global namespace:
namespace ;

In many respects I think the problem we will have is not that the problem is 
too hard, but that it is too easy and we all have an opinion on it, making 
hard to come to agreement!

Pete Cordell
Codalogic Ltd
C++ tools for C++ programmers, http://codalogic.com
Read & write XML in C++, http://www.xml2cpp.com
----- Original Message ----- 
From: "Carsten Bormann" <cabo@tzi.org>
To: "JSON WG" <json@ietf.org>
Sent: Wednesday, February 19, 2014 8:02 PM
Subject: Re: [Json] Nudging the English-language vs. formalisms discussion 
forward


At the danger of repeating myself and others here, I’ll try to summarize:

— We have quite good experience with using a single, standard (!) ABNF in 
IETF protocols.
  ABNF is a production system with well-understood semantics.
  It is somewhat idiosyncratic in the world of EBNF, but that has caused 
*zero* problems in practice.
  (People have just written ABNF parsers.  You still have the second half of 
the afternoon to do something else after that.)

— A production system that generates JSON (at the data model level) is easy 
to do.
  (But we have to find people who have the background and can commit the 
time.)

— Relax NG compact is a nice existence proof that a production system like 
this can work and be highly usable.
  A JSON version could be even simpler.

— Previous attempts at trying to express XML or JSON data model syntax in 
the formalism of XML or JSON itself provide incontrovertible proof that this 
approach does not work.
  Don’t do that then.
  (It is so much easier to spend half the afternoon writing the parser for 
the workable syntax.
   We can even spec it in ABNF!)

— If you want to cover 100 % of the “syntax” of an XML document, you need to 
add Schematron to Relax NG compact.
  a) Don’t do that then for JSON — stay with an 80 % solution like Relax NG 
(which is more like 90 %) or maybe even simpler.
  b) Choose some form of “Jpath” to complement the production system with 
assertions.

(Note that choice b can always be added later, on a separate timeline from 
doing a production system.)

My proposals following from this little exercise in fact finding:

— I would propose that we try to find energy for doing the production 
system.
(In addition to the Web people, we may want to tap the YANG people for some 
recent experience in doing this kind of work.)

— I would propose that we stay open to adding a Jpath/Schematron approach 
(choice “b”), *if* somebody brings a credible one to the table.

— I also would propose that we collect a small number of benchmarks that we 
use for demonstrating that proposals for the production system are useful.
  My first suggestion: RFC 7071.  But we need a couple of different ones. 
RFC 7095?  Maybe too ambitious.

Grüße, Carsten

_______________________________________________
json mailing list
json@ietf.org
https://www.ietf.org/mailman/listinfo/json