Re: [Json] Another problematic JSON Schema use-case

Phillip Hallam-Baker <ietf@hallambaker.com> Thu, 26 May 2016 17:56 UTC

Return-Path: <hallam@gmail.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8C59F12D5F9 for <json@ietfa.amsl.com>; Thu, 26 May 2016 10:56:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.4
X-Spam-Level:
X-Spam-Status: No, score=-2.4 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.198, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gEcK5gwtQwxa for <json@ietfa.amsl.com>; Thu, 26 May 2016 10:56:38 -0700 (PDT)
Received: from mail-qk0-x230.google.com (mail-qk0-x230.google.com [IPv6:2607:f8b0:400d:c09::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D05AC12D0FF for <json@ietf.org>; Thu, 26 May 2016 10:56:37 -0700 (PDT)
Received: by mail-qk0-x230.google.com with SMTP id y126so63305764qke.1 for <json@ietf.org>; Thu, 26 May 2016 10:56:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc; bh=9S+VLi7LKtbREIkYCyl+LF7X8fwoS9EqXEloCwXNBb4=; b=eG7BEMXo29Q35gARy+LCWHyRiNtW28y67Z0miNrT+r/nx1gE2VnSTsnR1fUAfo5wsw 3kQH3yeC25HdjoPaaBY4K/FwjuiMsXemlvLnfUP2nKohlWHt3mjo7XClYJABcUcHDZim x/o51CHKRc89iUl0Tom3jERU8twzfCO9EOslJPxybAEt0fu0v4G6gauH2sGIPSVHAZ5V WqActbU6UVNuOZyBldxuR55Zt9vI8+IuST3p+trEwt4RwgECLtafCCrg2OBGcjmPoTJT 7UvuaH5TWnDIyempAOFCfzoBnWLP3PjnhEGwm9LnMWUynilACGjiumYZhNLacIZaZaSR Cd8g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc; bh=9S+VLi7LKtbREIkYCyl+LF7X8fwoS9EqXEloCwXNBb4=; b=VIqLeQN7uLslmopejtGftm5vMnbQSc259uOoIdHrRuILY6y9qbLmtq2bsX/YBgdXG9 kbKlQ+qybGajHXdpwP8CC7LVEaDKP8YQ9emMvYQTY6E/zwAe4DinbLoiAlerDD+lbAik Vm0TJusxOsBNUTqCv6Srv4PEFNQuKw7qi2DWBPkmkaThKOjfGcEgFIKXsqm5wVAXLmT2 ivRRqBei0mdIFNGcIciLRswlbeBIlp7xit56sbz4PvasI330lL0MmdbrT4YZZnNK/nxv L20l5yf8AK4hfKcD3FDH7qbQ43pLdOTZjqDiAaK824+vJX06iZZTqAXpz7If33xeVSPO Q+yA==
X-Gm-Message-State: ALyK8tJU0QpVApai6XeKrTt85WSMAo03n7p3Ap9+xkZKAj0ReFNI9uJjsG51K3LjxCh59rJ0csZR608DzUWIMA==
MIME-Version: 1.0
X-Received: by 10.55.120.196 with SMTP id t187mr9963819qkc.6.1464285396776; Thu, 26 May 2016 10:56:36 -0700 (PDT)
Sender: hallam@gmail.com
Received: by 10.55.25.85 with HTTP; Thu, 26 May 2016 10:56:36 -0700 (PDT)
In-Reply-To: <20160526173719.GA19074@mercury.ccil.org>
References: <CAMm+Lwg2rWh0_gjXnSAEAvWtsMO1U3UiA8jsBzc+rRR6fiKcJg@mail.gmail.com> <20160526173719.GA19074@mercury.ccil.org>
Date: Thu, 26 May 2016 13:56:36 -0400
X-Google-Sender-Auth: Ic9C6oFjIrvgdRMxDO1gzcqYui0
Message-ID: <CAMm+Lwgzdw5gBdBvmA6aHQ55mcHXnQzabrfdA4DObtyReauuWA@mail.gmail.com>
From: Phillip Hallam-Baker <ietf@hallambaker.com>
To: John Cowan <cowan@mercury.ccil.org>
Content-Type: multipart/alternative; boundary="94eb2c05cebcecbff90533c2827c"
Archived-At: <http://mailarchive.ietf.org/arch/msg/json/ph-KesVrbG3O8h2NYnbYvK_g6vo>
Cc: Tim Bray <tbray@textuality.com>, "json@ietf.org" <json@ietf.org>, Austin William Wright <aaa@bzfx.net>, Andrew Newton <andy@hxr.us>
Subject: Re: [Json] Another problematic JSON Schema use-case
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 May 2016 17:56:40 -0000

On Thu, May 26, 2016 at 1:37 PM, John Cowan <cowan@mercury.ccil.org> wrote:

> Phillip Hallam-Baker scripsit:
>
> > I have some very strong opinions on the advantages of strong typing.
>
> Even if you accept the principle of strong typing, that is not conclusive
> as to how many atomic types you have.  Most programming languages have
> only one string type, or at most have a family of types varying only by
> length or maximum length.  But it would be possible to declare the type
> of a string using a regular expression, and Cobol approximates this with
> its PICTURE types.
>
> On the other hand, statically typed languages typically have multiple
> types of numbers: C has 14, Java has 7, Pascal has 2, and Algol 68 has
> a denumerably infinite number (short int, short short int, short short
> short int, etc.) all statically distinct, though normally only a finite
> number of distinct run-time representations.
>
> If we stick to JSON's mutually exclusive atomic types of null, boolean,
> number, string, we can devise a very simple schema language for static
> typing only.  Most of the complexity comes from wanting to provide more
> complex static types such as integer vs. non-integer, exact vs. inexact,
> or highly constrained strings.
>

I don't really consider different types of integer to be different types.
They are different representations of whole numbers, they are all
intrinsically the same type.

There is a similar error in Pascal which in the ANSI standard has the
ludicrous insistence that int [2] is a different type to int [3]. And there
is no cast operator. Small wonder people used to refer to it as being
'Wirthless'

Yes, real numbers and integers are probably things you want to be able to
treat differently in code because it is very very rare to need or want a
real64 number in a protocol.

I have no problem with limiting the intrinsic types to the set given in
JSON. What I mean by strong typing is that when I convert a JSON data
stream I convert it to a data model that has classes and structures that
are in the application data model. So I write code of the form:

Profile.Applications.Add (MailApplication);

Rather than something like

Tree.Append (Profile, "Profile.Applications", MailApplication.Tree());


Yes, if you use a fully dynamic binding language, you can write code that
looks like the first example. But what you don't get in that situation is
static type checking on your code. My code sill throw a compiler error if I
try to add a DeviceProfile in a slot typed as a list of ApplicationProfile.



> ===========
>
> Here's a sketch of a simple schema-by-example language:
>

Mostly agree as a starting point. But I think that you absolutely want to
be able to put binary blobs of data in a stream and distinguish them from
strings.

I have spec, running code and apps written in it.


>
> Open questions:
>
> 1) What about a JSON object with fixed keys that also allows arbitrary
> additional keys?
>

That is an interesting one because it comes down to how do you want
extensibility to work. In particular, when do you want adding things to be
backwards compatible and when do you want to cause things to halt rather
than have an application try to act on data it did not fully understand?



> 2) What about null?  It can be treated as the sole inhabitant of a unique
> atomic type encoded by a JSON null, or as something not mentioned in
> schemas that is always allowed in place of any JSON value (like SQL null),
> or as a replacement for arrays and objects only.
>

This is actually a little problematic as in JSON any slot can hold null
while many languages don't allow integers or booleans to be null.

C# has nullable types for integers and booleans. But lots of languages
don't. I am very wary of writing protocol logic that would depend on such a
distinction.