Re: [Json] Nudging the English-language vs. formalisms discussion forward

Nico Williams <> Wed, 19 February 2014 23:15 UTC

Return-Path: <>
Received: from localhost ( []) by (Postfix) with ESMTP id 201E41A02B9 for <>; Wed, 19 Feb 2014 15:15:15 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.378
X-Spam-Status: No, score=-1.378 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 8BlDprGCMFI2 for <>; Wed, 19 Feb 2014 15:15:13 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id 76FD41A0296 for <>; Wed, 19 Feb 2014 15:15:13 -0800 (PST)
Received: from (localhost []) by (Postfix) with ESMTP id 4F2092005D90E for <>; Wed, 19 Feb 2014 15:15:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed;; h= mime-version:in-reply-to:references:date:message-id:subject:from :to:cc:content-type;; bh=VwBFM4r/UZ5HuAykRESB FlnH1dw=; b=dP1ar23ExVTKKGbiNbxsTtHnr6wtF63MONOY0Ui0lSkIFURHjp+S up/LwDChqSf3yZPwTs0UPQDoweihWg+VAzHW4aqSabDGwVobgEHEHbMNBn4egG6Y yIvCOYWRo+f471X0ouFxYamOPYhNln3JmmmBXM8UYb0c5PBAIY+Nn6k=
Received: from ( []) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: by (Postfix) with ESMTPSA id D3FD72005D909 for <>; Wed, 19 Feb 2014 15:15:09 -0800 (PST)
Received: by with SMTP id q59so878861wes.23 for <>; Wed, 19 Feb 2014 15:15:08 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20130820; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=V9T+m3LXVi+QxrPY5m/0RgaDeXJsHPmqehgymED/q3c=; b=YZvxBrk3Hio4QpIhdEQWxNW34WDc60TvpEPmcQ4r9e3UMocNzYXdDi+FSMEe4YPdjl JAcgqT/sfyVbq7jtglBiNbS59gQOsK15P84OVW+X27sl5XIQsvyxwY6gvo3llLWQeyPf 7m5w1/a+RixRY+hDXDo097FH9m5qEEM2vOBPlCIw0pXxXkEaZEuDSjBiwSWzG90waA26 kQTAu2wAVv16kDoP8I3e0t7WGVA1HzWRkBZSZJ9RGd7bdORO9yvM7uY0biF8ei8NQ52v VH3IsGTQvu9bdMgKKlrlwq+ahHk+xff+MXcU0S1izPj8qnm85kKW0vPjEqjlmE1ggEZs we3A==
MIME-Version: 1.0
X-Received: by with SMTP id gb7mr4576263wjb.69.1392851708220; Wed, 19 Feb 2014 15:15:08 -0800 (PST)
Received: by with HTTP; Wed, 19 Feb 2014 15:15:08 -0800 (PST)
In-Reply-To: <>
References: <> <> <> <> <> <>
Date: Wed, 19 Feb 2014 17:15:08 -0600
Message-ID: <>
From: Nico Williams <>
To: Andrew Newton <>
Content-Type: text/plain; charset=UTF-8
Cc: Phillip Hallam-Baker <>, Tim Bray <>, Paul Hoffman <>, JSON WG <>
Subject: Re: [Json] Nudging the English-language vs. formalisms discussion forward
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 19 Feb 2014 23:15:15 -0000

On Wed, Feb 19, 2014 at 4:23 PM, Andrew Newton <> wrote:
> On Wed, Feb 19, 2014 at 12:30 PM, Tim Bray <> wrote:
>> I think clear English prose is *essential*, the one thing a specification
>> must have. Thus, schemas can be actively harmful if arguing over them
>> distracts attention from crafting the prose properly.  This is particularly
>> the case when the schema language is a flawed tool, which so many of them
>> are.
> I agree, clear English prose are essential. At the moment, I am
> evaluating two competing security protocols, both open standards
> specified with XML Schema of which one is the product of an IETF
> working group. The IETF standard is much clearer to understand because
> it offers prose on top of the XSD. I cannot help but think that the
> reason the IETF standard stands out is simply because it is an IETF
> standard; along the way to RFC it was reviewed and reviewed and people
> simply would not have let it pass had it been only an XSD. Therefore I
> do not think we need to worry about IETF specifications being harmed
> by schemas.

So you agree that prose is needed (no one here yet disagrees), but you
don't think schemas are harmful because we generally require prose
(which we do) and that's good.  Good!

We do need prose-mostly descriptions of protocols.  We need formal
languages to avoid accidents and to convey concisely and precisely
things that can be difficult to do in prose (in any natural language).
 Prose is needed for semantics -- formal alternatives for that (e.g.,
SDL) so far haven't worked well.

What I want to avoid:

 - TLS-style tool-less inconsistent ad-hoc syntaxes
 - SSHv2-style tool-less inconsistent ad-hoc syntaxes
 - 100% prose-only (e.g., SASL)

I also do not want an ASN.1 where the only way to implement is either
a) spend years building tools first, or b) ad-hoc manual coding based
off a syntax that's full of nuances (since this leads to accidents).

Whatever schema(s) we go with has to be simple enough that ad-hoc
manual coding off of the syntax is less likely to cause accidents than
ad-hoc manual coding off of prose, while also being usable with
automatic tooling.

Since we're talking about JSON we don't need to worry about silly
warts like DER/BER/CER tagging.  No need to mention such nastiness :)

Things I'd consider, and maybe propose:

 - just pattern-matching validation rules (e.g., using the jq or other
similar language that can do pattern matching);

 - a schema with describe-by-example-mostly metaschema, with special
names denoting "types" defined separately, something like:

    "message": { "sender": "_sender_type, "receiver":
"_receiver_type", "payload": "_json_string"}
    "sender": { ... },

 - a schema like compact RelaxNG that can be parsed and converted into
one of the above;

or anything else that's relatively simple and from which code can be
generated (or which can be interpreted) to at least do validation (for
testing, not necessarily at run-time in production), and preferably
more (e.g., RPC-like stubs, programming language types, ...).