Re: [apps-discuss] JSON Schema considered harmful

Nico Williams <nico@cryptonector.com> Thu, 20 September 2012 16:29 UTC

Return-Path: <nico@cryptonector.com>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0BB9121F876A for <apps-discuss@ietfa.amsl.com>; Thu, 20 Sep 2012 09:29:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.054
X-Spam-Level:
X-Spam-Status: No, score=-2.054 tagged_above=-999 required=5 tests=[AWL=-0.077, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4mfMDSwteAG0 for <apps-discuss@ietfa.amsl.com>; Thu, 20 Sep 2012 09:29:47 -0700 (PDT)
Received: from homiemail-a86.g.dreamhost.com (caiajhbdcbbj.dreamhost.com [208.97.132.119]) by ietfa.amsl.com (Postfix) with ESMTP id 209F821F871A for <apps-discuss@ietf.org>; Thu, 20 Sep 2012 09:29:47 -0700 (PDT)
Received: from homiemail-a86.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a86.g.dreamhost.com (Postfix) with ESMTP id CFBA736006F for <apps-discuss@ietf.org>; Thu, 20 Sep 2012 09:29:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h= mime-version:in-reply-to:references:date:message-id:subject:from :to:cc:content-type; s=cryptonector.com; bh=5PX3LoUaHnBM+qxHedlr 9P51BDU=; b=G5l3a9U8GSihR/hhgXia50DjURmoc1xfyq/EL+xUS46Cx7sKNUfK Xaccmpeu1QC2h/MLH+RaTrxEHE7NCJZusQDkEFSmFrJmsuV5Do8EzH+NIKeKOaCO LBepbIHkRyUuEUoqMF/A47HTYSeYadm5MWHCMts3uaY6x5jrrGFa5+8=
Received: from mail-pb0-f44.google.com (mail-pb0-f44.google.com [209.85.160.44]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: nico@cryptonector.com) by homiemail-a86.g.dreamhost.com (Postfix) with ESMTPSA id 9117336006D for <apps-discuss@ietf.org>; Thu, 20 Sep 2012 09:29:42 -0700 (PDT)
Received: by pbbjt11 with SMTP id jt11so3156257pbb.31 for <apps-discuss@ietf.org>; Thu, 20 Sep 2012 09:29:42 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.66.82.101 with SMTP id h5mr6738831pay.15.1348158582249; Thu, 20 Sep 2012 09:29:42 -0700 (PDT)
Received: by 10.68.20.194 with HTTP; Thu, 20 Sep 2012 09:29:42 -0700 (PDT)
In-Reply-To: <CAMm+LwjVdDE34mUbmaZShN60N2tSJgRnot383ktePN_-mpjiDQ@mail.gmail.com>
References: <CAMm+LwjYj0gd3Cxjj8WFcLy-zgBwfVDCPaRGcNSgOHD9m_07yw@mail.gmail.com> <999913AB42CC9341B05A99BBF358718D01DF0684@FIESEXC035.nsn-intra.net> <CAMm+LwjVdDE34mUbmaZShN60N2tSJgRnot383ktePN_-mpjiDQ@mail.gmail.com>
Date: Thu, 20 Sep 2012 11:29:42 -0500
Message-ID: <CAK3OfOinG_cb8wHKcxp56sohA=-baS_WM2j6g7Z8kT9Rrssi4w@mail.gmail.com>
From: Nico Williams <nico@cryptonector.com>
To: Phillip Hallam-Baker <hallam@gmail.com>
Content-Type: text/plain; charset="UTF-8"
Cc: apps-discuss@ietf.org
Subject: Re: [apps-discuss] JSON Schema considered harmful
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Sep 2012 16:29:48 -0000

On Thu, Sep 20, 2012 at 8:52 AM, Phillip Hallam-Baker <hallam@gmail.com> wrote:
> I found using a schema helpful in writing the SAML spec but I didn't use XML
> schema. I used a tool and generated the XML Schema and the text from my own
> schema language. That addressed the problem of the spec and schema being out
> of sync which was a real bear as the SAML TC made major changes to the
> schema, adding and dropping items at each con-call

OK, we're getting to the meat of the complaint.  It's not about how or
who but about failures of the end product (XML Schema, JSON Schema).

> What I think we need for the IETF is a tool to help document protocols. Now
> that might be a schema done right or it might be a protocol description
> tool.

I agree.  I don't know that this is a realistic expectation (but see
below), and in particular I'm skeptical that enough code can be
generated from such a tool (as compared to merely data
encoding/decoding) that it would be anywhere near as useful as data
description languages have been.  But I'd like to be proven wrong!
(Or even to prove myself wrong.)

The ITU-T would say that SDL should be it, no doubt :), but SDL is
weirdly incompatible with ASN.1 (SDL is case-insensitive(!) while
ASN.1 is case-sensitive), and anyways, I don't think I care for SDL.

That said, and as has been demonstrated before, there's a large degree
of duality between the various data description languages that we
have.  Thus we have XER (XML Encoding Rules for ASN.1), and
FastInfoSet (PER encoding applied to XML) and so on.  And I think the
proposition that XDR is a subset of ASN.1/PER with 4-octet alignment
and non-packed encoding of optional/defaulted fields, is defensible.
Therefore I think at least a unified data description language is
possible.

What is difficult to manage is protocol flows, particularly when there
is cryptography involved.  We might be able to use any of various
high-level programming languages (e.g., Haskell, Python, Scheme, ...)
in combination with a) a generic data description language and b) some
functions left undefined (defined in English language prose rather
than a formal language).  (I would prefer to use a LISP or Scheme with
a powerful macro language so as to make it easier to generate code
from the specification by just defining a suitable set of macros.)

Maybe we could experiment with this?

> Extensibility is certainly problematic. I have seen pretty much every WG
> using XML end up in rat hole after rat hole as people try to consider what v
> 1.1 of the spec might look like. XML Schema looks like it should answer that
> question but it does not. And not many other tools do so either.
>
> This problem crops up in a JSON protocol as well. Should unknown tags be
> ignored or cause an error? Should there be a mechanism that allows
> intentional breaking of backwards compatibility in cases where doing so may
> cause applications to do the wrong thing?

This shows up in ASN.1 as well.  ASN.1 lets you say that extensions
"go here" and are to be ignored by decoders that don't expect them,
but, so what, it's not enough.  We always end up having to signal or
negotiate the use of extensions, and I don't find that to be
particularly problematic.  Perhaps XML Schema is particularly
disastrous w.r.t. extensibility?  But I doubt it.  Instead I suspect
that some wish that extensibility were simpler and more automatic, but
I'm afraid that it simply cannot be.

> Even the terminology can be disastrous. There is a feature in X,509v3 that
> is designed to allow a certificate to say 'if you don't understand and
> process this extension then reject this certificate'. The original intention
> was to allow extensions to specify new revocation/status checking schemes.
> RPs that did not understand the scheme could not use the certificate.
>
> Unfortunately that feature was called 'critical' which also means
> 'important'. And so people started using 'critical' to mean 'I think this is
> important' and not 'I think this so important that you should break
> backwards compatibility'. This is why I called the same feature 'Conditions'
> in the SAML specification and disguised it so that it didn't look like a
> criticality flag. But Conditions and Criticality are the exact same thing.
> The SAML assertion structure began life as part of a design to re-do X.509
> in XML which is why it had to have some equivalent of criticality.

Criticality is a big deal.  We don't want to render extant
implementations vulnerable when we extend a protocol, but we don't
want to break interop with them either, and sometimes you just cannot
have both of those.  It mostly behooves us to get things like PKI
right from day zero, and yet that also is wishful thinking.  I wish I
had an actual, better answer here; I don't.

> What we need in my view is a way to identify the places where additional
> items can be added into protocol messages and a default that additional
> items are prohibited everywhere else. So in my Simple JSON Schema I would
> represent a SAML assertion like thing something like:

OK, like ASN.1 extensibility markers then.  (Really, we only keep
re-inventing the wheel :)  Maybe we need JER -- JSON Encoding Rules
for ASN.1 :)

> The Any type is an intrinsic type that refers to an object that is tagged
> with the object type. So an authentication assertion would be tagged with a
> "authentication" an authorization assertion with "authorization" and so on.

So, like typed holes in ASN.1 (including the Information Object Set, a
very difficult to use syntax extension to ASN.1 that adds very useful
formalism).

Nico
--