Re: If not JSON, what then ?

Carsten Bormann <cabo@tzi.org> Tue, 02 August 2016 13:56 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BD48412D5F2 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 2 Aug 2016 06:56:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.208
X-Spam-Level:
X-Spam-Status: No, score=-8.208 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.287, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=unavailable autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TvJ214uev-Dz for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 2 Aug 2016 06:56:47 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 425A012D5CA for <httpbisa-archive-bis2Juki@lists.ietf.org>; Tue, 2 Aug 2016 06:56:47 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1bUa7b-0002mU-K0 for ietf-http-wg-dist@listhub.w3.org; Tue, 02 Aug 2016 13:52:39 +0000
Resent-Date: Tue, 02 Aug 2016 13:52:39 +0000
Resent-Message-Id: <E1bUa7b-0002mU-K0@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <cabo@tzi.org>) id 1bUa7X-0002lY-Az for ietf-http-wg@listhub.w3.org; Tue, 02 Aug 2016 13:52:35 +0000
Received: from relay4-d.mail.gandi.net ([217.70.183.196]) by maggie.w3.org with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from <cabo@tzi.org>) id 1bUa7U-0000NQ-NM for ietf-http-wg@w3.org; Tue, 02 Aug 2016 13:52:34 +0000
Received: from mfilter33-d.gandi.net (mfilter33-d.gandi.net [217.70.178.164]) by relay4-d.mail.gandi.net (Postfix) with ESMTP id C52101720A9; Tue, 2 Aug 2016 15:52:08 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at mfilter33-d.gandi.net
Received: from relay4-d.mail.gandi.net ([IPv6:::ffff:217.70.183.196]) by mfilter33-d.gandi.net (mfilter33-d.gandi.net [::ffff:10.0.15.180]) (amavisd-new, port 10024) with ESMTP id 8HXBhk95zNVj; Tue, 2 Aug 2016 15:52:06 +0200 (CEST)
X-Originating-IP: 93.199.227.76
Received: from nar-3.local (p5DC7E34C.dip0.t-ipconnect.de [93.199.227.76]) (Authenticated sender: cabo@cabo.im) by relay4-d.mail.gandi.net (Postfix) with ESMTPSA id 85AFF1720AE; Tue, 2 Aug 2016 15:52:04 +0200 (CEST)
Message-ID: <57A0A585.4060402@tzi.org>
Date: Tue, 02 Aug 2016 15:52:05 +0200
From: Carsten Bormann <cabo@tzi.org>
User-Agent: Postbox 4.0.8 (Macintosh/20151105)
MIME-Version: 1.0
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
CC: HTTP Working Group <ietf-http-wg@w3.org>, draft-greevenbosch-appsawg-cbor-cddl@ietf.org
References: <77778.1470037414@critter.freebsd.dk>
In-Reply-To: <77778.1470037414@critter.freebsd.dk>
X-Enigmail-Version: 1.2.3
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
X-W3C-Hub-Spam-Status: No, score=-7.8
X-W3C-Hub-Spam-Report: AWL=0.849, BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, W3C_AA=-1, W3C_IRA=-1, W3C_IRR=-3, W3C_WL=-1
X-W3C-Scan-Sig: maggie.w3.org 1bUa7U-0000NQ-NM 0a0e75e83e1e8917c79e76eda8495f87
X-Original-To: ietf-http-wg@w3.org
Subject: Re: If not JSON, what then ?
Archived-At: <http://www.w3.org/mid/57A0A585.4060402@tzi.org>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/32148
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

I don't know when Poul-Henning will catch a train the next time...

But I was interested in whether CDDL can be used as a specification
language here, because it is always good to look at more use cases.
So let's see (mostly untested at this point because I have no idea
whether anybody wants to use CDDL in this context, but examples should
work with the CDDL tool today):

> Schemas
> =======
> 
> There needs a "ABNF"-parallel to specify what is mandatory and
> allowed for these headers in "common structure".

Indeed, CDDL is essentially ABNF ported to tree grammars.

The top-level data model of the proposed format could be expressed as:

header-value = [* dict-element]
dict-element = [name, value-map]
name = text                          ; possibly restricted
value-map = {* name => value}        ; empty by default
value = text / bytes / number / time / value-map
                                     ; add as needed

> Ideally this should be in machine-readable format, so that
> validation tools and parser-code can be produced without
> (too much) human intervation.  I'm tempted to say we should
> make the schemas JSON, but then we need to write JSON schemas
> for our schemas :-/

For -09, we are discussing to add a separate machine-readable (JSON)
encoding to be used by tools, in addition to the human-readable format
to be used by humans.  (No intention to make both the same, that would
be a classical mistake.)

> Since schemas basically restict what you are allowed to
> express, we need to examine and think about what restrictions
> we want to be able to impose, before we design the schema.
> 
> This is the least thought about part of this document, since
> the train is now in Lund:

OK, let's see what restrictions CDDL offers today (they are called
"annotations" there, a not so bright name to be changed):

> Unicode strings:
> ----------------
> 
> * Limit by (UTF-8) encoded length.
> 	Ie: a resource restriction, not a typographical restriction.

That would be .size:

dns-label = text .size (1..63)

> * Limit by codepoints
> 	Example: Allow only "0-9" and "a-f"
> 	The specification of code-points should be list of codepoint
> 	ranges.  (Ascii strings could be defined this way)

Today this is generally done via regexps.

> * Limit by allowed strings
> 	ie: Allow only "North", "South", "East" and "West"

Those are typically done constructively:

direction = "North" / "South" / "East" / "West"

Of course, regexps can do that, too, if needed.

> Tokens
> ------
> 
> * Limit by codepoints
> 	Example: Allow only "A-Z"

token1 = text .regexp "[A-Z]"

> * Limit by length
> 	Example: Max 7 characters

CDDL can only count characters (as opposed to bytes) employing regexps
right now.
Another extension may be needed if there indeed is a good use case for
counting characters.

> * Limit by pattern
> 	Example: "A-Z" "a-z" "-" "0-9" "0-9"
> 	(use ABNF to specify ?)

(Regexps, again)

> * Limit by well known set
> 	Example: Token must be ISO3166-1 country code
> 	Example: Token must be in IANA FooBar registry

Interesting.  There currently is no formal interface to IANA (or ISO)
registries; we don't have an informal escape like ABNF has in prose-val.
"Annotations" could be added as needed and they are close.

> Qualified Tokens
> ----------------
> 
> * Limit each of the two component tokens as above.
> 	
> Binary Blob
> -----------
> 
> * Limit by length in bytes
> 	Example: 128 bytes
> 	Example: 16-64 or 80 bytes

blob1 = bytes .size (16..64 / 80)

> Number
> ------
> 
> * Limit resolution
> 	Example: exactly 3 decimal digits

Would need a new CDDL "annotation", say

q = number .decimals 3

> * Limit range
> 	Example: [2.716 ... 3.1415]

etopi = 2.718..3.1415

> Integer
> -------
> 
> * Limit range
> 	Example [0 ... 65535]

ex1 = 0..65535
ex2 = uint .size 2

> Timestamp
> ---------
> 
> (I cant thing of usable restrictions here)

ts = uint   ; (or whatever type of timestamp you want;
            ;  `time` as a POSIX time and RFC3339 dates are built in)

Grüße, Carsten