Re: [Cbor] Validity checking and tags

Carsten Bormann <cabo@tzi.org> Sun, 03 November 2019 17:18 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 59ACF12004A for <cbor@ietfa.amsl.com>; Sun, 3 Nov 2019 09:18:15 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level:
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jeBFTuf0wvsn for <cbor@ietfa.amsl.com>; Sun, 3 Nov 2019 09:18:12 -0800 (PST)
Received: from gabriel-vm-2.zfn.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7F5F1120018 for <cbor@ietf.org>; Sun, 3 Nov 2019 09:18:12 -0800 (PST)
Received: from [192.168.217.102] (p548DC893.dip0.t-ipconnect.de [84.141.200.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-2.zfn.uni-bremen.de (Postfix) with ESMTPSA id 475jLV66kmzyfs; Sun, 3 Nov 2019 18:18:10 +0100 (CET)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <1F1C4AE2-25DB-455F-846C-A262A82A4A33@island-resort.com>
Date: Sun, 03 Nov 2019 18:18:10 +0100
Cc: cbor@ietf.org
X-Mao-Original-Outgoing-Id: 594494288.505784-825dd31063ae7234f28bad8484d01b82
Content-Transfer-Encoding: quoted-printable
Message-Id: <89CDD13F-26C9-41B4-97B9-6182ECBC4465@tzi.org>
References: <1F1C4AE2-25DB-455F-846C-A262A82A4A33@island-resort.com>
To: Laurence Lundblade <lgl@island-resort.com>
X-Mailer: Apple Mail (2.3445.9.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/TJ-c-XdgyITNZrtratDG6504QPs>
Subject: Re: [Cbor] Validity checking and tags
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Nov 2019 17:18:15 -0000

> Some CBOR protocols will require use of a tag to indicate a particular data type (on a per instance basis) and others will explicitly prohibit use of a tag (on a per instance basis).

Normally, you don’t “prohibit a tag”, you say what you want.  “This is a number.”

As in (RFC 8392 in one slide):

cwt = {
  ? iss => text
  ? sub => text
  ? aud => text
  ? exp => ~time
  ? nbf => ~time
  ? iat => ~time
  ? cti => bytes
  * int => any
}

iss = 1
sub = 2
aud = 3
exp = 4
nbf = 5
iat = 6
cti = 7

~time is a number (with a strong hint that this number is meant to be interpreted as similar to how the enclosed data item of a tag with number 1 would be, i.e. as an epoch-based time).

Note that the prelude says:

                  time = #6.1(number)
                  number = int / float

So the three lines

  ? exp => number
  ? nbf => number
  ? iat => number

in the above would have meant exactly the same, but without the little hint about interpretation.

> That implies that generic validity processing can only be performed on data items that are explicitly tagged. For example generic processing can’t check the internal structure of a decimal fraction for validity unless it is tagged. It just won’t know that it is a decimal fraction.

Generic validity processing sees an array and sees that this is valid (from a generic validity processing point of view).  Indeed, it won’t know that the application intends to process this as a decimal fraction.

> I think we expect protocols using using these new / custom / registered / bespoke data types to not require instances to be tagged, so validity checking will be limited. 

The checking cannot be done as “tag validity”, but it needs to be done in the application.

> I think this is mostly OK, but wanted to point it out.

I think so, too.

> (We refer to these new / custom / registered / bespoke data types primarily as tagged, but then say the tag is not necessary; I think this is confusing). 

The short form is: A CBOR-based protocol can make use of a tag definition in an interesting way: not by shaping the data as the tag that has been defined, but by shaping the data as what would have been the enclosed data item of the tag, and then saying with that (in prose, at this point) that the data item is to be processed as if it where that enclosed data item of the tag.  That is a useful device to reduce the amount of variation between different protocols.  Providing direct support for this from a CBOR decoder implementation would be an interesting addition (it would become a schema-informed CBOR decoder then, of course).

All this discussion may be somewhat clouded by the fact that RFC 7049 speaks about “optional tags”; these have never turned out to be particularly useful, so we should no longer talk about those when writing up 7049bis.

Grüße, Carsten