Re: [Cbor] CDDL parsing questions

Toerless Eckert <tte@cs.fau.de> Thu, 18 August 2022 10:43 UTC

Return-Path: <eckert@i4.informatik.uni-erlangen.de>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3A067C1524AF for <cbor@ietfa.amsl.com>; Thu, 18 Aug 2022 03:43:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.659
X-Spam-Level:
X-Spam-Status: No, score=-1.659 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cP-17tKz0ZPQ for <cbor@ietfa.amsl.com>; Thu, 18 Aug 2022 03:43:05 -0700 (PDT)
Received: from faui40.informatik.uni-erlangen.de (faui40.informatik.uni-erlangen.de [131.188.34.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 81BB2C1522BC for <cbor@ietf.org>; Thu, 18 Aug 2022 03:43:04 -0700 (PDT)
Received: from faui48e.informatik.uni-erlangen.de (faui48e.informatik.uni-erlangen.de [IPv6:2001:638:a000:4134::ffff:51]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by faui40.informatik.uni-erlangen.de (Postfix) with ESMTPS id A5A6A58C4AF; Thu, 18 Aug 2022 12:43:00 +0200 (CEST)
Received: by faui48e.informatik.uni-erlangen.de (Postfix, from userid 10463) id 96CA14EB7B1; Thu, 18 Aug 2022 12:43:00 +0200 (CEST)
Date: Thu, 18 Aug 2022 12:43:00 +0200
From: Toerless Eckert <tte@cs.fau.de>
To: Derek Atkins <derek@ihtfp.com>
Cc: cbor@ietf.org
Message-ID: <Yv4XtKqLUrto4f/c@faui48e.informatik.uni-erlangen.de>
References: <Yv13HuFndByI/TtZ@faui48e.informatik.uni-erlangen.de> <2d9abb4cff288213ee021bfb5d57f5a6.squirrel@mail2.ihtfp.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <2d9abb4cff288213ee021bfb5d57f5a6.squirrel@mail2.ihtfp.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/leDeR8e8z_gwGFHMXYsiJ5FfPaU>
Subject: Re: [Cbor] CDDL parsing questions
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Aug 2022 10:43:11 -0000

On Wed, Aug 17, 2022 at 07:52:32PM -0400, Derek Atkins wrote:
> My personal feeling is that the receiver needs to know a priori what the
> schema is in order to properly parse the object. 

Agreed. But it seems we do not agree how to call that level of receiver
processing. I was trying to call it CDDL parsing. Before CBOR, we would
have called it protocol parsing i guess.

> CBOR had a clear parsing
> mechanism (you have arrays and maps), but the semantics of the array
> offset and/or map keys is part of the CDDL schema and not part of the
> parser, per se.

Yes.

> In other words, you can parse the CBOR into its underlying structure
> without knowing the schema, but in order to process it, you need an (out
> of band) way to know the schema.

That sounds like a two-stage parsing. First just CBOR parsing, then
CDDL/protocol parsing. That is certainly an option, but i could also
see toolchains where the protocol would never bother with the CBOR layer,
but just do the CDDL/protocol parsing, and the CBOR parsing would
transparently happen under the hood. 

In any case: it seems to me that the CDDL layer replaces a lot of
what in pre-CBOR/CDDL protocols is a lot of human (implementor)
text explaining how protocol element fields may depend on each other
and/or how to interpret specific combinations of field. Especially
that "how to interpret" is replaced by simply giving CDDL names to
it.

> Tagged objects help identify what the object is, but without that, you
> pretty much have to know out-of-band.

Well, i don't think this is the only option. Like with any sentence
parser for a human language or command parser for a computer language,
CDDL allows you to parse/match CDDL names based not only on specific
tag values (even though thats what we like to do traditionall), but
also by any arbitrarily complex object structure, such as the
number of elements, their hierarchy or types. Without looking at any
element/tag values!

And that is what i find both exiting but probably more scary.

> Just my $0.02 (from having recently designed a CBOR data protocol,
> described by CDDL).

Any draft reference to look at ?

Cheers
   Toeless

> > Cheers
> >     Toerless
> 
> -derek
> 
> -- 
>        Derek Atkins                 617-623-3745
>        derek@ihtfp.com             www.ihtfp.com
>        Computer and Internet Security Consultant
> 

-- 
---
tte@cs.fau.de