Re: [Cbor] Using cddl-control to dissect byte strings

Carsten Bormann <cabo@tzi.org> Fri, 30 July 2021 13:45 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DC7063A2AC0; Fri, 30 Jul 2021 06:45:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id F2sCFY4jQJZx; Fri, 30 Jul 2021 06:45:29 -0700 (PDT)
Received: from gabriel-smtp.zfn.uni-bremen.de (gabriel-smtp.zfn.uni-bremen.de [134.102.50.15]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 373583A2ABF; Fri, 30 Jul 2021 06:45:29 -0700 (PDT)
Received: from [192.168.217.118] (p548dcc89.dip0.t-ipconnect.de [84.141.204.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4GbpYz0cdBz31Tf; Fri, 30 Jul 2021 15:45:27 +0200 (CEST)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.7\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <YQPz8H8WCUDXk/5/@hephaistos.amsuess.com>
Date: Fri, 30 Jul 2021 15:45:26 +0200
Cc: cbor@ietf.org, draft-ietf-lake-edhoc@ietf.org
X-Mao-Original-Outgoing-Id: 649345526.492762-4c8a89e5ab8eeb40102d866d25080826
Content-Transfer-Encoding: quoted-printable
Message-Id: <7F246456-2DA3-46DA-8D66-72FCC0CC4561@tzi.org>
References: <YQPkJEnTlT1ndXEf@hephaistos.amsuess.com> <E1CB86D0-20DA-40DB-899E-23A10BD56B71@tzi.org> <YQPz8H8WCUDXk/5/@hephaistos.amsuess.com>
To: =?utf-8?Q?Christian_Ams=C3=BCss?= <christian@amsuess.com>
X-Mailer: Apple Mail (2.3608.120.23.2.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/4iLwrZMB5BO5llGKL8e9j016HUM>
Subject: Re: [Cbor] Using cddl-control to dissect byte strings
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 30 Jul 2021 13:45:34 -0000

On 2021-07-30, at 14:43, Christian Amsüss <christian@amsuess.com> wrote:
> 
>> You mentioned the cddl tool generator — does the validator choke on this?
[…]
> 
>    CDDL validation failure

Yeah, I just looked at the code I wrote 2020-11-17.
The generator is fine, the validator focuses on the case where both lhs and rhs are single value (“extractable”) types; it probably should have a better message than “?” for when that is not the case.

>    [ full output at the end of the mail ]
> 
>> (I don’t want to think about .det here.)
> 
> For purposes of parser building (which is even stricter than validation,
> for validation probably doesn't need to find a *unique* assignment),
> there are reversible and irreversible controls; det is just
> irreversible, as is `bstr .cat bstr` without further constraints.

`bstr .cat bstr` is rather easy to match :-)

In this case, there simply is ambiguity where the bytes go (and the number of solutions is exactly the number of input bytes).
(It is not quite clear how to apply “preferred choice” here; greedy matching might be the equivalent.
Hope that uncertainty doesn’t stop us from going forward with .cat.)

Parsing (as opposed to matching) regexps have greedy and non-greedy occurrence indicators to disambiguate various use cases of this.
But the complexity of regexp code is exactly what I wouldn’t want in CDDL…

So I’d stick with the observation that you can write CDDL that tools will have trouble with.  I’d like to be able to cover at least

(bstr .size nn) .cat bstr
and
bstr .cat (bstr .size nn)

in a future version of the tool, but as soon as nn is a multi-value type (e.g., range), we’d need to invoke preferred choice again.

Grüße, Carsten