Re: [Cbor] CDDL parsing questions

Brian E Carpenter <brian.e.carpenter@gmail.com> Thu, 18 August 2022 01:46 UTC

Return-Path: <brian.e.carpenter@gmail.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3438FC14CF1F for <cbor@ietfa.amsl.com>; Wed, 17 Aug 2022 18:46:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.109
X-Spam-Level:
X-Spam-Status: No, score=-2.109 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jX_SEhq7yDNT for <cbor@ietfa.amsl.com>; Wed, 17 Aug 2022 18:46:45 -0700 (PDT)
Received: from mail-pg1-x531.google.com (mail-pg1-x531.google.com [IPv6:2607:f8b0:4864:20::531]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C75CDC14F74B for <cbor@ietf.org>; Wed, 17 Aug 2022 18:46:45 -0700 (PDT)
Received: by mail-pg1-x531.google.com with SMTP id 24so154787pgr.7 for <cbor@ietf.org>; Wed, 17 Aug 2022 18:46:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc; bh=1YYpe46T06aM2YGdtA82Fmu+kTrhEvGe5QiT0afsflY=; b=nzCzhMyQxd8PzBIAwnOSm+76sUxZQiC89Rdpfs/uCykT9YzGP8kPLFKbozduc98GSL WXOiP2kzvE2//s1Arr3veX9wY4yk601yAmsaX9mwTwvK7YxAO9Zc97rvp3tHcjuZSQ/P FEslwo/IlcBHiAhbP7wzJ71i1mVU49W9hUTkgZkmWqq/z5wp8sccJPeDJhRc+jkkkvy9 BIaQmAimbdc/9yI52X4kZGNKMmzmTH93oxtR7mesj+SmdSDOs4A2paS9I8YiEdrzmayS flr2+GuD1iymFB5MiH2uoQ1ac8HM4MGPDLj/tlp/8txzIvrZuOJeHhEdNrCkQQJheUjH siXg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc; bh=1YYpe46T06aM2YGdtA82Fmu+kTrhEvGe5QiT0afsflY=; b=UmXYnA6Y3BWVKciMQwXZ/Gqz3u+8KaEmE5q/FY/nLRGekWiXHqKoLib+a/x8KuEi9S MEsBSmKEsKDh32cxj57Nc9Xntj6IicTuQHeHQSZx+ajTSSqn2Mutmfkf4+JKUMAZDLzZ 6cWitOsC3l3hqDsb5s4HaNzHNvMmcS9NYW6f9BjFInmCKF8t/NtFCTfW0Ygbu4mv21iU v5yEcn+uOloLn/swLp8gJNYBkGVYS1x9AVFzLtQHLmRmMGI8stAhK0wSg3TsOeTL0dgG /CBQlY08HhSTqDM0lbr86g7etUQjU6dkB/YkKdaM+yMLryRPikX3wqvREjrutjOgK2Vv gLHA==
X-Gm-Message-State: ACgBeo0dJOeSfMO7bL0UIsUUSPWuQp734UkHPNOdQr701oLEqWcN1/uu dUq6s3wSGFKBhQuNFGConPzPNzckGUQ=
X-Google-Smtp-Source: AA6agR7wDkburJr6QA9xI18b8W2vghOto7bELQjoGBtSuGbP/JslGF7EQrkQAgd+Bd/lYinPXyouSw==
X-Received: by 2002:a63:2605:0:b0:429:e7f0:a92d with SMTP id m5-20020a632605000000b00429e7f0a92dmr763742pgm.188.1660787205254; Wed, 17 Aug 2022 18:46:45 -0700 (PDT)
Received: from ?IPV6:2406:e003:1124:9301:80b2:5c79:2266:e431? ([2406:e003:1124:9301:80b2:5c79:2266:e431]) by smtp.gmail.com with ESMTPSA id j18-20020a170902da9200b001714c36a6d9sm71423plx.229.2022.08.17.18.46.42 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 17 Aug 2022 18:46:44 -0700 (PDT)
Message-ID: <57a7223d-62d8-1d51-1db3-9c5e37d07117@gmail.com>
Date: Thu, 18 Aug 2022 13:46:41 +1200
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0
Content-Language: en-US
To: Derek Atkins <derek@ihtfp.com>
Cc: cbor@ietf.org, Toerless Eckert <tte@cs.fau.de>
References: <Yv13HuFndByI/TtZ@faui48e.informatik.uni-erlangen.de> <2d9abb4cff288213ee021bfb5d57f5a6.squirrel@mail2.ihtfp.org>
From: Brian E Carpenter <brian.e.carpenter@gmail.com>
In-Reply-To: <2d9abb4cff288213ee021bfb5d57f5a6.squirrel@mail2.ihtfp.org>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/S1jPmo3i29vgh_D9RQDxHZZkEvg>
Subject: Re: [Cbor] CDDL parsing questions
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Aug 2022 01:46:46 -0000

A supplementary question below:

On 18-Aug-22 11:52, Derek Atkins wrote:
> HI,
> 
> On Wed, August 17, 2022 7:17 pm, Toerless Eckert wrote:
>> Sorry, drive-by question (as in: i am not following the mailing list
>> thoroughly, so if this was already discussed, i happily take a pointer
>> and apologize for duplication).
>>
>> When defining data structures via CDDL for protocol headers, parsing
>> of such protocol headers does not only include parsing the CBOR or
>> JSON data structure, but also matching/parsing that data structure
>> into the right CDDL names.
>>
>> CDDL makes it easy to define names that can require quite complex
>> "lookahead" parsing. This is not uncommon in natural or computer
>> languages, but to my memory this is usually painstakingly avoided
>> in "classical" (pre-CBOR) packet header structure designs (at least
>> at network/transport layer). Now, with the abstraction of CDDL,
>> new designers may not even think about the resulting parsing complexity.
>>
>> I could not find anything about this in rfc8610, so i was
>> wondering if the WG ever discussed this and/or if members
>> have thoughts about this. I for example just ran into this
>> with one of our CDDL defined protocols. Wonder if it exists
>> in other CDDL defined protocols too, and if so, with what
>> degree  of complexity.
> 
> My personal feeling is that the receiver needs to know a priori what the
> schema is in order to properly parse the object.  CBOR had a clear parsing
> mechanism (you have arrays and maps), but the semantics of the array
> offset and/or map keys is part of the CDDL schema and not part of the
> parser, per se.
> 
> In other words, you can parse the CBOR into its underlying structure
> without knowing the schema, but in order to process it, you need an (out
> of band) way to know the schema.
> 
> Tagged objects help identify what the object is, but without that, you
> pretty much have to know out-of-band.

And does that also apply if you have some sort of compiler-like parser
driven by the CDDL? Or do you assume that the parser is hand-coded
in order to implement the given CDDL and insert the out-of-band semantic
knowledge?

    Brian

> 
> Just my $0.02 (from having recently designed a CBOR data protocol,
> described by CDDL).
> 
>> Cheers
>>      Toerless
> 
> -derek
>