Re: [Cbor] 7049bis: The concept of "optional tagging" is not really used in practice #126

Henk Birkholz <henk.birkholz@sit.fraunhofer.de> Sun, 03 November 2019 17:05 UTC

Return-Path: <henk.birkholz@sit.fraunhofer.de>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9B5A7120048 for <cbor@ietfa.amsl.com>; Sun, 3 Nov 2019 09:05:00 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.9
X-Spam-Level:
X-Spam-Status: No, score=-6.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lLFmoPqFh7GC for <cbor@ietfa.amsl.com>; Sun, 3 Nov 2019 09:04:57 -0800 (PST)
Received: from mailext.sit.fraunhofer.de (mailext.sit.fraunhofer.de [141.12.72.89]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 957DB120044 for <cbor@ietf.org>; Sun, 3 Nov 2019 09:04:55 -0800 (PST)
Received: from mail.sit.fraunhofer.de (mail.sit.fraunhofer.de [141.12.84.171]) by mailext.sit.fraunhofer.de (8.15.2/8.15.2/Debian-10) with ESMTPS id xA3H4rsM015453 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA256 bits=128 verify=NOT); Sun, 3 Nov 2019 18:04:54 +0100
Received: from [192.168.16.50] (79.234.112.245) by mail.sit.fraunhofer.de (141.12.84.171) with Microsoft SMTP Server (TLS) id 14.3.468.0; Sun, 3 Nov 2019 18:04:48 +0100
To: Laurence Lundblade <lgl@island-resort.com>, Christophe Lohr <christophe.lohr@imt-atlantique.fr>
CC: cbor@ietf.org
References: <92400DAA-A713-4905-A721-34B138E25192@tzi.org> <ed45e995-1858-3169-1be6-0cce5ce37ce3@imt-atlantique.fr> <87889E65-0152-455A-A6B7-C5F336DC97A4@island-resort.com>
From: Henk Birkholz <henk.birkholz@sit.fraunhofer.de>
Message-ID: <88ea442d-8fc0-31fa-d00b-7e9c0c047323@sit.fraunhofer.de>
Date: Sun, 03 Nov 2019 18:04:47 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0
MIME-Version: 1.0
In-Reply-To: <87889E65-0152-455A-A6B7-C5F336DC97A4@island-resort.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-Originating-IP: [79.234.112.245]
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/eJ9V6MobQMTeynPOSVFl-xEiMT4>
Subject: Re: [Cbor] 7049bis: The concept of "optional tagging" is not really used in practice #126
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Nov 2019 17:05:01 -0000

Hi Laurence,

please see below.

On 03.11.19 17:45, Laurence Lundblade wrote:
> I’m not really a data structure scientist or such, but I think I can see 
> Christophe’s point
> 
> Maybe CBOR-based (and JSON-based) protocols don’t have a formal schema 
> language, but these protocols rely on ordering and such. For example in 
> a COSE_Sign1 it is expected that the first data item is the protected 
> headers, the second the unprotected headers, the third the payload and 
> the fourth the signature. I don’t think you can call them self-describing.
> 
> It seems like CBOR and JSON say “no schema’” to distance from the horror 
> of XML schemas, but in reality CDDL and prose protocol specs are schemas 
> in spirit.
> 
> Maybe a key question here is whether you can say in CDDL “this next item 
> must always be interpreted as a date even though it will never have a 
> date tag”.

I would not use the term "interpretation" here. CDDL is a language, it 
in unambiguous in that regard. But I think it can do exactly what you 
mean :-) It describes the structure of CBOR/JSON messages/documents - if 
it precise enough, CBOR tags are not needed, but to ensure that 
precision you need a dedicated authority, I think.

If CDDL doesn’t have than, then you can’t describe some
> CBOR-protocols with it. CWT would be one of those protocols as it 
> forbids adding the tag to dates.
> 
> To summarize what I understand about tagging:
> 
>     The designer of a new CBOR data item type like a date format will
>     generally register a tag for it. These new data types can be really
>     simple, like epoch dates or really complex like COSE_Sign1.
> 
>     The designer of a protocol using a new data type will indicate in
>     their protocol for each occurrence of it whether the tag must be
>     present or not (never saying the tag may or may not be present). The
>     designer will typically require the tag only when necessary to
>     disambiguate the type of the data item.
> 
>     The implementor of a general purpose library to generate one of
>     these new data item types must give the caller the option to include
>     or not include the tag. Maybe this is just by never automatically
>     outputting the tag and having a distinct output tag function.
> 
>     The implementor of a general purpose library to decode one of these
>     new data types must allow the caller to say that the next data item
>     should be decoded as this new data type whether or not it is tagged.
>     Maybe it even errors out if it is tagged for the cases where the
>     protocol document says no tag should be used.
> 
> What I don’t know is whether CDDL can describe all this desired behavior.
> 
> LL
> 
> 
> 
> 
>> On Oct 24, 2019, at 1:50 AM, Christophe Lohr 
>> <christophe.lohr@imt-atlantique.fr 
>> <mailto:christophe.lohr@imt-atlantique.fr>> wrote:
>>
>> On 23/10/2019 13:38, Carsten Bormann wrote:
>>> Section 3.4 talks about "optional tagging" as a secondary purpose of 
>>> tags. But in today's CBOR protocols, tags are rarely "optional" in 
>>> the sense that they can simply be left out without a change in 
>>> semantics, as 3.4 para 3 implies.
>>>
>>> This concept comes up again in 4.2.2, where "optional tagging" is 
>>> outlawed in deterministic encoding (but then the text goes on to 
>>> explain that protocols might choose to retain tags, but doesn't say why).
>>
>> To be honest, I don't really understand how much optional are tags.
>>
>> A CDD rule with tags matchs cbor items with tags and reject cbor items
>> without tags. Tags are not optional from the data-model point of view.
>>
>>
>> Moreover, please consider this CDDL objective:
>> (https://tools.ietf.org/html/rfc7049#section-1.1)
>>
>>    3.  Data must be able to be decoded without a schema description.
>>        *  Similar to JSON, encoded data should be self-describing so
>>           that a generic decoder can be written.
>>
>>
>> Well, how to do this without putting tags everywhere for everything?
>> (Or I need more explanation about what is "self-describing" and what is
>> a "schema description")
>>
>> Let say I receive data. How may I know that this number is a temperature
>> and not a distance, and that this byte-string is an uuid and not a small
>> picture?
>>
>> The first way is to have a schema (written or not): That is to say a
>> certain preliminary knowledge of expected data which tell me that this
>> number at this place or associated to this map-key is a temperature.
>> The second way is to decorate data with tags, all data.
>> A third way is a compromise between the two first ones: I have a certain
>> level of preliminary knoledge of what data are (a kind of schema
>> description), with possibly some missing parts that are filled by tags.
>>
>> But the only way to decode data _without_ a schema description is to
>> have tags everywhere for everything.
>> Surprisingly, json has no tags and is claimed to be self-describing. Is
>> it really? I'm lost.
>>
>> My feeling is that this objective CBOR should be not so demanding.
>>
>> Best regards,
>> Christophe
>>
>> _______________________________________________
>> CBOR mailing list
>> CBOR@ietf.org <mailto:CBOR@ietf.org>
>> https://www.ietf.org/mailman/listinfo/cbor
> 
> 
> _______________________________________________
> CBOR mailing list
> CBOR@ietf.org
> https://www.ietf.org/mailman/listinfo/cbor
>