Re: [Cbor] Invalid decimal fraction / big float?

Carsten Bormann <cabo@tzi.org> Wed, 28 August 2019 05:11 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E26DD120835 for <cbor@ietfa.amsl.com>; Tue, 27 Aug 2019 22:11:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.197
X-Spam-Level:
X-Spam-Status: No, score=-4.197 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WCup7v_9Rs2d for <cbor@ietfa.amsl.com>; Tue, 27 Aug 2019 22:11:33 -0700 (PDT)
Received: from gabriel-vm-2.zfn.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C5785120832 for <cbor@ietf.org>; Tue, 27 Aug 2019 22:11:32 -0700 (PDT)
Received: from [192.168.217.110] (p548DCCB9.dip0.t-ipconnect.de [84.141.204.185]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-2.zfn.uni-bremen.de (Postfix) with ESMTPSA id 46JDNz1CNvz108Z; Wed, 28 Aug 2019 07:11:31 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <6D9351E2-D9BE-4650-A5DC-1E4897F48939@island-resort.com>
Date: Wed, 28 Aug 2019 07:11:30 +0200
Cc: cbor@ietf.org
X-Mao-Original-Outgoing-Id: 588661888.2526979-8e6bc07246045989dd39dcaecfe22ba0
Content-Transfer-Encoding: quoted-printable
Message-Id: <C16A1A49-3E7B-42AD-98BC-852D6BE88824@tzi.org>
References: <D4C38A34-F601-4173-A686-362DDE4E8BEE@island-resort.com> <0515B626-7968-43C1-950E-5AD5FCEA2671@tzi.org> <39C91AF6-7948-46E4-8FDA-F1F8188A107D@island-resort.com> <C893901A-EE56-45E9-9CF2-CB800AD38DA1@tzi.org> <09A5B3BF-28F1-4543-89E6-DCD8CCA0477B@island-resort.com> <A4D19CD5-4F3E-4E81-9792-15D43BFA4096@tzi.org> <6D9351E2-D9BE-4650-A5DC-1E4897F48939@island-resort.com>
To: Laurence Lundblade <lgl@island-resort.com>
X-Mailer: Apple Mail (2.3445.9.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/fmDg2eVnvMgb3rxgi9vt6UJ7adQ>
Subject: Re: [Cbor] Invalid decimal fraction / big float?
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Aug 2019 05:11:36 -0000

Hi Laurence,

good point.
However, tagging is tagging, there can be no tagging without tagging.
So “implicit tagging” is an oxymoron to me.

What we could say:

The data types that are being defined in the form of CBOR tag definitions may also be useful to data definition languages, which may provide a way to employ the type definitions at specific locations identified by their context by means of the data definition language, without a representation of the actual tag being exchanged (for instance, the “unwrap” operator defined for CDDL in Section 3.7 of RFC 8610 can be used for this purpose: `~time` could stand for a number representing a Posix time without actually encoding Tag 1).
A CBOR-based protocol definition based on such a data definition language may still want to employ tags to enable automatic processing of tags in generic decoders, and to provide a distinguishing semantics where needed (e.g., to distinguish a time from another use of numbers possible in the same place).
Generic decoders may want to provide their tag data processing capabilities (e.g., converting a number into a time) in an unbundled form to the application in order to enable the processing of such data types identified by context, and not by an explicit tag.

(Unrelated:)
A CBOR-based protocol definition will typically define exactly where tags are and are not to be used, just as with other containers such as arrays and maps (giving rise to a “structural” view of application level validity).
Alternatively, it could describe its protocol in terms of the data types that are conceptually “returned” by a tag, e.g., it could treat all arrays the same, independently of whether they are represented as classical CBOR arrays or as typed arrays created by specific tags [I-D.ietf-cbor-array-tags]; we could term this as a “semantic” view of application level validity.  Note that the present specification does not define a “semantic” type system that could be employed for this; such definitions are left to further work.


The above needs a lot more word smithing (and may benefit from creating contexts where it is easier to say these things), but you get the idea where I think this should be heading.

Generic decoders will always need to be able to hand (e.g., unknown) tags to the application in an unprocessed manner.  Where processing *is* performed, it may be necessary to indicate to the application that a tag was employed (to keep the distinguishing semantics), except where we explicitly exclude this (e.g., bignums should never be used with distinguishing semantics).

Grüße, Carsten


> On Aug 28, 2019, at 06:30, Laurence Lundblade <lgl@island-resort.com> wrote:
> 
> Here's text I think might be useful to add to the Creating CBOR-based Protocols section:
> 
> Protocols using data types defined in the section Tagging of Items section can use them with explicit tags (type 6 enclosing data items) or with no tag and their type implied (no type 6 enclosing data item). If there is no explicit tag, the protocol design should be sure the data type can be unambiguously recognized in all use case. This is often accomplished by saying a member of a map with a particular label is always of a particular type.
> 
> Explicit tagging should only be used when it is actually necessary to clearly distinguish the type.
> 
> Some protocol designs may forbid explicit tagging of particular data items.
> 
> Protocol designs should directly state whether explicit tagging is required, disallowed or optional for each use of these data types.
> 
> One of the reasons I’m bring this up is that in a way section 3.4 Tagging of Items  has the horse before the cart. The definition of the new data types seems more important than the tagging as the tagging is the optional part. Alternative titles for the section might be Additional Data Types or Compound Data Types. The somewhat unclear (to me) characterization of the optionality of tagging is why I asked about tags in decimal fractions.
> 
> 
> I think it is useful and practical for generic decoders to directly support tagged data types rather than just passing them on to the caller with the tag.
> 
> In some cases the tagged data types may have natural representations in the language or the platform (e.g. time formats). In some cases the tagged data types are complex enough that some CBOR expertise it helpful (e.g., array tags) when implementing. The decoder author might also want to encapsulate more of CBOR so the caller doesn’t have to know as much about CBOR.
> 
> Obviously, it can’t be for tags that aren’t invented yet, some tagged data types have use that is too narrow, and some may be poorly designed in the opinion of the author of the decoder. Generic decoders should also support the caller implementing tagged data types on their own and a mode to pass tags through.
> 
> That’s all just my view and no particular issue with the text. I mention it in part to say that generic decoders supporting tagged data types may wish to have features in their API to handle explicit / implicit tagging.
> 
> LL
> 
> 
> 
> 
>> On Aug 25, 2019, at 2:32 PM, Carsten Bormann <cabo@tzi.org> wrote:
>> 
>> 
>>> CWT aside, by my understanding, it is allowed for CBOR protocols to make use of the data types defined in “Optional Tagging of Items” section of 7049 without explicitly adding a tag. 
>> 
>> Absolutely.  Making that easier to do was one reason we introduced ~ (unwrap) in CDDL.
>> 
>> 
>>> 
>>> For example, I could say in the EAT definition that the claim labeled “teetime”  is a tag 1 epoch-based date as defined in 7049. EAT could (should?) then say which of a), b) or c) is allowed for use of the tag 1. Right?
>> 
>> Well, EAT should simply say what the claim is.  There is no need for a/b/c here.
>> Either you use the tag or just its definition of the semantics applied to the enclosed data item; I can’t imagine a case where anything is gained by allowing both.
>> 
>>> When CWT says “MUST NOT be prefixed with any CBOR tag”, it sounds semantically like “forbidden” to me.
>> 
>> Yes.  But it’s not that CWT forbids tags, it is that it defines the first seven claims for CWT in a way that does not make use of tags.  It would be as misleading to say that “COSE is forbidden in CWT” because the first seven claims don’t use COSE.
>> 
>>>>> c) optional — (I don’t know of an example)
>>>> 
>>>> Any protocol that uses CDDL `unsigned` has “optional” Tags:
>>>> Numbers below 2**64 do not use Tags, numbers equal to or greater than 2**64 do.
>>>> 
>>>>> A good generic decoders will handle all three, probably with some feature in the API to say which of the above scenarios to use.
>>>> 
>>>> Since the generic decoder hands the application the decoded data item, the application can handle the Tags and their being required/optional/forbidden.
>>> 
>>> Generic decoders can handle tagged types inside the decoder itself.
>> 
>> They never can for all tags, because all tags haven’t been invented yet.
>> 
>>> Mine does. That means those types of generic decoders have to know about case a), b) and c).
>> 
>> I still don’t think so.  That would only be the case if the decoder doesn’t “handle” the tag, but “hides” it.  That would indeed lead to problems.
>> 
>>> Despite my nit picking, I really do like CBOR a lot
>> 
>> Thanks.  We need to have these discussions to ensure we don’t have language in the spec that turns out to be misleading.  I just pushed a pull request for CBORbis that massively rearranges the language around tags, please have a look:
>> 
>> https://github.com/cbor-wg/CBORbis/pull/109
>> 
>> Grüße, Carsten
>> 
>> _______________________________________________
>> CBOR mailing list
>> CBOR@ietf.org
>> https://www.ietf.org/mailman/listinfo/cbor
>