Re: [Cbor] A CBOR tag for alternatives/unions, request for comments

Michael Peyton Jones <michael.peyton-jones@iohk.io> Thu, 24 February 2022 14:13 UTC

Return-Path: <michael.peyton-jones@iohk.io>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 258883A07B7 for <cbor@ietfa.amsl.com>; Thu, 24 Feb 2022 06:13:03 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.088
X-Spam-Level:
X-Spam-Status: No, score=-7.088 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=iohk.io
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9IuycFg5iDTS for <cbor@ietfa.amsl.com>; Thu, 24 Feb 2022 06:12:57 -0800 (PST)
Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5E6FE3A0332 for <cbor@ietf.org>; Thu, 24 Feb 2022 06:12:57 -0800 (PST)
Received: by mail-wr1-x42d.google.com with SMTP id r10so2258519wrp.3 for <cbor@ietf.org>; Thu, 24 Feb 2022 06:12:57 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iohk.io; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=/pcHVbpEnRE18KlRHciT6I94hiOWtc2FtWTHizQjsxE=; b=yRnMlGJ52wnS617uPS02b3EmqolC1MPzRJjYhiM+xBvYMDJ8/uOy/3Qw9IzUtpa7yj 3Nouqn+i0td6oZCydU9+EFXXXMAXpJtfZsFC1jTnJU1h8szfjN7HQ9Ewp7r+/FEn3LTr rDLKOIJzyo3iTPfjKH3T48q0krbIEqPR9wDYL/Sb5leNWjdlGfJ8N/My90MSt7BFcHAx l3TaoWRs6mFBq6vx9lhyViqXnHCORhnfGH46ZEay9nRmBHqJFAyeOoX5P/7lS7zFPDQT C3qJVJis069a5lqLfPLTlj05ZMgnkqYtfhj6ltK/PIovnjTJm/8Ua52cL6M7XQ7MAk6H XAjg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=/pcHVbpEnRE18KlRHciT6I94hiOWtc2FtWTHizQjsxE=; b=eZCs5pk22V6Czt91BNeDzlucGn8x7+18jY0Tz6Nvw2l5CxqgHuu7qEVU1nOgE4XHNA y9wLIxtSNpHbNECAeQKjQ5S5hBWrXDqkOoK1QMoeFu7Kb1NfB1/+GdLnwmo2Vm4b4q24 MoP1Y5nYg4tcDlDlmCztXIZHXzja7dR1nzQd67MD3CwdOJG+YG0ronHZx1GZdIUKLbVn lgieFwwiKbAiek8cAcY2L7N2XDNPtpRk613XRxU42mJAw4PhUqirdWf3jaM42amSyash M/v0wsOvcEI34/LJoq8XfkJlRMpFY11RHa33PZ1XURwoVIq/FafCOtlz8EEY90J8Fr2Y 0IEA==
X-Gm-Message-State: AOAM533glmBbd16pL9yVE+jv+dSbFI6ZNgHnlV6wyGbMc2qTAYPPootM HNUakXCqcZH19Eh6/0cdESquDPfWyBb84FIA3tKTHg==
X-Google-Smtp-Source: ABdhPJzoRUMvNcyptphiLDU+AzwtXtfR7Sj6TTJouKiW+iXfb7FKUkIN4h9Y4sBylfsoJMw1eAYVY76lzV5UQKdhgv0=
X-Received: by 2002:adf:e592:0:b0:1ed:9f91:e299 with SMTP id l18-20020adfe592000000b001ed9f91e299mr2307026wrm.638.1645711974703; Thu, 24 Feb 2022 06:12:54 -0800 (PST)
MIME-Version: 1.0
References: <CAKoRMYH3MTMi_tX5KHF-O-DTKzopiGqe3fi6XjkPaGCM4823OQ@mail.gmail.com> <7dfd62ccb6c089af90c90f26a8945f23232ecbc1.camel@well-typed.com> <CAKoRMYEOo1Gqfc4W4k3NOLKpFa97Q9YzLCm3r0PJ13V2HJPf3A@mail.gmail.com> <2BBF6463-FDB2-4A8A-B20D-7A1AD976A90D@tzi.org> <CAKoRMYFi8uo2GfHA9s1n+-rMO8Ja9=2qMMzjS9Z=F9r3LFozRQ@mail.gmail.com> <8EA89504-C176-4850-9BB8-C7E7206374FF@tzi.org> <CAKoRMYGmOa0hzEFsJh8kpz0bU5x56Yc9P=DBK-ghU83gXxPv7A@mail.gmail.com> <CAKoRMYGUvmxufQUVyvX2mciq5LCmV0Nz-uE2MJn54GDBB+9DRw@mail.gmail.com> <CAKoRMYF_19V6mu4S9GVqfiNzyQVvvOzX6eYwHp_DtZQoG0xTKg@mail.gmail.com> <4B47F4D7-ADE3-4A22-8A5B-97F4E5FCD933@tzi.org> <Yhd3/bwVUOLJLzWu@hephaistos.amsuess.com> <B6FC521C-1C28-4B11-90F2-DE62308B7168@tzi.org>
In-Reply-To: <B6FC521C-1C28-4B11-90F2-DE62308B7168@tzi.org>
From: Michael Peyton Jones <michael.peyton-jones@iohk.io>
Date: Thu, 24 Feb 2022 14:12:43 +0000
Message-ID: <CAKoRMYGVii8eQAPsGWJn-H8+kJ81QkOGWpybDK44b5wsJbjxtw@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: Christian Amsüss <christian@amsuess.com>, Duncan Coutts <duncan@well-typed.com>, cbor@ietf.org, Jared Corduan <jared.corduan@iohk.io>, Alexander Byaly <alexander.byaly@iohk.io>
Content-Type: multipart/alternative; boundary="000000000000a7bfcb05d8c42db4"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/iwDe2riuvolv53hYuevUhxcy7UA>
Subject: Re: [Cbor] A CBOR tag for alternatives/unions, request for comments
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 24 Feb 2022 14:13:11 -0000

> Does it, for the applications expecting to use this, make a noticable
difference whether a 1+1 or a 1+2 tag is used?

This is a great question.

> The difference is likely most pronounced on alternatives without actual
payload

More generally, the tag overhead is important in cases where the space used
for tags is significant relative to the space used for the payloads. The
case where there is no payload is one example.

Another example that's interesting is where there is a lot of *structure*
to the data. If you'll permit me a toy example, suppose we have a balanced
binary tree of numbers. In Haskell, the data type definition would look
like this:

data Tree = Leaf Int | Node Tree Tree

If we encode such a structure using the proposed alternative tags ('0' for
'Leaf' and '1' for 'Node') then we will have 2n-1 nodes for n entries in
the tree, and hence that many tags. If the entries themselves are not
enormous, then the proportion of the total space usage devoted to the tags
can become significant! We have encountered this problem in real life.

Putting it another way: some tags are useful, but occur infrequently. There
is little point in optimizing their space usage. However, we expect that
the users who *do* use these tags will use them a *lot*, and so optimizing
them provides a decent benefit.

Best wishes,
Michael

On Thu, 24 Feb 2022 at 13:48, Carsten Bormann <cabo@tzi.org> wrote:

> > […] this is not a case coming
> > from the constrained or high-volume applications that CBOR was designed
> > for, […]
>
> I’m not so sure about the constrained space:
> Rust has a strong ascent as the language of choice in the constrained
> space.
> Rust also has a strong enum system.
>
> (Rust is also of interest in the high-volume space.)
>
> Grüße, Carsten
>
>

-- 

*Michael Peyton Jones*
Software Engineering Lead | London, UK

Website: www.iohk.io <http://iohk.io>
Skype: michael.s.pj
Twitter: @mpeytonjones
PGP Key ID: 29F64616

[image: Input Output] <http://iohk.io>

[image: Twitter] <https://twitter.com/InputOutputHK> [image: Github]
<https://github.com/input-output-hk> [image: LinkedIn]
<https://www.linkedin.com/company/input-output-global>


This e-mail and any file transmitted with it are confidential and intended
solely for the use of the recipient(s) to whom it is addressed.
Dissemination, distribution, and/or copying of the transmission by anyone
other than the intended recipient(s) is prohibited. If you have received
this transmission in error please notify IOHK immediately and delete it
from your system. E-mail transmissions cannot be guaranteed to be secure or
error free. We do not accept liability for any loss, damage, or error
arising from this transmission