Re: [Cbor] Reminder and call for agenda: CBOR WG Virtual Meeting on 2022-06-01

Christian Amsüss <christian@amsuess.com> Mon, 27 June 2022 10:49 UTC

Return-Path: <christian@amsuess.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0A2A8C159498 for <cbor@ietfa.amsl.com>; Mon, 27 Jun 2022 03:49:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.91
X-Spam-Level:
X-Spam-Status: No, score=-1.91 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cZvPX2fSUd1P for <cbor@ietfa.amsl.com>; Mon, 27 Jun 2022 03:49:55 -0700 (PDT)
Received: from smtp.akis.at (smtp.akis.at [IPv6:2a02:b18:500:a515::f455]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6C687C14CF1E for <cbor@ietf.org>; Mon, 27 Jun 2022 03:49:53 -0700 (PDT)
Received: from poseidon-mailhub.amsuess.com ([IPv6:2a02:b18:c13b:8010:a800:ff:fede:b1bd]) by smtp.akis.at (8.17.1/8.17.1) with ESMTPS id 25RAnnSq056918 (version=TLSv1.2 cipher=ECDHE-ECDSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 27 Jun 2022 12:49:49 +0200 (CEST) (envelope-from christian@amsuess.com)
X-Authentication-Warning: smtp.akis.at: Host [IPv6:2a02:b18:c13b:8010:a800:ff:fede:b1bd] claimed to be poseidon-mailhub.amsuess.com
Received: from poseidon-mailbox.amsuess.com (hermes.amsuess.com [10.13.13.254]) by poseidon-mailhub.amsuess.com (Postfix) with ESMTP id 3EDCC9229; Mon, 27 Jun 2022 12:47:29 +0200 (CEST)
Received: from hephaistos.amsuess.com (unknown [IPv6:2a02:b18:c13b:8010:d445:47d9:7089:397e]) by poseidon-mailbox.amsuess.com (Postfix) with ESMTPSA id E7EA7CEDB; Mon, 27 Jun 2022 12:41:28 +0200 (CEST)
Received: (nullmailer pid 3949017 invoked by uid 1000); Mon, 27 Jun 2022 10:41:27 -0000
Date: Mon, 27 Jun 2022 12:41:27 +0200
From: Christian Amsüss <christian@amsuess.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: cbor@ietf.org
Message-ID: <YrmJV7OwrbOI/zKe@hephaistos.amsuess.com>
References: <CALaySJLPtUjdfVss17noK=18RyczpcCGNu=im8CBpiQz=WiLWA@mail.gmail.com> <CALaySJKUNh-AkJa87sCDpzf9OHV8H367VQyzyozXCCXxphUARw@mail.gmail.com> <CALaySJ+P2sP7BU7bNSxRJBByyp04rzVZuukq_e+9wbb5WPRSFQ@mail.gmail.com> <CALaySJKxht1gd1+3mNiAH-kLUAxjdPPk3doK50C_xS74LG+YTQ@mail.gmail.com> <CALaySJJjSHT2q_wpZQ9QFhLSxGuhffWwb=9P1XDUFTsheOvPZA@mail.gmail.com> <5A9B396E-1D9F-455C-949F-9B4C89AA510C@tzi.org> <CALaySJ+Sp=hmc4-kp1UrYPf0BxMtQy4aS+LiCfkREYqmip1Q6w@mail.gmail.com> <B9E21E1E-164D-4306-88D7-A88DC76080A9@tzi.org>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="3ZASYt2PPo/iErnw"
Content-Disposition: inline
In-Reply-To: <B9E21E1E-164D-4306-88D7-A88DC76080A9@tzi.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/LGbalt7OT846g2AgxlWt4ozoOgA>
Subject: Re: [Cbor] Reminder and call for agenda: CBOR WG Virtual Meeting on 2022-06-01
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 27 Jun 2022 10:49:58 -0000

Hello Carsten,

thanks for the text, it adds important functionality.

> When reconstructing the original data item, such a reference is
> replaced by a data item constructed from the argument data item found
> in the table (argument, which might need to be recursively unpacked
> first) and the rump data item (rump, again possibly recursively
> unpacked).

This reads a bit like also the rump would be recursively unpacked first.
Is that the intention? (I think not -- with my understanding of
unpacking so far, the rump would be unpacked at the time it is
encountered in the reconstructed item, allowing the argument to alter
the dictionary).

> [...]; a type-0 reference is either a prefix reference or a
> type-0 function reference, while a type-1 reference is either a
> suffix reference or a type-1 function reference.

I find the type-0/type-1/"dominating tag" concept rather hard to
understand. It's neat how it allows symmetry between reference and
argument (which is especially pronounced when prefix/suffix becomes
concatenation), but I'm unconvinced that that symmetry is worth the
cognitive / implementation load of doing it that way, or even generally
present. (One can always swap the arguments of a two-argument function,
but not for all functions either argument makes sense in a dictionary).

Alternatives I'd like to suggest for consideration:

* Maybe the mechanism described in the branch can be explained in an
  easier-to-digest way. It may or may not become simpler by either of

  * giving names to the two arguments the of the function (arg1, arg2?
    A, B? tagged, other?)
  * calling prefix/suffix just "concatenation" (arg1 | arg2) and not
    treating it as an extra case but as the default

* Use a bit more separate tags for where there is real need for having
  both directions. (No type-0/type-1 distinction, and having the
  tag-that-indicates-function always in the dictionary. For string/array
  postfixes that'd need a tag already at dictionary setup).

* Decoupling the functions a bit more from the compression, and using a
  more general mechanism to fill them.

  There might not be much point in having `109(["example.com",
  ["https://", "/foo".html"]])` around explicitly in any document, but
  in a generalization the tag could mean "string-join arg2 with arg1",
  and all of a sudden `109([",", ["1", "2", "3"]])` or even `109([",",
  64('123')])` could be usable more generally even outside packing.

  In packing, they'd be used as

  ```
  113([[],
    [109(["packed.example", ARG])],
    [6(["https://", "/foo.html"]),
     ...
    ]
  ])
  ```

  or (in the other example's style)

  ```
  113([["packed.example"],
    [],
    [109([6(0), ["https://", "/foo.html"]]),
     ...
    ]
  ])
  ```

  (The first example here does beg the question the 'put the argument in
  here' placeholder ARG is best phrased; I don't have any definite
  answer here yet, but maybe it makes sense to shift up the index space
  inside the argument-item space by 1 and make 6(0) the argument --
  although that might be similarly hard to teach as type-0/type-1.)

BR
Christian

who is wondering whether indefinite-length strings / bytestrings really
needed to be in the original CBOR spec, or would not have been better
served with a "concatenate" tag around an indefinite-length array.

-- 
You don't use science to show that you're right, you use science to
become right.
  -- Randall Munroe