[Cbor] WGLC comments on cbor-packed

Christian Amsüss <christian@amsuess.com> Wed, 18 May 2022 12:50 UTC

Return-Path: <christian@amsuess.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5C893C14F74A for <cbor@ietfa.amsl.com>; Wed, 18 May 2022 05:50:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id o_6VdbkNRPaB for <cbor@ietfa.amsl.com>; Wed, 18 May 2022 05:49:58 -0700 (PDT)
Received: from smtp.akis.at (smtp.akis.at [IPv6:2a02:b18:500:a515::f455]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 42F8AC14F748 for <cbor@ietf.org>; Wed, 18 May 2022 05:49:56 -0700 (PDT)
Received: from poseidon-mailhub.amsuess.com ([IPv6:2a02:b18:c13b:8010:a800:ff:fede:b1bd]) by smtp.akis.at (8.17.1/8.17.1) with ESMTPS id 24ICnpXA039974 (version=TLSv1.2 cipher=ECDHE-ECDSA-AES256-GCM-SHA384 bits=256 verify=NOT) for <cbor@ietf.org>; Wed, 18 May 2022 14:49:52 +0200 (CEST) (envelope-from christian@amsuess.com)
X-Authentication-Warning: smtp.akis.at: Host [IPv6:2a02:b18:c13b:8010:a800:ff:fede:b1bd] claimed to be poseidon-mailhub.amsuess.com
Received: from poseidon-mailbox.amsuess.com (hermes.amsuess.com [10.13.13.254]) by poseidon-mailhub.amsuess.com (Postfix) with ESMTP id 3DADB613E for <cbor@ietf.org>; Wed, 18 May 2022 14:49:51 +0200 (CEST)
Received: from hephaistos.amsuess.com (unknown [IPv6:2a02:b18:c13b:8010:fd7a:e73:a558:c031]) by poseidon-mailbox.amsuess.com (Postfix) with ESMTPSA id 00499A020 for <cbor@ietf.org>; Wed, 18 May 2022 14:49:50 +0200 (CEST)
Received: (nullmailer pid 3444189 invoked by uid 1000); Wed, 18 May 2022 12:49:50 -0000
Date: Wed, 18 May 2022 14:49:50 +0200
From: Christian Amsüss <christian@amsuess.com>
To: cbor@ietf.org
Message-ID: <YoTrbqq0Or2dfMdO@hephaistos.amsuess.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="gw3nBv4Z7LXs5e+L"
Content-Disposition: inline
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/qhBa-nC3q9S_GJgG5tAQu1xGMSw>
Subject: [Cbor] WGLC comments on cbor-packed
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 18 May 2022 12:50:00 -0000

Hello -packed authors,
hello CBOR group,

with chair hat off, I've had a look at this document, here's a few
notes:

* "are rendered as a hyphen": I share the sentiment, but there might be
  processes where that comment is placed better. (But maybe it can stay
  in here for some while...).

* The explicit separation of the prefix and the suffix table makes it
  hard to later extend to other affixes, e.g. the one we're considering
  for the use with CRIs where the tagged infix gets pushed several
  layers down into the circumfix from the table.

  One way to keep that open is that the various circumfix forms gain
  their own tags (probably none around the 216..223 / 225-255 range, but
  from a later range). An alternative design would be to just have a
  single list of affix tags and decide whether it's pre- or suffix by
  the kind of item in the affix list.

* "A maliciously crafted Packed CBOR data item might contain a reference
  loop" in section 2.4: That this is possible only follows when one sees
  the ways the tables can be set up in section 3 (because it can only
  happen with setups that set the Current Set already during item
  definition).

* "Packed item references in the newly constructed (low-numbered) parts
  of the table need to be interpreted": Given how long it took me to
  find that paragraph, it'd help if it used the term "Current Set".

  Suggestion:

  > The Current Set relevant for the newly constructed (low-numbered)
  > parts of the table is the new table (which includes the, now
  > higher-numbered, inherited parts). If any existing, inherited
  > (higher-numbered) parts contain packed reference, their Current Set
  > stays unmodified; these references still go to the inherited (more
  > limited) number space.

  There's one odd detail about shifting the Current Set early of which
  I'm not sure whether it will impact zero-copy embedded devices: By the
  time the Current Set is used the first time (i.e. when seeing the
  shared items), the amount by which the old set was shifted is not yet
  known (because the prefix and suffix lengths were not seen yet). It's
  probably OK because the items aren't evaluated yet anyway, but I ask
  you to think this through as well.

* "By the application environment, e.g., a media type".

  Does this also happen implicitly when using a tag from file-magic?

  (If so, should a media type also describe what happens to existing
  entries inside the table?)

* ad dictionary referencing: I think that a good way to do this would
  also to us an own registered tag (any discussions that pop up on tag
  evolvability will certainly help there). Referencing the dictionary
  via URI is probably too impractical to go into this document, but
  dictionary setup by custom tag might deserve a hint.

* This is currently informational. While that's enough to exercise the
  extension points (tag 6 is in 'standards action range', ie. BCP is
  enough), if other documents are to use this in any official manner
  (CoRAL will likely say "are compressed using", "the dictionary is
  defined in terms of", and not "using any additional mechansisms such
  as" when referring to packed, to ensure good interoperability and not
  require another SDO that picks the right components), it'd be helpful
  if this were standards track from the onset. 

  The datatracker "intended status" already reflects that.


BR & see you all soon
c

-- 
To use raw power is to make yourself infinitely vulnerable to greater powers.
  -- Bene Gesserit axiom