Re: [Cbor] Unusual map labels, dCBOR and interop

Carsten Bormann <cabo@tzi.org> Thu, 28 March 2024 06:21 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AD675C14F71D for <cbor@ietfa.amsl.com>; Wed, 27 Mar 2024 23:21:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.185
X-Spam-Level:
X-Spam-Status: No, score=-4.185 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id k3-GW8quyj3W for <cbor@ietfa.amsl.com>; Wed, 27 Mar 2024 23:21:09 -0700 (PDT)
Received: from smtp.zfn.uni-bremen.de (smtp.zfn.uni-bremen.de [134.102.50.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E69C1C14F721 for <cbor@ietf.org>; Wed, 27 Mar 2024 23:21:06 -0700 (PDT)
Received: from smtpclient.apple (p548dcbf2.dip0.t-ipconnect.de [84.141.203.242]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4V4thb5VblzDCbk; Thu, 28 Mar 2024 07:21:03 +0100 (CET)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.500.171.1.1\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <CAN8C-_JJZ6uS5mBj_gNozC7H3+RG7ULBJ6gO55R9B7=gv=D_RA@mail.gmail.com>
Date: Thu, 28 Mar 2024 07:20:53 +0100
Cc: Wolf McNally <wolf@wolfmcnally.com>, "lgl island-resort.com" <lgl@island-resort.com>, cbor@ietf.org, Christopher Allen <christophera@lifewithalacrity.com>, Shannon Appelcline <shannon.appelcline@gmail.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <921D34F5-9E86-48B8-9D5F-F923F8B93A8F@tzi.org>
References: <8C245824-1990-4616-AB70-FFD4FACB1AE9@island-resort.com> <11E8A8A5-D891-49FF-AF16-697C06F463B3@tzi.org> <9A0CE364-C141-4EBE-9703-292C416D12F5@island-resort.com> <3D62C4F0-D570-4EE4-AF6A-163C708AA6BE@tzi.org> <58BA8F8C-0C63-4534-9BF7-255C32D02C16@island-resort.com> <5F1E1133-4565-4D0A-98EE-A13C6F5F67AA@wolfmcnally.com> <CAN8C-_+72_H=mk6xGuSk72rZWVg9Ff0d_b_o8Rz+kRWn1FruCQ@mail.gmail.com> <E585B8F9-BA13-4018-8D50-3C7560183BC4@wolfmcnally.com> <CAN8C-_JJZ6uS5mBj_gNozC7H3+RG7ULBJ6gO55R9B7=gv=D_RA@mail.gmail.com>
To: Orie Steele <orie@transmute.industries>
X-Mailer: Apple Mail (2.3774.500.171.1.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/dDgz05z6TPQ9Mb4EUrLiYJ71oDk>
Subject: Re: [Cbor] Unusual map labels, dCBOR and interop
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Mar 2024 06:21:14 -0000

Hi Orie,

an application profile specifies some aspects that are common for each member of a set of application protocols.
(By that definition, each application protocol also is an application profile, so much of the reasoning for application profiles needs to apply directly to application protocols.  Clearly, to be useful, an application profile should be applicable to a set of application protocols with much more than one member.)

draft-mcnally-deterministic-cbor-07 does not define a media type for dCBOR [1].
It could, but it would not be very useful, as you’d really want to know the specific application protocol (such as application/envelope+cbor in [2]).
If we would, we would undoubtedly call it application/dcbor+cbor, as dCBOR instances are CBOR instances.
A Structured Syntax Suffix [3] could be defined for dCBOR itself, but then we’d run into the question whether that should be stacked on top of +cbor and all that unwieldiness (application/envelope+dcbor+cbor?) (***).

[1]: https://www.ietf.org/archive/id/draft-mcnally-deterministic-cbor-07.html#name-iana-considerations
[2]: https://www.ietf.org/archive/id/draft-mcnally-envelope-06.html
[3]: https://www.rfc-editor.org/rfc/rfc6839.html

> If undefined were encountered when deserializing dcbor to JavaScript, an error should be thrown?

(Or to any implementation platform.) Yes.
(Asking the same question for a CBOR generic decoder, the answer is always “no”, independent of application protocol or profile.  Of course, a CBOR decoder can be implemented with limitations that fit well with the application protocol(s) to be used, so CBOR decoders can be written that reject “undefined” or any other simple value (or 4711, if you like) right during the deserialization.)

> SCITT Receipts and CWTs are built on the assumption that no error would be thrown.

SCITT Receipts and CWTs are application protocols that have not been designed with the application profile dCBOR in mind.  (E.g., in CWT, »undefined« can occur once a Claim is defined that can have that inside its member value.  But then dCBOR already would reject receiving 1711604736.0 in an iat/exp/nbf.)

> I must have not understood the comment made regarding map labels, because I thought you were saying that dcbor serializes maps differently from vanilla CDE... If it doesn't, why the need for any new serialization?

(I think Wolf covered this.
Random information: My PoC implementation derives the dCBOR-serialized map keys for use as the sorting key in sorting the map members immediately before their serialization — which it would need to do anyway to actually encode them, but the actual serialization is then done separately inside a plain generic encoder(*).  The assumption here is that the map keys are small and structurally simple enough that the runtime hit is not a big problem.)

> I don't think dcbor can be called application/cbor any more than json without booleans can be called application/json.

[1, 2, 3] is an instance of application/json.  There is no boolean in there.  Likewise, the use of application/cbor does not require that any CBOR instance is a valid instance of the application protocol.

> Processors not expecting Booleans in JSON would explode, and processors of dcbor expecting it to be meaningfully different than application/cbor will also explode right?

Not sure I can parse this.

> Feeding unexpected content to a parser is the normal expectation for security formats.

Absolutely, and a checking dCBOR decoder is a good tool to factorize out detection of certain format violations to a common, reusable, well-tested library.
(**)

> We've seen real problems arise from JSON-LD, being valid json, but not quite valid "JSON-LD"... It can lead to ambiguous processing, where some middle boxes throw errors and others don't.

JSON-LD is RDF serialized in JSON.
RDF provides an application data model that is completely different from that of JSON.
JSON-LD documents "look like" JSON, and hand-made examples you can find in documents are typically made up in such a way that this illusion is fed well.
But interpreting JSON-LD documents as PoJ (plain old JSON) is, in general, a mistake.

The dCBOR application data model is a proper subset of the CBOR generic application data model, with identical semantics; no special processing is required after CBOR-decoding to do further interpretation by the application.
(Insert lengthy discussion whether numeric reduction leads to different semantics — mathematically, it does not, and dCBOR lends itself to application protocols that do not make that distinction either.)

> Distinguishing dbcor from cbor seems a prerequisite to making any progress on its application in a security context.

Not sure if I’m getting what you are trying to say, but you can’t distinguish a specific dCBOR instance from a CBOR instance, because any dCBOR instance is both.

Grüße, Carsten

(***) Of course, we have +der and +ber, but I don’t think we want to emulate that here.

(*) which, for my implementation platform, luckily happens to be able to preserve the order of map elements that has been set by the sorting; as a prerequisite, map ordering by insertion order is implemented in the implementation platform.

(**) My 65-LOC PoC dCBOR implementation doesn’t even come with a decoder, so the user of that library has to do the checking of incoming dCBOR.
This checking can be implemented trivially by dCBOR-encoding the result of the decode with a strict encoder and ensuring the bytes of the encoded representation are the same as the original encoded input.  (Insert runtime argument here.)  I probably should add those 5 LOC for a dCBOR decode function to the code, once I’ve made the encoder strict…