[Cbor] Chunks with tags inside indefinite-length string (major type 2 and 3)

Faye Amacker <faye.github@gmail.com> Thu, 05 December 2019 15:27 UTC

Return-Path: <faye.github@gmail.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9C951120013 for <cbor@ietfa.amsl.com>; Thu, 5 Dec 2019 07:27:32 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Level:
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QklY5NIJsTjI for <cbor@ietfa.amsl.com>; Thu, 5 Dec 2019 07:27:30 -0800 (PST)
Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com [IPv6:2607:f8b0:4864:20::72b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F145712002E for <cbor@ietf.org>; Thu, 5 Dec 2019 07:27:29 -0800 (PST)
Received: by mail-qk1-x72b.google.com with SMTP id i18so3610038qkl.11 for <cbor@ietf.org>; Thu, 05 Dec 2019 07:27:29 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=hVKlL6ywfAJ7cbpg3UtoJLpW6+J5IPztP5xN5L7xNMo=; b=axTKOgMbIDkxNuEbIIUT9nR9dbBk7BC59mSt1MLBZwLC9Img3X3kVRVsmpXWpzmzrI ytmWeEgqV6WXMCFJg4BBa8FG6lhMjcMXia2tg5OwRGodOnSnKX+dMBnyLW6V9fecRoju M8fUKPm1GNgd6/ajPlBtQlQXz2xHhvQnRsAHKDvhEp7ZRDxLNW9a61GoaSzBC+MW73Ol YNg1z/XFXqgrbRnLMzAmpDBLF6idRm1fPTF8EKYRv1zRBI2GEJAPmG0jMhxuqFjMj1nR YEM4Ts0vZyvuHMJxiYN8+eXQioAYihuefpoBZ6ZWe09tC1nlv9shFnq4dyXZ+IvN6qMO ceHw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=hVKlL6ywfAJ7cbpg3UtoJLpW6+J5IPztP5xN5L7xNMo=; b=c7PO1DMLywRALEpi3wB1gNfz/hYjyi727XlEEFRClzFEENB3+UL2kY/Uy2eXYxxzuQ fMDyjX+uAAt0tyJqt2US40rzGWhcH+N/naNtIuOLZ8tvA3a5Iom+S7bP9JGG7omDosdV VerXgWXClLpNAg743KNICSNukW2+wdV/DR2JKCeBeaExe+h871j6Fmepmb2M+0akuYgz Vgt+M9AOWmLE6PG9JgSW423Zid1P9cfJO4DTobIU+S38CcC22LgW0mdnJp12Xpb69Aov ydbCI9P3WuXmzcDeOSg0IBgYqvEbA9Xe/5l5JHDp+rMbxKb16oWq9yg1aJBPT0asD589 nOkg==
X-Gm-Message-State: APjAAAXvFhqI2pTO1raUxpI4KTYyoAcD5uXnRU7mBGtzPyt6E84GJxF4 5yqLZK6//pvcGa3EN1eUgkIOHeSKa+36x2O16I5VCtv/
X-Google-Smtp-Source: APXvYqwFHie+4VVW++Tl0A5HsIRBV9MhC3182pCkeoV1mF5qVI6powgNUoulRw8HcSkpnPziawPU+HCnUz5AlLdB8zA=
X-Received: by 2002:a37:9b97:: with SMTP id d145mr8966499qke.108.1575559648907; Thu, 05 Dec 2019 07:27:28 -0800 (PST)
MIME-Version: 1.0
From: Faye Amacker <faye.github@gmail.com>
Date: Thu, 05 Dec 2019 09:27:17 -0600
Message-ID: <CA+qCGhv_d=uJxnPrnbRO_iN9nVhwPf0Qa8EvtqqXZS6pMaCyBw@mail.gmail.com>
To: cbor@ietf.org
Content-Type: multipart/alternative; boundary="0000000000003216c40598f69149"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/D-mmEm5bnWlYYTxf44J1WBcTqLs>
Subject: [Cbor] Chunks with tags inside indefinite-length string (major type 2 and 3)
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Dec 2019 15:27:33 -0000

While implementing support for tags, I ran into a scenario that might be
errata or could use some clarification in 7049bis.

Section 3.2.3 states:

> Indefinite-length strings are represented by a byte containing the major
type and additional information value of 31, followed by a series of zero
or more byte or text strings (“chunks”) that have definite lengths,
followed by the “break” stop code (Section 3.2.1). The data item
represented by the indefinite-length string is the concatenation of the
chunks (i.e., the empty byte or text string, respectively, if no chunk is
present).

And the pseudocode in Appendix C allows tags for chunks inside
indefinite-length strings.

Chunks are simply fragments to be concatenated, so tags applied to chunks
doesn't seem intuitive.  Chunks are not intended for independent access
like array elements.  Some tags (like #2 and #3) transform byte string into
bignum which makes sense for arrays, not chunks.

If tags must be applied to each chunk, there should be some text mentioning
that because library authors might think it should be rejected or applied
to the concatenated string rather than chunk.

However, if the tag for a chunk applies to the entire concatenated string,
then what happens when there are multiple chunks with different tags? Which
tag wins?

A simpler way forward is to treat tags on chunks as malformed.  If this is
the way to go, then the pseudocode in Appendix C needs to be updated and an
example could be added to Appendix G.

GitHub issue for RFC 7049bis: https://github.com/cbor-wg/CBORbis/issues/148
GitHub issue for my CBOR library:
https://github.com/fxamacker/cbor/issues/44