Re: [Cbor] RFC7049bis processing of unknown tags

Jeffrey Yasskin <jyasskin@chromium.org> Thu, 07 May 2020 22:50 UTC

Return-Path: <jyasskin@google.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E1DEE3A0E2B for <cbor@ietfa.amsl.com>; Thu, 7 May 2020 15:50:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.25
X-Spam-Level:
X-Spam-Status: No, score=-9.25 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_SPF_WL=-7.5] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=chromium.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id d9wN3mme-SLk for <cbor@ietfa.amsl.com>; Thu, 7 May 2020 15:50:04 -0700 (PDT)
Received: from mail-qv1-xf34.google.com (mail-qv1-xf34.google.com [IPv6:2607:f8b0:4864:20::f34]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A1D403A0DD0 for <cbor@ietf.org>; Thu, 7 May 2020 15:50:04 -0700 (PDT)
Received: by mail-qv1-xf34.google.com with SMTP id v10so3577701qvr.2 for <cbor@ietf.org>; Thu, 07 May 2020 15:50:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=xebqd2t+sGm/Z+9wevzOVnhLk13hPdgNomoPgmgB4uM=; b=bYE00rsUbyyZNTnY/jT+BupAL5/oVFpHbYpU1n7heyz57HZhCqEUCXms4CNSCdt2l4 Vua1C4yMSiIJSjMVYkXzYMP9SKku/TjyjI1RtobOcvwn6zLs9hmnSMGwGx3KdgxXxxYi aHuoyLcJuP8lY8/qTOu77ghQHqFaVOKr5jrxQ=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=xebqd2t+sGm/Z+9wevzOVnhLk13hPdgNomoPgmgB4uM=; b=jwyePUcb31dPNS1aaIsXnGPh0eqjH2/SvB/PQEjEssV5GH6yQQA26Aqx9L9Z0Ut3rt m+o5kFaOZ7+DZu7r/looyTcpfpO/iXiio2tfKH6oj8KLIFGbd4D1gtleGf58R4dWdeUF 4/HD/ntlyUyPRtVmZcjD33csis2etb/Z7KVSfdBiEJZwNQvcknQKsAApX8GBy9Wz/WvY 6NPbdOmhgV4funjVgNl1VceqoZ8kBbNyhABO7232Mn/lGO+i2QtcuUKXhJE1bUPnu4Vm fJdY6Mc5LOA053P9I6XA+VgTxxCr9QYIntdZqx2azxX83g37EGzseavnwEEnld6KcTHD USgA==
X-Gm-Message-State: AGi0PuYuamprrirwYpqcNSnJ7cEbOlC2vG1dTh4h+fL0WyZC6QFErc/K hp3DsxSYvwWC74odAo400LsUli8SmKlguvambhE93Q==
X-Google-Smtp-Source: APiQypKUHplolg1J+BD0pLRgIjYMz7Nu9IRCHPul29ScOV0E4kTI0viGlPYlN3jPGTGR3MVpSqs8BzsSnmIHPhVx2ho=
X-Received: by 2002:a0c:c987:: with SMTP id b7mr6423676qvk.20.1588891802856; Thu, 07 May 2020 15:50:02 -0700 (PDT)
MIME-Version: 1.0
References: <17300.1588779159@localhost> <38BB6FFF-737F-4C11-AD7A-DA3F28A9F570@tzi.org>
In-Reply-To: <38BB6FFF-737F-4C11-AD7A-DA3F28A9F570@tzi.org>
From: Jeffrey Yasskin <jyasskin@chromium.org>
Date: Thu, 07 May 2020 15:49:50 -0700
Message-ID: <CANh-dXkdjMyO=WFUxrF06OfP+RE9v11unKJXL8P3UtEe+prV1w@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: Michael Richardson <mcr+ietf@sandelman.ca>, cbor@ietf.org
Content-Type: multipart/alternative; boundary="0000000000007f82c405a516b334"
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/ThngELPRL04VSqFZw_ReU70qaC0>
Subject: Re: [Cbor] RFC7049bis processing of unknown tags
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 07 May 2020 22:50:11 -0000

I think at least some of this change was my doing, so I'll send three
opinions.

1) I should have changed the text in 7.1, "Implementations receiving an
unknown tag number can choose to simply ignore it (process just the
enclosed tag content)", and I must have just missed it.
2) I don't think it makes sense for any parser to see a tag and give the
application its contents instead of the tagged contents. That said, this is
a change that does make some RFC7049-compliant parsers non-compliant with
RFC7049-bis, which is a downside. On the third hand, the parsers will still
exist, applications using them don't need to upgrade if it breaks their
protocol.
3) While RFC7049-bis is already somewhat overstepping the bounds of a
standard in specifying error-handling options, it'd be even farther out of
bounds to specify parser API options. :)

Jeffrey

On Wed, May 6, 2020 at 9:07 AM Carsten Bormann <cabo@tzi.org> wrote:

> Hi Michael,
>
> Thank you for your interjection at the CBOR interim.
> I think this is a good issue to roll up.
>
> On 2020-05-06, at 17:32, Michael Richardson <mcr+ietf@sandelman.ca> wrote:
> >
> >
> > After discussion about #176/#181 I submitted #182:
> >
> > https://github.com/cbor-wg/CBORbis/issues/182
> >
> > RFC7049 specified that CBOR tags which were not recognized should be
> ignored.
>
> That is not exactly what it says:
>
> (3.5)
>    A decoder that comes across a tag (Section 2.4) that it does not
>    recognize, such as a tag that was added to the IANA registry after
>    the decoder was deployed or a tag that the decoder chose not to
>    implement, might issue a warning, might stop processing altogether,
>    might handle the error and present the unknown tag value together
>    with the contained data item to the application (as is expected of
>    generic decoders), might ignore the tag and simply present the
>    contained data item only to the application, or take some other type
>    of action.
>
> So there always was a choice.
> Note that there always was a preference to present the unknown tag to the
> application, for “generic decoders”; i.e., the choice is more for
> application-specific decoders.
>
> > RFC7049bis wishes to change this behaviour such that unknown tags would
> not
> > be ignored, but would at least, be presented to the application for
> further
> > determination. This is a change that would render existing CBOR parsers
> > instantly invalid.
>
> A change that would remove the choice is not what was intended so far,
> just increased emphasis on the options:
>
> — 1 might issue a warning,
> — 2 might stop processing altogether,
> — 3 might handle the error and present the unknown tag value together
>    with the contained data item to the application (as is expected of
>    generic decoders),
>
> as opposed to
>
> — 4 might ignore the tag and simply present the
>    contained data item only to the application, or
> — 5 take some other type
>    of action.
>
> Generally, it is a good idea if users of a library know what they will
> get, so the behavior to be expected needs to be documented.  A generic
> decoder that only does 3 (as is “expected”) will be the most interoperable
> one.
>
> There are two pieces of text in 7049bis that may not entirely be aligned:
>
> (3.4:)
>    Decoders do not need to understand tags of every tag number, and tags
>    may be of little value in applications where the implementation
>    creating a particular CBOR data item and the implementation decoding
>    that stream know the semantic meaning of each item in the data flow.
>    Their primary purpose in this specification is to define common data
>    types such as dates.  A secondary purpose is to provide conversion
>    hints when it is foreseen that the CBOR data item needs to be
>    translated into a different format, requiring hints about the content
>    of items.  Understanding the semantics of tags is optional for a
>    decoder; it can simply present both the tag number and the tag
>    content to the application, without interpreting the additional
>    semantics of the tag.
>
> But also:
>
> (7.1:)
>    CBOR has three major extension points:
>
>    […]
>
>    *  the "tag" space (values in major type 6).  Again, only a small
>       part of the codepoint space has been allocated, and the space is
>       abundant (although the early numbers are more efficient than the
>       later ones).  Implementations receiving an unknown tag number can
>       choose to simply ignore it (process just the enclosed tag content)
>       or to process it as an unknown tag number wrapping the tag
>       content.  The IANA registry in Section 9.2 is the appropriate way
>       to address the extensibility of this codepoint space.
>
> > The suggestion is that parsers should be in RFC7049 mode by default,
>
> But there is no RFC 7049 mode — there is a choice.
>
> > and
> > applications that want RFC7049bis behaviour should initialize the parser
> with
> > an option that enables it [or use a new parser with awareness].
>
> It definitely is a good idea to either always get behavior 3, or to
> provide flags that control the behavior.
>
> > Applications that want to make use of tags defined in RFC7049bis need to
> put
> > the parser in RFC7049bis mode.
>
> RFC7049bis does not define any new tags (so far).
>
> > I think that Carsten does not agree with my suggested solution, but I'm
> not
> > attached to it.
>
> I hope I have explained how the situation is a bit more nuanced than might
> have come over in the meeting.
>
> Grüße, Carsten
>
> _______________________________________________
> CBOR mailing list
> CBOR@ietf.org
> https://www.ietf.org/mailman/listinfo/cbor
>