[Cbor] Glitch in cddl-control AUTH48

Carsten Bormann <cabo@tzi.org> Tue, 07 December 2021 21:12 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8915B3A18DD for <cbor@ietfa.amsl.com>; Tue, 7 Dec 2021 13:12:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id U24MU48dCcg1 for <cbor@ietfa.amsl.com>; Tue, 7 Dec 2021 13:12:39 -0800 (PST)
Received: from gabriel-smtp.zfn.uni-bremen.de (gabriel-smtp.zfn.uni-bremen.de [IPv6:2001:638:708:32::15]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4F0303A18DA for <cbor@ietf.org>; Tue, 7 Dec 2021 13:12:37 -0800 (PST)
Received: from [192.168.217.118] (p5089a436.dip0.t-ipconnect.de [80.137.164.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4J7tKs6wb6zDCd0; Tue, 7 Dec 2021 22:12:33 +0100 (CET)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.7\))
From: Carsten Bormann <cabo@tzi.org>
Date: Tue, 7 Dec 2021 22:12:33 +0100
X-Mao-Original-Outgoing-Id: 660604353.458914-e0e39bf85328bff5d0e0ff7c0abf7a04
Content-Transfer-Encoding: quoted-printable
Message-Id: <BD71DE08-ABB4-445E-9215-AB8144F5A087@tzi.org>
References: <077EDC59-74CE-42B4-B651-BA8846719CD9@tzi.org>
To: cbor@ietf.org
X-Mailer: Apple Mail (2.3608.120.23.2.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/dNulH3PiCf4AMVhX9nmtVKqiYLI>
Subject: [Cbor] Glitch in cddl-control AUTH48
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Dec 2021 21:12:45 -0000

Hi,

In the last step (“AUTH48”) before a document like cddl-control becomes an RFC, the author(s) are supposed to do a full reread to capture any problems that remained hidden during the incremental development of the I-D.

Normally, we find a few editorial problems (which we did; they since have been dealt with).
This time, however, there is also a technical contradiction between an illustration (Figure 4) and the normative text that would need to be considered in making up the example in the illustration (

Background: cddl-control deliberately allows byte string literals in places where text string literals would really make more sense.  The reason is that the content of these literals often is specification text (ABNF) with embedded newlines, which are much easier to write as byte strings (where embedded newlines are allowed in RFC 8610) than as text strings (which inherit the JSON syntax that requires replacing every newline with a linear \n, creating essentially unreadable source text).  This is demonstrated in Figure 2:

>    c = "foo" .cat '
>      bar
>      baz
>    '
>    ; on a system where the newline is \n, is the same string as:
>    b = "foo\n  bar\n  baz\n"
> 
>        Figure 2: An Example of Concatenation of Text and Byte Strings

So we handled that properly for .cat and .det, but it turns out we did not for .abnf and .abnfb themselves.  Here, bullet 1 (which mostly is about the target and its interpretation) in Section 3 tersely says:

>    The controller string MUST be a text string.

This is not only not what is implemented in the cddl tool (which accepts a byte string just fine), it is also directly contradicting Figure 4:

>    oid = bytes .abnfb 'oid
>    oid = 1*arc
>    roid = *arc
>    arc = [nlsb] %x00-7f
>    nlsb = %x81-ff *%x80-ff
>    '
> 
>              Figure 4: Dedenting example: result of first .det

To me, this is an oversight where we simply didn’t update that sentence to the more useful way of accepting byte strings just for their better syntax handling newlines.  So I am suggesting we change the sentence to:

> The controller MUST be a string.  When a byte string, it MUST be valid UTF-8 and is interpreted as the text string that has the same sequence of bytes.

… which I’m suggesting as a fix in the excerpt of the AUTH48 processing mail copied below.  I apologize for not seeing this earlier.

Our AD Francesca brought to my attention that cbor@ietf.org is not copied on AUTH48 processing; I had forgotten that (other WGs do that copying habitually).  So I didn’t think to bring this up separately here, which I am now doing.

You can find the text that will be published if we do make the last-minute change, at the end of the bullet:

https://www.rfc-editor.org/authors/rfc9165.html#section-3-4.1

(Note that this text is an RFC-to-be, not the final RFC, please don’t start citing it as an RFC before it has been published.)

Does anybody in the CBOR WG see a technical problem coming up with making this last minute fix?  If you see a problem (with implementation or use of byte string literals not only in the controller arguments of .cat/.det, but also in those of .abnf/.abnfb), please speak up quickly; this is the last item standing in the way towards publishing this RFC.

Grüße, Carsten



> Begin forwarded message:
> 
> From: Carsten Bormann <cabo@tzi.org>
> Date: 2021-12-04 at 17:00:03 CET
> To: <rfc-editor@rfc-editor.org>
> Cc: cbor-ads@ietf.org, cbor-chairs@ietf.org, Christian Amsüss <christian@amsuess.com>om>, francesca.palombini@ericsson.com
> Message-Id: <077EDC59-74CE-42B4-B651-BA8846719CD9@tzi.org>
> 
[…]
> (0)
> The last sentence in the first bullet in 3 says "The controller string MUST be a text string.”.
> However, the previous section’s example in Figure 4 happens to use a byte string literal, and leniently allowing both kinds of strings is also what is implemented.
> I believe loosening this up has the upside of simplifying cases such as in Figure 4, and no downside, so I would suggest:
> 
> OLD:
> The controller string MUST be a text string.
> NEW:
> The controller MUST be a string.  When a byte string, it MUST be valid UTF-8 and is interpreted as the text string that has the same sequence of bytes.
[…]