Re: [Cbor] [Technical Errata Reported] RFC8610 (6527)

Carsten Bormann <cabo@tzi.org> Tue, 13 April 2021 12:13 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 801333A12C4 for <cbor@ietfa.amsl.com>; Tue, 13 Apr 2021 05:13:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.919
X-Spam-Level:
X-Spam-Status: No, score=-1.919 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id S2yPD9hHj13v for <cbor@ietfa.amsl.com>; Tue, 13 Apr 2021 05:13:25 -0700 (PDT)
Received: from gabriel-vm-2.zfn.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 10F5E3A12C2 for <cbor@ietf.org>; Tue, 13 Apr 2021 05:13:25 -0700 (PDT)
Received: from [192.168.217.118] (p548dc178.dip0.t-ipconnect.de [84.141.193.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-2.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4FKPdZ4lPvzyd9; Tue, 13 Apr 2021 14:13:22 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <4986660B-EDCC-4D07-A74E-BBEBE698721D@tzi.org>
Date: Tue, 13 Apr 2021 14:13:22 +0200
Cc: Henk Birkholz <henk.birkholz@sit.fraunhofer.de>, christoph.vigano@uni-bremen.de, "Murray S. Kucherawy" <superuser@gmail.com>, Francesca Palombini <francesca.palombini@ericsson.com>, Barry Leiba <barryleiba@computer.org>, Christian Amsüss <christian@amsuess.com>
X-Mao-Original-Outgoing-Id: 640008801.258806-38bca6a40634a21a7530fe845d4ee75e
Content-Transfer-Encoding: quoted-printable
Message-Id: <2E410DD1-D0E2-4137-B7E7-7FB18CF71971@tzi.org>
References: <20210411161045.9648FF40799@rfc-editor.org> <4986660B-EDCC-4D07-A74E-BBEBE698721D@tzi.org>
To: cbor@ietf.org, smbarte2@illinois.edu
X-Mailer: Apple Mail (2.3608.120.23.2.4)
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/N1lb3N1tGBGbqg3puezb1L9f-xY>
Subject: Re: [Cbor] [Technical Errata Reported] RFC8610 (6527)
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Apr 2021 12:13:30 -0000

On 2021-04-11, at 18:51, Carsten Bormann <cabo@tzi.org> wrote:
> 
> However, there is one omission in the ABNF: the \u syntax, which is somewhat complicated in JSON because it is followed either by 4 hex digits that are not in the range d800 to dfff or by 4 hex digits in the range d800 to dbff, another \u, and four more hex digits dc00 to dfff.
> Someone needs to sit down and write up the ABNF for that (or find some ABNF that already has done the work).

I was too lazy to find some ABNF so I wrote my own.

76c76,84
< SESC = "\" (%x20-7E / %x80-10FFFD)
---
> 
> SESC = "\" ( %x22 / %x2F / %x5C / %x62 / %x66 / %x6E / %x72 / %x74 /
>              (%x75 hexchar) )
> 
> hexchar = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) /
>            ("D"
>             (( %x30-37 2HEXDIG ) /
>              (("8"/"9"/"A"/"B") 2HEXDIG "\" %x75 "D"
>               ("C"/"D"/"E"/"F") 2HEXDIG )))
79c87
< BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF
---
> BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / "\'" / CRLF


The second change is necessary as SESC is now narrowly restricted to the escape combinations that JSON allows in text strings, and \' is not among those.

You can play with this changed grammar in cddlc version 0.0.3 (`gem update` if needed).

> I’m not sure this update should all be put into an errata item; maybe we should pursue writing this up in a document that updates RFC 8610 that could then also add a less unwieldy syntax for non-BMP code points such as the ubiquitous \u{…}.

I’m still not entirely sure we want to handle this as a bog-standard errata item.
Opinions welcome.

Grüße, Carsten