Re: [Cbor] [Technical Errata Reported] RFC8610 (6278)

Doug Ewell <doug@ewellic.org> Fri, 04 September 2020 02:06 UTC

Return-Path: <doug@ewellic.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CBB0E3A14CF for <cbor@ietfa.amsl.com>; Thu, 3 Sep 2020 19:06:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.895
X-Spam-Level:
X-Spam-Status: No, score=-1.895 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 69aDHocqBSez for <cbor@ietfa.amsl.com>; Thu, 3 Sep 2020 19:06:44 -0700 (PDT)
Received: from p3plsmtpa11-03.prod.phx3.secureserver.net (p3plsmtpa11-03.prod.phx3.secureserver.net [68.178.252.104]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 19FE13A14CE for <cbor@ietf.org>; Thu, 3 Sep 2020 19:06:44 -0700 (PDT)
Received: from DESKTOPLPOB1E4 ([73.229.14.229]) by :SMTPAUTH: with ESMTPSA id E17ik8qdpu3L7E17ikRxVI; Thu, 03 Sep 2020 19:06:43 -0700
X-CMAE-Analysis: v=2.3 cv=WIYBoUkR c=1 sm=1 tr=0 a=9XGd8Ajh92evfb2NHZFWmw==:117 a=9XGd8Ajh92evfb2NHZFWmw==:17 a=IkcTkHD0fZMA:10 a=nORFd0-XAAAA:8 a=1EkIDzWhvkYwJMlqKMoA:9 a=QEXdDO2ut3YA:10 a=AYkXoqVYie-NGRFAsbO8:22
X-SECURESERVER-ACCT: doug@ewellic.org
From: Doug Ewell <doug@ewellic.org>
To: 'Carsten Bormann' <cabo@tzi.org>, 'RFC Errata System' <rfc-editor@rfc-editor.org>
Cc: 'Barry Leiba' <barryleiba@computer.org>, 'Henk Birkholz' <henk.birkholz@sit.fraunhofer.de>, 'Jim Schaad' <ietf@augustcellars.com>, eds@reric.net, cbor@ietf.org, "'Murray S. Kucherawy'" <superuser@gmail.com>, christoph.vigano@uni-bremen.de, 'Francesca Palombini' <francesca.palombini@ericsson.com>
References: <20200904001702.6E249F40781@rfc-editor.org> <ABA95628-D8A6-4E8F-A3CD-B51EF5B9ADF9@tzi.org>
In-Reply-To: <ABA95628-D8A6-4E8F-A3CD-B51EF5B9ADF9@tzi.org>
Date: Thu, 03 Sep 2020 20:06:42 -0600
Message-ID: <000001d68260$096e6770$1c4b3650$@ewellic.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQH0lu+H0pthCvFFmmSxkqUOhwrxeAI2My4rqQnYdmA=
Content-Language: en-us
X-CMAE-Envelope: MS4wfJadSnOh8KRAqzwXpABC9RhBj58pRzis47KrLO2NbjZPjcUXOllIxiXRfb3Y8fLhUJOMcGjvYXOoCVSewl+6sBDtQkAC/bs6COTXdFciUIJUsExFsbdh LkjBsB+rnSu3/DjpkWFSD1PNHSvcQSJ6uDVwZtjGEcbY82unsvNhnR2AGqqfLE6f9ip7jcajn9vh0fa61tdcags3cWqqOdsJOj9KyxzVdSOOKzYsPRysJXi7 ksTNi5luzbkNa/cgZamwxZkTp7i3fizWhmjN5Hm72Y8OpcokzokrSxD4XK7jCtN+U7RsWmbF3Uj30boBWJm07xX4BB6Zz/OcKWzVf5KMvVWl2GeQli/jGE1S pveyDYkBwGrPFDp+/tZDP3Wbab7mbcVKCjeTUz+s+V2nnYS2X0Q0P6fgHd/f0G6KbGLYtbDnjNWV6vfCcbKfIhBQQEm04f+xZhuV27SUz0SZN5KI6rjdM1d8 jETJ5jFwvegQ3PQp
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/5wR8lT6Gz0t1wV7l25Kvc0j0Nhg>
X-Mailman-Approved-At: Thu, 03 Sep 2020 20:03:24 -0700
Subject: Re: [Cbor] [Technical Errata Reported] RFC8610 (6278)
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Sep 2020 02:06:46 -0000

> Corrected Text
> --------------
> BCHAR = %x20-26 / %x28-5B / %x5D-7E / %x80-10FFFD / SESC / CRLF

Actually, if the range is meant to end at U+10FFFD (thus excluding the noncharacters U+10FFFE and U+10FFFF), then it should also exclude the following:

1. the U+xxFFFE and U+xxFFFF noncharacters in every other plane
2. noncharacters U+FDD0 through U+FDEF
3. surrogate code points 0xD800 through 0xDFFF, which aren't even Unicode scalar values (hence not marked with "U+" notation)

Of course this goes for the definitions of PCHAR, SCHAR, and SESC as well.

A new symbol, NONASCII, could resolve all this:

NONASCII = %x80-D7FF / %xE000-FDCF / %xFDF0-FFFD
         / %x10000-1FFFD
         / %x20000-2FFFD
         / %x30000-3FFFD
         / %x40000-4FFFD
         / %x50000-5FFFD
         / %x60000-6FFFD
         / %x70000-7FFFD
         / %x80000-8FFFD
         / %x90000-9FFFD
         / %xA0000-AFFFD
         / %xB0000-BFFFD
         / %xC0000-CFFFD
         / %xD0000-DFFFD
         / %xE0000-EFFFD
         / %xF0000-FFFFD
         / %x100000-10FFFD
[...]
SCHAR = %x20-21 / %x23-5B / %x5D-7E / NONASCII / SESC
SESC = "\" (%x20-7E / NONASCII)
[...]
BCHAR = %x20-26 / %x28-5B / %x5D-7E / NONASCII / SESC / CRLF
[...]
PCHAR = %x20-7E / NONASCII

--
Doug Ewell | Thornton, CO, US | ewellic.org