Re: [Json] [Technical Errata Reported] RFC8259 (7673)

Tim Bray <tbray@textuality.com> Wed, 11 October 2023 15:16 UTC

Return-Path: <tbray@textuality.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 13B96C151089 for <json@ietfa.amsl.com>; Wed, 11 Oct 2023 08:16:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.106
X-Spam-Level:
X-Spam-Status: No, score=-7.106 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=textuality.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gSGNVF7Ahzec for <json@ietfa.amsl.com>; Wed, 11 Oct 2023 08:16:36 -0700 (PDT)
Received: from mail-ed1-x535.google.com (mail-ed1-x535.google.com [IPv6:2a00:1450:4864:20::535]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5692BC15152D for <json@ietf.org>; Wed, 11 Oct 2023 08:16:36 -0700 (PDT)
Received: by mail-ed1-x535.google.com with SMTP id 4fb4d7f45d1cf-53dfc28a2afso29946a12.1 for <json@ietf.org>; Wed, 11 Oct 2023 08:16:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=textuality.com; s=google; t=1697037392; x=1697642192; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=2Qk54A+GzIzazg5BKndyFrv+LJUpu9+cSmiBhSdsYSs=; b=SUKw3t+HzHcUKlQq4o1u/84bT83tHj+uSzFPvtJgV7e79cjDf5pVjy/KXayFydGxAl 1xFQ2yVNI3pnEL+C5cEAY4+rG4vCRDirXwaQ83KCqED2gOF+QVAmuS8TDEHagvp1go4L B29rHNziSe6W6yltPKK1PAJbDRzDAliXmlfGg=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697037392; x=1697642192; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=2Qk54A+GzIzazg5BKndyFrv+LJUpu9+cSmiBhSdsYSs=; b=W5laGZPqrIRjnwo0I4lVUCGcM7ROkRf+49VYL/BNKXysbi6cfOG6uEY/xaaB6sI+AY ka29e7Nbzn8CYLARkGL9i1lM3+AyPQB9D02t4UblcBESG7+4GwOH52LDsg49r+jWF8aF pPvzo//5btetxDTnW6t7g5aw1wmEGj3IGBjAZ1r3ZWKyItNt4wW/SiMcNjaD6F9PE/Aa L3jJTmVPFor+yai0pTz5GO5h2rJtT4eVj+Q0JIvc1KOME8Hii9v9vYfi8Iq6TO2CNBoE Jo+7VN6yy/t/v66uN2pX5lrf5i54lOW7B4ocZHUyv5y6bqsl9JPsgGYl5Q/KQdCrtcF0 OcHw==
X-Gm-Message-State: AOJu0Yz55fL02KErdcqrQAU2z9ujp07GT1fU5gRm1jjpdNhr2uCm4MQ/ q8LZU6LLBwtQoJcHygBJ6gt5tKgjUyNiCgNn5VQ3OA==
X-Google-Smtp-Source: AGHT+IGMREApi6iGNEaavqoVf6EyvBhRq0VkA8q9rRz21V+7CNJKPe9vAj8+Li+qfFPLBnRRc1S5RXOYo4e0agwd46A=
X-Received: by 2002:aa7:d052:0:b0:532:c909:a06c with SMTP id n18-20020aa7d052000000b00532c909a06cmr19518556edo.18.1697037392399; Wed, 11 Oct 2023 08:16:32 -0700 (PDT)
Received: from 1064022179695 named unknown by gmailapi.google.com with HTTPREST; Wed, 11 Oct 2023 08:16:32 -0700
Received: from 1064022179695 named unknown by gmailapi.google.com with HTTPREST; Wed, 11 Oct 2023 08:16:28 -0700
Mime-Version: 1.0 (Mimestream 1.1.2)
References: <20231011065619.82BC5E6D69@rfcpa.amsl.com> <CE22DEB9-FA3A-439B-A4CD-79138DBB18A5@cursive.net> <CAHBU6is2zYR8V-BK8_OVe8iWhfxRym=+=4s1X26e1kfJzOWGdg@mail.gmail.com>
In-Reply-To: <CAHBU6is2zYR8V-BK8_OVe8iWhfxRym=+=4s1X26e1kfJzOWGdg@mail.gmail.com>
From: Tim Bray <tbray@textuality.com>
Date: Wed, 11 Oct 2023 08:16:32 -0700
Message-ID: <CAHBU6ismHz2bo6zUVeJ5cvDtz=VM=4_Ef_RSa0_oSnjFWwKUkw@mail.gmail.com>
To: Joe Hildebrand <hildjj@cursive.net>
Cc: "Murray S. Kucherawy" <superuser@gmail.com>, Francesca Palombini <francesca.palombini@ericsson.com>, linuxwolf+ietf@outer-planes.net, zachmcollier@gmail.com, json@ietf.org, RFC Editor <rfc-editor@rfc-editor.org>
Content-Type: multipart/alternative; boundary="000000000000f205cc0607724ea6"
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/ZaDMTEg66SsPCtMDh0Xf-fb8hlY>
Subject: Re: [Json] [Technical Errata Reported] RFC8259 (7673)
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 11 Oct 2023 15:16:41 -0000

 Oh, and as I didn’t say, while the erratum is reasonable, I think it’s a
reject.

On Oct 11, 2023 at 8:14:24 AM, Tim Bray <tbray@textuality.com> wrote:

> Well, this is a strange one.  All the specs for JSON have said the same
> thing, and the thing they’ve said is kind of stupid. The requested change
> is to add U+7F DEL to the list of characters that MUST be escaped.
>
> However, I created a document containing one field containing only a
> single U+7F, it is available at https://www.tbray.org/tmp/del.json, and
> it seems to be legal JSON.  So, in fact, while DEL should have been
> included in the must-escapes, the world’s software has learned to live with
> it not being escaped.
>
> Note that this file does not display correctly.
>
>  ~ % curl https://www.tbray.org/tmp/del.json | jsonlint
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time
>  Current
>                                  Dload  Upload   Total   Spent    Left
>  Speed
> 100    23  100    23    0     0    224      0 --:--:-- --:--:-- --:--:--
> 237
> {
>   "example": ""
> }
>
> FWIW, those who care about such issues may be interested in
> https://datatracker.ietf.org/doc/draft-bray-unichars/ currently under
> discussion in the art@ and i18ndir@ mailing lists.
>
>
> On Oct 11, 2023 at 7:17:00 AM, Joe Hildebrand <hildjj@cursive.net> wrote:
>
>> Suggested resolution: reject
>>
>> RFC 8259's ABNF is quite clear that these codepoints are allowed:
>> "unescaped = %x20-21 / %x23-5B / %x5D-10FFFF"
>>
>> ECMA-404 agrees: "the control characters U+0000 to U+001F".
>>
>> json.org's wording is awkward, but still clear: "any of the Unicode code
>> points except the 32 control codes and "double quote"
>>
>> Here is some JS to prove it got implemented this way:
>>
>> ```
>> JSON.parse('"\x7f"')
>> ```
>>
>> The approach in the errata might have been the correct one to have been
>> specified, but it wasn't.  Even if we had wanted to make this change, it
>> was far too late by the time RFC 4627 was written.
>>
>> —
>> Joe Hildebrand
>>
>> On Oct 11, 2023, at 12:56 AM, RFC Errata System <
>> rfc-editor@rfc-editor.org> wrote:
>>
>>
>> The following errata report has been submitted for RFC8259,
>>
>> "The JavaScript Object Notation (JSON) Data Interchange Format".
>>
>>
>> --------------------------------------
>>
>> You may review the report below and at:
>>
>> https://www.rfc-editor.org/errata/eid7673
>>
>>
>> --------------------------------------
>>
>> Type: Technical
>>
>> Reported by: Zachary Collier (Zamicol) <zachmcollier@gmail.com>
>>
>>
>> Section: 7
>>
>>
>> Original Text
>>
>> -------------
>>
>> The representation of strings is similar to conventions used in the C
>> family
>>
>> of programming languages.  A string begins and ends with quotation marks.
>> All
>>
>> Unicode characters may be placed within the quotation marks, except for
>> the
>>
>> characters that MUST be escaped: quotation mark, reverse solidus, and the
>>
>> control characters (U+0000 through U+001F).
>>
>>
>> Corrected Text
>>
>> --------------
>>
>> The representation of strings is similar to conventions used in the C
>> family
>>
>> of programming languages.  A string begins and ends with quotation
>> marks.  All
>>
>> Unicode characters may be placed within the quotation marks, except for
>> the
>>
>> characters that MUST be escaped: quotation mark, reverse solidus, and the
>>
>> control characters (U+0000 through U+001F, U+007F, and U+0080 through
>>
>> U+009F).
>>
>>
>>
>> Notes
>>
>> -----
>>
>> There are 33 7-bit control characters, but the JSON RFC only listed 32 by
>>
>> omitting the inclusion of the last control character in the 7-bit ASCII
>> range,
>>
>> 'del.'  However, JSON is not limited to 7-bit ASCII; it is Unicode.
>> Unicode
>>
>> encompasses 65 control characters from U+0080 to U+009F, totaling an
>> additional
>>
>> 32 characters.  The section that currently reads "U+0000 through U+001F"
>> should
>>
>> include these additional control characters reading as "U+0000 through
>> U+001F,
>>
>> U+007F, and U+0080 through U+009F"
>>
>>
>> Instructions:
>>
>> -------------
>>
>> This erratum is currently posted as "Reported". If necessary, please
>>
>> use "Reply All" to discuss whether it should be verified or
>>
>> rejected. When a decision is reached, the verifying party
>>
>> can log in to change the status and edit the report, if necessary.
>>
>>
>> --------------------------------------
>>
>> RFC8259 (draft-ietf-jsonbis-rfc7159bis-04)
>>
>> --------------------------------------
>>
>> Title               : The JavaScript Object Notation (JSON) Data
>> Interchange Format
>>
>> Publication Date    : December 2017
>>
>> Author(s)           : T. Bray, Ed.
>>
>> Category            : INTERNET STANDARD
>>
>> Source              : Javascript Object Notation Update
>>
>> Area                : Applications and Real-Time
>>
>> Stream              : IETF
>>
>> Verifying Party     : IESG
>>
>>
>> _______________________________________________
>>
>> json mailing list
>>
>> json@ietf.org
>>
>> https://www.ietf.org/mailman/listinfo/json
>>
>>
>>