Re: [Json] [Technical Errata Reported] RFC8259 (7673)
Tim Bray <tbray@textuality.com> Wed, 11 October 2023 15:14 UTC
Return-Path: <tbray@textuality.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8ED8CC151089 for <json@ietfa.amsl.com>; Wed, 11 Oct 2023 08:14:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.106
X-Spam-Level:
X-Spam-Status: No, score=-7.106 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=textuality.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Zk4Y1aSxiVxt for <json@ietfa.amsl.com>; Wed, 11 Oct 2023 08:14:28 -0700 (PDT)
Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2B1ACC14F736 for <json@ietf.org>; Wed, 11 Oct 2023 08:14:27 -0700 (PDT)
Received: by mail-ed1-x52f.google.com with SMTP id 4fb4d7f45d1cf-53627feca49so11507916a12.1 for <json@ietf.org>; Wed, 11 Oct 2023 08:14:27 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=textuality.com; s=google; t=1697037266; x=1697642066; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=KX9DOT1ZhmVNi8v9EZit1sWOoyFima3HWl504ylyWsA=; b=FmLhr5YA0Vf2rlcQHtNnB69l/T3UAV2HCg6QCFgT6tVa3nIUmrb9bP3iUGZQWS6q+Z E+CLtVvjNSofvsbpuN7Qej5XoeeCOMAcrGuoigab17jPquJN6osrb1+qN6SYYB6dE4Zo UCYQFutQjOlLZO0I2dbaD8f2L74rqg51MlqTs=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697037266; x=1697642066; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=KX9DOT1ZhmVNi8v9EZit1sWOoyFima3HWl504ylyWsA=; b=I0mdMnP5LgtWpXt+cmtfXxws0LpT9Nl9/8f3rn60AM5s8N5XoH0pUd8Cjhmd4vyorg Im1Y4CUQooW3QhaTndoq5rIlAmUSfbnwzdL/iBw3mB226hhoi+w8sQWJIIX5uVscfzKR CM0DdN8VGnPDRiauWtD1JYORNhzswds2pP2yW3BBX0zDbExoC3QYvsHyXTS26jNIe2Ws GBd9tz5wMCYQrrP+d0VxDw3wJ76MAyff2laftbgClXW2MRc9HL5ZYNsSwVowjaKL7Ru+ iafLR1tDjiM7iBgVATClFuZnDrjAbG5AxuUp5a0UovqfjFnycZIXR66WtaVbPz3AJW2U 1B8w==
X-Gm-Message-State: AOJu0YwCmwxgilEc0nwcRPBtekNZsgPasd70UzABdMAQdHN6QXc7uZ1u Sh3nLXPz3WGQsgDSI8cYuRrDGn9l/0Qh1ICghP1TQQ==
X-Google-Smtp-Source: AGHT+IG5PzBa7O5nWMxaVSuhWz1CTDt8J6fhZkOxuLu6q1gHfsTltuKa0e6qSXIVXpSU51PAVTDgPIU9utSZ2wjm+Po=
X-Received: by 2002:aa7:d699:0:b0:525:440a:616a with SMTP id d25-20020aa7d699000000b00525440a616amr19126902edr.20.1697037265820; Wed, 11 Oct 2023 08:14:25 -0700 (PDT)
Received: from 1064022179695 named unknown by gmailapi.google.com with HTTPREST; Wed, 11 Oct 2023 08:14:24 -0700
Received: from 1064022179695 named unknown by gmailapi.google.com with HTTPREST; Wed, 11 Oct 2023 08:14:21 -0700
Mime-Version: 1.0 (Mimestream 1.1.2)
References: <20231011065619.82BC5E6D69@rfcpa.amsl.com> <CE22DEB9-FA3A-439B-A4CD-79138DBB18A5@cursive.net>
In-Reply-To: <CE22DEB9-FA3A-439B-A4CD-79138DBB18A5@cursive.net>
From: Tim Bray <tbray@textuality.com>
Date: Wed, 11 Oct 2023 08:14:24 -0700
Message-ID: <CAHBU6is2zYR8V-BK8_OVe8iWhfxRym=+=4s1X26e1kfJzOWGdg@mail.gmail.com>
To: Joe Hildebrand <hildjj@cursive.net>
Cc: "Murray S. Kucherawy" <superuser@gmail.com>, Francesca Palombini <francesca.palombini@ericsson.com>, linuxwolf+ietf@outer-planes.net, zachmcollier@gmail.com, json@ietf.org, RFC Editor <rfc-editor@rfc-editor.org>
Content-Type: multipart/alternative; boundary="00000000000066706306077247f3"
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/duK-6y1yEPRrLTgj5t5uROiHqhM>
Subject: Re: [Json] [Technical Errata Reported] RFC8259 (7673)
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 11 Oct 2023 15:14:32 -0000
Well, this is a strange one. All the specs for JSON have said the same thing, and the thing they’ve said is kind of stupid. The requested change is to add U+7F DEL to the list of characters that MUST be escaped. However, I created a document containing one field containing only a single U+7F, it is available at https://www.tbray.org/tmp/del.json, and it seems to be legal JSON. So, in fact, while DEL should have been included in the must-escapes, the world’s software has learned to live with it not being escaped. Note that this file does not display correctly. ~ % curl https://www.tbray.org/tmp/del.json | jsonlint % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 23 100 23 0 0 224 0 --:--:-- --:--:-- --:--:-- 237 { "example": "" } FWIW, those who care about such issues may be interested in https://datatracker.ietf.org/doc/draft-bray-unichars/ currently under discussion in the art@ and i18ndir@ mailing lists. On Oct 11, 2023 at 7:17:00 AM, Joe Hildebrand <hildjj@cursive.net> wrote: > Suggested resolution: reject > > RFC 8259's ABNF is quite clear that these codepoints are allowed: > "unescaped = %x20-21 / %x23-5B / %x5D-10FFFF" > > ECMA-404 agrees: "the control characters U+0000 to U+001F". > > json.org's wording is awkward, but still clear: "any of the Unicode code > points except the 32 control codes and "double quote" > > Here is some JS to prove it got implemented this way: > > ``` > JSON.parse('"\x7f"') > ``` > > The approach in the errata might have been the correct one to have been > specified, but it wasn't. Even if we had wanted to make this change, it > was far too late by the time RFC 4627 was written. > > — > Joe Hildebrand > > On Oct 11, 2023, at 12:56 AM, RFC Errata System <rfc-editor@rfc-editor.org> > wrote: > > > The following errata report has been submitted for RFC8259, > > "The JavaScript Object Notation (JSON) Data Interchange Format". > > > -------------------------------------- > > You may review the report below and at: > > https://www.rfc-editor.org/errata/eid7673 > > > -------------------------------------- > > Type: Technical > > Reported by: Zachary Collier (Zamicol) <zachmcollier@gmail.com> > > > Section: 7 > > > Original Text > > ------------- > > The representation of strings is similar to conventions used in the C > family > > of programming languages. A string begins and ends with quotation marks. > All > > Unicode characters may be placed within the quotation marks, except for the > > characters that MUST be escaped: quotation mark, reverse solidus, and the > > control characters (U+0000 through U+001F). > > > Corrected Text > > -------------- > > The representation of strings is similar to conventions used in the C > family > > of programming languages. A string begins and ends with quotation marks. > All > > Unicode characters may be placed within the quotation marks, except for the > > characters that MUST be escaped: quotation mark, reverse solidus, and the > > control characters (U+0000 through U+001F, U+007F, and U+0080 through > > U+009F). > > > > Notes > > ----- > > There are 33 7-bit control characters, but the JSON RFC only listed 32 by > > omitting the inclusion of the last control character in the 7-bit ASCII > range, > > 'del.' However, JSON is not limited to 7-bit ASCII; it is Unicode. > Unicode > > encompasses 65 control characters from U+0080 to U+009F, totaling an > additional > > 32 characters. The section that currently reads "U+0000 through U+001F" > should > > include these additional control characters reading as "U+0000 through > U+001F, > > U+007F, and U+0080 through U+009F" > > > Instructions: > > ------------- > > This erratum is currently posted as "Reported". If necessary, please > > use "Reply All" to discuss whether it should be verified or > > rejected. When a decision is reached, the verifying party > > can log in to change the status and edit the report, if necessary. > > > -------------------------------------- > > RFC8259 (draft-ietf-jsonbis-rfc7159bis-04) > > -------------------------------------- > > Title : The JavaScript Object Notation (JSON) Data > Interchange Format > > Publication Date : December 2017 > > Author(s) : T. Bray, Ed. > > Category : INTERNET STANDARD > > Source : Javascript Object Notation Update > > Area : Applications and Real-Time > > Stream : IETF > > Verifying Party : IESG > > > _______________________________________________ > > json mailing list > > json@ietf.org > > https://www.ietf.org/mailman/listinfo/json > > >
- [Json] [Technical Errata Reported] RFC8259 (7673) RFC Errata System
- Re: [Json] [Technical Errata Reported] RFC8259 (7… Joe Hildebrand
- Re: [Json] [Technical Errata Reported] RFC8259 (7… Tim Bray
- Re: [Json] [Technical Errata Reported] RFC8259 (7… Tim Bray
- Re: [Json] [Technical Errata Reported] RFC8259 (7… Peter F. Patel-Schneider
- Re: [Json] [Technical Errata Reported] RFC8259 (7… Carsten Bormann