Re: [Json] [Technical Errata Reported] RFC8259 (7603)

Tim Bray <tbray@textuality.com> Thu, 17 August 2023 15:38 UTC

Return-Path: <tbray@textuality.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AA530C151520 for <json@ietfa.amsl.com>; Thu, 17 Aug 2023 08:38:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.095
X-Spam-Level:
X-Spam-Status: No, score=-2.095 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=textuality.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id I1orYcbUUH7S for <json@ietfa.amsl.com>; Thu, 17 Aug 2023 08:37:56 -0700 (PDT)
Received: from mail-ed1-x535.google.com (mail-ed1-x535.google.com [IPv6:2a00:1450:4864:20::535]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9C50DC14F73F for <json@ietf.org>; Thu, 17 Aug 2023 08:37:56 -0700 (PDT)
Received: by mail-ed1-x535.google.com with SMTP id 4fb4d7f45d1cf-522dd6b6438so9834757a12.0 for <json@ietf.org>; Thu, 17 Aug 2023 08:37:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=textuality.com; s=google; t=1692286675; x=1692891475; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=1v9f+Ck0SM/OWunq8qCA0hmi3mnt5ANEzrYtgXiSzHA=; b=GYAlDDmbptyT/UaN0N3acSIH2kRWuxzHOuNBrzCvpybntKMxwUzneIAeXuCnJszsdb ZJRwvGypUb1Bklcm55NU+MPmtjb3ADrtekNzrrIYlZHiOs+Juje/fsOxMTWIec3raStO PJTGpAYn6DwiqtLH6sMIiqRwnEQOIwmk7lV+0=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692286675; x=1692891475; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1v9f+Ck0SM/OWunq8qCA0hmi3mnt5ANEzrYtgXiSzHA=; b=MpO7cK6oZE/MdW3EoA9YoxWO+8ZGAcURniZkhF2ebfziay4Lg8fdxBqi/DLk16AkO2 Qgr1L7bUS7JIFYzNpXogrTjHS/MBwilIWOB1qqYDhEwdW1VKFAERrrz7+ri/MJMhlqMb vmAWbGe+iSzv5Bkc+ucWIUMJcefg46bdtRDCZ2xiEbQrihLeJyUKFFxyPtfElHkRwJqX 9AxejI7vfDn0oXrIGEYEGMbTr9qjCPklUn4WZA8P3jt5noG7wjmUJhwZ/IwNo/LSJnju rEWimegpciDoNMaN3sJS5yJre8u5agmGLVEVKZBaUedpZ6Txt7tfWS76QIflbFqycKNP TkuQ==
X-Gm-Message-State: AOJu0YzK57TMbCUXsACoxHm8NL2vhZTmNY9lS8ZpG8cKrHpYZLNGOJaj hG0dfIqCdW1ABl+T3aYgv0wfrI+a1v7NPrWZvCx2xA==
X-Google-Smtp-Source: AGHT+IGSh7vEZD+xH6zYotMbJx3T2NqgFT9fokRdtDGfAcRRJ2qvVGBvadF1DEhTvvMRMdWso/RMtKZl7cUQiroRC50=
X-Received: by 2002:aa7:d481:0:b0:522:3a37:a463 with SMTP id b1-20020aa7d481000000b005223a37a463mr71503edr.20.1692286674314; Thu, 17 Aug 2023 08:37:54 -0700 (PDT)
Received: from 1064022179695 named unknown by gmailapi.google.com with HTTPREST; Thu, 17 Aug 2023 08:37:53 -0700
Received: from 1064022179695 named unknown by gmailapi.google.com with HTTPREST; Thu, 17 Aug 2023 08:37:50 -0700
Mime-Version: 1.0 (Mimestream 1.0.5)
References: <20230813200941.250C13E8A7@rfcpa.amsl.com> <CAHBU6itO9SKMgZPGgdgE2a0NtDGvY59omECdpMDwfxJFLZVJWw@mail.gmail.com> <BABE6FF5-80F2-484B-82D2-0324F0320BB3@tzi.org>
In-Reply-To: <BABE6FF5-80F2-484B-82D2-0324F0320BB3@tzi.org>
From: Tim Bray <tbray@textuality.com>
Date: Thu, 17 Aug 2023 08:37:53 -0700
Message-ID: <CAHBU6itt-8NbQ2=WUWQkbekySqn2vPrU3GzyF4=E6n8ZzXCVsQ@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: RFC Errata System <rfc-editor@rfc-editor.org>, guillaume.fortin@debigare.com, json@ietf.org, superuser@gmail.com, francesca.palombini@ericsson.com, linuxwolf+ietf@outer-planes.net
Content-Type: multipart/alternative; boundary="00000000000014c04f0603203257"
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/e5qZ673uCddPn2KAPJ-xSQWDB3E>
Subject: Re: [Json] [Technical Errata Reported] RFC8259 (7603)
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Aug 2023 15:38:00 -0000

On Aug 16, 2023 at 2:31:30 PM, Carsten Bormann <cabo@tzi.org> wrote:

> Section 1.  Introduction
>
>   A string is a sequence of zero or more Unicode characters [UNICODE].
>
> This is the one we have been discussing (Section 1 discusses the JSON data
> model).
> From [1], D76 is the definition that works for me, not D10.
>

Yes, you’ve been clear about what you want, but obviously we can’ t do that
because JSON has been clear for years that naked surrogates are allowed
(obviously this makes me unhappy) and we can’t retroactively assert that
all these formerly compliant JSON documents are no longer valid. Not
without chartering another WG.

Section 8.2.  Unicode Characters
>
>   When all the strings represented in a JSON text are composed entirely
>   of Unicode characters [UNICODE] (however escaped), then that JSON
>   text is interoperable in the sense that all software implementations
> …
>   However, the ABNF in this specification allows member names and
>   string values to contain bit sequences that cannot encode Unicode
>   characters; for example, "\uDEAD" (a single unpaired UTF-16
>   surrogate).  Instances of this have been observed, for example, when
>
> Obviously, this whole section only works when Unicode Characters is D76.
>

I draw the opposite conclusion: That since JSON clearly blesses D10 not
D76, you should be careful to provide only D76 characters per Postel’s law.

I stand by my observation that this errata report tries to do in the errata
> process what only can be done in a proper WG consensus process.
>

I agree that the effect you’d like to achieve would require proper process.
I however do maintain that the report might be correct in that the choice
of “Unicode character” is potentially misleading.


> Grüße, Carsten
>
>