Re: [Json] Naked surrogates already banned?

Carsten Bormann <cabo@tzi.org> Fri, 18 October 2013 06:39 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0DC4521F9F3A for <json@ietfa.amsl.com>; Thu, 17 Oct 2013 23:39:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -107.173
X-Spam-Level:
X-Spam-Status: No, score=-107.173 tagged_above=-999 required=5 tests=[AWL=1.076, BAYES_00=-2.599, GB_I_LETTER=-2, HELO_EQ_DE=0.35, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rqtxY6PzCrfF for <json@ietfa.amsl.com>; Thu, 17 Oct 2013 23:39:48 -0700 (PDT)
Received: from informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) by ietfa.amsl.com (Postfix) with ESMTP id E8E7621F9E3B for <json@ietf.org>; Thu, 17 Oct 2013 23:39:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from smtp-fb3.informatik.uni-bremen.de (smtp-fb3.informatik.uni-bremen.de [134.102.224.120]) by informatik.uni-bremen.de (8.14.4/8.14.4) with ESMTP id r9I6dhZM012120; Fri, 18 Oct 2013 08:39:43 +0200 (CEST)
Received: from [192.168.217.105] (p54892BCF.dip0.t-ipconnect.de [84.137.43.207]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by smtp-fb3.informatik.uni-bremen.de (Postfix) with ESMTPSA id A36EBE2F; Fri, 18 Oct 2013 08:39:43 +0200 (CEST)
Mime-Version: 1.0 (Mac OS X Mail 6.6 \(1510\))
Content-Type: text/plain; charset=iso-8859-1
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <CAHBU6itESY9bzSZ-0VnLq-VNcPg_LbR_q-kuaVzTyLQL9tTKcw@mail.gmail.com>
Date: Fri, 18 Oct 2013 08:39:42 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <B1E64114-E01E-499C-93A0-5CC201C4175D@tzi.org>
References: <CAHBU6itESY9bzSZ-0VnLq-VNcPg_LbR_q-kuaVzTyLQL9tTKcw@mail.gmail.com>
To: Tim Bray <tbray@textuality.com>
X-Mailer: Apple Mail (2.1510)
Cc: "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] Naked surrogates already banned?
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Oct 2013 06:39:54 -0000

On Oct 18, 2013, at 03:36, Tim Bray <tbray@textuality.com> wrote:

> It says Any character may be escaped. If the character is in the Basic Multilingual Plane (U+0000 through U+FFFF), then it may be represented as a six-character sequence: a reverse solidus, followed by the lowercase letter u,... etc

It already says in section 1:    
A string is a sequence of zero or more Unicode characters [UNICODE].

I think we had that discussion already.
Count me on the side of the people who don't think UTF-16 artifacts are, or have ever been, a part of JSON.
ECMA-404 is on the side of "any Unicode code point", but that is just one of the extensions 404 makes over JSON.

Now there is a problem that the definition in 4627 ties JSON to a specific version of Unicode.
(The reference is nicely confusing in which version is meant, but that is an artifact of the way Unicode versions are documented.)
I think a robust interpretation of the intent here will add all code points that are available to be characters in future versions of Unicode.
That is a change the WG SHOULD make, the predictable noise from the surrogate faction notwithstanding.

Grüße, Carsten