[Json] Scope: Wire format or runtime format?

Norbert Lindenberg <ietf@lindenbergsoftware.com> Thu, 13 June 2013 22:47 UTC

Return-Path: <ietf@lindenbergsoftware.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost []) by ietfa.amsl.com (Postfix) with ESMTP id 888F821F9AAC for <json@ietfa.amsl.com>; Thu, 13 Jun 2013 15:47:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.624
X-Spam-Status: No, score=-3.624 tagged_above=-999 required=5 tests=[AWL=-0.025, BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([]) by localhost (ietfa.amsl.com []) (amavisd-new, port 10024) with ESMTP id jx9665Mra7N4 for <json@ietfa.amsl.com>; Thu, 13 Jun 2013 15:47:45 -0700 (PDT)
Received: from mirach.lunarpages.com (mirach.lunarpages.com []) by ietfa.amsl.com (Postfix) with ESMTP id 2D57721F99B0 for <json@ietf.org>; Thu, 13 Jun 2013 15:47:45 -0700 (PDT)
Received: from 50-0-136-241.dsl.dynamic.sonic.net ([]:58202 helo=[]) by mirach.lunarpages.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80) (envelope-from <ietf@lindenbergsoftware.com>) id 1UnGIw-0049h2-B9; Thu, 13 Jun 2013 15:47:42 -0700
From: Norbert Lindenberg <ietf@lindenbergsoftware.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Thu, 13 Jun 2013 15:47:38 -0700
Message-Id: <6FC6B441-B74D-4B9F-B883-065C05890880@lindenbergsoftware.com>
To: json@ietf.org
Mime-Version: 1.0 (Apple Message framework v1283)
X-Mailer: Apple Mail (2.1283)
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - mirach.lunarpages.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - lindenbergsoftware.com
X-Get-Message-Sender-Via: mirach.lunarpages.com: authenticated_id: ietf@lindenbergsoftware.com
Cc: Norbert Lindenberg <ietf@lindenbergsoftware.com>
Subject: [Json] Scope: Wire format or runtime format?
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Jun 2013 22:47:50 -0000

In looking over older messages on this list, I found a message that made clear to me why we're having this endless discussion about Unicode surrogates - it's because we're not clear whether we're designing a wire format or a format that also for use at runtime:

Some people are coming from the runtime point of view, especially ECMAScript, where it's accepted practice to use ill-formed UTF-16 or even non-text in strings. At least the ill-formed UTF-16 is legitimized by section 2.7 of the Unicode standard.

Other people are coming from the wire protocol point of view, where clean formats are expected, in particular well-formed Unicode code unit sequences according to section 3.9 of the Unicode standard.

So which one shall it be?

If we adopt the wire protocol point of view and require well-formed code unit sequences, then ECMAScript will have to define its own extension of JSON (which it has already by allowing JSON values at the top level).

If we adopt the runtime point of view and allow all code points as in RFC 4627, then there probably should be separate verbiage defining a restricted version for use over the wire.