[Json] Actual interoperability issues

Carsten Bormann <cabo@tzi.org> Wed, 26 June 2013 20:19 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 60D1611E81DD for <json@ietfa.amsl.com>; Wed, 26 Jun 2013 13:19:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.249
X-Spam-Level:
X-Spam-Status: No, score=-106.249 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HELO_EQ_DE=0.35, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7S3rWLwZ0PmL for <json@ietfa.amsl.com>; Wed, 26 Jun 2013 13:19:03 -0700 (PDT)
Received: from informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) by ietfa.amsl.com (Postfix) with ESMTP id 44E7F11E8144 for <json@ietf.org>; Wed, 26 Jun 2013 13:19:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from smtp-fb3.informatik.uni-bremen.de (smtp-fb3.informatik.uni-bremen.de [134.102.224.120]) by informatik.uni-bremen.de (8.14.4/8.14.4) with ESMTP id r5QKIuYZ009526 for <json@ietf.org>; Wed, 26 Jun 2013 22:18:56 +0200 (CEST)
Received: from pptp-218-1.informatik.uni-bremen.de (pptp-218-1.informatik.uni-bremen.de [134.102.218.240]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by smtp-fb3.informatik.uni-bremen.de (Postfix) with ESMTPSA id AA6A0324D; Wed, 26 Jun 2013 22:18:55 +0200 (CEST)
From: Carsten Bormann <cabo@tzi.org>
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Date: Wed, 26 Jun 2013 22:18:54 +0200
To: "json@ietf.org WG" <json@ietf.org>
Message-Id: <6E98097C-59A2-428A-BF31-E6A1F9114747@tzi.org>
Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\))
X-Mailer: Apple Mail (2.1508)
Subject: [Json] Actual interoperability issues
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Jun 2013 20:19:09 -0000

From the 1093 JSON messages in my mailbox, I find discussion of the
following issues that appear to be relevant for interoperability:

Group 1: Overall format of JSON documents

   (This group is about the sequence of characters that make up the
   actual JSON document, i.e. native Unicode characters/code points,
   not escape sequences etc.)

1.1) Ill-formed Unicode in JSON documents

     (Unpaired surrogates in UTF-16, CESU-8 surrogates, etc.)

1.2) Use of non-characters (Unicode term) in a JSON document

     (Is that an actual interoperability issue?  I haven't seen it.)

1.3) Use of non-yet-assigned characters in a JSON document

     (Is that an actual interoperability issue?  I haven't seen it.)

1.4) Usage of LS/PS (U+2028/U+2029) in JSON documents

     (I.e., native LS/PS; escapes are not a problem.  This hurts newer
     JavaScript implementations if they still want to eval things.
     Probably easy to solve: this just needs to be discouraged)

1.5) BOMs in JSON documents

1.6) usage of UTF-16 and UTF-32 based character encoding schemes (and
     which ones?) for JSON documents

1.7) non-BMP characters in JSON documents

     (I have heard about implementations that have trouble with this,
      maybe because of assuming CESU-8 instead of UTF-8, but have
      never been able to pin them down.  May be a non-issue, more
      study required.)

1.8) Combining characters after markup

     (Another issue that was mentioned, but for which I'm not aware of
      an actual interoperability problem.)

Group 2: Syntax, overall data model

2.1) Top-level data items beyond arrays and maps (JSON "objects")

     (and the potential interaction with 1.5/1.6; with streaming; with
     detection of truncation)

Group 3: Interpretation of data in JSON documents

3.1) Ill-formed Unicode constructed from escape sequences in strings

     (This is about escape sequences that generate unpaired surrogates.)

3.2) Construction of non-characters (Unicode term) from escape sequences in strings

     (Is that an actual interoperability issue?  I haven't seen it.)

3.3) Construction of non-yet-assigned characters from escape sequences in strings

     (Is that an actual interoperability issue?  I haven't seen it.)

3.4) Non-unique ("duplicate") keys in maps (JSON "objects")

3.5) Equivalence of native and (variants of) escaped representations
     of characters in map keys

     (Is that an actual interoperability issue?  I haven't seen it.)

3.6) Equivalence of canonical-equivalent (Unicode term) characters in
     map keys

     (Is that an actual interoperability issue?  I haven't seen it.)

3.7) Assumptions on the semantics of ordering of entries in maps

     (also in conjunction with 3.4)

3.8) Assumptions about (range and) precision of numbers

     (E.g., the twitter ID issue)

3.9) Handling of NaNs in numbers

     (I'm not quite sure what the issue was, but it was mentioned)

So much for the content of 1093 messages.  (Most of these are almost
trivial to resolve/clarify from what we already know, and should be,
so each spec referencing JSON doesn't need to supply its own 18
answers to these questions.)

Am I missing any issues?

(For now, as long as the issue is about interoperability, I don't care
about "standard" vs "implementation guide" issues; we can sort that
one out later, including whether we want to address the issue at all.)

Grüße, Carsten

PS.: There also was a feature request, for a "profile" parameter of
the media type.  Any feature requests that I'm missing?