[Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"

Matthew Miller <linuxwolf+ietf@outer-planes.net> Mon, 13 March 2017 21:06 UTC

Return-Path: <linuxwolf+ietf@outer-planes.net>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost []) by ietfa.amsl.com (Postfix) with ESMTP id 967AF129A59 for <json@ietfa.amsl.com>; Mon, 13 Mar 2017 14:06:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.235
X-Spam-Status: No, score=-1.235 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_SOFTFAIL=0.665] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=outer-planes-net.20150623.gappssmtp.com
Received: from mail.ietf.org ([]) by localhost (ietfa.amsl.com []) (amavisd-new, port 10024) with ESMTP id a4IIbOQhqJj6 for <json@ietfa.amsl.com>; Mon, 13 Mar 2017 14:06:16 -0700 (PDT)
Received: from mail-ot0-x241.google.com (mail-ot0-x241.google.com [IPv6:2607:f8b0:4003:c0f::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 51F0512942F for <json@ietf.org>; Mon, 13 Mar 2017 14:06:15 -0700 (PDT)
Received: by mail-ot0-x241.google.com with SMTP id i1so17561186ota.3 for <json@ietf.org>; Mon, 13 Mar 2017 14:06:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=outer-planes-net.20150623.gappssmtp.com; s=20150623; h=sender:from:subject:to:cc:message-id:date:user-agent:mime-version; bh=FTv5hBe1776sDGut0zZXJcC38Jt1DVtjoL9JfpDQ1c0=; b=iXgzUU3H+6Ja1rRmsi63sJ5v8rVoSUYpFif2jnjVg96wlBIkPktDIJiLsopBvUTk0R mqUuqEKowpuPrHc/ZiyUwB5HJOb4NK2dA3AAlPNQbWcmqDX7PKIenPrVM8kZEhWyn2XJ 5xyCRRB/NXcp5AgDC2dninrmd+eMtEgoRX2lp8BZmCsX6MtAp6DjMR0AucKihvpsGA2s zTphQVxgjYQSa/1XJsL8dGQWkSGN93APSa7YH5RN58YuCKQ6kK88OxXQ0cAoh17t70zB UJinwckf+Y9NycbOfJvX0dtPPkGHX+d4TxBMQURfohIhUeMTclZ3Pqv3wkPH8ZSbN2LJ /Edw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:subject:to:cc:message-id:date :user-agent:mime-version; bh=FTv5hBe1776sDGut0zZXJcC38Jt1DVtjoL9JfpDQ1c0=; b=lnjCgr0bbQjwPtIUrDH4AwLGVQIxwjWksqTI72JFAS1KL+hvpxh+cG5Lcg2x358vzI yY4rjtZWX2K4+M6QGmpU17dkaKyF3OME3O3PdgArVLwOzmuF4uPBy/MkSZacXmxupCky 5qViThKaS4FCD7anZHji/L+EQwQVB/OaX+PeXg6xKX5+T8K7VVpVxVk64gbJ8PYqdtKK s/eIenOKkzG3Lhpnh5mLrr55oW85taaZVpxejG07FT6pf2lkbmWF83fzS49SabsxFprc KPi0pF6PrdVK9VJoDlomAZTVjUv3NCzMk6rTV69Ev6gphWgpfk8Z+Q1wdm/G+xyVLnZk KQ6A==
X-Gm-Message-State: AFeK/H1is9RgVXRj+VaCFSNaJ4YfmTslDhkvLGei9JM4THkc5bBFrlAr0TpP/SMXIkRf0w==
X-Received: by with SMTP id o78mr17271154ota.144.1489439174649; Mon, 13 Mar 2017 14:06:14 -0700 (PDT)
Received: from [] ([]) by smtp.gmail.com with ESMTPSA id n126sm8450669oia.18.2017. (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 13 Mar 2017 14:06:13 -0700 (PDT)
Sender: Matthew Miller <linuxwolf@outer-planes.net>
From: Matthew Miller <linuxwolf+ietf@outer-planes.net>
To: "json@ietf.org" <json@ietf.org>
Message-ID: <1fb5849e-8dbf-835d-65b7-2403686248f9@outer-planes.net>
Date: Mon, 13 Mar 2017 15:06:13 -0600
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:45.0) Gecko/20100101 Thunderbird/45.7.1
MIME-Version: 1.0
Content-Type: multipart/signed; micalg="pgp-sha512"; protocol="application/pgp-signature"; boundary="K1eR3Gw34exKQ8RggkEejM6KdCFma9fkr"
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/6z7y7HTqV4U5mDHFEuuKTc7V5xQ>
Cc: draft-ietf-jsonbis-rfc7159bis.all@ietf.org
Subject: [Json] Call for Consensus: Proposed Text for "8.1 Character Encoding"
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 13 Mar 2017 21:06:16 -0000

Hello JSONbis,

The security directorate review discussion has raised the issue of
encoding detection.  The original table from RFC 4627 was removed from
RFC 7159 due to a lack of consensus.  In this latest round, there have
been a number of comments have been made supporting (and against) adding
more guidance than is currently present.

The chair asks for a call on the following from the working group:

1) Does the working group think adding any text on how to detect the
encoding worthwhile?

2a) If such text is worthwhile, is the following proposed text from Nico
Williams acceptable (to be appended to Section 8.1)?

   Implementors MAY count the number of ASCII NULs in the first four
   bytes of any JSON text to detect which of UTF-8, UTF-16, or UTF-32
   the text is encoded in:

    - if the count is zero, then the text is encoded in UTF-8
    - if the count is one or two, then the text is encoded in UTF-16
    - if the count is three, then the text is encoded in UTF-32

   This results from a) JSON texts having to start with an ASCII
   character, b) no unescaped NULs being allowed in JSON strings, and c)
   any type being allowed at the top-level, thus the first character may
   be a double-quote and the second may be any permissible, unescaped
   Unicode codepoint.  An ASCII character requires a NUL-valued byte in
   UTF-16 encoding, three in UTF-32, and none in UTF-8.

2b) If such text is worthwhile but Nico's proposal is not worthwhile,
what would be acceptable?

Please respond by March 16.

Thank you in advance,

Matthew A. Miller
JSONbis chair