Re: [Json] Proposal for strings/Unicode text

Paul Hoffman <paul.hoffman@vpnc.org> Mon, 17 June 2013 13:59 UTC

Return-Path: <paul.hoffman@vpnc.org>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ACB6221F9C2F for <json@ietfa.amsl.com>; Mon, 17 Jun 2013 06:59:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.488
X-Spam-Level:
X-Spam-Status: No, score=-102.488 tagged_above=-999 required=5 tests=[AWL=0.111, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iy+EwNHPQZei for <json@ietfa.amsl.com>; Mon, 17 Jun 2013 06:59:08 -0700 (PDT)
Received: from hoffman.proper.com (IPv6.Hoffman.Proper.COM [IPv6:2605:8e00:100:41::81]) by ietfa.amsl.com (Postfix) with ESMTP id 36FAD21F9BB8 for <json@ietf.org>; Mon, 17 Jun 2013 06:59:08 -0700 (PDT)
Received: from [10.20.30.90] (50-0-66-165.dsl.dynamic.sonic.net [50.0.66.165]) (authenticated bits=0) by hoffman.proper.com (8.14.5/8.14.5) with ESMTP id r5HDx6jg032757 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Mon, 17 Jun 2013 06:59:07 -0700 (MST) (envelope-from paul.hoffman@vpnc.org)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\))
From: Paul Hoffman <paul.hoffman@vpnc.org>
In-Reply-To: <255B9BB34FB7D647A506DC292726F6E1151B931064@WSMSG3153V.srv.dir.telstra.com>
Date: Mon, 17 Jun 2013 06:59:07 -0700
Content-Transfer-Encoding: quoted-printable
Message-Id: <9DC91218-B255-4CA2-8337-41F7135A90BD@vpnc.org>
References: <20130613121620.GB11739@mercury.ccil.org> <A723FC6ECC552A4D8C8249D9E07425A70FC47B42@xmb-rcd-x10.cisco.com> <255B9BB34FB7D647A506DC292726F6E1151B931064@WSMSG3153V.srv.dir.telstra.com>
To: "Manger, James H" <james.h.manger@team.telstra.com>
X-Mailer: Apple Mail (2.1508)
Cc: json@ietf.org
Subject: Re: [Json] Proposal for strings/Unicode text
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Jun 2013 13:59:08 -0000

On Jun 16, 2013, at 10:02 PM, "Manger, James H" <james.h.manger@team.telstra.com> wrote:

>>> The point is that if JSON is encoded in UTF-8, any surrogate code
>>> points MUST be escaped, even though the grammar does not say so.
>> 
>> What about changing the grammar to make that clear?
>> 
>> unescaped = %x20-21 / %x23-5B / %x5D-%xD7FF / %E000-%10FFFF  ; Any
>> unicode code point except control characters, QUOTATION MARK,  ;
>> REVERSE SOLIDUS, or code points reserved for UTF-16 surrogates
> 
> +1
> Unpaired surrogates cannot be interchanged reliably so they should be dropped from this ABNF. I don't mind a note saying how some implementations handle them (or their escaped form).
> 
> Fixing typos and tweaking the comment:
> 
>  unescaped = %x20-21 / %x23-5B / %x5D-D7FF / %xE000-10FFFF
>    ; any Unicode scalar value, except those that must be escaped
>    ; (control characters, quotation mark, and reverse solidus)


<no hat>

Making comments in ABNF that disagree with the ABNF itself seems like a completely terrible idea that will lead to lack of interoperability. Complicated ABNF is better than simple ABNF that has contradictory comments.

--Paul Hoffman