[Json] Ambiguous character references

Norbert Lindenberg <ietf@lindenbergsoftware.com> Tue, 18 June 2013 06:07 UTC

Return-Path: <ietf@lindenbergsoftware.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost []) by ietfa.amsl.com (Postfix) with ESMTP id 2DFBF21F9DE0 for <json@ietfa.amsl.com>; Mon, 17 Jun 2013 23:07:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.541
X-Spam-Status: No, score=-4.541 tagged_above=-999 required=5 tests=[AWL=1.058, BAYES_00=-2.599, GB_I_LETTER=-2, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([]) by localhost (ietfa.amsl.com []) (amavisd-new, port 10024) with ESMTP id WswpUBFWNoNJ for <json@ietfa.amsl.com>; Mon, 17 Jun 2013 23:07:07 -0700 (PDT)
Received: from mirach.lunarpages.com (mirach.lunarpages.com []) by ietfa.amsl.com (Postfix) with ESMTP id F217621F9DD3 for <json@ietf.org>; Mon, 17 Jun 2013 23:07:06 -0700 (PDT)
Received: from 50-0-136-241.dsl.dynamic.sonic.net ([]:62872 helo=[]) by mirach.lunarpages.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80) (envelope-from <ietf@lindenbergsoftware.com>) id 1Uop4K-0014Jr-0d; Mon, 17 Jun 2013 23:07:04 -0700
From: Norbert Lindenberg <ietf@lindenbergsoftware.com>
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Date: Mon, 17 Jun 2013 23:07:00 -0700
Message-Id: <8FDBF4EF-B2DF-458E-834E-5707ED7B27C9@lindenbergsoftware.com>
To: Paul Hoffman <paul.hoffman@vpnc.org>
Mime-Version: 1.0 (Apple Message framework v1283)
X-Mailer: Apple Mail (2.1283)
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - mirach.lunarpages.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - lindenbergsoftware.com
X-Get-Message-Sender-Via: mirach.lunarpages.com: authenticated_id: ietf@lindenbergsoftware.com
Cc: Norbert Lindenberg <ietf@lindenbergsoftware.com>, json@ietf.org
Subject: [Json] Ambiguous character references
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Jun 2013 06:07:12 -0000

The existing text in section 2.5 of RFC 4627, first through third paragraph, has the same ambiguous character references as discussed below. Do they need to be fixed?

quotation marks -> quotation marks (U+0022)

quotation mark -> quotation mark (U+0022)

reverse solidus -> reverse solidus (U+005C)

control characters (U+0000 through U+001F) -> control characters U+0000 through U+001F

lowercase letter u -> Latin small letter u (U+0075)

hexadecimal digits -> hexadecimal digits U+0030 through U+0039, U+0041 through U+0046, U+0061 through U+0066


On Jun 17, 2013, at 7:43 , Paul Hoffman wrote:

> On Jun 17, 2013, at 7:32 AM, John Cowan <cowan@mercury.ccil.org> wrote:
>> Paul Hoffman scripsit:
>>>> Where's the contradiction?  The prose says exactly what the ABNF says.
>>>> %22 (quotation mark), %5C (backslash), and %D800 through %DFFF (surrogate
>>>> pairs) are left out of the full Unicode range.
>>> U+0000-0019 is not the entire list of Unicode control characters? U+0022
>>> is not the only Unicode quotation mark.
>> I assume by "19" you mean "1F".  
> Yes. :-)
>> If you want to say "C0 control
>> characters", fine.  
> That would work for me. But I suspect that some people will say "no, all control characters".
>> But there is only one Unicode character named
>> "QUOTATION MARK", and it is the only one relevant to JSON.
> Then we should say that. Or, better, not have a comment that is open to misinterpretation.
>>> Did we not learn anything from the IDN WG work 13 years ago?
>> I don't see what that has to do with the price of pídàn in China.
> An excellent accidental analogy! I was referring to the fact that "everyone" understood that the "dot" between DNS labels was U+002E... except the Japanese. And we had to put a complicated exception in for them, late in the process.
>> Other quotation marks and control characters need not be escaped in JSON.
> If everyone agrees with that, then we do not need the comment at all. But I suspect that "everyone" does not.
> --Paul Hoffman
> _______________________________________________
> json mailing list
> json@ietf.org
> https://www.ietf.org/mailman/listinfo/json