Re: [Json] Consensus call: ABNF nits

Norbert Lindenberg <ietf@lindenbergsoftware.com> Fri, 21 June 2013 18:57 UTC

Return-Path: <ietf@lindenbergsoftware.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0AB0221F9E5A for <json@ietfa.amsl.com>; Fri, 21 Jun 2013 11:57:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.003
X-Spam-Level:
X-Spam-Status: No, score=-3.003 tagged_above=-999 required=5 tests=[AWL=-1.204, BAYES_00=-2.599, J_CHICKENPOX_14=0.6, J_CHICKENPOX_15=0.6, J_CHICKENPOX_16=0.6, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4PGf4AxyTV2e for <json@ietfa.amsl.com>; Fri, 21 Jun 2013 11:56:57 -0700 (PDT)
Received: from mirach.lunarpages.com (mirach.lunarpages.com [216.97.235.70]) by ietfa.amsl.com (Postfix) with ESMTP id 6D6D221F9CF2 for <json@ietf.org>; Fri, 21 Jun 2013 11:56:57 -0700 (PDT)
Received: from 50-0-136-241.dsl.dynamic.sonic.net ([50.0.136.241]:57524 helo=[192.168.0.5]) by mirach.lunarpages.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80) (envelope-from <ietf@lindenbergsoftware.com>) id 1Uq6Vx-000xLn-GE; Fri, 21 Jun 2013 11:56:53 -0700
Mime-Version: 1.0 (Apple Message framework v1283)
Content-Type: text/plain; charset="utf-8"
From: Norbert Lindenberg <ietf@lindenbergsoftware.com>
In-Reply-To: <9E497099-2B5B-4256-B066-C2E77B558267@vpnc.org>
Date: Fri, 21 Jun 2013 11:56:50 -0700
Content-Transfer-Encoding: quoted-printable
Message-Id: <339F1E4A-5D5F-481E-AF4D-42477A07F50F@lindenbergsoftware.com>
References: <9E497099-2B5B-4256-B066-C2E77B558267@vpnc.org>
To: Paul Hoffman <paul.hoffman@vpnc.org>
X-Mailer: Apple Mail (2.1283)
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - mirach.lunarpages.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - lindenbergsoftware.com
X-Get-Message-Sender-Via: mirach.lunarpages.com: authenticated_id: ietf@lindenbergsoftware.com
Cc: Norbert Lindenberg <ietf@lindenbergsoftware.com>, JSON WG <json@ietf.org>
Subject: Re: [Json] Consensus call: ABNF nits
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Jun 2013 18:57:02 -0000

On Jun 21, 2013, at 9:40 , Paul Hoffman wrote:

> There are two proposals for dealing with nits in the ABNF:
>  0) Leave the ABNF in the current draft as-is
>  1) Change the ABNF in the draft to use the values in "Proposal 1" below
> Note that later proposals might make technical changes to the ABNF itself; those proposals are not what we are discussing in this consensus call.
> 
> Please respond to this message with a list of proposals you could accept, ordered from highest to lowest. Do not list proposals you cannot live with. If you cannot accept any of the proposals, please respond and say why.

Below are comments on three issues with proposal 1. The second issue, if not addressed, would make proposal 1 unacceptable to me. Otherwise I prefer proposal 1 over the old version.

> char = unescaped / (
>    escape (
>        %x22 /           ; "    quotation mark  U+0022
>        %x5C /           ; \    reverse solidus U+005C
>        %x2F /           ; /    solidus         U+002F
>        %x62 /           ; b    backspace       U+0008
>        %x66 /           ; f    form feed       U+000C
>        %x6E /           ; n    line feed       U+000A
>        %x72 /           ; r    carriage return U+000D
>        %x74 /           ; t    tab             U+0009
>        %x75 4HEXDIG ) ) ; uXXXX                U+XXXX

As noted earlier, the last line isn't quite correct - \uXXXX can not only represent U+XXXX, but also be part of U+YYYYYY, where YYYYYY depends on a preceding or following \uXXXX sequence.

Proposal: Replace "U+XXXX" with "U+XXXX or part of U+YYYYYY".

> escape = %x5C              ; \  U+005C
> quotation-mark = %x22      ; "  U+0022
> unescaped = %x20-21 / %x23-5B / %x5D-10FFFF  ; any Unicode scalar
>   ; value, except those that must be escaped (C0 control
>   ; characters, "QUOTATION MARK" [U+0022], and "REVERSE SOLIDUS"
>   ; [U+005C])

Describing "unescaped" as "any Unicode scalar value, except..." is wrong. The correct term is "Unicode code point". Unicode scalar values do not include the range U+D800 through U+DFFFF.
http://www.unicode.org/glossary/#unicode_scalar_value

> HEXDIG = DIGIT / %x41-46 / %x61-66   ; 0-9, A-F, or a-f
>       ; HEXDIG equivalent to HEXDIG rule in [RFC5234]

To complete the effort of disambiguating Unicode characters in the grammar, the comment should include "U+0030 through U+0039, U+0041 through U+0046, U+0061 through U+0066". We don't want them to be confused with 0-9, A-F, or a-f.

Norbert