Re: [Json] ABNF nits -- LAST CHANCE ON PROPOSALS

Stefan Drees <stefan@drees.name> Wed, 12 June 2013 15:19 UTC

Return-Path: <stefan@drees.name>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 52A3821F98AD for <json@ietfa.amsl.com>; Wed, 12 Jun 2013 08:19:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.829
X-Spam-Level:
X-Spam-Status: No, score=-1.829 tagged_above=-999 required=5 tests=[AWL=-0.180, BAYES_00=-2.599, HELO_EQ_DE=0.35, J_CHICKENPOX_14=0.6]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mKX4SisWEhLS for <json@ietfa.amsl.com>; Wed, 12 Jun 2013 08:19:17 -0700 (PDT)
Received: from mout.web.de (mout.web.de [212.227.17.12]) by ietfa.amsl.com (Postfix) with ESMTP id 0AD1321F994B for <json@ietf.org>; Wed, 12 Jun 2013 08:19:16 -0700 (PDT)
Received: from newyork.local.box ([77.13.220.117]) by smtp.web.de (mrweb103) with ESMTPSA (Nemesis) id 0MK20H-1UoAGz3EOL-002Btk; Wed, 12 Jun 2013 17:19:13 +0200
Message-ID: <51B89171.9070100@drees.name>
Date: Wed, 12 Jun 2013 17:19:13 +0200
From: Stefan Drees <stefan@drees.name>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:17.0) Gecko/20130509 Thunderbird/17.0.6
MIME-Version: 1.0
To: Paul Hoffman <paul.hoffman@vpnc.org>
References: <6898D31C-FF53-4B8D-9F81-5519C934E00D@vpnc.org>
In-Reply-To: <6898D31C-FF53-4B8D-9F81-5519C934E00D@vpnc.org>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Provags-ID: V02:K0:cTor+YDMxqBs173YumTfYqczdSr3LuHO8wodpJMhsBp Gb88lyOdxNrHewmDe8qJWIq7Deekw74zCSI4RmQLsFd/jT6zCx 1jRbHF1HVsONpKXluk56pAcvBra5fDSsmmGVDOtSC2bJM8Agwk CDjGQZmc7g6NF1JeI0S+xCN94YtN7kmmmHKCtu/d7f6b1PwY1J LY109UPd5Nx4qU+adW0gg==
Cc: "json@ietf.org" <json@ietf.org>
Subject: Re: [Json] ABNF nits -- LAST CHANCE ON PROPOSALS
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: stefan@drees.name
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Jun 2013 15:19:23 -0000

On 2013-06-11 19:32 CEST, Paul Hoffman wrote:
> ...
> JSON-text = object / array
>
> begin-array     = ws %x5B ws  ; [ left square bracket
> begin-object    = ws %x7B ws  ; { left curly bracket
> end-array       = ws %x5D ws  ; ] right square bracket
> end-object      = ws %x7D ws  ; } right curly bracket
> name-separator  = ws %x3A ws  ; : colon
> value-separator = ws %x2C ws  ; , comma
>
> ws = *(
>       %x20 /              ; Space
>       %x09 /              ; Horizontal tab
>       %x0A /              ; Line feed or New line
>       %x0D                ; Carriage return
>       )
> ...

allthough I am used to ABNF stanzas like the above, as a grammar, it is 
only "paper ware" (IMO not useful for an optimal parser generator) and I 
would like to hereby ask all participants in this endeavour if it is 
possible to list the allowed "white space" without defining such a mad 
token like "ws" is defined above. (I think a grammar should clearly 
focus on real tokens and not name "nothing")

Why now and not earlier? Well, as Carsten in a separate sub thread (of 
this "ABNF nits last proposal" thread) suggested a better readable ABNF 
of this part and I opposed to it, I went back and looked at the above 
proposed version again and of course disliked the "ws"'s sprinkled all 
over the place that are in fact optional (zero or more) but which is 
buried inside the ws rule.

Wouldn't it be better to explicitely state the following:

"""
JSON-text = object / array

begin-array     = *ws %x5B *ws  ; [ left square bracket
begin-object    = *ws %x7B *ws  ; { left curly bracket
end-array       = *ws %x5D *ws  ; ] right square bracket
end-object      = *ws %x7D *ws  ; } right curly bracket
name-separator  = *ws %x3A *ws  ; : colon
value-separator = *ws %x2C *ws  ; , comma

ws =  (
       %x20 /              ; Space
       %x09 /              ; Horizontal tab
       %x0A /              ; Line feed or New line
       %x0D                ; Carriage return
       )
"""

instead of the grammar part cited on top of this mail?

Then a "ws" truly would be a single white space.
Also the optionality (zero or more) of this white space surrounding the 
punctuation tokens would clearly stand out.

What do you think? Based on that, would it be helpful, to deliver a 
complete new proposal? I thought yes, and thus integrated Carstens first 
partial proposal on better grouping on the right hand side of the "char" 
rule.

So **this** below is my proposal unification effort. (I named it 
proposal 2, as I did not see another complete proposal. If I overlooked 
one, please excuse and renumber the below one accordingly.)

Proposal 2
==========

JSON-text = object / array

begin-array     = *ws %x5B *ws  ; [ left square bracket
begin-object    = *ws %x7B *ws  ; { left curly bracket
end-array       = *ws %x5D *ws  ; ] right square bracket
end-object      = *ws %x7D *ws  ; } right curly bracket
name-separator  = *ws %x3A *ws  ; : colon
value-separator = *ws %x2C *ws  ; , comma

ws = (
      %x20 /              ; Space
      %x09 /              ; Horizontal tab
      %x0A /              ; Line feed or New line
      %x0D                ; Carriage return
      )

value = false / null / true / object / array / number / string
false = %x66.61.6c.73.65   ; false
null  = %x6e.75.6c.6c      ; null
true  = %x74.72.75.65      ; true

object = begin-object [ member *( value-separator member ) ]
          end-object
member = string name-separator value

array = begin-array [ value *( value-separator value ) ] end-array

number = [ minus ] int [ frac ] [ exp ]
decimal-point = %x2E       ; .
digit1-9 = %x31-39         ; 1-9
e = %x65 / %x45            ; e E
exp = e [ minus / plus ] 1*DIGIT
frac = decimal-point 1*DIGIT
int = zero / ( digit1-9 *DIGIT )
minus = %x2D               ; -
plus = %x2B                ; +
zero = %x30                ; 0

string = quotation-mark *char quotation-mark

char = unescaped / (
     escape (
         %x22 /          ; "    quotation mark  U+0022
         %x5C /          ; \    reverse solidus U+005C
         %x2F /          ; /    solidus         U+002F
         %x62 /          ; b    backspace       U+0008
         %x66 /          ; f    form feed       U+000C
         %x6E /          ; n    line feed       U+000A
         %x72 /          ; r    carriage return U+000D
         %x74 /          ; t    tab             U+0009
         (%x75 4HEXDIG) ) )   ; uXXXX           U+XXXX

escape = %x5C              ; \
quotation-mark = %x22      ; "
unescaped = %x20-21 / %x23-5B / %x5D-10FFFF

HEXDIG = DIGIT / %x41-46 / %x61-66   ; 0-9, A-F, or a-f
        ; HEXDIG equivalent to HEXDIG rule in [RFC5234]
DIGIT = %x30-39            ; 0-9
       ; DIGIT equivalent to DIGIT rule in [RFC5234]


""" end of proposal 2.


Thanks a lot,
Stefan.