Re: [Json] Doofus Parameter Labels

Nico Williams <nico@cryptonector.com> Mon, 30 March 2015 20:20 UTC

Return-Path: <nico@cryptonector.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 308651A8756 for <json@ietfa.amsl.com>; Mon, 30 Mar 2015 13:20:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.233
X-Spam-Level:
X-Spam-Status: No, score=0.233 tagged_above=-999 required=5 tests=[BAYES_40=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, IP_NOT_FRIENDLY=0.334, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2bvT-Z21LtEk for <json@ietfa.amsl.com>; Mon, 30 Mar 2015 13:20:11 -0700 (PDT)
Received: from homiemail-a86.g.dreamhost.com (sub4.mail.dreamhost.com [69.163.253.135]) by ietfa.amsl.com (Postfix) with ESMTP id D813E1A886D for <json@ietf.org>; Mon, 30 Mar 2015 13:16:42 -0700 (PDT)
Received: from homiemail-a86.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a86.g.dreamhost.com (Postfix) with ESMTP id D4EDE360155; Mon, 30 Mar 2015 13:16:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h=date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=cryptonector.com; bh=Yh1QkE09XiscMu AQmmFuiiIWk9o=; b=IGXFyK9+hwA4YoPzdFhph70zMR8QoJw5j3mUTGzX099s5q O9VsOhrhp52JNDYG3YlWDh9Cl+aMHXSFfa+45rgwHCQ5zeriuXj7O1Tg5OEqyKVZ FVBAuSTLv1RLEowRBPaObxYoVXb/DHfE5EI2PhSCGhzNhRMEEAF8kAV3mYImI=
Received: from localhost (108-207-244-174.lightspeed.austtx.sbcglobal.net [108.207.244.174]) (Authenticated sender: nico@cryptonector.com) by homiemail-a86.g.dreamhost.com (Postfix) with ESMTPA id CF747360135; Mon, 30 Mar 2015 13:15:45 -0700 (PDT)
Date: Mon, 30 Mar 2015 15:15:35 -0500
From: Nico Williams <nico@cryptonector.com>
To: Phillip Hallam-Baker <phill@hallambaker.com>
Message-ID: <20150330201534.GV10960@localhost>
References: <CAMm+Lwg7UBJO83BN1xBbVsbzgeXa1pbhtiTcPKmMgFX3HcvQEw@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <CAMm+Lwg7UBJO83BN1xBbVsbzgeXa1pbhtiTcPKmMgFX3HcvQEw@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: <http://mailarchive.ietf.org/arch/msg/json/w9NVxBpFv1CDC8-SJ1JxLKAOJgU>
Cc: JSON WG <json@ietf.org>
Subject: Re: [Json] Doofus Parameter Labels
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 30 Mar 2015 20:20:12 -0000

On Mon, Mar 30, 2015 at 03:38:39PM -0400, Phillip Hallam-Baker wrote:
> So, I am implementing an IETF draft which is fairly widely used and
> using a code generator to speed things along.
> 
> The code is breaking because the spec has decided to use 'protected'
> as a tag which is of course a reserved keyword in Java, C# and much
> else.
> 
> OK so there is an escape feature that I can use instead. But those
> don't exist in other languages. And then I come across a tag that has
> a non alphanumeric tag value.

This sort of question comes up a lot for jq.  I don't think you can
produce a generic rule here.  "protected" is a keyword in many
languages, but not others.  It's not a keyword in jq, for example,
though 'reduce' is (which is probably not a keyword elsewhere).

There is a workaround, of course.  Instead of writing

    .then.protected

one has to write

    .["then"].protected

or

    ."then".protected

in jq in order to access the value at the "protected" object name in the
object at the "then" name in the top-level object.

What sort of rule do you have in mind for avoiding this?  Why are the
workarounds insufficient?

> This sort of thing is likely to be happening quite a bit now that
> people are using JSON and it would be a lot better if we could avoid
> it. The fact that JSON allows tags to be any valid UNICODE sequence
> does not mean that all choices are equal.

There are other considerations too.  For example: names shorter than 8
or 16 bytes may get treated as immediate values, without heap allocation
(I have a half-baked patch to do that for jq) in some implementations.
(E.g., a NaN-coding implementation may be able to store most 7-byte
strings as immediate values.)

> In general it would be good if people writing specs could stick to the
> identifier naming rules for C and avoid reserved words from Java, C#,
> etc. The languages and tools I use are quite capable of using accented
> characters. I have absolutely no clue how to generate them on the
> keyboard though.

OK, so, stick to identifier naming rules for C, avoid keywords from some
set of programming languages, and don't use non-ASCII characters.  How
do we specify such constraints?  Remember, programming language specs
evolve and tend to grow new keywords.

-1