Re: [Json] Another json interop soft spot

Carsten Bormann <cabo@tzi.org> Thu, 23 April 2015 16:43 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5ACCC1AC42B for <json@ietfa.amsl.com>; Thu, 23 Apr 2015 09:43:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.949
X-Spam-Level:
X-Spam-Status: No, score=0.949 tagged_above=-999 required=5 tests=[BAYES_40=-0.001, HELO_EQ_DE=0.35, J_CHICKENPOX_45=0.6] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gIckMdMU5ejf for <json@ietfa.amsl.com>; Thu, 23 Apr 2015 09:43:24 -0700 (PDT)
Received: from mailhost.informatik.uni-bremen.de (mailhost.informatik.uni-bremen.de [IPv6:2001:638:708:30c9::12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B60871A8AB8 for <json@ietf.org>; Thu, 23 Apr 2015 09:43:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at informatik.uni-bremen.de
Received: from submithost.informatik.uni-bremen.de (submithost.informatik.uni-bremen.de [134.102.201.11]) by mailhost.informatik.uni-bremen.de (8.14.5/8.14.5) with ESMTP id t3NGhBR9009408; Thu, 23 Apr 2015 18:43:11 +0200 (CEST)
Received: from alma.local (p5DCCC91B.dip0.t-ipconnect.de [93.204.201.27]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by submithost.informatik.uni-bremen.de (Postfix) with ESMTPSA id 3lXkvz52ytz2tnH; Thu, 23 Apr 2015 18:43:11 +0200 (CEST)
Message-ID: <55392120.4050601@tzi.org>
Date: Thu, 23 Apr 2015 18:43:12 +0200
From: Carsten Bormann <cabo@tzi.org>
User-Agent: Postbox 3.0.11 (Macintosh/20140602)
MIME-Version: 1.0
To: Tim Bray <tbray@textuality.com>
References: <CAHBU6iu1ndbw9V_D3yyxY_FiaBgtD9=94_Rgcra_RJ_WTLVqRA@mail.gmail.com>
In-Reply-To: <CAHBU6iu1ndbw9V_D3yyxY_FiaBgtD9=94_Rgcra_RJ_WTLVqRA@mail.gmail.com>
X-Enigmail-Version: 1.2.3
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/json/5TddlHmxtL_SKQgCxSHzc_YTfGs>
Cc: json@ietf.org
Subject: Re: [Json] Another json interop soft spot
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 23 Apr 2015 16:43:25 -0000

Tim Bray wrote:
> JSON.parse ' {"a\b": "b"}'

You fell into a nasty Ruby trap here:
In a single-quoted string, a single backslash remains a single backslash
unless it is followed by either another backslash or a single quote.

$ irb -rjson
>> RUBY_VERSION
=> "2.2.2"
>> JSON.parse ' {"a\b": "b"}'
=> {"a\b"=>"b"}
>> JSON.parse ' {"a\\b": "b"}'
=> {"a\b"=>"b"}
>> (JSON.parse ' {"a\\b": "b"}').keys[0].size
=> 2
>> (JSON.parse ' {"a\b": "b"}').keys[0].size
=> 2
>> (JSON.parse ' {"a\b": "b"}').keys[0].hexi
=> "6108"
>> (JSON.parse ' {"a\\b": "b"}').keys[0].hexi
=> "6108"
>> ('{"a\b": "b"}').size
=> 12
>> ('{"a\\b": "b"}').size
=> 12
>> ('{"a\\b": "b"}').hexi
=> "7b22615c62223a202262227d"
>> ('{"a\b": "b"}').hexi
=> "7b22615c62223a202262227d"
>> 'a\b'.hexi
=> "615c62"
>> 'a\\b'.hexi
=> "615c62"

(String#hexi is in my .irbrc and does the obvious bytes.map{|x| "%02x" %
x}.join thing.)

Textbook example of quoting hell...

Now, the bug in the default Ruby JSON parser you really are addressing
is illustrated by this:

>> (JSON.parse ' {"a\\c": "b"}').keys[0].hexi
=> "6163"

Of course, \c is not allowed in JSON strings.

Fix:

$ irb -roj
>> (Oj.load ' {"a\\c": "b"}').keys[0].hexi
Oj::ParseError: invalid escaped character at line 1, column 5 [parse.c:280]
	from (irb):3:in `load'
	from (irb):3
	from /Users/cabo/bin/irb:11:in `<main>'
>> (Oj.load ' {"a\\b": "b"}').keys[0].hexi
=> "6108"

Grüße, Carsten