Re: [apps-discuss] draft-pbryan-zyp-json-pointer: name syntax for non-ASCII

Julian Reschke <julian.reschke@gmx.de> Tue, 22 November 2011 13:06 UTC

Return-Path: <julian.reschke@gmx.de>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C3DB021F8DFE for <apps-discuss@ietfa.amsl.com>; Tue, 22 Nov 2011 05:06:55 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -104.078
X-Spam-Level:
X-Spam-Status: No, score=-104.078 tagged_above=-999 required=5 tests=[AWL=-1.479, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 84nifPBSu5Pd for <apps-discuss@ietfa.amsl.com>; Tue, 22 Nov 2011 05:06:55 -0800 (PST)
Received: from mailout-de.gmx.net (mailout-de.gmx.net [213.165.64.23]) by ietfa.amsl.com (Postfix) with SMTP id 4B7B421F8DEC for <apps-discuss@ietf.org>; Tue, 22 Nov 2011 05:06:54 -0800 (PST)
Received: (qmail invoked by alias); 22 Nov 2011 13:06:52 -0000
Received: from mail.greenbytes.de (EHLO [192.168.1.140]) [217.91.35.233] by mail.gmx.net (mp061) with SMTP; 22 Nov 2011 14:06:52 +0100
X-Authenticated: #1915285
X-Provags-ID: V01U2FsdGVkX18OFPwnO3oOQPSO3JTXYsrAte6+zDEkyYN8BezXK/ aYEh1n/WIVVhxL
Message-ID: <4ECB9E69.8090505@gmx.de>
Date: Tue, 22 Nov 2011 14:06:49 +0100
From: Julian Reschke <julian.reschke@gmx.de>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20111105 Thunderbird/8.0
MIME-Version: 1.0
To: Carsten Bormann <cabo@tzi.org>
References: <4ECA5C66.1040305@gmx.de> <1321903463.1990.16.camel@neutron> <4ECAA9FE.6080802@gmx.de> <1321905599.1990.23.camel@neutron> <4ECAAF39.8000702@gmx.de> <1321906189.1990.26.camel@neutron> <4ECAB0BC.0@gmx.de> <6462023D-F767-45DE-9AF0-011CC48374CF@mnot.net> <F7E6E395-463D-4D0C-A352-EAD4B5A27202@tzi.org>
In-Reply-To: <F7E6E395-463D-4D0C-A352-EAD4B5A27202@tzi.org>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 8bit
X-Y-GMX-Trusted: 0
Cc: Mark Nottingham <mnot@mnot.net>, IETF Apps Discuss <apps-discuss@ietf.org>
Subject: Re: [apps-discuss] draft-pbryan-zyp-json-pointer: name syntax for non-ASCII
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Nov 2011 13:06:55 -0000

On 2011-11-22 10:33, Carsten Bormann wrote:
> On Nov 21, 2011, at 21:55, Mark Nottingham wrote:
>
>> +1 to Julian here -- there's no reason why non-ASCII chars need to be percent-encoded when they occur inside a JSON document, only when they're in a URI (or similar context).
>
> OK, folks, let's see where that leads to.
>
> JSON document:
> {"Bjørn/Carsten": "Fritz"}
>
> JSON pointer:
> "/Bjørn%2FCarsten"
>
> Multi-segment URI-encoded version of this JSON pointer:
>
> /Bj%C3%B8rn%252FCarsten
>
> (For reference: Plain URI-encoded version:
>
> %2FBj%C3%B8rn%252FCarsten
> )
>
> If this is what you want, please document it thoroughly, clearly, painfully redundantly.
> Dozens of examples, including examples that show how to do it wrong.
>
> With this kind of dual-layer percent-encoding, we are deeply in quoting hell, and implementers will need all the help we can give them (and they will still get it wrong).
>
> Grüße, Carsten

Indeed.

To add to the coverage, we should also consider whitespace and non-BMP 
characters:

JSON document:
{"Bjørn/Carsten/foo \uD834\uDD1E": "Fritz"}

JSON pointer:
"/Bjørn%2FCarsten%2Ffoo(X)(Y)"

(X) right now would be %20. Should it? Why escape it here already? (this 
applies to all characters that are currently disallowed expect '/' and '%')

(Y) imho should be the Unicode code point U+1D11E 
(<https://tools.ietf.org/html/rfc4627#section-2.5> has this example

So yes, the fact that a JSON name can be anything a JSON string can take 
is indeed a problem, because it doesn't leave us any characters as 
delimiters (so this is very different from XML vs XPath).

Best regards, Julian