[Json] "Generators SHOULD escape all Unicode whitespace characters"?
Jacob Davies <jacob@well.com> Mon, 10 June 2013 22:55 UTC
Return-Path: <cromis@gmail.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2C4BF21F96D9 for <json@ietfa.amsl.com>; Mon, 10 Jun 2013 15:55:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.978
X-Spam-Level:
X-Spam-Status: No, score=-1.978 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, NO_RELAYS=-0.001]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id w3qyQ02tXcGf for <json@ietfa.amsl.com>; Mon, 10 Jun 2013 15:55:02 -0700 (PDT)
Received: from mail-qc0-x233.google.com (mail-qc0-x233.google.com [IPv6:2607:f8b0:400d:c01::233]) by ietfa.amsl.com (Postfix) with ESMTP id 1FD6621F96C2 for <json@ietf.org>; Mon, 10 Jun 2013 15:55:02 -0700 (PDT)
Received: by mail-qc0-f179.google.com with SMTP id e1so4038029qcx.10 for <json@ietf.org>; Mon, 10 Jun 2013 15:55:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:from:date:x-google-sender-auth:message-id :subject:to:content-type:content-transfer-encoding; bh=s9VKfSgsi3UbzbAsh79IFiPaPc1/GWzp5S1ZYAVpjbI=; b=iawuVPeCP0arc+W2aZQHtaf8JjtSQLBzd1V0fSzHRNjO0+iD9LiSHzOkaFMCjchrQM iFGgYU51QiN+t3XR5qrNVWIUK1yftmOhqB+pCBMiXpjCt/Ok2+0Z1dzhrPpTqTobAyok lNxykQ3Is6wNJYXd2m+q6l/aRwy89x1vFBI9yXaxmN1zjd3Xva3+PVz2gpAImIMVqXBg i67KPlkdiEP3CIIP4QTQLdSNYui2OC4uc0O5lRDuXOEXM9C72F/EPpRj5cx8554CZqPn NigFsT7AUtqzfQ2JG2ABIocVwWGqnDEYv3mzkRUcLLRO0mZu0ssy8OedhnbA9L7li++b b+Iw==
X-Received: by 10.229.124.80 with SMTP id t16mr4525581qcr.93.1370904901543; Mon, 10 Jun 2013 15:55:01 -0700 (PDT)
MIME-Version: 1.0
Sender: cromis@gmail.com
Received: by 10.49.106.228 with HTTP; Mon, 10 Jun 2013 15:54:41 -0700 (PDT)
From: Jacob Davies <jacob@well.com>
Date: Mon, 10 Jun 2013 15:54:41 -0700
X-Google-Sender-Auth: upThZMcNOiaSTd42xn_3yy-mBDY
Message-ID: <CAO1wJ5S_c_4H5PD5HAZo9UR2KbhDHqfXjo=C3GAGJeGEqCSFHA@mail.gmail.com>
To: "json@ietf.org" <json@ietf.org>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Subject: [Json] "Generators SHOULD escape all Unicode whitespace characters"?
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/json>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jun 2013 22:55:03 -0000
I'm curious if anyone else thinks this is worth suggesting to implementors. There are a number of non-ASCII Unicode whitespace and control characters that are not required to be escaped right now. I think generators SHOULD escape them. Obviously parsers must continue to accept them unescaped regardless. The set is fairly small and could be enumerated in the document (it might expand in future, but this would be a good start): http://en.wikipedia.org/wiki/Space_(punctuation)#Spaces_in_Unicode "Whitespace smuggling" is a mild security concern and, from experience, can be quite hard to debug if non-0x20 spaces are not escaped. There is a small overhead of a couple of characters in doing so. Everything else in JSON's text serialization uses either printing characters, insignificant ASCII whitespace between values, or plain spaces in strings. Of course some printing Unicode characters are doppelgängers so perhaps people feel it is not worth worrying about whitespace either.
- [Json] "Generators SHOULD escape all Unicode whit… Jacob Davies
- Re: [Json] "Generators SHOULD escape all Unicode … Stephen Dolan
- Re: [Json] "Generators SHOULD escape all Unicode … Norbert Lindenberg
- Re: [Json] "Generators SHOULD escape all Unicode … Jacob Davies