Re: JSON headers

Julian Reschke <julian.reschke@gmx.de> Mon, 11 July 2016 06:07 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2938812B068 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sun, 10 Jul 2016 23:07:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.208
X-Spam-Level:
X-Spam-Status: No, score=-8.208 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.287, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id O19MSo2ANE4g for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sun, 10 Jul 2016 23:07:48 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CBD4C12B062 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Sun, 10 Jul 2016 23:07:48 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1bMUJd-00027Q-PG for ietf-http-wg-dist@listhub.w3.org; Mon, 11 Jul 2016 06:03:37 +0000
Resent-Date: Mon, 11 Jul 2016 06:03:37 +0000
Resent-Message-Id: <E1bMUJd-00027Q-PG@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <julian.reschke@gmx.de>) id 1bMUJa-00025l-CT for ietf-http-wg@listhub.w3.org; Mon, 11 Jul 2016 06:03:34 +0000
Received: from mout.gmx.net ([212.227.15.15]) by lisa.w3.org with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from <julian.reschke@gmx.de>) id 1bMUJW-0005Qg-Ic for ietf-http-wg@w3.org; Mon, 11 Jul 2016 06:03:33 +0000
Received: from [192.168.178.20] ([93.217.94.167]) by mail.gmx.com (mrgmx001) with ESMTPSA (Nemesis) id 0MHoWj-1bPWBA0sXt-003fXJ; Mon, 11 Jul 2016 08:02:57 +0200
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
References: <74180.1468000149@critter.freebsd.dk> <A17D3EFD-A935-4971-BCF6-DC9D38302CAD@oracle.com> <564a72e8-b9d3-1f9c-5982-48f2b07272e5@greenbytes.de> <3924.1468137899@critter.freebsd.dk> <683f5f58-6046-d9fb-cc75-d0ab3890ce23@greenbytes.de> <4105.1468141779@critter.freebsd.dk> <5cdf0fa8-063c-7eaa-a9e3-fb6db7417254@gmx.de> <4213.1468143913@critter.freebsd.dk> <94e4a5c2-3465-fef3-6221-d9f4fcccb5fa@gmx.de> <4324.1468145426@critter.freebsd.dk> <CAB0No9kf6gje3Tc+impphV5tUHjksCkL1PJ1YAgNjXO+tLq=XA@mail.gmail.com> <176d58df-debf-e660-edf7-7d686c926ef6@gmx.de> <5939.1468189218@critter.freebsd.dk>
Cc: Yanick Rochon <yanick.rochon@gmail.com>, Phil Hunt <phil.hunt@oracle.com>, HTTP Working Group <ietf-http-wg@w3.org>
From: Julian Reschke <julian.reschke@gmx.de>
Message-ID: <40e62f5c-9fe4-35c0-d986-c01fb63f6b4e@gmx.de>
Date: Mon, 11 Jul 2016 08:02:55 +0200
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0
MIME-Version: 1.0
In-Reply-To: <5939.1468189218@critter.freebsd.dk>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Provags-ID: V03:K0:0u+86JID4qHrYzDiRoopCu0+chAcBz/Y/MXDwq5fnfmUKHcNjNh IODTltqRRt7VNO7G0p2gPOqUBI6qdJALd1llWI7HLmFTrkJSPZKrn+KnUxvgHOCahicPTIS oBlv6Hoxah7vtYBgPXkxjik44vZvju+fSha0zdi67+rbNC+Y/CIhuTzdPbKKRrsdIwbGTJD 6Mj/uwNrA7wGPJg6KVNjw==
X-UI-Out-Filterresults: notjunk:1;V01:K0:FT6xpXpIHQw=:E2eo4UTa91qOYntrFnnYbL W0rVlxHwXLDj71Uafp4RAov5pGoBnVThc7fWGXcmoUbI3K7/KMlydyT9BJ/MKxREJnViywXh4 nfYlr7qhbYiZeROJ4atTJhBmKb3dBLz5xm1URVa7uyXyCFcTRzvaRWnbFSATKjjpYyfBcoo7s oh3lGB7Qk/Llrsc+s0nhZcT0KSG2llKfJ4TEDIN5gREKY6qyTlXYaxqPcZV72GLObG17K8+c+ d3kP2d1QMZBHeIVjdo8bFR7RYuClJJE7XXtnyU9jzhUy7vAgZuY4249WWaeUADT2ofn10xMqv Gtk/N1JGfLeucRPnddLPcS0bnNn8U/VAJH/duZZ+1/oxDIpiw5akjIcjKGMJ6Mlu4UWHEkvtC o1WwMdD6Zr82iekJS63DdmRV1ZasFRma6S3AuUD8xXIRVSFqkspN9Dj3KG6np575YAUGoR2kZ Kz6YgewvN+Td2u14f/ZgGxzxzZhZjO+B8SNj972BKPNI+IqPn+QhkDNBddMdxvtC1ZIewY7x8 O+INhg14Xl892ucoGX1WxnbZLL+Lsf2opZiUJYtwSmj7uHcPE0bcpELGfDnEGEQf9+B26dCkA JNhBjz2T4Xuv6zYQN/hEAlD2FBsm2i7Czs/trV1hq2AGOsX962nQafh2DdIIYN9NgzkXpiG6e NDh00sY8f/p9zPAFjmyufrTqal0VVAPf4WfDR3b8H8vc+B99pSyuELcfA6VLMAKRCebi55Zcc /6ZmMlT5lQP5STgDWrqkpgDAcX7sVL0NomNU4R7fwANOGL1Hf3xlcmrb9pqcogomq66deBTTZ 7XzPgGR
Received-SPF: pass client-ip=212.227.15.15; envelope-from=julian.reschke@gmx.de; helo=mout.gmx.net
X-W3C-Hub-Spam-Status: No, score=-6.7
X-W3C-Hub-Spam-Report: AWL=-0.075, BAYES_00=-1.9, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: lisa.w3.org 1bMUJW-0005Qg-Ic 5811845a021d58c77b143fa3248520e5
X-Original-To: ietf-http-wg@w3.org
Subject: Re: JSON headers
Archived-At: <http://www.w3.org/mid/40e62f5c-9fe4-35c0-d986-c01fb63f6b4e@gmx.de>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/31875
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On 2016-07-11 00:20, Poul-Henning Kamp wrote:
> --------
> In message <176d58df-debf-e660-edf7-7d686c926ef6@gmx.de>, Julian Reschke writes
> :
>
>> It seems you are confusing several issues here: multiple header field
>> instances in the HTTP message, and duplicate member names in a JSON
>> object. These are completely orthogonal issues.
>
> Uhm, no?
>
> They only become orthogonal if whoever specifies the JSON header
> takes great care to make that happen.

That's what my draft tries to enforce by requiring the data model to be 
a JSON *array*.

> Allowing split headers forces all headers to be defined as JSON
> lists, which means we're knee-capping header-designers from the get go.

I would call it: "steer them to a format that is robust with regards how 
header fields can be repeated in messages"

> Take this RFC7231 example:
>
> 	Accept: text/plain; q=0.5, text/html, text/x-dvi; q=0.8, text/x-c
>
> If we allow split headers for JSON, this can only be defined as JSON list,
> in order that it can also be sent as:
>
> 	Accept: <JSON for "text/plain; q=0.5">
> 	Accept: <JSON for "text/html">
> 	Accept: <JSON for "text/x-dvi; q=0.8">
> 	Accept: <JSON for "text/x-c">
>
> But that leaves it to the application writer to spot and detect and
> handle this degenerate case from a lazy sender:
>
> 	Accept: <JSON for "text/plain; q=0.5">
> 	Accept: <JSON for "text/html">
> 	Accept: <JSON for "text/x-dvi; q=0.8">
> 	Accept: <JSON for "text/x-c">
> 	Accept: <JSON for "text/plain; q=0.0">

The "degenerate" case can happen in a single header field as well.

(yes, discussing *simplifiyng* certain header fields is interesting as 
well, but it again is orthogonal to this discussion)

> It's even more complicated for a proxy sender which wants to
> modify the text/plain priority:  It needs to spot both copies
> and change them both (or delete one of them).

Yes.

> (It seems to me that there is a class of smuggling attacks
> where the proxy sees q=0.5 and the server q=0.0, which
> RFC7231 does not seem address at all?)

Only if it's broken by not processing all field instances, no?

> If we instead, as I propose, require that JSON headers *never* be
> split, then it becomes both possible and rather obviously smarter
> to define this header as a JSON object, keyed by the media type:
>
> 	Accept: { 					\
> 		"text/plain": <JSON for "q=0.5">,	\
> 		"text/html": <JSON for no parameter>,	\
> 		"text-xdvi": <JSON for "q=0.8">,	\
> 		"text/x-c": <JSON for no parameter>	\
> 	}
>
> A sender wishing to modify the priority, just sets the
> corresponding JSON object using the native languages
> JSON facility:
>
> 	req.accept["text/plain"] = <JSON for "q=0">
>
> The receiver can then simply load the JSON as JSON, and the
> application does not have to explicitly check for duplicate
> media types, but can simply look up "text/plain" in the
> JSON object.

Yes, that's a different data model for this header field, that makes 
certain operations simpler.

> So what happens if the sender sends bad JSON anyway ?
>
> 	Accept: { 					\
> 		"text/plain": <JSON for "q=0.5">,	\
> 		"text/html": <JSON for no parameter>,	\
> 		"text-xdvi": <JSON for "q=0.8">,	\
> 		"text/x-c": <JSON for no parameter>	\
> 		"text/plain": <JSON for "q=0.0">,	\
> 	}
>
> Well, RFC7159 says:
>
> 	4. Objects
>
> 	[...]The names within an object SHOULD be unique.
>
> 	An object whose names are all unique is interoperable in
> 	the sense that all software implementations receiving that
> 	object will agree on the name-value mappings.  When the
> 	names within an object are not unique, the behavior of
> 	software that receives such an object is unpredictable.
> 	Many implementations report the last name/value pair only.
> 	Other implementations report an error or fail to parse the
> 	object, and some implementations report all of the name/value
> 	pairs, including duplicates.
>
> In other words:  Don't do that.
>
> What really happens is this:  We just opened another door to the
> exact same smuggling attack as I mentioned above.
>
> But this time we can shut them all with one single line of text:
>
> 	"Duplicate keys in JSON objects SHALL cause and be treated
> 	as connection failure."

But then we can't rely on generic JSON parsers anymore.

> So all in all, split JSON headers would be a really bad idea.

After replying to your mail, I'm even more convinced that it's the right 
thing to do.

Best regards, Julian