Re: JSON headers

Andy Green <andy@warmcat.com> Mon, 11 July 2016 04:41 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EC48812B02E for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sun, 10 Jul 2016 21:41:53 -0700 (PDT)
X-Quarantine-ID: <zzWAwYKV-1FC>
X-Virus-Scanned: amavisd-new at amsl.com
X-Amavis-Alert: BAD HEADER SECTION, Improper folded header field made up entirely of whitespace (char 09 hex): References: ...0149@critter.freebsd.dk>\n\t\n <A17D3EFD-A93[...]
X-Spam-Flag: NO
X-Spam-Score: -8.308
X-Spam-Level:
X-Spam-Status: No, score=-8.308 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.287, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=warmcat.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zzWAwYKV-1FC for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sun, 10 Jul 2016 21:41:52 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3743812B028 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Sun, 10 Jul 2016 21:41:52 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1bMSyh-0007fd-TV for ietf-http-wg-dist@listhub.w3.org; Mon, 11 Jul 2016 04:37:55 +0000
Resent-Date: Mon, 11 Jul 2016 04:37:55 +0000
Resent-Message-Id: <E1bMSyh-0007fd-TV@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <andy@warmcat.com>) id 1bMSye-0007er-BS for ietf-http-wg@listhub.w3.org; Mon, 11 Jul 2016 04:37:52 +0000
Received: from mail.warmcat.com ([163.172.24.82]) by lisa.w3.org with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from <andy@warmcat.com>) id 1bMSyb-00019u-Lf for ietf-http-wg@w3.org; Mon, 11 Jul 2016 04:37:51 +0000
DKIM-Filter: OpenDKIM Filter v2.10.3 warmcat.warmcat.com 7AD32DA29A
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=warmcat.com; s=dkim; t=1468211839; bh=ZgXQBOpd0mYjyl+Azd/7THnRrQJrZ0jkEqPa/8lher8=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=moXKA5Tu4j02VimsZN0rtOo+w1QmpsRbbmk8ewP4SvrOgxggCWL3jWX/7qs82fk/+ 8iu5dZslaZwAz9BgRcvmz5VBAWujsXu9XTkvpKzOhILNwkA+siwgL69JLwWX98rm8g OsiRr6gIB0l5pL3fJlQzgO1+vRcdu/MsB5W91I6k=
Message-ID: <1468211839.6746.67.camel@warmcat.com>
From: Andy Green <andy@warmcat.com>
To: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, Poul-Henning Kamp <phk@phk.freebsd.dk>, Julian Reschke <julian.reschke@gmx.de>
Cc: Yanick Rochon <yanick.rochon@gmail.com>, Phil Hunt <phil.hunt@oracle.com>, HTTP Working Group <ietf-http-wg@w3.org>
Date: Mon, 11 Jul 2016 12:37:19 +0800
In-Reply-To: <94d7c36a-7d6d-11bf-27b6-2e6a2b807b09@it.aoyama.ac.jp>
References: <74180.1468000149@critter.freebsd.dk> <A17D3EFD-A935-4971-BCF6-DC9D38302CAD@oracle.com> <564a72e8-b9d3-1f9c-5982-48f2b07272e5@greenbytes.de> <3924.1468137899@critter.freebsd.dk> <683f5f58-6046-d9fb-cc75-d0ab3890ce23@greenbytes.de> <4105.1468141779@critter.freebsd.dk> <5cdf0fa8-063c-7eaa-a9e3-fb6db7417254@gmx.de> <4213.1468143913@critter.freebsd.dk> <94e4a5c2-3465-fef3-6221-d9f4fcccb5fa@gmx.de> <4324.1468145426@critter.freebsd.dk> <CAB0No9kf6gje3Tc+impphV5tUHjksCkL1PJ1YAgNjXO+tLq=XA@mail.gmail.com> <176d58df-debf-e660-edf7-7d686c926ef6@gmx.de> <5939.1468189218@critter.freebsd.dk> <94d7c36a-7d6d-11bf-27b6-2e6a2b807b09@it.aoyama.ac.jp>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=163.172.24.82; envelope-from=andy@warmcat.com; helo=mail.warmcat.com
X-W3C-Hub-Spam-Status: No, score=-5.3
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RP_MATCHES_RCVD=-1.287, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: lisa.w3.org 1bMSyb-00019u-Lf d52a2c25fb092020cb5d396edc8837a6
X-Original-To: ietf-http-wg@w3.org
Subject: Re: JSON headers
Archived-At: <http://www.w3.org/mid/1468211839.6746.67.camel@warmcat.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/31870
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On Mon, 2016-07-11 at 13:05 +0900, Martin J. Dürst wrote:
> Hello Paul-Henning,
> 
> On 2016/07/11 07:20, Poul-Henning Kamp wrote:
> 
> > If we instead, as I propose, require that JSON headers *never* be
> > split, then it becomes both possible and rather obviously smarter
> > to define this header as a JSON object, keyed by the media type:
> > 
> > 	Accept: { 					\
> > 		"text/plain": <JSON for "q=0.5">,	\
> > 		"text/html": <JSON for no parameter>,	\
> > 		"text-xdvi": <JSON for "q=0.8">,	\
> > 		"text/x-c": <JSON for no parameter>	\
> > 	}
> > 
> > A sender wishing to modify the priority, just sets the
> > corresponding JSON object using the native languages
> > JSON facility:
> > 
> > 	req.accept["text/plain"] = <JSON for "q=0">
> 
> My understanding is that you are extremely concerned about the speed
> at 
> which headers can be processed. My guess would be that
> deserializing, 
> changing, and reserialising JSON headers takes more time than 
> detecting/processing duplicate headers. But I of course might be
> wrong.

I'm a bit bemused why the world needs JSON headers instead of the cool
stuff for header coding in http/2, but I can give one point of view
related to duplicate headers and efficiency.

In libwebsockets we use bytewise state machines for everything,
including http/1.x header parsing.  Normally the library tries to stay
out of the way of the application code and provide events and
information to it as it becomes available, without the need for
buffering on the library side.

But in the case of http/1.x headers, we can't give the application any
definitive report on header payload until we got the whole lot, since
headers may be appended to at any point.  Further, it means we have to
keep the whole payload of every recognized header around in case it was
subject to appending later.

Actually it'd be nice, and efficient, if we could assemble one header
payload in the library, pass it up to the application to copy or act
on, and then reuse the buffer, as we go through the incoming, possibly
fragmented, header content.

Parsing the JSON out of it is very cheap and quite compatible with
being integrated into the general header stream, bytewise parser, but
deferring being able to get a definitive result to pass up as it is now
in http/1.x is painful if you are serious about memory efficiency.

In the actual scenario being asked about if you can eliminate always
having the final result for a header pending until all the headers came
(in JSON, by saying each header may only appear once and multiple
results come in an array on that), you can reissue the headers to
forward header-by-header as they are parsed without storing them all,
which is radically more efficient if what you're doing with them allows
it.

-Andy

> Could you give some more background on why speed-wise, de/serializing
> is 
> okay for you, but duplicate detection isn't?
> 
> > But this time we can shut them all with one single line of text:
> > 
> > 	"Duplicate keys in JSON objects SHALL cause and be treated
> > 	as connection failure."
> 
> How are you going to tell your favorite JSON library to behave that
> way?
> 
> Regards,   Martin.
> 
>