Re: If not JSON, what then ?

Alcides Viamontes E <alcidesv@shimmercat.com> Mon, 01 August 2016 17:05 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EBF6412D0FF for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 1 Aug 2016 10:05:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.207
X-Spam-Level:
X-Spam-Status: No, score=-8.207 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.287, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=shimmercat-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3ZJNze7RINnM for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 1 Aug 2016 10:05:10 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 26E6F12D13D for <httpbisa-archive-bis2Juki@lists.ietf.org>; Mon, 1 Aug 2016 10:05:10 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1bUGaS-0007jI-RK for ietf-http-wg-dist@listhub.w3.org; Mon, 01 Aug 2016 17:01:08 +0000
Resent-Date: Mon, 01 Aug 2016 17:01:08 +0000
Resent-Message-Id: <E1bUGaS-0007jI-RK@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <alcidesv@zunzun.se>) id 1bUGaN-0007iX-Ff for ietf-http-wg@listhub.w3.org; Mon, 01 Aug 2016 17:01:03 +0000
Received: from mail-oi0-f49.google.com ([209.85.218.49]) by maggie.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <alcidesv@zunzun.se>) id 1bUGaD-0004tR-Kp for ietf-http-wg@w3.org; Mon, 01 Aug 2016 17:01:00 +0000
Received: by mail-oi0-f49.google.com with SMTP id w18so201060661oiw.3 for <ietf-http-wg@w3.org>; Mon, 01 Aug 2016 10:00:32 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shimmercat-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=5akZVPD0/R4i3UCtmM+mqaKAbSUU2mUOY2A81LhUOy0=; b=fCOBbwAYCQfDW0sKbCjnplnMoW+yAZsnemk0IxXCkdjDOhShnNFNCdeBeUn8dYD/RC S6CpiRGDNSPzTNtSPKR/NEVZLJaIh7YHmO64DOgCZLcJcciHT1svv51G5Ry7EHI2D+ju ix1yq+8JvYB8sYby/w0hpMv1+Y37A0t2JsicC8mEwjEZiO9TlC+HeH3ErnGZHfkMjKSi yBBXs0mBgrKZVBy8PoLwH1AP2CtPSciBAGtuWs+bnqj/Bwy8HMI2xNgRDtJtH4gZ4VOi 6LUc8mCTW51o9Bk9DHPD1YHNyBHkUfqQ194ZCtsA8v+mAdJuJNoADaRzL7fiipKFH+1I Tfog==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=5akZVPD0/R4i3UCtmM+mqaKAbSUU2mUOY2A81LhUOy0=; b=TKEBZhuZRu3V5BVe/SLwcGadYW3vcUigoEdM1eD/JRM7oxwIXtLvg8BopwHw8sGqx2 hRfe6vCwl6ayND4Qcf5od9NM4HfNDhLQK39BBXh3Q5ShxFFIuMDwycUERHBXEJuFWdM0 wZ4GjmN7oui8hrOkqVWqnRHNmvhI7ph2zgjOfGhgABim3LTnNvM6ZJXkFeuuTnpaHxJj NnR0bTW24REitSskjdo/k9x5lB7ldaTc79QL6pfVYDM5GQJnAd5flWq2N4hJa23I5iWP 9bV7eLAStughDnSCLHrVyB9OLP10DQqIl1ve+f0g3Fv976SDeG4VOCndSRZ3bSTQJXDy VdTg==
X-Gm-Message-State: AEkoout48W0iXYPr4y6wDbyUtzbdNBZH04y4lYMXf5ugZxZNMM7ZUfXUAkJjpNmriON+FYyCmQOGNEOG9t54XQ==
X-Received: by 10.202.77.151 with SMTP id a145mr33210809oib.163.1470060631295; Mon, 01 Aug 2016 07:10:31 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.202.215.132 with HTTP; Mon, 1 Aug 2016 07:10:30 -0700 (PDT)
In-Reply-To: <0BE751D9-95ED-4E6E-87DE-69188DCCCFD7@lukasa.co.uk>
References: <77778.1470037414@critter.freebsd.dk> <7B76F00B-2CAF-42A4-B09C-FA0748A4D025@laposte.net> <52025.1470048651@critter.freebsd.dk> <0BE751D9-95ED-4E6E-87DE-69188DCCCFD7@lukasa.co.uk>
From: Alcides Viamontes E <alcidesv@shimmercat.com>
Date: Mon, 01 Aug 2016 16:10:30 +0200
Message-ID: <CAAMqGzZJMN_duiG4t4VbpNA_1pE+yxEamG0hgn38aSeztaNEsA@mail.gmail.com>
To: HTTP Working Group <ietf-http-wg@w3.org>
Cc: Cory Benfield <cory@lukasa.co.uk>
Content-Type: multipart/alternative; boundary="001a113dedccba365c053903295f"
Received-SPF: pass client-ip=209.85.218.49; envelope-from=alcidesv@zunzun.se; helo=mail-oi0-f49.google.com
X-W3C-Hub-Spam-Status: No, score=-4.2
X-W3C-Hub-Spam-Report: AWL=-2.067, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, W3C_NW=0.5
X-W3C-Scan-Sig: maggie.w3.org 1bUGaD-0004tR-Kp 28d0c9c41da2c5960d3d2aeb59f3e3de
X-Original-To: ietf-http-wg@w3.org
Subject: Re: If not JSON, what then ?
Archived-At: <http://www.w3.org/mid/CAAMqGzZJMN_duiG4t4VbpNA_1pE+yxEamG0hgn38aSeztaNEsA@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/32120
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Hi!


TL;DR: I also think that trying to fit HTTP headers in anything else other
than their current representation is a bad idea. But creating a semi-formal
compilation of rules and behaviours for core HTTP headers would be worth
it.

Long rant:

Recently we revamped how ShimmerCat handles HTTP headers and we ended up
creating a separate library with a "Headers Document Object Model". The
bare minimum set of different headers we needed to understand and
manipulate to offer basic functionality is 20, and for each of them we
needed to take the following into account:


       * Representation round-trip: How header ASCII values are parsed to
"things that the program can easily manipulate" (see next), and the other
way around, how to convert to ASCII values. This is slightly different for
HTTP/1.1 and HTTP/2, because of connection specific headers, the "Cookie: "
header, and the rather non-trivial dance with "Host: " and ":authority:".

        * What in-memory representation makes sense for the program: "Date:
" should be a date, "Cookie: "  is  a dictionary, "Set-Cookie: " is a set
indexed by cookie name,  path and perhaps other attributes (exercising the
RFC with  Wordpress teaches you one or two things), "Forwarded: " is
actually a list, "Link: " headers from the point of view of a server doing
HTTP/2 Push are all different beasts each getting their own thing, and so
on. This of course is very program specific and probably not generally
interesting, but it is easier to talk about data structures instead of
ASCII text when defining:

       * How header values combine: there shouldn't be more than one "Date:
" in a given response, even if both a proxy server and an application may
try to stamp a "Date: ". However, a server may "add" cookies to an
application response, and the "Forwarded" header needs to be composed in a
sequence. Similar decisions are needed with CORS headers, Link headers,
Cache-Control, Etag and so on.

        * Headers are extensible, so one needs default policies for header
values where there are no RFC dispositions.

I would find daunting the task of fitting all the idiosyncrasies and
behaviours of HTTP headers in a common bytes representation (serialisation)
without some kind of updated compendium of what they do and how they
behave. For example, it would be nice to have a doc similar to RFC 4229,
with formalised candidate data structures and algorithms for how
intermediaries in different roles should handle the headers. Furthermore,
some HTTP headers are more important/common  than others (is there anybody
using the "From:" header?), or they are relevant to different roles, so
maybe we need to group headers in some sensible way (so that we can say,
e.g. "my CMS emits core http content headers" or "my server is
caching-compliant because it interprets correctly http core caching
headers" or "my server/application implements correctly the security
measures implied by the core security HTTP headers", whatever any of these
can be).


On Mon, Aug 1, 2016 at 2:30 PM, Cory Benfield <cory@lukasa.co.uk> wrote:

>
> > On 1 Aug 2016, at 11:50, Poul-Henning Kamp <phk@phk.freebsd.dk> wrote:
> >
> > No matter what we decide, we cannot change how JSON defined their
> > dicts, and consequently whatever we do needs to be mapped into JSON,
> > python, $lang's data models somehow.
>
> JSON, sure, but don’t let Python hold you back. All supported versions of
> Python have an OrderedDict in their standard library. And any Python tool
> dealing with HTTP has inevitably had to invent something like a
> CaseInsensitiveOrderedMultiDict in order to deal with HTTP headers, so any
> tool that’s likely to deal with this kind of thing is already swimming in
> dictionary representations that we can use for ordering fields in header
> values.
>
> So just to clarify: the lack of ordering in a JSON object is a reasonable
> problem with using JSON, but that doesn’t mean we can’t use ordered
> representations in other serialisation formats. Programming languages have
> all the abstractions required to do this, and it’s just not that hard to
> write an Ordered Mapping in $LANG that wraps $LANG’s built-in Mapping type.
> (Hell, some Python interpreters have *all* dicts ordered, such that they
> define OrderedDict by simply writing “OrderedDict = dict”).
>
> Cory
>


./Alcides