How to handle content-encoding

Daurnimator <> Tue, 31 May 2016 02:52 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 9BAA012D145 for <>; Mon, 30 May 2016 19:52:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -8.236
X-Spam-Status: No, score=-8.236 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.426, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_DKIM_INVALID=0.01] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=fail (1024-bit key) reason="fail (body has been altered)"
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 0Ae_7lh1ZGhW for <>; Mon, 30 May 2016 19:52:21 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 2BA7012D12C for <>; Mon, 30 May 2016 19:52:20 -0700 (PDT)
Received: from lists by with local (Exim 4.80) (envelope-from <>) id 1b7Zj4-0001QF-Jv for; Tue, 31 May 2016 02:48:14 +0000
Resent-Date: Tue, 31 May 2016 02:48:14 +0000
Resent-Message-Id: <>
Received: from ([]) by with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <>) id 1b7Ziz-0001PP-8t for; Tue, 31 May 2016 02:48:09 +0000
Received: from ([]) by with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <>) id 1b7Zix-0003eK-EI for; Tue, 31 May 2016 02:48:08 +0000
Received: by with SMTP id w16so68904960lfd.2 for <>; Mon, 30 May 2016 19:47:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=daurnimator; h=mime-version:date:message-id:subject:from:to; bh=HHqLCDXNvTjGGAuxFZSz/TDuh3nPUlNkQs0Y3pTsyio=; b=R/sfR31gJ/Hg3xtxJ2WIr+7nK9FL8PIlre1dWg6+swTf6cf9R7VRd1ksbw4fs4/bkQ cNBPMT8Qtfz2uqil04rOoctJ2RCXddgqVnk67IbZcRuOOmInZjh+mftTK9PyZ9clXXi9 Q3iIqR/HBqmv3sA0vgGHOqhLgm1pM4/VicJnI=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to; bh=HHqLCDXNvTjGGAuxFZSz/TDuh3nPUlNkQs0Y3pTsyio=; b=avNuBCU61K+QxXvwUjUJ9RwBmGRg8HCRa0+nTvwxQs83dJsOlfPaHmnsmuCOfffEm2 KOccTCyNqhDpTij/lr+s2Kz76Ba03de+plVIz+tH53DubUano+De4IgVb6AKuQU53fPt zZOGK2QzKJBYkYADN20kkDnNc2Y9vlgrlVypsQL8ldWJm2vCA+aAuNDcwOXYgxMg0rzK VGR/hPd2mi3EE4Ec59rV8K07TV7Z7zut6h3q7i8cfQ1s4o0+W3RwjqqdbWIhNlIAihb7 ifrapgMjon0lUPKyTvCetiJce2WlvuUq50ZTORoIg8di2N3QmMMFPJ833XH39axZ5vee UnBw==
X-Gm-Message-State: ALyK8tL1AtPeYijoc/IJZAQHYRV9AXPAj7TDGHC3S81+L/r7Q0csPDMGoblcNHPDZd3mCA==
X-Received: by with SMTP id u134mr7087455lja.18.1464662860370; Mon, 30 May 2016 19:47:40 -0700 (PDT)
Received: from ( []) by with ESMTPSA id 4sm2609173ljj.2.2016. for <> (version=TLSv1/SSLv3 cipher=OTHER); Mon, 30 May 2016 19:47:39 -0700 (PDT)
Received: by with SMTP id b73so58566747lfb.3 for <>; Mon, 30 May 2016 19:47:39 -0700 (PDT)
MIME-Version: 1.0
X-Received: by with SMTP id 89mr6810358ljb.23.1464662858721; Mon, 30 May 2016 19:47:38 -0700 (PDT)
Received: by with HTTP; Mon, 30 May 2016 19:47:38 -0700 (PDT)
Date: Tue, 31 May 2016 12:47:38 +1000
X-Gmail-Original-Message-ID: <>
Message-ID: <>
From: Daurnimator <>
To: HTTP Working Group <>
Content-Type: multipart/alternative; boundary="001a1142beae6906d905341a65b0"
Received-SPF: pass client-ip=;;
X-W3C-Hub-Spam-Status: No, score=-5.7
X-W3C-Hub-Spam-Report: AWL=-1.046, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: 1b7Zix-0003eK-EI 3e04a726768113d972270da3c070c009
Subject: How to handle content-encoding
Archived-At: <>
X-Mailing-List: <> archive/latest/31668
Precedence: list
List-Id: <>
List-Help: <>
List-Post: <>
List-Unsubscribe: <>

I'm thinking through how to add support for Content-Encoding to lua-http

A brief digression to lua-http structure (library terminology is borrowed
from http2):
  - a 'connection' encapsulates a socket, a connection has many streams
  - a 'stream' is a request/response pair (a request can have multiple
header blocks, and many data chunks)
      - The same stream structure is used for both client and server
      - You can implement a HTTP proxy by forwarding items from one stream
to another
  - a 'request' is a pre-prepared object consisting of a request header
block, a function to obtain body chunks, and a destination.
      - `request:go()` returns the 'main' response header block and a
stream (from which you can read the body one chunk at a time)

There is a desire to compress content to save bandwidth, HTTP has had two
main ways to do this: Transfer-Encoding and Content-Encoding.

To me it was simple to add support for Transfer-Encoding, without any
ambiguities or issues. For HTTP1 in the stream logic:
  -  (if zlib is installed) we automatically add `TE: gzip, deflate`.
  - On reply, if Transfer-Encoding contains gzip or deflate, we decode it
before passing it onto the caller.
This is permitted as TE and Transfer-Encoding are hop-by-hop headers.

However, HTTP2 does not support transfer-encoding.
Furthermore, certain servers **stares at** send
`Content-Encoding: gzip` even if you *don't* send `Accept-Encoding: gzip`
This seems to demand that I support Content-Encoding.

As far as the specifications go, Content-Encoding is *meant* to be used to
for end-to-end encoding that intermediate hops do not touch.
  - Intermediaries should cache Content-Encoded bodies in their encoded form
  - ETag is dependant on Content-Encoding

This makes it hard to find a place for it in lua-http's structure.
If I add it transparently in the stream (as done for Transfer-Encoding)
then it will be hop-by-hop (not end-to-end)
This seems to demand (at least for client requests) that it is switched
on/off at the request layer.
>From there though, it seems it would need to add some sort of stream body

How should I be adding this? What have other implementations done? (and
what do they wish they'd done differently?)
The current state seems to be *against* the spec: should the spec be
changed? should implementations be updated?
HTTP2 has no transfer-encoding equivalent... why not?


Original content-encoding spec
Hop-by-hop headers
  - Current spec
  - Mozilla disregards
Content-Encoding spec
  - lua-http documentation