Re: How to handle content-encoding

Cory Benfield <cory@lukasa.co.uk> Thu, 08 December 2016 09:52 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E5402129E7A for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 8 Dec 2016 01:52:34 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.201
X-Spam-Level:
X-Spam-Status: No, score=-6.201 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_SORBS_WEB=3.595, RP_MATCHES_RCVD=-2.896, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=lukasa-co-uk.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id txJxF2bp3kFz for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 8 Dec 2016 01:52:32 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 50078129CDA for <httpbisa-archive-bis2Juki@lists.ietf.org>; Thu, 8 Dec 2016 01:52:32 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1cEvKo-0007A0-4E for ietf-http-wg-dist@listhub.w3.org; Thu, 08 Dec 2016 09:49:50 +0000
Resent-Date: Thu, 08 Dec 2016 09:49:50 +0000
Resent-Message-Id: <E1cEvKo-0007A0-4E@frink.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by frink.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <cory@lukasa.co.uk>) id 1cEvKb-00078d-5G for ietf-http-wg@listhub.w3.org; Thu, 08 Dec 2016 09:49:37 +0000
Received: from mail-wm0-f52.google.com ([74.125.82.52]) by mimas.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from <cory@lukasa.co.uk>) id 1cEvKT-0007Qx-PK for ietf-http-wg@w3.org; Thu, 08 Dec 2016 09:49:31 +0000
Received: by mail-wm0-f52.google.com with SMTP id a197so208551346wmd.0 for <ietf-http-wg@w3.org>; Thu, 08 Dec 2016 01:49:08 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lukasa-co-uk.20150623.gappssmtp.com; s=20150623; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=fr2WNysiyUudv3zCMA1uyu17QuMy8PXTCNbY3zKUERI=; b=yvURFbTUttDhZAodkqzB5a9QlJ8ZGxWsadJfjYR2OYvFXj5O1/01xS7BLQTIcyqs1J 1kLKPkgRwXJ1yJdxOb3YwAfk7NfLUYvOq1dIRZYr54UNSI1ttJ1r+d6JAgzqbrMH3ClK 99z+DC8ka6j8g6sh6KVoBSykSTi3LpRrFe+qcUX7A045INUmfcWBzKXuYKIbaTl9tWSr 2/r5J93ula0WjQzTPPl6dYbJPDUSz16NV9IFdG+jaYufe0Mk6X0y3SFml4ZjmXW76PHF ki3Nj5Jkg7ytE87ivz74FnL1lAq3+/e0xd0c+qvX5jmw9yzrzFSlLJDvsvO28Swajo/s e/QA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=fr2WNysiyUudv3zCMA1uyu17QuMy8PXTCNbY3zKUERI=; b=G7CPKA0Pze+l6AsoaJepAVhgC+XgH9pqX2q2VzTght6Qdxbb6lu+qAwIn0+mFiBeBC HjM7da0A7XZPpc3GJaiG5uZ9F9gXB+EdZG0D+RNsUOTspXvCt8Bg5ToX4gS6M+C4+tIq Yvrj4fGYP7L6Nrk7X0r1bb8Mpd65HFttRnxhKr8EaTfIZIxRczcnq+OjfXnHF5+hnJGs lRDpD/YaepAIzWI6dqDvs9Iku54BPGR7x8LayLS3mAkD4jeRB93vAooAAEdF3wZMExbV bBy5FSHJedctx2pZNwIPwMIJtKglvi+nqeU66IU8fOHhRgblo1saGzSNdgnB3m5uX7BO M8gw==
X-Gm-Message-State: AKaTC000RbErj88keTDfx0Ttzd8MX4XyxA3NycKhfTf9DToJ9bkNmOHK8gfkNr+pBKkRrw==
X-Received: by 10.28.214.133 with SMTP id n127mr1422967wmg.28.1481190542536; Thu, 08 Dec 2016 01:49:02 -0800 (PST)
Received: from [192.168.1.5] (72.6.208.46.dyn.plus.net. [46.208.6.72]) by smtp.gmail.com with ESMTPSA id ba10sm36021489wjb.32.2016.12.08.01.49.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 08 Dec 2016 01:49:01 -0800 (PST)
From: Cory Benfield <cory@lukasa.co.uk>
Message-Id: <E3DD40AF-ECB4-4D5A-8C8E-D3370622022D@lukasa.co.uk>
Content-Type: multipart/alternative; boundary="Apple-Mail=_FA7363CE-4157-48A5-A77F-88ED51FFD863"
Mime-Version: 1.0 (Mac OS X Mail 10.2 \(3259\))
Date: Thu, 08 Dec 2016 09:49:00 +0000
In-Reply-To: <CAEnbY+cMmPKefxZHW++KT2Rf7F8oL4E-cUP7jDs-6LpR8fBy8g@mail.gmail.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
To: Daurnimator <quae@daurnimator.com>
References: <CAEnbY+fW_n4sFrFQSVcMWBoqxEWw3yoKnhCu1seRXj4GBr6wfA@mail.gmail.com> <CAEnbY+cMmPKefxZHW++KT2Rf7F8oL4E-cUP7jDs-6LpR8fBy8g@mail.gmail.com>
X-Mailer: Apple Mail (2.3259)
Received-SPF: pass client-ip=74.125.82.52; envelope-from=cory@lukasa.co.uk; helo=mail-wm0-f52.google.com
X-W3C-Hub-Spam-Status: No, score=-0.8
X-W3C-Hub-Spam-Report: AWL=-0.458, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_WEB=3.599, SPF_PASS=-0.001, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1cEvKT-0007Qx-PK babff8e009ed88418bcd986131d204a0
X-Original-To: ietf-http-wg@w3.org
Subject: Re: How to handle content-encoding
Archived-At: <http://www.w3.org/mid/E3DD40AF-ECB4-4D5A-8C8E-D3370622022D@lukasa.co.uk>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/33134
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

> On 7 Dec 2016, at 12:34, Daurnimator <quae@daurnimator.com> wrote:
> 
> On 31 May 2016 at 12:47, Daurnimator <quae@daurnimator.com> wrote:
>> 
>> To me it was simple to add support for Transfer-Encoding, without any
>> ambiguities or issues. For HTTP1 in the stream logic:
>>  -  (if zlib is installed) we automatically add `TE: gzip, deflate`.
>>  - On reply, if Transfer-Encoding contains gzip or deflate, we decode it
>> before passing it onto the caller.
>> This is permitted as TE and Transfer-Encoding are hop-by-hop headers.

In my experience, the gzip and deflate options for Transfer-Encoding are essentially unsupported. RFC 7230 doesn’t provide any requirement to support them, and so they are almost always not used.

>> However, HTTP2 does not support transfer-encoding.
>> Furthermore, certain servers **stares at twitter.com** send
>> `Content-Encoding: gzip` even if you *don't* send `Accept-Encoding: gzip`
>> This seems to demand that I support Content-Encoding.

Are you sending Accept-Encoding: identity? RFC 7231 Section 5.3.4 says the following things about Accept-Encoding:

1. If the request contains no Accept-Encoding header then any content-encoding is considered acceptable.
2. If the request does contain an accept-encoding header, but the server cannot satisfy it (for example, the server may not have access to an uncompressed resource), then the server may send a content-coding anyway (this is because the server SHOULD send a response without any content-coding, but is not required to)

>> How should I be adding this? What have other implementations done? (and what
>> do they wish they'd done differently?)
>> The current state seems to be *against* the spec: should the spec be
>> changed? should implementations be updated?
>> HTTP2 has no transfer-encoding equivalent... why not?

Lets tackle these questions in order.

1. My implementations support only Content-Encoding. My clients support all three major compressed content-encodings. This has more or less been just fine: almost all servers also only support Content-Encoding. Generally speaking, servers that support content-encoding for static content do so not by compressing on the fly but by having compressed and uncompressed versions of each resource on disk. For dynamic responses, compression is frequently handled by middlewares that apply the content-encoding on-the-fly. This is generally not an issue: the ETag can be calculated afterwards if that’s required.

2. I don’t see that there’s any spec violations here. Can you clarify about where you believe the spec violation is occurring?

3. HTTP/2 deliberately removed all support for hop-by-hop headers of this form. For more discussion on this specific issue, see this thread from 2014: https://lists.w3.org/Archives/Public/ietf-http-wg/2014JanMar/1179.html <https://lists.w3.org/Archives/Public/ietf-http-wg/2014JanMar/1179.html>. This is a substantial thread with a number of links out of it to other discussions. The major counter-arguments were that Transfer-Encoding: gzip required logic to detect already-compressed content to avoid double-compression (that is, to avoid compressing resources that were already compressed) and that it was not widely deployed.

Some further attempts have been made to redefine Transfer-Encoding: gzip using HTTP/2 frames, but again there has been relatively lukewarm interest from implementers to support that draft, so it has stalled.

Essentially, it seems that the bulk of the HTTP/2 community isn’t interested in Transfer-Encoding: gzip: they’ve concluded, rightly or wrongly, that Content-Encoding: gzip is fine.

Cory