Re: The use of binary data in any part of HTTP 2.0 is not good

William Chan (陈智昌) <willchan@chromium.org> Mon, 21 January 2013 01:44 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7CF0B21F842C for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sun, 20 Jan 2013 17:44:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.078
X-Spam-Level:
X-Spam-Status: No, score=-7.078 tagged_above=-999 required=5 tests=[FM_FORGED_GMAIL=0.622, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BGmKuwryy5gm for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sun, 20 Jan 2013 17:44:36 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 7722621F880B for <httpbisa-archive-bis2Juki@lists.ietf.org>; Sun, 20 Jan 2013 17:44:36 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1Tx6PR-0001Up-0W for ietf-http-wg-dist@listhub.w3.org; Mon, 21 Jan 2013 01:42:49 +0000
Resent-Date: Mon, 21 Jan 2013 01:42:49 +0000
Resent-Message-Id: <E1Tx6PR-0001Up-0W@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <willchan@google.com>) id 1Tx6PK-0001SE-AQ for ietf-http-wg@listhub.w3.org; Mon, 21 Jan 2013 01:42:42 +0000
Received: from mail-qc0-f173.google.com ([209.85.216.173]) by lisa.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <willchan@google.com>) id 1Tx6PJ-0003va-5p for ietf-http-wg@w3.org; Mon, 21 Jan 2013 01:42:42 +0000
Received: by mail-qc0-f173.google.com with SMTP id b12so3493477qca.18 for <ietf-http-wg@w3.org>; Sun, 20 Jan 2013 17:42:14 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=8MKpsUNs/QtKZoFFIBxksm2jbyXy1X3QRBIch8xhMgI=; b=nr/hSDYcdqndyB8Oetu8YacfLYXKanTxpa4mROoFpOpfhaDJkghG20v9pLIwqE4fHT dCSRBW7g0iilrh4jfUDK0HyNVYr9ihHGLfrlFz4zqhEW5K0UmSivjboQ/DCb3DBrsBJ0 8marOCjm6TgunEEYzMn4wUZYCDhD9sDFVKMeOJa+PsR78/bjMFzyprhHdgfPGn5UTl1q d+0cYJyJvSh6uEB2I+Lex084ZSTfroEIqUfekC8amTs0ib1KEYCLMWZVrjQBGvtdzm9f KQ0Qz1l2HguTqmibDxH+KT7scbZDWOwY6+IB5tziMsuuUJ4kng4GaPJ5eI6d+R7EvONz MXeg==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=8MKpsUNs/QtKZoFFIBxksm2jbyXy1X3QRBIch8xhMgI=; b=Tbe6AUa/tEa2SYtXLw2oterR/hEje6JdxyULSiluqEwfLhO/fxR+FqDBhw5DfKcwms /O8EO3q0wOKbNNqg8whEqvKzh46bNUNJnN6rxK4EKPpHhjrGOXC1LSTvvX25RQJLqccf 6lNBR5+jdooAcPHznoIognMwKRaGryLgtU21I=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding:x-gm-message-state; bh=8MKpsUNs/QtKZoFFIBxksm2jbyXy1X3QRBIch8xhMgI=; b=XOuu6BfsdXrcPZBoJdQS+53w6njuIF7zK18CV/9OIeZpfW9A8gHzXkc4DO3Tj/HruD oU8SorxCu4clTqRKdzIaQVmvFxS5PYID+lwuytv4Cgx8Zeu5uDVRPpqrSPSKA4eRVHPw gBXpj8HQFwQJvbzMEELWX4bVgkzSfga02nRuXNbG5N+SPfyMpseflw+OoEScECipnC6G 8I8VaPZtdFfmQ9WlPyAelZp6UPno7puXHAOYV3HSvhMhnQ9W7tDe8hGdUk5BufxTFPY1 dUvZ0zElab3arGosDiNdIoi4yf3BAqjHkHBvlCtLorsw0jMUXDPUMdXqjHidL7dOVvam enAA==
MIME-Version: 1.0
X-Received: by 10.224.216.9 with SMTP id hg9mr17361936qab.44.1358732534659; Sun, 20 Jan 2013 17:42:14 -0800 (PST)
Sender: willchan@google.com
Received: by 10.229.57.163 with HTTP; Sun, 20 Jan 2013 17:42:14 -0800 (PST)
In-Reply-To: <B2A83604-183E-49BF-A962-238AF5F19DA9@mnot.net>
References: <em670f0a0f-3c5a-4f99-88cb-03bd4234ce63@bombed> <7F8E363D-6D6E-4FDD-B8EA-24A31383B1A3@mnot.net> <CAA4WUYiPRpm0OWesf5wTGLX--HWtDmgjFr+wSEEVr-beH8J=qw@mail.gmail.com> <3EFE45C2-8147-432C-8D15-7E8C5AEC39DC@mnot.net> <CAA4WUYhxHbFeaw-M=DUdKKKafhdDE4U5==N2QGY2hd_ptxSHiA@mail.gmail.com> <B2A83604-183E-49BF-A962-238AF5F19DA9@mnot.net>
Date: Sun, 20 Jan 2013 17:42:14 -0800
X-Google-Sender-Auth: TKjihVY_hvbvMkKIZnc7oBukGTQ
Message-ID: <CAA4WUYjzG+1CzEo0WJUjoeqZs5bfud3P+V30_+p8pe_MD6bjPQ@mail.gmail.com>
From: "William Chan (陈智昌)" <willchan@chromium.org>
To: Mark Nottingham <mnot@mnot.net>
Cc: "Adrien W. de Croy" <adrien@qbik.com>, Pablo <paa.listas@gmail.com>, HTTP Working Group <ietf-http-wg@w3.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Gm-Message-State: ALoCoQk/xcQnG0MIxX6Bgc+2D47Z8LQGX9AEPgiCCIK9S/4viXKRl9nWYthN9fKIHoyWnJYke0ZVZinzeV9kDUSsG0veZkNiowqEAC7EpYsnmj+S3coiAWRaHYNNg2K90AGwVRQ1CeGpRL4YOBFbahkwZhLGVoLAnPSg3aVYfqXnnvJv9TjyHPzb3A/YzuXJHTZ9vbWJhWx2
Received-SPF: pass client-ip=209.85.216.173; envelope-from=willchan@google.com; helo=mail-qc0-f173.google.com
X-W3C-Hub-Spam-Status: No, score=-3.9
X-W3C-Hub-Spam-Report: AWL=-1.192, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001
X-W3C-Scan-Sig: lisa.w3.org 1Tx6PJ-0003va-5p 828f93f2d03e108bd223271dba34e736
X-Original-To: ietf-http-wg@w3.org
Subject: Re: The use of binary data in any part of HTTP 2.0 is not good
Archived-At: <http://www.w3.org/mid/CAA4WUYjzG+1CzEo0WJUjoeqZs5bfud3P+V30_+p8pe_MD6bjPQ@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/16071
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On Sun, Jan 20, 2013 at 4:50 PM, Mark Nottingham <mnot@mnot.net> wrote:
>
> On 21/01/2013, at 11:41 AM, William Chan (陈智昌) <willchan@chromium.org> wrote:
>
>> Many times these intermediaries / 3rd party software are trying to do
>> "legitimate" things like sniffing headers so they can do filtering
>> based on that. If we add a sender-configured option to enable
>> disabling header encoding/compression at the client, it's very likely
>> that 3rd party software will take advantage of that option for popular
>> client software (popular browsers). Thus, even if such an option
>> existed in the spec, I would be nervous about exposing it within
>> Chromium in such a way that 3rd party software could force it on.
>
> Roberto mentioned that the sender could just choose not to compress their output; how would you stop that?

TLS + providing no Chromium-configurable option to disable header
compression, at least from our (Chromium's) end :) If the origin
server disables header compression in their responses, we can't help
that, in the same way we can't help if they don't compress their
response bodies. We do what we can. It's possible that, since Google
Chrome includes auto-update capabilities, we can err on the side of
providing an option and revoking it if abused.

>
>
>> And if such a mechanism existed sender-side, I still don't see what
>> (beyond a secured transport like TLS) prevents an ISP or corporate
>> network administrator who has filtering software he/she wants to run
>> from adding a hop that enables this debug option to make it easier to
>> pass through to 3rd party authored filtering software running within
>> the ISP / corporate network.
>>
>> My concerns are admittedly erring on the side of paranoia, but it is
>> definitely advised by our (Google Chrome) previous experience in this
>> area. Please see
>> https://developers.google.com/speed/articles/use-compression where we
>> provide data on the prevalence of the aforementioned Accept-Encoding
>> stripping issue. "There are a handful of ISPs, where the percentage of
>> uncompressed content is over 95%. One likely hypothesis is that either
>> an ISP or a corporate proxy removes or mangles the Accept-Encoding
>> header." Accept-Encoding is a sender-side controlled header, so unless
>> the proposed option is protected somehow via TLS or something, I would
>> imagine that it likewise would expose us to the issue we've seen with
>> Accept-Encoding stripping.
>
> Again, this isn't negotiation (as currently discussed); it's the sender choosing not to compress.

Sorry, I either misunderstood or was unclear. In my understanding,
while A-E is generally a negotiation, if you simply strip A-E, it's
effectively the same as "the sender choosing not to compress".

>
> Even if it were a negotiation, I suspect you'd find that the dynamics that you saw in play with compressing payloads doesn't play out the same way as it does with headers. Intermediaries strip accept-encoding because decompressing and recompressing the response bodies coming through them presents a scalability challenge; if we do our job right with header compression, it shouldn't be nearly as much of a problem for them.

This is speculation of course, but I suspect that many do it simply
for simplicity's sake. While I think intermediaries like
Varnish/HAProxy/Squid may have legitimate scalability concerns if/when
they disable stuff like compression, I think that many intermediaries,
like virus scanners and what not, simply disable things like
compression to make them easier to process. Maybe I'm more jaded about
the quality of engineering out there than you, but I suspect most
engineers simply consider what's easiest to get their job done, rather
than the overall implications of their actions. Otherwise, I'm pretty
surprised some ISPs strip A-E, since I'd imagine that the bandwidth
costs exceed the scalability costs for implementing their filtering
policies. Of course, I'm no expert here and would love to be educated.

>
> Besides which, if we design HTTP/2 to be unpalatable to middleboxes, they'll merely block it and force all traffic to HTTP/1, meaning that those users won't see *any* benefits from the new protocol.

This is why I asked about the use case, so we could evaluate how
necessary this option is. I'm hoping it is not unpalatable. Adrien's
response leaves me somewhat hopeful, although it's still uncertain.

>
>
>> All options are ripe for abuse. We should be very careful to make sure
>> the option is truly necessary, rather than just potentially useful, in
>> order to counterbalance the downside of possible abuse.
>
> To counter that -- I'm somewhat wary of approaching protocol design as an exercise in controlling how the result is used; people *will* work around your intent. While we can do some social engineering in this process, it's very soft, and very limited, power.

Fair enough. I mostly wanted to chime in to provide the "other"
opinion here, since you stated that the previous reaction had been
pretty positive. Please take my comments as a contrary opinion for
people to consider before we draw any conclusions here.

>
> Cheers,
>
> --
> Mark Nottingham   http://www.mnot.net/
>
>
>