Re: [hybi] workability (or otherwise) of HTTP upgrade

Zhong Yu <zhong.j.yu@gmail.com> Thu, 09 December 2010 09:29 UTC

Return-Path: <zhong.j.yu@gmail.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id AADED3A6AA1 for <hybi@core3.amsl.com>; Thu, 9 Dec 2010 01:29:43 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.045
X-Spam-Level:
X-Spam-Status: No, score=-3.045 tagged_above=-999 required=5 tests=[AWL=-0.046, BAYES_00=-2.599, J_CHICKENPOX_43=0.6, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id k9gXCU-0vxlA for <hybi@core3.amsl.com>; Thu, 9 Dec 2010 01:29:42 -0800 (PST)
Received: from mail-ew0-f53.google.com (mail-ew0-f53.google.com [209.85.215.53]) by core3.amsl.com (Postfix) with ESMTP id 662223A6AA6 for <hybi@ietf.org>; Thu, 9 Dec 2010 01:29:41 -0800 (PST)
Received: by ewy6 with SMTP id 6so1524534ewy.40 for <hybi@ietf.org>; Thu, 09 Dec 2010 01:31:09 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=1QxMM3u19J5bAhH1YZMC7C9NJaq2BOFNw50hpKeLVTc=; b=mS04pjzEAijBZvZobHYYnNukVQBPMqnaiI1iWUf0x9jZPBl+FKbwNEj1Q485YiQe9b HLdx2CEWSNeqLu0ica2sMh33IQLGAwiKCAtmQ2/W7VmP0sqVFCpeUjJK8353O4FYrG01 2AVW7N84nGMIF/7NxnHkkQn9uyMn6x2fpJiLA=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=xLuRmWUOzAlW3G5QYhTTxjhC7hHG+jIPJ60fdWBIc+C763dZWoHVnW7kvHA7JAK3Xd 3j1p5V8Qb5agpPh7gkVY/iMsGcVY7kbcrzr2eJ7XCBNSoF9SwdgrLcqdp7mSmUpfeNL8 FPwh9Oypw8/VyXXdbJacarms7mfzTa82SO0Dg=
MIME-Version: 1.0
Received: by 10.213.32.6 with SMTP id a6mr10493417ebd.62.1291887069676; Thu, 09 Dec 2010 01:31:09 -0800 (PST)
Received: by 10.213.16.142 with HTTP; Thu, 9 Dec 2010 01:31:09 -0800 (PST)
In-Reply-To: <AANLkTi=bDCroNG2yFVebWUAOTDtWgm8mkv9H85nMdDgo@mail.gmail.com>
References: <AANLkTin6=8_Bhn2YseoSHGh1OSkQzsYrTW=fMiPvYps1@mail.gmail.com> <AANLkTimwiGKdy2eHve9eDezMZg+duuK-AMWpeCR4GH3m@mail.gmail.com> <AB6151A1-A334-469F-BC74-1FA73E6B689A@mnot.net> <221B3DED-A3CC-4961-9CCF-48B6EBCB241F@apple.com> <3605.1291714925.544875@puncture> <AANLkTik4zgrqqbzWSmuRjS78Ur5ZOeejnP=Zu2usXh6D@mail.gmail.com> <AANLkTi=bDCroNG2yFVebWUAOTDtWgm8mkv9H85nMdDgo@mail.gmail.com>
Date: Thu, 09 Dec 2010 03:31:09 -0600
Message-ID: <AANLkTikH-T6a_r6cBJriBrmMO2SBSzYtR9xcFUPrKPnh@mail.gmail.com>
From: Zhong Yu <zhong.j.yu@gmail.com>
To: Collin Jackson <collin@collinjackson.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Cc: Server-Initiated HTTP <hybi@ietf.org>
Subject: Re: [hybi] workability (or otherwise) of HTTP upgrade
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Dec 2010 09:29:43 -0000

On Thu, Dec 9, 2010 at 2:38 AM, Collin Jackson <collin@collinjackson.com> wrote:
> On Sat, 4 Dec 2010 2:50 PM, Zhong Yu <zhong.j.yu@gmail.com> wrote:
>> So in the POST experiment, the bytes are
>>
>>     POST /path/of/attackers/choice HTTP/1.1
>>     Host: host-of-attackers-choice.com
>>     Sec-WebSocket-Key: <connection-key>
>>
>>     GET /script.php/<random> HTTP/1.1
>>     Host: target.com
>>
>> In 1376 cases the 2nd request was routed to target.com, presumably
>> because some interceptors parsed it as an HTTP request, and routed it
>> based on Host.
>>
>> In the Upgrade experiment, the bytes are
>>
>>     GET /path/of/attackers/choice HTTP/1.1
>>     Host: host-of-attackers-choice.com
>>     Connection: Upgrade
>>     Sec-WebSocket-Key: <connection-key>
>>     Upgrade: WebSocket
>>
>>     GET /script.php/<random> HTTP/1.1
>>     Host: target.com
>>
>> In only 1 case the 2nd request was routed to target.com. This
>> experiment is apparently done in the same ad display as the POST
>> experiment, and the bytes passed over same intermediaries.
>>
>> Don't you find that odd? How do you explain the difference?
>
> My interpretation is that most (but not all) of the proxies that route
> requests by Host header decided not to parse the 2nd request as an
> HTTP request, because they observed an HTTP Upgrade negotiation in the
> first request.

Thank you Collin. But this suggests that overwhelming majority(>99.9%)
of them already understand Upgrade. That's is very hard to believe. If
it is true, we should also expect that cache poisoning success drops
dramatically in Upgrade  experiment. Yet it only drops to 8 from 15 in
the POST experiment. You may argue that these are two different types
of interceptors, but they all understand HTTP, yet in one type 50%
don't understand Upgrade, and in another type, only 0.1% don't
understand Upgrade. That's hard to imagine.

>
> On Sat, 4 Dec 2010 at 3:39 PM, Zhong Yu <zhong.j.yu@gmail.com> wrote:
>> From paper: "47,741 (96:9%) reported that no intermediaries were
>> confused when sending the spoofed HTTP request"
>>
>> What's the expected behavior here? The 2nd request is transmitted to
>> host-of-attackers-choice.com verbatim, right?
>>
>> "There were 97 of 49;218 impressions (0:2%) where the spoofed request
>> was routed by IP"
>>
>> The 2nd request is also transmitted to host-of-attackers-choice.com
>> verbatim, right? How does this differ from the 1st case?
>
> The 2nd request was received by the server on a separate network connection.

Got it. The paper describes server set up as "standard Apache web
server", so I didn't assume you checked lower layer network
information.

>
> On Tue, Dec 7, 2010 at 2:38 PM, Zhong Yu <zhong.j.yu@gmail.com> wrote:
>> On Tue, Dec 7, 2010 at 3:42 AM, Dave Cridland <dave@cridland.net> wrote:
>> > On Mon Dec  6 23:27:02 2010, Maciej Stachowiak wrote:
>> >>
>> >> I'd like to see more detail on the data than is found in the paper, but it
>> >> seems to show a real-world hazard with use of Upgrade, since many
>> >> intermediaries do not understand it and at least a few are confused into
>> >> treating subsequent traffic as additional HTTP requests and responses.
>> >
>> > That's a subtle misread of the paper.
>> >
>> > The paper shows that many intermediaries treat any traffic as HTTP requests
>> > and responses until they find a CONNECT, after which they treat the traffic
>> > as opaque except in a tiny minority of cases (what, 4 out of 54,000?).
>>
>> I do not think the paper corroborates that argument at all.
>>
>> Quoting the paper: "In our experiments, we observed two proxies which
>> appear not to understand CONNECT but simply to treat the request as an
>> ordinary request and then separately route subsequent requests, with
>> all routing based on IP address."
>>
>> Sounds simple and clear, but let's dig a little deeper. The
>> experiments sent bytes in the following form (as far as we know, from
>> conversations on this mailing list)
>>
>>  CONNECT websocket.invalid:443 HTTP/1.1
>>  Host: websocket.invalid:443
>>  Sec-WebSocket-Key: <connection-key>
>>  Sec-WebSocket-Metadata: <metadata>
>>
>>  GET /script.php/<random> HTTP/1.1
>>  Host: target.com
>>
>> that is, two HTTP requests, well formed. An HTTP interceptor that
>> understands CONNECT will treat the load(all bytes after the connect
>> request) as opaque and forward them to the server verbatim.
>>
>> On the other hand, a "CONNECT-agnostic" HTTP interceptor, one that
>> does not "understand CONNECT but simply to treat the request as an
>> ordinary request and then separately route subsequent requests, with
>> all routing based on IP address", will do ... pretty much the same
>> thing! It could have parsed the load into a HTTP request, then sent
>> the request to the server as is, effectively forwarding the load to
>> the server verbatim. Neither the client nor the server could detect
>> the fact that this interceptor parsed the load as HTTP requests.
>>
>> Some CONNECT-agnostic interceptors may have touched the 2nd request in
>> some way, allowing the server to detect them. The two proxies
>> described in the paper may have done something like that. It would
>> nice if the authors tell us how exactly they are detected.
>
> The 2nd request was transmitted on a separate network connection. That
> is how we distinguished it from the normal case, which re-uses the
> initial connection.

Got it. However, my point is still valid, that the experiment tells us
nothing about the number of interceptors that do parse bytes after
CONNECT request. They may reuse the connection, so your method didn't
detect them. They can be tripped and detected in a modified experiment
where bytes after CONNECT do not form a valid HTTP request. But your
experiment didn't do that and couldn't detect these interceptors.

One of the main points of the paper is that CONNECT is widely
understood and bytes after CONNECT will most likely be treated as
opaque by intermediaries. Your experiment tested a very narrow subset
of bytes, and found a lower bound of number of bad proxies. It does
not give us any idea of percentage of interceptors that treat bytes
after CONNECT as opaque.

- Zhong Yu

> As discussed in Section V(C)(3) of the paper, we did not observe any
> caching proxies that had this behavior in our experiment. If these
> proxies are a concern, it is possible to prevent them from becoming
> poisoned using encryption.
>
> Collin Jackson
>