Re: what constitutes an "invalid" content-length
Alex Rousskov <rousskov@measurement-factory.com> Tue, 12 July 2016 16:26 UTC
Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7B64112B03C for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 12 Jul 2016 09:26:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.208
X-Spam-Level:
X-Spam-Status: No, score=-8.208 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.287, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id od7-ogaeHYO2 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 12 Jul 2016 09:26:30 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0355812D542 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Tue, 12 Jul 2016 09:26:30 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1bN0R9-0002E3-QI for ietf-http-wg-dist@listhub.w3.org; Tue, 12 Jul 2016 16:21:31 +0000
Resent-Date: Tue, 12 Jul 2016 16:21:31 +0000
Resent-Message-Id: <E1bN0R9-0002E3-QI@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <rousskov@measurement-factory.com>) id 1bN0R5-0002DI-4s for ietf-http-wg@listhub.w3.org; Tue, 12 Jul 2016 16:21:27 +0000
Received: from mail.measurement-factory.com ([104.237.131.42]) by maggie.w3.org with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from <rousskov@measurement-factory.com>) id 1bN0R3-000254-3w for ietf-http-wg@w3.org; Tue, 12 Jul 2016 16:21:26 +0000
Received: from [65.102.233.169] (unknown [65.102.233.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.measurement-factory.com (Postfix) with ESMTPSA id 87FB3E06A; Tue, 12 Jul 2016 16:21:01 +0000 (UTC)
To: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
References: <em19b7fba4-42bf-40e8-83a9-132dfdc92698@bodybag>
From: Alex Rousskov <rousskov@measurement-factory.com>
Cc: Adrien de Croy <adrien@qbik.com>
Message-ID: <578518CD.8070305@measurement-factory.com>
Date: Tue, 12 Jul 2016 10:20:29 -0600
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101 Thunderbird/38.8.0
MIME-Version: 1.0
In-Reply-To: <em19b7fba4-42bf-40e8-83a9-132dfdc92698@bodybag>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
Received-SPF: pass client-ip=104.237.131.42; envelope-from=rousskov@measurement-factory.com; helo=mail.measurement-factory.com
X-W3C-Hub-Spam-Status: No, score=-6.0
X-W3C-Hub-Spam-Report: AWL=-0.856, BAYES_00=-1.9, RP_MATCHES_RCVD=-1.287, SPF_PASS=-0.001, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: maggie.w3.org 1bN0R3-000254-3w f99f68a8109fb74761c295d16ff62ba1
X-Original-To: ietf-http-wg@w3.org
Subject: Re: what constitutes an "invalid" content-length
Archived-At: <http://www.w3.org/mid/578518CD.8070305@measurement-factory.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/31931
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>
On 07/12/2016 07:31 AM, Adrien de Croy wrote: > just dealing with a site that sends more payload data than is indicated > in the Content-Length header. >From the standards point of view, that is _not_ what you are dealing with. You are dealing with a site that sends two responses, the first response is proper HTTP. The second response is garbage. > RFC7230 sections 3.3.2 (Content-Length), 3.3.3 (Message body length), > and 3.3.4 (Handling incomplete messages) only contemplate issues around > Content-Length specifying more bytes than are received, not fewer. >From the standards point of view, it is impossible for the Content-Length to specify fewer bytes than the message has. Irrelevant for this discussion cases aside, the message end is defined by the Content-Length header value. One cannot have more than what was promised because one stops assembling the message [body] after the promised number of bytes were added. Any "leftovers" are another message or garbage, depending on Connection:close, pipelining, and similar factors. > I guess one could argue that a wrong C-L value is "invalid", but it's > not clear that invalid in this context simply means it doesn't parse, or > is otherwise non-compliant with the ABNF. It is valid from protocol point of view. You know it is "wrong" only because you can (or you think you can) distinguish garbage from the end of the content. > So, it's not clear what the browser and/or proxy response should be. There is no single right answer to that. A compliant client (including proxies) ought to treat leftovers as post-message gardbage or another message. A real-world client may identify specific cases where leftovers are likely to be the end of the message content and ignore Content-Length in those cases. The cases where such behavior would be a good idea would vary from agent to agent, from one deployment to another. > I would expect it's in everyone's best interest if sites that have > broken framing are forced to be fixed. This won't happen if browsers > "just work" for the site. The ever-popular "force sites to be fixed" approach rarely fixes enough real-word sites to remove special treatment code. See Patrick's response for a good illustration. > Is there a special behaviour we should agree on for such cases? We could agree to violate the standard in one or two special cases, but any formal agreement would probably result in a few more broken sites because more folks will tolerate them, decreasing the probability that they will be fixed. I can think of one special case where it is more-or-less safe to ignore response Content-Length: * the HTTP/1 connection is not persistent, * no additional outstanding pipelined requests on that connection, * the unique Content-Length header field is syntactically valid, and * more bytes were read during the last network read than C-L promises. The combination of these conditions can trigger [optional] "robustness" code that reads until connection closure and re-sends leftovers/garbage to the next hop (or displays it to the user), opening a message smuggling attack vector. Needless to say, there are benign leftover cases that the above conditions do not cover. Cheers, Alex.
- Re: what constitutes an "invalid" content-length Willy Tarreau
- Re: what constitutes an "invalid" content-length Patrick McManus
- Re: what constitutes an "invalid" content-length Mark Nottingham
- Re: what constitutes an "invalid" content-length Adrien de Croy
- Re: what constitutes an "invalid" content-length Tim Bray
- Re: what constitutes an "invalid" content-length Willy Tarreau
- Re: what constitutes an "invalid" content-length Alex Rousskov
- Re: what constitutes an "invalid" content-length Adrien de Croy
- Re: what constitutes an "invalid" content-length Patrick McManus
- what constitutes an "invalid" content-length Adrien de Croy