Re: what constitutes an "invalid" content-length
"Adrien de Croy" <adrien@qbik.com> Tue, 12 July 2016 22:44 UTC
Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 008C512D9DC for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 12 Jul 2016 15:44:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.208
X-Spam-Level:
X-Spam-Status: No, score=-8.208 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.287, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vFKUzRCEzk9a for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 12 Jul 2016 15:44:21 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 56F6712DA6D for <httpbisa-archive-bis2Juki@lists.ietf.org>; Tue, 12 Jul 2016 15:44:20 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1bN6Kz-00037z-DR for ietf-http-wg-dist@listhub.w3.org; Tue, 12 Jul 2016 22:39:33 +0000
Resent-Date: Tue, 12 Jul 2016 22:39:33 +0000
Resent-Message-Id: <E1bN6Kz-00037z-DR@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <adrien@qbik.com>) id 1bN6Kv-000370-Mh for ietf-http-wg@listhub.w3.org; Tue, 12 Jul 2016 22:39:29 +0000
Received: from smtp.qbik.com ([122.56.26.1]) by maggie.w3.org with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from <adrien@qbik.com>) id 1bN6Kn-0006Im-Kj for ietf-http-wg@w3.org; Tue, 12 Jul 2016 22:39:26 +0000
Received: From [192.168.1.146] (unverified [192.168.1.146]) by SMTP Server [192.168.1.3] (WinGate SMTP Receiver v9.0.0 (Build 5838)) with SMTP id <0000774821@smtp.qbik.com>; Wed, 13 Jul 2016 10:38:47 +1200
From: Adrien de Croy <adrien@qbik.com>
To: Alex Rousskov <rousskov@measurement-factory.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Date: Tue, 12 Jul 2016 22:38:47 +0000
Message-Id: <emec1ffa61-8cc9-4139-a25c-3704194a71e4@bodybag>
In-Reply-To: <578518CD.8070305@measurement-factory.com>
Reply-To: Adrien de Croy <adrien@qbik.com>
User-Agent: eM_Client/6.0.24928.0
Mime-Version: 1.0
Content-Type: text/plain; format="flowed"; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Received-SPF: pass client-ip=122.56.26.1; envelope-from=adrien@qbik.com; helo=smtp.qbik.com
X-W3C-Hub-Spam-Status: No, score=-5.3
X-W3C-Hub-Spam-Report: AWL=-0.085, BAYES_00=-1.9, RP_MATCHES_RCVD=-1.287, SPF_PASS=-0.001, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: maggie.w3.org 1bN6Kn-0006Im-Kj 2c3d8c0863a01b1f1a83e85bbc4f6c4e
X-Original-To: ietf-http-wg@w3.org
Subject: Re: what constitutes an "invalid" content-length
Archived-At: <http://www.w3.org/mid/emec1ffa61-8cc9-4139-a25c-3704194a71e4@bodybag>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/31934
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>
Looks like this turns out to be a red herring sorry all. improper header wrapping / bare linefeed in a response header value was pushing our header byte count out by 2 bytes. ------ Original Message ------ From: "Alex Rousskov" <rousskov@measurement-factory.com> To: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org> Cc: "Adrien de Croy" <adrien@qbik.com> Sent: 13/07/2016 4:20:29 a.m. Subject: Re: what constitutes an "invalid" content-length >On 07/12/2016 07:31 AM, Adrien de Croy wrote: > >> just dealing with a site that sends more payload data than is >>indicated >> in the Content-Length header. > >From the standards point of view, that is _not_ what you are dealing >with. You are dealing with a site that sends two responses, the first >response is proper HTTP. The second response is garbage. > > >> RFC7230 sections 3.3.2 (Content-Length), 3.3.3 (Message body length), >> and 3.3.4 (Handling incomplete messages) only contemplate issues >>around >> Content-Length specifying more bytes than are received, not fewer. > >From the standards point of view, it is impossible for the >Content-Length to specify fewer bytes than the message has. Irrelevant >for this discussion cases aside, the message end is defined by the >Content-Length header value. One cannot have more than what was >promised >because one stops assembling the message [body] after the promised >number of bytes were added. Any "leftovers" are another message or >garbage, depending on Connection:close, pipelining, and similar >factors. > > >> I guess one could argue that a wrong C-L value is "invalid", but it's >> not clear that invalid in this context simply means it doesn't parse, >>or >> is otherwise non-compliant with the ABNF. > >It is valid from protocol point of view. You know it is "wrong" only >because you can (or you think you can) distinguish garbage from the end >of the content. > > >> So, it's not clear what the browser and/or proxy response should be. > >There is no single right answer to that. A compliant client (including >proxies) ought to treat leftovers as post-message gardbage or another >message. A real-world client may identify specific cases where >leftovers >are likely to be the end of the message content and ignore >Content-Length in those cases. The cases where such behavior would be a >good idea would vary from agent to agent, from one deployment to >another. > > >> I would expect it's in everyone's best interest if sites that have >> broken framing are forced to be fixed. This won't happen if browsers >> "just work" for the site. > >The ever-popular "force sites to be fixed" approach rarely fixes enough >real-word sites to remove special treatment code. See Patrick's >response >for a good illustration. > > >> Is there a special behaviour we should agree on for such cases? > >We could agree to violate the standard in one or two special cases, but >any formal agreement would probably result in a few more broken sites >because more folks will tolerate them, decreasing the probability that >they will be fixed. > >I can think of one special case where it is more-or-less safe to ignore >response Content-Length: > >* the HTTP/1 connection is not persistent, >* no additional outstanding pipelined requests on that connection, >* the unique Content-Length header field is syntactically valid, and >* more bytes were read during the last network read than C-L promises. > >The combination of these conditions can trigger [optional] "robustness" >code that reads until connection closure and re-sends leftovers/garbage >to the next hop (or displays it to the user), opening a message >smuggling attack vector. > >Needless to say, there are benign leftover cases that the above >conditions do not cover. > > >Cheers, > >Alex. >
- Re: what constitutes an "invalid" content-length Willy Tarreau
- Re: what constitutes an "invalid" content-length Patrick McManus
- Re: what constitutes an "invalid" content-length Mark Nottingham
- Re: what constitutes an "invalid" content-length Adrien de Croy
- Re: what constitutes an "invalid" content-length Tim Bray
- Re: what constitutes an "invalid" content-length Willy Tarreau
- Re: what constitutes an "invalid" content-length Alex Rousskov
- Re: what constitutes an "invalid" content-length Adrien de Croy
- Re: what constitutes an "invalid" content-length Patrick McManus
- what constitutes an "invalid" content-length Adrien de Croy