Re: Sections 3.3.2 and 3.3.3 allow bogus Content-Length?

Alex Rousskov <rousskov@measurement-factory.com> Wed, 15 February 2017 00:35 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C92F1129452 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 14 Feb 2017 16:35:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.901
X-Spam-Level:
X-Spam-Status: No, score=-6.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ykNwSMe4WtE5 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 14 Feb 2017 16:35:09 -0800 (PST)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CD0AC12941C for <httpbisa-archive-bis2Juki@lists.ietf.org>; Tue, 14 Feb 2017 16:35:09 -0800 (PST)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1cdnWZ-0005ml-7o for ietf-http-wg-dist@listhub.w3.org; Wed, 15 Feb 2017 00:32:47 +0000
Resent-Date: Wed, 15 Feb 2017 00:32:47 +0000
Resent-Message-Id: <E1cdnWZ-0005ml-7o@frink.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by frink.w3.org with esmtps (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <rousskov@measurement-factory.com>) id 1cdnWT-0005m0-Vz for ietf-http-wg@listhub.w3.org; Wed, 15 Feb 2017 00:32:42 +0000
Received: from mail.measurement-factory.com ([104.237.131.42]) by mimas.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from <rousskov@measurement-factory.com>) id 1cdnWF-0004Da-BE for ietf-http-wg@w3.org; Wed, 15 Feb 2017 00:32:36 +0000
Received: from [65.102.233.169] (unknown [65.102.233.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.measurement-factory.com (Postfix) with ESMTPSA id 8048FE037; Wed, 15 Feb 2017 00:32:05 +0000 (UTC)
To: Adrien de Croy <adrien@qbik.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
References: <emdcb96fc0-0d2f-436c-9f1f-05beffe7593e@bodybag> <e01c4945-1116-d258-7004-ea917843bf3d@ninenines.eu> <ema747b801-6dcc-4b2d-ac95-9a027e10c0b4@bodybag> <7874c62b-c6a0-5d84-8115-20016b45118a@measurement-factory.com> <em541e3407-4e99-468e-a1e7-85a7bf074bdd@bodybag> <874938e6-2153-e02a-ab0e-814f468c58f8@measurement-factory.com> <em95b13204-3a33-4bd5-81d2-791e809b9cd2@bodybag>
From: Alex Rousskov <rousskov@measurement-factory.com>
Message-ID: <0f12628c-ab62-22c2-2cf3-e4b456072597@measurement-factory.com>
Date: Tue, 14 Feb 2017 17:32:04 -0700
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:45.0) Gecko/20100101 Thunderbird/45.7.0
MIME-Version: 1.0
In-Reply-To: <em95b13204-3a33-4bd5-81d2-791e809b9cd2@bodybag>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=104.237.131.42; envelope-from=rousskov@measurement-factory.com; helo=mail.measurement-factory.com
X-W3C-Hub-Spam-Status: No, score=-4.5
X-W3C-Hub-Spam-Report: AWL=-0.614, BAYES_00=-1.9, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1cdnWF-0004Da-BE 04e3e8031c13b2455eb278ec0c069bae
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Sections 3.3.2 and 3.3.3 allow bogus Content-Length?
Archived-At: <http://www.w3.org/mid/0f12628c-ab62-22c2-2cf3-e4b456072597@measurement-factory.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/33513
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On 02/14/2017 04:18 PM, Adrien de Croy wrote:

> The only true size of a body is what you obtain by counting its bytes. 

I disagree. The only true size of a body is the Content-Length value (in
relevant contexts). The number of bytes counted by the sender or
recipient do not change the message body length. Yes, there could be
garbage after the body and, yes, something may prevent sending or
receiving of the whole body, but none of that changes what the body size is.

If you define the message body size as the number of bytes sent or
received (in relevant contexts), then you may add rules about those
numbers, but HTTP specs do not define the message body size that way.

Alex.



> The Content-Length header is not part of that body, it's sent in the
> headers, we parse it, convert the string to a number.
> 
> Yet we have no rule stating it must be consistent with what is sent.
> 
> I think we could be a little less timid about this and call it like it is.
> 
> Adrien
> 
> 
> ------ Original Message ------
> From: "Alex Rousskov" <rousskov@measurement-factory.com>
> To: "Adrien de Croy" <adrien@qbik.com>; "ietf-http-wg@w3.org"
> <ietf-http-wg@w3.org>
> Sent: 15/02/2017 12:13:56 PM
> Subject: Re: Sections 3.3.2 and 3.3.3 allow bogus Content-Length?
> 
>> On 02/14/2017 03:54 PM, Adrien de Croy wrote:
>>>  I have no problem with the concept of adding a rule that states that
>>>
>>>  objects labelled as weighing 5T MUST weigh 5T or the label is
>>>  incorrect/invalid.
>>
>> Thinking of Content-Length as a packaging label gets you into the very
>> trap you want to escape: Yes, rules for labeling accuracy would be fine,
>> but Content-Length (in relevant contexts) is _not_ a label!
>> Content-Length does not merely document weight that you can
>> independently measure and validate. Content-Length _is_ weight.
>>
>> Alex.
>>
>>
>>
>>>  ------ Original Message ------
>>>  From: "Alex Rousskov" <rousskov@measurement-factory.com>
>>>  To: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
>>>  Cc: "Adrien de Croy" <adrien@qbik.com>
>>>  Sent: 15/02/2017 11:38:17 AM
>>>  Subject: Re: Sections 3.3.2 and 3.3.3 allow bogus Content-Length?
>>>
>>>>  On 02/14/2017 02:12 PM, Adrien de Croy wrote:
>>>>
>>>>>   I did quote that section, but it doesn't define what an invalid
>>>>> C-L is.
>>>>
>>>>  The term "valid" in that section means "syntactically correct". 123 is
>>>>  valid. 0x123 is not. 0123 is valid unless the recipient is paranoid.
>>>>
>>>>
>>>>>   Nowhere does it explicitly state that C-L value must equal the body
>>>>>  size
>>>>>   in order to be valid.
>>>>
>>>>  You are correct. The message framing rules (3.3.3.1-5) establish that
>>>>  C-L value and body length are the same concept (for the applicable
>>>> cases
>>>>  where C-L value is used for framing and only for those cases).
>>>>
>>>>  In other words, one should not add a "C-L value MUST match the body
>>>>  length" or "the body length MUST match the C-L value" rule because the
>>>>  body length _is_ the C-L value (for the applicable cases). Adding
>>>> such a
>>>>  rule would be like saying "an object with a weight of 5 tons MUST
>>>> weigh
>>>>  5 tons".
>>>>
>>>>
>>>>  HTH,
>>>>
>>>>  Alex.
>>>>
>>>>
>>>>>   ------ Original Message ------
>>>>>   From: "Loïc Hoguin" <essen@ninenines.eu>
>>>>>   To: "Adrien de Croy" <adrien@qbik.com>; "ietf-http-wg@w3.org"
>>>>>   <ietf-http-wg@w3.org>
>>>>>   Sent: 15/02/2017 10:05:46 AM
>>>>>   Subject: Re: Sections 3.3.2 and 3.3.3 allow bogus Content-Length?
>>>>>
>>>>>>   On 02/14/2017 09:49 PM, Adrien de Croy wrote:
>>>>>>>
>>>>>>>   The language in RFC 7230 section 3.3.2 is extremely non-commital
>>>>>>>  about
>>>>>>>   whether Content-Length needs to be correct or not.
>>>>>>>
>>>>>>>   I'm currently having a dispute about this with someone who quoted
>>>>>>>  these
>>>>>>>   sections at me as being proof that you can use any value for C-L
>>>>>>>   regardless of the body length.
>>>>>>>
>>>>>>>   I think it could be a lot more forcefully written
>>>>>>>
>>>>>>>   Or is the person correct and we don't need to have C-L match
>>>>>>> the body
>>>>>>>   length?
>>>>>>
>>>>>>   It sounds pretty explicit to me:
>>>>>>
>>>>>>      4.  If a message is received without Transfer-Encoding and with
>>>>>>          either multiple Content-Length header fields having
>>>>>> differing
>>>>>>          field-values or a single Content-Length header field
>>>>>> having an
>>>>>>          invalid value, then the message framing is invalid and the
>>>>>>          recipient MUST treat it as an unrecoverable error.  If this
>>>>>>  is a
>>>>>>          request message, the server MUST respond with a 400 (Bad
>>>>>>  Request)
>>>>>>          status code and then close the connection.
>>>>>>
>>>>>>   If it's both invalid and required for handling the request, send
>>>>>> a 400
>>>>>>   and close the connection.
>>>>>>
>>>>>>   I suppose the spec allows you to have an invalid Content-Length
>>>>>> if and
>>>>>>   only if the request also has a Transfer-Encoding header, however:
>>>>>>
>>>>>>          If a message is received with both a Transfer-Encoding and a
>>>>>>          Content-Length header field, the Transfer-Encoding overrides
>>>>>>  the
>>>>>>          Content-Length.  Such a message might indicate an attempt to
>>>>>>          perform request smuggling (Section 9.5) or response
>>>>>> splitting
>>>>>>          (Section 9.4) and ought to be handled as an error.
>>>>>>
>>>>>>   So sending a 400 and closing does not sound crazy even in that
>>>>>> case,
>>>>>>   despite the spec not requiring it.
>>>>>>
>>>>>>   -- Loïc Hoguin
>>>>>>   https://ninenines.eu
>>>>>
>>>>
>>>
>>