Re: what constitutes an "invalid" content-length

Patrick McManus <mcmanus@ducksong.com> Wed, 13 July 2016 12:45 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5AFD412D7C3 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 13 Jul 2016 05:45:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.207
X-Spam-Level:
X-Spam-Status: No, score=-8.207 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.287, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=sendgrid.me
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id H_xZ1W_NvrXX for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 13 Jul 2016 05:45:35 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DF75812D744 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 13 Jul 2016 05:45:34 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1bNJTc-0008Ow-R6 for ietf-http-wg-dist@listhub.w3.org; Wed, 13 Jul 2016 12:41:20 +0000
Resent-Date: Wed, 13 Jul 2016 12:41:20 +0000
Resent-Message-Id: <E1bNJTc-0008Ow-R6@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <bounces+1568871-208f-ietf-http-wg=w3.org@sendgrid.net>) id 1bNJTX-0008OD-Vh for ietf-http-wg@listhub.w3.org; Wed, 13 Jul 2016 12:41:16 +0000
Received: from o1678924164.outbound-mail.sendgrid.net ([167.89.24.164]) by lisa.w3.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA256:128) (Exim 4.80) (envelope-from <bounces+1568871-208f-ietf-http-wg=w3.org@sendgrid.net>) id 1bNJTU-0000mR-6O for ietf-http-wg@w3.org; Wed, 13 Jul 2016 12:41:15 +0000
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sendgrid.me; h=mime-version:in-reply-to:references:from:subject:to:cc:content-type; s=smtpapi; bh=OWdtS2l1v7C8RXZR4yQTHgx7Qaw=; b=rJ67Lapb/7AUMNckAd Oshvlx30P+tEkn2kfmMgC8eRfdNU45EZ6g79ju4JaUYWmiGpBphxzwK5gL3Gt/U4 XME7EL3QnPDZDme74tg1B7fL2AJMDETfJHoN5/lbz7eI8QIK6Iz1a6nQb5jYr2eZ MW6hbbI4k0XcPbv/HIJ/ORdO4=
Received: by filter0605p1mdw1.sendgrid.net with SMTP id filter0605p1mdw1.3146.578636CCE 2016-07-13 12:40:44.180540188 +0000 UTC
Received: from mail-vk0-f53.google.com (mail-vk0-f53.google.com [209.85.213.53]) by ismtpd0002p1iad1.sendgrid.net (SG) with ESMTP id y_lYtzJcTJuycOytT8yzZQ for <ietf-http-wg@w3.org>; Wed, 13 Jul 2016 12:40:44.134 +0000 (UTC)
Received: by mail-vk0-f53.google.com with SMTP id x130so63333298vkc.0 for <ietf-http-wg@w3.org>; Wed, 13 Jul 2016 05:40:44 -0700 (PDT)
X-Gm-Message-State: ALyK8tLacWw+4UPFTAWzDDQ1RXD2qqwER0YlLcBlGweenmK8DxXhKgZKyi3UR/dttceFi5Gz6NPXmllqSMPkOA==
X-Received: by 10.31.154.1 with SMTP id c1mr3809101vke.36.1468413643509; Wed, 13 Jul 2016 05:40:43 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.176.4.79 with HTTP; Wed, 13 Jul 2016 05:40:42 -0700 (PDT)
In-Reply-To: <5FA906F3-23E7-4E6D-9812-DEDF49CEC80C@mnot.net>
References: <em19b7fba4-42bf-40e8-83a9-132dfdc92698@bodybag> <CAOdDvNq5Tgb+yYxprV2s+GDSvCoPi2kd9VJWL1hdHQYDq0bUFA@mail.gmail.com> <5FA906F3-23E7-4E6D-9812-DEDF49CEC80C@mnot.net>
From: Patrick McManus <mcmanus@ducksong.com>
Date: Wed, 13 Jul 2016 08:40:42 -0400
X-Gmail-Original-Message-ID: <CAOdDvNr6T-d2KEdvYXzwx5S9u3Qfg-88krpSD28=U0sLEX7Qrg@mail.gmail.com>
Message-ID: <CAOdDvNr6T-d2KEdvYXzwx5S9u3Qfg-88krpSD28=U0sLEX7Qrg@mail.gmail.com>
To: Mark Nottingham <mnot@mnot.net>
Cc: Patrick McManus <mcmanus@ducksong.com>, Adrien de Croy <adrien@qbik.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="001a1142336e9ae1c0053783b174"
X-SG-EID: YLWet4rakcOTMHWvPPwWbcsiUJbN1FCn0PHYd/Uujh7eXhDRkUW2JmAh1A5/VFRFq9ab0/2jxTNF66 ez30PbDaClUTwuPMQOgXuSHZKCCXl6xAnf3Ozrhy+UfZv4ZoRCEjVfh4rnBNMGk1506er0rF7fSKtB Kt9Lh0USGXGR7wrm7KPgxQ/z78jMzi/SvL/d+3hbILjxWv4izNyi3iknbfYzxIQ1z3N2BZxmt/9pnt w=
X-SendGrid-Contentd-ID: {"test_id":"1468413644"}
Received-SPF: pass client-ip=167.89.24.164; envelope-from=bounces+1568871-208f-ietf-http-wg=w3.org@sendgrid.net; helo=o1678924164.outbound-mail.sendgrid.net
X-W3C-Hub-Spam-Status: No, score=-6.0
X-W3C-Hub-Spam-Report: AWL=-0.268, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RP_MATCHES_RCVD=-1.287, SPF_PASS=-0.001, URIBL_GREY=0.424, W3C_AA=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: lisa.w3.org 1bNJTU-0000mR-6O d0980bf010a6ea774db55455f88ee38c
X-Original-To: ietf-http-wg@w3.org
Subject: Re: what constitutes an "invalid" content-length
Archived-At: <http://www.w3.org/mid/CAOdDvNr6T-d2KEdvYXzwx5S9u3Qfg-88krpSD28=U0sLEX7Qrg@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/31950
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On Tue, Jul 12, 2016 at 8:21 PM, Mark Nottingham <mnot@mnot.net> wrote:

> Just curious -- is the heuristic you use more than just "if we only have
> one outstanding request, keep reading bytes and consider them part of the
> response?"



we don't actually do that - it doesn't mesh well with how fast we want to
either recycle connections or notify the in-browser consumer (i.e. the
parser, the decoder, etc..) about EOM.

If doing persistent connections - 'extra' bytes get silently thrown away as
leading garbage on the subsequent response... otherwise we just read to EOF
and ignore C-L/Chunked. (A surprising portion of the servers that get C-L
wrong are closing the connection anyhow) as a message delimiter.

My biggest frustration in this space is actually around the unreliability
of truncation detection. *lots* of non persistent h1 transactions that seem
to be strongly framed come up short by some criterion - c-l, missing a
zero-chunk terminator, or unclean close termiations like RST or TLS Alerts.
For all practical purposes you need to silently accept what you have
received. Of course some actual network driven truncations look exactly the
same. That's created a problem in the file downloader (do I show a retry or
not?) and there is a security implication as well (do you really need all
that javascript?) when carried over a non-authenticated transport (or when
we are forced to ignore some aspects of it.)