Re: p1: whitespace in request-target

Willy Tarreau <w@1wt.eu> Thu, 18 April 2013 05:58 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EE59921F85CC for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 17 Apr 2013 22:58:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.599
X-Spam-Level:
X-Spam-Status: No, score=-10.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QTFr94MyURb7 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 17 Apr 2013 22:58:11 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 123BB21F8506 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 17 Apr 2013 22:58:11 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1UShr1-0006yB-NM for ietf-http-wg-dist@listhub.w3.org; Thu, 18 Apr 2013 05:57:55 +0000
Resent-Date: Thu, 18 Apr 2013 05:57:55 +0000
Resent-Message-Id: <E1UShr1-0006yB-NM@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <w@1wt.eu>) id 1UShqy-0006xS-Rs for ietf-http-wg@listhub.w3.org; Thu, 18 Apr 2013 05:57:52 +0000
Received: from 1wt.eu ([62.212.114.60]) by maggie.w3.org with esmtp (Exim 4.72) (envelope-from <w@1wt.eu>) id 1UShqx-0000GE-Op for ietf-http-wg@w3.org; Thu, 18 Apr 2013 05:57:52 +0000
Received: (from willy@localhost) by mail.home.local (8.14.4/8.14.4/Submit) id r3I5trwE014119; Thu, 18 Apr 2013 07:55:53 +0200
Date: Thu, 18 Apr 2013 07:55:53 +0200
From: Willy Tarreau <w@1wt.eu>
To: Mark Nottingham <mnot@mnot.net>
Cc: "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>, Amos Jeffries <squid3@treenet.co.nz>, Roy Fielding <fielding@gbiv.com>
Message-ID: <20130418055553.GB13063@1wt.eu>
References: <2183465A-F833-4701-A55C-EC105A36329E@mnot.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <2183465A-F833-4701-A55C-EC105A36329E@mnot.net>
User-Agent: Mutt/1.4.2.3i
Received-SPF: pass client-ip=62.212.114.60; envelope-from=w@1wt.eu; helo=1wt.eu
X-W3C-Hub-Spam-Status: No, score=-3.3
X-W3C-Hub-Spam-Report: AWL=-2.782, RP_MATCHES_RCVD=-0.556, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1UShqx-0000GE-Op 48f47f8127f5fe75f2a120bc04319c5b
X-Original-To: ietf-http-wg@w3.org
Subject: Re: p1: whitespace in request-target
Archived-At: <http://www.w3.org/mid/20130418055553.GB13063@1wt.eu>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/17326
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Hi,

On Thu, Apr 18, 2013 at 10:49:10AM +1000, Mark Nottingham wrote:
> p1 3.1.1 says:
> 
> > Unfortunately, some user agents fail to properly encode hypertext
> > references that have embedded whitespace, sending the characters directly
> > instead of properly encoding or excluding the disallowed characters.
> > Recipients of an invalid request-line SHOULD respond with either a 400 (Bad
> > Request) error or a 301 (Moved Permanently) redirect with the
> > request-target properly encoded. Recipients SHOULD NOT attempt to
> > autocorrect and then process the request without a redirect, since the
> > invalid request-line might be deliberately crafted to bypass security
> > filters along the request chain.
> 
>   http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-22#section-3.1.1
> 
> I note that the practice of correcting this is fairly widespread; e.g., in
> Squid, the default is to strip the whitespace, and IIRC has been for some
> time:
> 
>   http://www.squid-cache.org/Doc/config/uri_whitespace/

Does Squid log something which helps seeing when it fixes this ?

> I think that the Squid documentation needs to be corrected, because the text
> in RFC2396 (and later in 3986) is about URIs in contexts like books, e-mail
> and so forth, not protocol elements:
> 
>   http://tools.ietf.org/html/rfc3986#appendix-C
> 
> My question is why this is a SHOULD / SHOULD NOT. We say that SHOULD-level
> requirements affect conformance unless there's a documented exception here:
> 
>   http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-22#section-2.5
> 
> ... but these requirements don't mention any exceptions. Is the security risk
> here high enough to justify a MUST / MUST NOT? If not, they probably need to
> be downgraded to ought (or an exception needs to be highlighted).

Well, FWIW, haproxy is strict on this and rejects requests which don't
exactly match the expected format, which means that embedded spaces are
rejected with a 400 badreq. At the same time when such a bad request
happens, the whole request is captured. I must say that all the ones
I have seen to date (which are extremely rare) were made by poorly
written attack scripts, or by stupid web scraping tools that resolve
URL-encoding in links found on web pages before doing the request.
These are the same which append '">' at the end of a URL because they
failed to properly parse the "<a href=..." tag.

So in my opinion, we could reasonably use a MUST here. Suggesting recipients
to fix this significantly increases the risk that they do it wrong and become
vulnerable to certain classes of attacks. And we're clearly not contributing
to cleaning the web by accepting such erroneous behaviours, especially if
browsers managed to get it right.

Regards,
Willy