Re: [HASMAT] Parsing Content-Type

Adam Barth <ietf@adambarth.com> Tue, 13 July 2010 16:40 UTC

Return-Path: <ietf@adambarth.com>
X-Original-To: hasmat@core3.amsl.com
Delivered-To: hasmat@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 981FB3A6A93 for <hasmat@core3.amsl.com>; Tue, 13 Jul 2010 09:40:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.324
X-Spam-Level:
X-Spam-Status: No, score=-0.324 tagged_above=-999 required=5 tests=[AWL=-0.948, BAYES_50=0.001, FM_FORGED_GMAIL=0.622, WEIRD_PORT=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SdMRsaN82dAA for <hasmat@core3.amsl.com>; Tue, 13 Jul 2010 09:40:29 -0700 (PDT)
Received: from mail-qw0-f44.google.com (mail-qw0-f44.google.com [209.85.216.44]) by core3.amsl.com (Postfix) with ESMTP id D40943A6A71 for <hasmat@ietf.org>; Tue, 13 Jul 2010 09:40:28 -0700 (PDT)
Received: by qwe5 with SMTP id 5so1882494qwe.31 for <hasmat@ietf.org>; Tue, 13 Jul 2010 09:40:35 -0700 (PDT)
Received: by 10.224.64.85 with SMTP id d21mr8933248qai.367.1279039234385; Tue, 13 Jul 2010 09:40:34 -0700 (PDT)
Received: from mail-vw0-f44.google.com (mail-vw0-f44.google.com [209.85.212.44]) by mx.google.com with ESMTPS id js14sm25612051qcb.6.2010.07.13.09.40.33 (version=SSLv3 cipher=RC4-MD5); Tue, 13 Jul 2010 09:40:33 -0700 (PDT)
Received: by vws14 with SMTP id 14so6308833vws.31 for <hasmat@ietf.org>; Tue, 13 Jul 2010 09:40:33 -0700 (PDT)
Received: by 10.220.128.198 with SMTP id l6mr5890806vcs.219.1279039233202; Tue, 13 Jul 2010 09:40:33 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.220.70.210 with HTTP; Tue, 13 Jul 2010 09:40:13 -0700 (PDT)
In-Reply-To: <4C3C93A5.30903@gmx.de>
References: <op.vfr1o4on64w2qv@annevk-t60> <4C3C4831.5000007@gmx.de> <AANLkTilTSZHIdCxQ8Ip_c7aRHNYYQSghsRaPYFY9_VNM@mail.gmail.com> <4C3C93A5.30903@gmx.de>
From: Adam Barth <ietf@adambarth.com>
Date: Tue, 13 Jul 2010 09:40:13 -0700
Message-ID: <AANLkTimwoQ4OQ_v8TYv8-PY0XIFD7ff71lo2ZXLPDXGf@mail.gmail.com>
To: Julian Reschke <julian.reschke@gmx.de>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Cc: hasmat@ietf.org
Subject: Re: [HASMAT] Parsing Content-Type
X-BeenThere: hasmat@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: HTTP Application Security Minus Authentication and Transport <hasmat.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hasmat>, <mailto:hasmat-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hasmat>
List-Post: <mailto:hasmat@ietf.org>
List-Help: <mailto:hasmat-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hasmat>, <mailto:hasmat-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Jul 2010 16:40:30 -0000

On Tue, Jul 13, 2010 at 9:26 AM, Julian Reschke <julian.reschke@gmx.de> wrote:
> On 13.07.2010 18:10, Adam Barth wrote:
>>> Speaking of which: why would the fallback be "text/plain"?
>>
>> It depends on how the UA treats text/potato.  It would not be
>> unreasonable to treat text/potato similarly to text/plain, depending
>> on the UA's purpose.
>
> Hm, no. Imho.
>
> "text/html;" is a parse error according to the ABNF.
>
> So the likely outcomes seem to be:
>
> 1) Treated as missing parameter, thus as "tex/html", or

I don't understand where "tex/html" comes from in this discussion.

> 2) Treated as invalid header value, thus same as absent content-type header.
>
> In case 2 I wouldn't expect a default text handling.

Testing makes perfect:

$ cat testcase
HTTP/1.0 200 OK
Date: Tue, 13 Jul 2010 16:28:45 GMT
Expires: -1
Content-Type: text/html;

Potato
<b>Grass</b>
$ nc -l 9191 < testcase

Loading <http://localhost:9191/dd> in Firefox, Chrome, Safari, and
Opera causes the text "Grass" to be rendered bold, which means the
response is being treated as HTML.  Based on other information, we
know that this is not a result of the sniffing algorithm but rather a
result of the Content-Type header parser.

It's a separate question of whether we should specify this behavior in
this document, in another document, or not at all.  It's also a
separate question of what we would like to imagine is going on
internally to generate the observable behavior.

Adam