Re: comments on draft-abarth-mime-sniff-03

Adam Barth <ietf@adambarth.com> Wed, 20 January 2010 23:15 UTC

MIME-Version: 1.0
In-Reply-To: <C68CB012D9182D408CED7B884F441D4D5FDE79@nambxv01a.corp.adobe.com>
References: <C68CB012D9182D408CED7B884F441D4D5FDE79@nambxv01a.corp.adobe.com>
From: Adam Barth <ietf@adambarth.com>
Date: Wed, 20 Jan 2010 15:14:37 -0800
Message-ID: <7789133a1001201514l47b43b8bw958e42794707dbc9@mail.gmail.com>
Subject: Re: comments on draft-abarth-mime-sniff-03
To: Larry Masinter <masinter@adobe.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Cc: Ian Hickson <ian@hixie.ch>, "apps-discuss@ietf.org" <apps-discuss@ietf.org>
Precedence: list

Thanks Larry.  These are great comments.  I'll incorporate them when I
next update the draft.  To answer a couple of your questions:

On Wed, Jan 20, 2010 at 2:41 PM, Larry Masinter <masinter@adobe.com> wrote:
> A message with more than one content-type header
> should be treated as malformed.

What does it mean to treat the response as malformed?  I've seen
examples of servers that blissfully send more than one Content-Type
header.  This document just describes how to process the responses
without making a judge about whether the server is acting properly or
not.

> The "algorithm for extracting an encoding ...."
[...]
> The nature of the "willful violation"
> (I.e., how it is different) and the
> justification for the "willful violation"
> should be included. I can't fathom any
> justification for it.

Charset sniffing is required to avoid ugly replacement characters from
being shown to the user.  Sad, but true.

> file extensions:
>
>  Note: It is essential that file extensions
>  are not used for determining the media type
>   for resources fetched over HTTP because
>  file extensions can often by supplied by
>   malicious parties.
>
>  "Often" is dubious. How can file extensions be
> supplied more often than  content-type headers?

For example, the attacker can chose the file extension in most PHP
installations because foo.php happily processes:

http://example.com/foo.php/bar.qux

> What is the security threat?

The security threat is that if you treat an HTML file extension as
evidence the server wants the response to be treated as text/html you
will introduce XSS vulnerabilities into some large number of sites
running PHP (among others).

> I'd think that the behavior of "how to sniff"
> should start out with what the inputs are
> (the first N bytes of some data from a response).

This business about waiting for 512 bytes has to do with a poor
interaction between buffering for sniffing and Comet
<http://en.wikipedia.org/wiki/Comet_(programming)>.  Basically, if you
wait forever for the 512 bytes you need to sniff completely, then you
break things like chat in Gmail.  For example, Gmail chat used to not
work in Safari for this reason.  However, always using the first chunk
of data off the network to sniff means you'll get unpredictable
results based on how exactly the response was chunked.  Hence the
advice to wait for 512 bytes but not a requirement to wait forever.

Adam

comments on draft-abarth-mime-sniff-03 Larry Masinter
Re: comments on draft-abarth-mime-sniff-03 Adam Barth
RE: comments on draft-abarth-mime-sniff-03 Larry Masinter
Re: comments on draft-abarth-mime-sniff-03 Adam Barth
RE: comments on draft-abarth-mime-sniff-03 Ian Hickson
Re: comments on draft-abarth-mime-sniff-03 Ian Hickson
Re: comments on draft-abarth-mime-sniff-03 Julian Reschke
Re: comments on draft-abarth-mime-sniff-03 Adam Barth
Re: comments on draft-abarth-mime-sniff-03 Adam Barth
Re: comments on draft-abarth-mime-sniff-03 Adam Barth