Re: [websec] Is sniffing a heuristic? (was Re: more on sniffing)

Bjoern Hoehrmann <derhoermi@gmx.net> Sun, 08 January 2012 23:08 UTC

Return-Path: <derhoermi@gmx.net>
X-Original-To: websec@ietfa.amsl.com
Delivered-To: websec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9B90921F861A for <websec@ietfa.amsl.com>; Sun, 8 Jan 2012 15:08:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.116
X-Spam-Level:
X-Spam-Status: No, score=-1.116 tagged_above=-999 required=5 tests=[AWL=-0.972, BAYES_00=-2.599, FRT_ADOBE2=2.455]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uIOjydECwfEf for <websec@ietfa.amsl.com>; Sun, 8 Jan 2012 15:08:32 -0800 (PST)
Received: from mailout-de.gmx.net (mailout-de.gmx.net [213.165.64.22]) by ietfa.amsl.com (Postfix) with SMTP id 8361921F8608 for <websec@ietf.org>; Sun, 8 Jan 2012 15:08:31 -0800 (PST)
Received: (qmail invoked by alias); 08 Jan 2012 23:08:28 -0000
Received: from dslb-094-223-221-170.pools.arcor-ip.net (EHLO HIVE) [94.223.221.170] by mail.gmx.net (mp072) with SMTP; 09 Jan 2012 00:08:28 +0100
X-Authenticated: #723575
X-Provags-ID: V01U2FsdGVkX19TDutiu3NTioMbI4F9J6rqdHcNHpjJ7Z7A/drWJO uPc6M3EFec9LWT
From: Bjoern Hoehrmann <derhoermi@gmx.net>
To: Adam Barth <ietf@adambarth.com>
Date: Mon, 09 Jan 2012 00:08:27 +0100
Message-ID: <mj5kg717bt1nsq4scnm7022oeo41vglfje@hive.bjoern.hoehrmann.de>
References: <CAJE5ia8dVwtr5Qe3DqyrDiFk7B0_3nEJD50=RewXK5RbB37LMQ@mail.gmail.com>
In-Reply-To: <CAJE5ia8dVwtr5Qe3DqyrDiFk7B0_3nEJD50=RewXK5RbB37LMQ@mail.gmail.com>
X-Mailer: Forte Agent 3.3/32.846
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
X-Y-GMX-Trusted: 0
Cc: "IETF WebSec WG (websec@ietf.org)" <websec@ietf.org>
Subject: Re: [websec] Is sniffing a heuristic? (was Re: more on sniffing)
X-BeenThere: websec@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Web Application Security Minus Authentication and Transport <websec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/websec>, <mailto:websec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/websec>
List-Post: <mailto:websec@ietf.org>
List-Help: <mailto:websec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/websec>, <mailto:websec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 08 Jan 2012 23:08:33 -0000

* Adam Barth wrote:
>On Sun, Jan 8, 2012 at 12:55 PM, Adam Barth <ietf@adambarth.com> wrote:
>> On Sun, Jan 8, 2012 at 9:12 AM, Larry Masinter <masinter@adobe.com> wrote:
>>>      <t>Sniffing is by its nature a heuristic process, because there are
>>>      many situations where content matches the signatures and capabilities
>>>      of many different possible content-type values.
>>
>> I disagree with this statement as well.  The sniffing we're talking
>> about here is not a heuristic.  It's a historical anomaly that needs
>> to be corrected for in order for user agents to be compatible with
>> some web sites.
>
>Let me expand this point some more.  Does you view the HTML5 parsing
>algorithm a heuristic?  The sniffing algorithm is the same sort of
>thing as the HTML5 parsing algorithm in that it's a somewhat
>unpleasant algorithm for interpreting responses from servers that's
>compatible with existing deployments.

In computer science heuristics are problem-solving techniques that pro-
vide good but not neccesarily correct solutions; they are employed as a
trade-off between correctness and other desirable properties. Sniffing
produces good results that are not always correct in a trade-off between
correctness and other properties like "compatibility", so it's a heuris-
tic. The "HTML5" parsing algorithm also produces good results, but there
is no widely accepted basis for claiming it produces incorrect results.

Saying that the sniffing algorithm always generates correct solutions is
like saying the Content-Type header in HTTP responses always has correct
media type information.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/