Re: [Asrg] Adding a spam button to MUAs

Ian Eiloart <iane@sussex.ac.uk> Mon, 21 December 2009 11:31 UTC

Return-Path: <iane@sussex.ac.uk>
X-Original-To: asrg@core3.amsl.com
Delivered-To: asrg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 97BAD3A69FF for <asrg@core3.amsl.com>; Mon, 21 Dec 2009 03:31:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.104
X-Spam-Level:
X-Spam-Status: No, score=-0.104 tagged_above=-999 required=5 tests=[AWL=-1.657, BAYES_50=0.001, MIME_QP_LONG_LINE=1.396, SUBJECT_FUZZY_TION=0.156]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id i9UQ6NqUStro for <asrg@core3.amsl.com>; Mon, 21 Dec 2009 03:31:43 -0800 (PST)
Received: from sivits.uscs.susx.ac.uk (sivits.uscs.susx.ac.uk [139.184.14.88]) by core3.amsl.com (Postfix) with ESMTP id 567F53A69FD for <asrg@irtf.org>; Mon, 21 Dec 2009 03:31:42 -0800 (PST)
Received: from lewes.staff.uscs.susx.ac.uk ([139.184.134.43]:65050) by sivits.uscs.susx.ac.uk with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.64) (envelope-from <iane@sussex.ac.uk>) id KV02QP-000LUR-9O; Mon, 21 Dec 2009 11:32:49 +0000
Date: Mon, 21 Dec 2009 11:31:19 +0000
From: Ian Eiloart <iane@sussex.ac.uk>
Sender: iane@sussex.ac.uk
To: Jose-Marcio.Martins@mines-paristech.fr, Anti-Spam Research Group - IRTF <asrg@irtf.org>
Message-ID: <A8792A43052B049E5A4851E5@lewes.staff.uscs.susx.ac.uk>
In-Reply-To: <4B2E1E76.9000400@mines-paristech.fr>
References: <alpine.BSF.2.00.0912082138050.20682@simone.lan> <20091216014800.GA29103@gsp.org> <DBF77720-200E-4846-949F-924388F9CC15@blighty.com> <20091216120742.GA28622@gsp.org> <20091216185904.3B9032421D@panix5.panix.com> <4B296458.5070603@mail-abuse.org> <16C1C8A4-D223-435B-93BC-A9D44F5965A1@guppylake.com> <B14EC7430355853625D0D4EA@lewes.staff.uscs.susx.ac.uk> <BBF2AC03-3C88-4557-9346-343347C196A9@guppylake.com> <240DB04672256506ED548857@lewes.staff.uscs.susx.ac.uk> <4B2A7E8D.8060104@nd.edu> <20091217200605.8E99E2421D@panix5.panix.com> <4B2B0E4B.3050509@dcrocker.net> <4B2E1E76.9000400@mines-paristech.fr>
Originator-Info: login-token=Mulberry:014+EUOUxCgp4kTmibTwdWVoesSVk1k0YHWEw=; token_authority=support@its.sussex.ac.uk
X-Mailer: Mulberry/4.0.8 (Mac OS X)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
X-Sussex: true
X-Sussex-transport: remote_smtp
Subject: Re: [Asrg] Adding a spam button to MUAs
X-BeenThere: asrg@irtf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: Anti-Spam Research Group - IRTF <asrg@irtf.org>
List-Id: Anti-Spam Research Group - IRTF <asrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/listinfo/asrg>, <mailto:asrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/asrg>
List-Post: <mailto:asrg@irtf.org>
List-Help: <mailto:asrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/asrg>, <mailto:asrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Mon, 21 Dec 2009 11:31:44 -0000

>
> Quoting a research paper from Gordon Cormack [1] :
>
> "When explicitly asked to classify messages, human subjects have been
> reported to exhibit error rates of 3%-7% [30, 16]. Tacitly derived
> labels, such as those obtained from a “report spam” button, where it
> is assumed that unreported messages are ham, may have even higher rates."
>
> The references 30 and 16 in this paper comes from experiments with real
> data and real subjects (SpamOrHam - Graham-Cumming and Hotmail -
> Yih/Kolcz)
>
> These error rates are most of the time bigger than what can be achieved
> by spam filters. So it's probably a bad idea to consider that user
> feedback is reliable. User interface shall be as simple as possible.

But wait, we're talking about messages that the spam filter hasn't 
rejected. Additional data *has* to be useful.

The false positive rates are only a problem if the admin stupid enough to 
consider a single report as definitive. If you deliver a message to 100 
users, and three report it as spam, then you probably take no action. If 20 
report it as spam, then you need to take a closer look.

I certainly don't think a 7% error rate is enough to determine that users 
should not be given the opportunity to distinguish between unwanted mail 
and reportable junk.

You can also combine reporting rates with your bayesian content analyser or 
spamassassin score, or with your reputational score for the sender domain, 
etc.

> [1] Cormack, G. V. and Kolcz, A. 2009. Spam filter evaluation with
> imprecise ground truth. In Proceedings of the 32nd international ACM
> SIGIR Conference on Research and Development in information Retrieval
> (Boston, MA, USA, July 19 - 23, 2009). SIGIR '09. ACM, New York, NY,
> 604-611. http://plg.uwaterloo.ca/~gvcormac/cormacksigir09-spam.pdf
>
> JM
>
>>
>> for folks who do serious UX work, they do not guarantee that their
>> suggestions are right, merely that they are worth testing.
>>
>> d/
>
>
> --   ---------------------------------------------------------------
>   Jose Marcio MARTINS DA CRUZ           http://j-chkmail.ensmp.fr
>   Ecole des Mines de Paris
>   60, bd Saint Michel                      75272 - PARIS CEDEX 06
>   mailto:Jose-Marcio.Martins@mines-paristech.fr
> _______________________________________________
> Asrg mailing list
> Asrg@irtf.org
> http://www.irtf.org/mailman/listinfo/asrg



-- 
Ian Eiloart
IT Services, University of Sussex
01273-873148 x3148
For new support requests, see http://www.sussex.ac.uk/its/help/