Re: [Asrg] Summary of junk button discussion

Jose-Marcio Martins da Cruz <Jose-Marcio.Martins@mines-paristech.fr> Thu, 25 February 2010 21:01 UTC

Return-Path: <Jose-Marcio.Martins@mines-paristech.fr>
X-Original-To: asrg@core3.amsl.com
Delivered-To: asrg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 48E1F28C2AA for <asrg@core3.amsl.com>; Thu, 25 Feb 2010 13:01:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.943
X-Spam-Level:
X-Spam-Status: No, score=-0.943 tagged_above=-999 required=5 tests=[AWL=1.150, BAYES_00=-2.599, HELO_EQ_FR=0.35, SUBJECT_FUZZY_TION=0.156]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id W+Ra4tLRN+NH for <asrg@core3.amsl.com>; Thu, 25 Feb 2010 13:01:43 -0800 (PST)
Received: from boipeva.ensmp.fr (cobra.ensmp.fr [194.214.158.101]) by core3.amsl.com (Postfix) with ESMTP id 369AF28C150 for <asrg@irtf.org>; Thu, 25 Feb 2010 13:01:42 -0800 (PST)
Received: from localhost.localdomain (joe.j-chkmail.org [88.168.143.55]) (authenticated bits=0) by boipeva.ensmp.fr (8.14.3/8.14.3/JMMC-11/Feb/2009) with ESMTP id o1PL3qxm008936 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for <asrg@irtf.org>; Thu, 25 Feb 2010 22:03:52 +0100 (MET)
Message-ID: <4B86E5A1.5000201@mines-paristech.fr>
Date: Thu, 25 Feb 2010 22:03:29 +0100
From: Jose-Marcio Martins da Cruz <Jose-Marcio.Martins@mines-paristech.fr>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.23) Gecko/20090908 Fedora/1.1.18-1.fc11 SeaMonkey/1.1.18
MIME-Version: 1.0
To: Anti-Spam Research Group - IRTF <asrg@irtf.org>
References: <20100225054546.16850.qmail@simone.iecc.com> <4B86172D.2080702@nortel.com> <4B86AD93.1050800@tana.it> <4B86DD80.8060508@nortel.com>
In-Reply-To: <4B86DD80.8060508@nortel.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Miltered: at boipeva with ID 4B86E5B8.000 by Joe's j-chkmail (http : // j-chkmail dot ensmp dot fr)!
X-j-chkmail-Enveloppe: 4B86E5B8.000/88.168.143.55/joe.j-chkmail.org/localhost.localdomain/<Jose-Marcio.Martins@mines-paristech.fr>
Subject: Re: [Asrg] Summary of junk button discussion
X-BeenThere: asrg@irtf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: Jose-Marcio.Martins@mines-paristech.fr, Anti-Spam Research Group - IRTF <asrg@irtf.org>
List-Id: Anti-Spam Research Group - IRTF <asrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/listinfo/asrg>, <mailto:asrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/asrg>
List-Post: <mailto:asrg@irtf.org>
List-Help: <mailto:asrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/asrg>, <mailto:asrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Feb 2010 21:01:44 -0000

Chris Lewis wrote:
> On 2/25/2010 12:04 PM, Alessandro Vesely wrote:

...

>> I cannot help distinguishing between IMAP and POP3 here. For IMAP,
>> synchronization of Bayesian data among several servers and clients may
>> be viewed as a generic distributed database problem, possibly
>> complicated by an amount of fuzziness. It is possible to send an abuse
>> report as a consequence of particular user's actions; it is just
>> similar to "move marked junk to junk folder".
> 
> It's an implementation _convenience_ (for IMAP).  Nothing more.
> 
> There is nothing preventing you sending your Bayesian data out-of-band 
> to the server, nor, keeping it local for client-based filtering (as in 
> Thunderbird).
> 
> Heck, SpamAssassin even manages to tune Bayesian without having any 
> end-user feedback at all.

...

>> In case servers maintain _per-user_
>> Bayesian data --as they should-- the whole idea of filtering on the
>> servers seems rather pointless.


>> To recap, junk buttons can be embedded within a more sophisticated
>> architecture (as for IMAP). But not the other way around: anti-spam
>> filter training cannot (in general) be based upon junk buttons and
>> abuse reporting.
> 
> Of course you can train spam filters based on abuse reports.  We've been 
> doing precisely that for 13 years in several different incarnations.
> 
> It may well make sense to include an "tickle IMAP" server as part of a 
> spec, but, also having an abuse reporting mechanism makes sure that you 
> have just about all implementations covered, IMAP or otherwise.
> 
> We could spec both, and leave it up to an installation or user to decide 
> which (or both) to use in any particular instance.

Hmmmm... IMHO, it isn't a good idea to restrict this work based on what one think "--as they 
should--" or "most ISP think"...

Bayesian filters aren't the only kind of statistical filters. There are many ways to think about 
spam. There are a lot of research being done in universities or research departements of private 
companies (sometimes presented at CEAS).

So, we should consider that the mechanism which is being specified here could be used in ways we 
don't know today with current MUAs and current filter technologies.

my 2 cents.

JM