Re: [Asrg] "Uncaught spam" research project

Martijn Grooten <martijn.grooten@virusbtn.com> Fri, 30 April 2010 16:33 UTC

Return-Path: <martijn.grooten@virusbtn.com>
X-Original-To: asrg@core3.amsl.com
Delivered-To: asrg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 1348B3A6A0D for <asrg@core3.amsl.com>; Fri, 30 Apr 2010 09:33:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.001
X-Spam-Level:
X-Spam-Status: No, score=0.001 tagged_above=-999 required=5 tests=[BAYES_50=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id msQfn+DCQxqT for <asrg@core3.amsl.com>; Fri, 30 Apr 2010 09:33:10 -0700 (PDT)
Received: from mx1.sophos.com (mx1.sophos.com [195.166.81.52]) by core3.amsl.com (Postfix) with ESMTP id F40CA3A69F0 for <asrg@irtf.org>; Fri, 30 Apr 2010 09:33:08 -0700 (PDT)
Received: from mx1.sophos.com (localhost.localdomain [127.0.0.1]) by localhost (Postfix) with SMTP id A9754E78002 for <asrg@irtf.org>; Fri, 30 Apr 2010 17:32:53 +0100 (BST)
Received: from uk-exch2.green.sophos (uk-exch2.green.sophos [10.100.199.17]) by mx1.sophos.com (Postfix) with ESMTP id 0AFB6E7803C for <asrg@irtf.org>; Fri, 30 Apr 2010 17:32:53 +0100 (BST)
Received: from UK-EXCHMBX1.green.sophos ([fe80:0000:0000:0000:e1bd:d3c1:23.222.229.221]) by uk-exch2.green.sophos ([10.100.199.17]) with mapi; Fri, 30 Apr 2010 17:32:46 +0100
From: Martijn Grooten <martijn.grooten@virusbtn.com>
To: Anti-Spam Research Group - IRTF <asrg@irtf.org>
Date: Fri, 30 Apr 2010 17:32:45 +0100
Thread-Topic: [Asrg] "Uncaught spam" research project
Thread-Index: Acrof0UKjyE1RJmVTdOMRT5+VXlcLAAAS7JQ
Message-ID: <18B53BA2A483AD45962AAD1397BE1325379ED80D77@UK-EXCHMBX1.green.sophos>
References: <18B53BA2A483AD45962AAD1397BE1325379ED80C30@UK-EXCHMBX1.green.sophos> <20100430160658.GR14169@verdi>
In-Reply-To: <20100430160658.GR14169@verdi>
Accept-Language: en-US, en-GB
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US, en-GB
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [Asrg] "Uncaught spam" research project
X-BeenThere: asrg@irtf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: Anti-Spam Research Group - IRTF <asrg@irtf.org>
List-Id: Anti-Spam Research Group - IRTF <asrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/listinfo/asrg>, <mailto:asrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/asrg>
List-Post: <mailto:asrg@irtf.org>
List-Help: <mailto:asrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/asrg>, <mailto:asrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Fri, 30 Apr 2010 16:33:11 -0000

John Leslie wrote:
>    I'd definitely record the AS of the sender's IP.

That should be possible, but would that give information that's useful in a statistical way? I want to find data such as "7% of all spam is sent from Liechtenstein, whereas 12% of hard-to-filter spam is sent from there; we ought to focus more on emails coming from Liechtenstein's IP space". I expect the percentages of AS's found to be so low that the differences will hardly ever be significant.

>    Filters, hopefully, are a moving target; so whatever you publish
> will be of limited use a week later.

I realise that. I do want to run the project over a longer period of time and I do hope the project to yield some general information on where to focus. To be honest, I'm not sure what to expect and whether I can expect the results to be useful, but that's one of my reasons for running the project.

> > [1] Spam in the context of this email is spam sent to spam traps.
> > So the real, proper spam, not the perhaps-not-100%-CAN-SPAM-compliant
> > spam.
>
>    It will be necessary to at least sample the "interesting" cases,
> since spamtraps do get some non-spam...

Good point and yes, I will do that.

> > [2] Several of these make use of open source filters (e.g.
> > SpamAssassin), so it's fair to say that most filters are covered.
> > The setup does exclude techniques such as TCP fingerprinting or
> > greylisting though.
>
>    That's OK, though it might be interesting to compare those
> techniques. BTW are you saying that if a (commercial?) spam-filter
> uses those techniques, your setup will exclude them?

No. Just that these features are turned off.

Thanks

Martijn.

Virus Bulletin Ltd, The Pentagon, Abingdon, OX14 3YP, England.
Company Reg No: 2388295. VAT Reg No: GB 532 5598 33.