[Asrg] Statistical Analysis shows SPF should work Pretty Well
mengwong@dumbo.pobox.com (Meng Weng Wong) Fri, 13 June 2003 01:45 UTC
Received: from www1.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id VAA04425 for <asrg-archive@odin.ietf.org>; Thu, 12 Jun 2003 21:45:43 -0400 (EDT)
Received: (from mailnull@localhost) by www1.ietf.org (8.11.6/8.11.6) id h5D1jGg10908 for asrg-archive@odin.ietf.org; Thu, 12 Jun 2003 21:45:16 -0400
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id h5D1jGm10905 for <asrg-web-archive@optimus.ietf.org>; Thu, 12 Jun 2003 21:45:16 -0400
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id VAA04405; Thu, 12 Jun 2003 21:45:12 -0400 (EDT)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 19Qdas-00003B-00; Thu, 12 Jun 2003 21:43:06 -0400
Received: from ietf.org ([132.151.1.19] helo=www1.ietf.org) by ietf-mx with esmtp (Exim 4.12) id 19Qdar-000037-00; Thu, 12 Jun 2003 21:43:05 -0400
Received: from www1.ietf.org (localhost.localdomain [127.0.0.1]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id h5CKP2a21943; Thu, 12 Jun 2003 16:25:02 -0400
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id h5CKOvm21906 for <asrg@optimus.ietf.org>; Thu, 12 Jun 2003 16:24:57 -0400
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA25725 for <asrg@ietf.org>; Thu, 12 Jun 2003 16:24:54 -0400 (EDT)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 19QYau-0005i5-00 for asrg@ietf.org; Thu, 12 Jun 2003 16:22:48 -0400
Received: from dumbo.pobox.com ([208.210.125.24]) by ietf-mx with esmtp (Exim 4.12) id 19QYau-0005i2-00 for asrg@ietf.org; Thu, 12 Jun 2003 16:22:48 -0400
Received: by dumbo.pobox.com (Postfix, from userid 505) id 1BC97DE41; Thu, 12 Jun 2003 16:24:50 -0400 (EDT)
To: asrg@ietf.org
From: mengwong@dumbo.pobox.com
Message-Id: <20030612202450.1BC97DE41@dumbo.pobox.com>
Subject: [Asrg] Statistical Analysis shows SPF should work Pretty Well
Sender: asrg-admin@ietf.org
Errors-To: asrg-admin@ietf.org
X-BeenThere: asrg@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/asrg>, <mailto:asrg-request@ietf.org?subject=unsubscribe>
List-Id: Anti-Spam Research Group - IRTF <asrg.ietf.org>
List-Post: <mailto:asrg@ietf.org>
List-Help: <mailto:asrg-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/asrg>, <mailto:asrg-request@ietf.org?subject=subscribe>
List-Archive: <https://www1.ietf.org/pipermail/asrg/>
Date: Thu, 12 Jun 2003 16:24:50 -0400
Executive Summary: Matching sender domain with client IP is a strong predictor of spamminess. http://dumbo.pobox.com/spam-sensor/analysis01.png Analysis: I analyzed 6,810,374 unique deliveries over a two-month period whose senders claimed to be from aol.com, hotmail.com, and yahoo.com. Those deliveries came from 1,885,248 distinct email senders. I classified those senders using statistical methods into 1,775,660 spammer addresses and 109,588 nonspammer addresses. Of the 1,775,660 addresses which my classifier decided were more likely to be spammers than not-spammers, 4,188 actually originated from aol, hotmail, or yahoo. That is a statistically insignificant number and reflects more on the imperfection of my classifier scheme than anything else. The classifier scheme is described at http://dumbo.pobox.com/spam-sensor/. Conclusion 1: aol, hotmail, and yahoo have successfully implemented outbound antispam technology, ie. ways to ensure that only humans sign up for their accounts, or limits on per-account outbound message volume. The analysis is described in detail at http://dumbo.pobox.com/spam-sensor/analysis01.txt The important result of the analysis is a log/log scatterplot http://dumbo.pobox.com/spam-sensor/analysis01.png Each dot represents one or more sender addresses; the color of the dot represents whether the domain matched the client IP --- sort of a proto-SPF, using PTR instead. There is a collision problem but on the whole the output communicates pretty well. Conclusion 2: Client IPs whose PTR do not match their sender domains are more likely to be spam than not. But that means a scheme like SPF/DMP/RMX should work nicely. _______________________________________________ Asrg mailing list Asrg@ietf.org https://www1.ietf.org/mailman/listinfo/asrg
- [Asrg] Statistical Analysis shows SPF should work… Meng Weng Wong
- Re: [Asrg] Statistical Analysis shows SPF should … Kee Hinckley
- Re: [Asrg] Statistical Analysis shows SPF should … Vernon Schryver
- Re: [Asrg] Statistical Analysis shows SPF should … Yakov Shafranovich
- Re: [Asrg] Statistical Analysis shows SPF should … Yakov Shafranovich
- Re: [Asrg] Statistical Analysis shows SPF should … Yakov Shafranovich
- Re: [Asrg] Statistical Analysis shows SPF should … Vernon Schryver
- Re: [Asrg] Statistical Analysis shows SPF should … Barry Shein
- [Asrg] Spammer responses to SPF Meng Weng Wong
- Re: [Asrg] Spammer responses to SPF Yakov Shafranovich
- Re: [Asrg] Spammer responses to SPF Alan DeKok
- [Asrg] SPF: Objection: spammers will use <> Meng Weng Wong
- Re: [Asrg] Spammer responses to SPF Markus Stumpf
- Re: [Asrg] SPF: Objection: spammers will use <> Markus Stumpf
- Re: [Asrg] SPF: Objection: spammers will use <> Yakov Shafranovich