RE: [Asrg] Some data on the validity of MAIL FROM addresses

Kee Hinckley <nazgul@somewhere.com> Tue, 20 May 2003 02:48 UTC

Received: from www1.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id WAA13797 for <asrg-archive@odin.ietf.org>; Mon, 19 May 2003 22:48:17 -0400 (EDT)
Received: (from mailnull@localhost) by www1.ietf.org (8.11.6/8.11.6) id h4K2HWc08121 for asrg-archive@odin.ietf.org; Mon, 19 May 2003 22:17:32 -0400
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id h4K2HWB08118 for <asrg-web-archive@optimus.ietf.org>; Mon, 19 May 2003 22:17:32 -0400
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id WAA13785; Mon, 19 May 2003 22:47:47 -0400 (EDT)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 19HxC4-0000bA-00; Mon, 19 May 2003 22:49:36 -0400
Received: from ietf.org ([132.151.1.19] helo=www1.ietf.org) by ietf-mx with esmtp (Exim 4.12) id 19HxC3-0000b7-00; Mon, 19 May 2003 22:49:35 -0400
Received: from www1.ietf.org (localhost.localdomain [127.0.0.1]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id h4K2ECB07979; Mon, 19 May 2003 22:14:12 -0400
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id h4K2DTB07928 for <asrg@optimus.ietf.org>; Mon, 19 May 2003 22:13:29 -0400
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id WAA13691 for <asrg@ietf.org>; Mon, 19 May 2003 22:43:44 -0400 (EDT)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 19Hx89-0000ZK-00 for asrg@ietf.org; Mon, 19 May 2003 22:45:33 -0400
Received: from www.somewhere.com ([66.92.72.194] helo=somewhere.com) by ietf-mx with esmtp (Exim 4.12) id 19Hx88-0000ZA-00 for asrg@ietf.org; Mon, 19 May 2003 22:45:32 -0400
Received: from [66.92.72.194] (account nazgul HELO [192.168.1.104]) by somewhere.com (CommuniGate Pro SMTP 3.5.7) with ESMTP-TLS id 2368566; Mon, 19 May 2003 21:46:38 -0500
Mime-Version: 1.0
X-Sender: nazgul@somewhere.com@pop.messagefire.com
Message-Id: <p06001208baef4283debf@[192.168.1.104]>
In-Reply-To: <MBEKIIAKLDHKMLNFJODBCENDFDAA.eric@purespeed.com>
References: <MBEKIIAKLDHKMLNFJODBCENDFDAA.eric@purespeed.com>
To: Eric Dean <eric@purespeed.com>
From: Kee Hinckley <nazgul@somewhere.com>
Subject: RE: [Asrg] Some data on the validity of MAIL FROM addresses
Cc: Alan DeKok <aland@freeradius.org>, asrg@ietf.org
Content-Type: text/plain; charset="us-ascii"; format="flowed"
Sender: asrg-admin@ietf.org
Errors-To: asrg-admin@ietf.org
X-BeenThere: asrg@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/asrg>, <mailto:asrg-request@ietf.org?subject=unsubscribe>
List-Id: Anti-Spam Research Group - IRTF <asrg.ietf.org>
List-Post: <mailto:asrg@ietf.org>
List-Help: <mailto:asrg-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/asrg>, <mailto:asrg-request@ietf.org?subject=subscribe>
List-Archive: <https://www1.ietf.org/pipermail/asrg/>
Date: Mon, 19 May 2003 22:46:24 -0400

At 4:37 PM -0400 5/19/03, Eric Dean wrote:
>  >
>>  >  For example, if 90% of spam is forged, then RMX, C/R, and
>>  > authentication schemes could do a lot against spam (modulo their
>>  > other problems).
>
>It's not a large step to estimate that 90% of spam is forged.
>1) However, much of that spam can be filtered using simple sender domain
>checks.  Many spammers use bogus domains and maybe 5-10% of spam is dropped
>accordingly.
>2) The next value is to do a HELO hostname check..about 10-20% is dropped as
>well.  However, there are casualities for very large companies...such as
>bellsouth and verizon whereby I have to punch holes in my filters.
>3) Then I could be more aggressive and apply a reverse-dns check on the
>initiating source IP.  Doing so is also effective, however, all DSL and
>carrier Dial networks in-addr their IP pools...yet many mail admins don't.
>I have aout another 5-10% of my spam come from unresolved IPs..but instantly
>the phones light up..cost me money..and I'm out of business.  The tough-love
>approach is suicidal stupidity.
>4) Then OK, so now we go with RBL, to identify the pools..that'll
>work..costs non-trivial money..but it works for that flavor of spam..maybe
>5%.


Well, actually I collected some of this data as well.  But without 
corresponding data on non-spam, it's not very useful.  Certainly each 
of the steps you outline includes an increased number of false 
positives.

There were 7376 unique senders.
4298 had some "problem" with the HELO or DNS information.

10	No A record for the HELO domain
702	The hostname for the HELO doesn't resolve
1330	Unqualified domain in the HELO
2030	Sender domain does not match the HELO
76	DNS Failed or timed out
1000	No DNS A data (not sure how this differs from the first)
8	Bad DNS Q Data Format (?)
101	Pipelined

(Pipelining was detected by spotting cases where an error was 
returned which should have terminated the transaction, but they kept 
sending commands, up to and including the content of the message.)

Obviously those all overlapped a good deal.  Your immediate reaction 
might be to make sure that the sender domain matches the HELO.  After 
all, it would nail half the spam right there.  But then again, it 
would also block most of the mail coming from my domain and many 
others.  My mail server always uses the primary domain name in the 
HELO, no matter which domain it sends for.  That's probably true of 
most servers.
-- 
Kee Hinckley
http://www.messagefire.com/          Junk-Free Email Filtering
http://commons.somewhere.com/buzz/   Writings on Technology and Society

I'm not sure which upsets me more: that people are so unwilling to accept
responsibility for their own actions, or that they are so eager to regulate
everyone else's.
_______________________________________________
Asrg mailing list
Asrg@ietf.org
https://www1.ietf.org/mailman/listinfo/asrg