[Asrg] Some data on the validity of MAIL FROM addresses
Kee Hinckley <nazgul@somewhere.com> Sun, 18 May 2003 07:37 UTC
Received: from www1.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id DAA04412 for <asrg-archive@odin.ietf.org>; Sun, 18 May 2003 03:37:34 -0400 (EDT)
Received: (from mailnull@localhost) by www1.ietf.org (8.11.6/8.11.6) id h4I75uv18960 for asrg-archive@odin.ietf.org; Sun, 18 May 2003 03:05:56 -0400
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id h4I75uB18957 for <asrg-web-archive@optimus.ietf.org>; Sun, 18 May 2003 03:05:56 -0400
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id DAA04408; Sun, 18 May 2003 03:37:04 -0400 (EDT)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 19HIkx-0002Wc-00; Sun, 18 May 2003 03:38:55 -0400
Received: from ietf.org ([132.151.1.19] helo=www1.ietf.org) by ietf-mx with esmtp (Exim 4.12) id 19HIkw-0002WZ-00; Sun, 18 May 2003 03:38:54 -0400
Received: from www1.ietf.org (localhost.localdomain [127.0.0.1]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id h4I71IB18874; Sun, 18 May 2003 03:01:18 -0400
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id h4I707B18804 for <asrg@optimus.ietf.org>; Sun, 18 May 2003 03:00:07 -0400
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id DAA04376 for <Asrg@ietf.org>; Sun, 18 May 2003 03:31:14 -0400 (EDT)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 19HIfK-0002W3-00 for Asrg@ietf.org; Sun, 18 May 2003 03:33:06 -0400
Received: from www.somewhere.com ([66.92.72.194] helo=somewhere.com) by ietf-mx with esmtp (Exim 4.12) id 19HIfJ-0002Vz-00 for Asrg@ietf.org; Sun, 18 May 2003 03:33:05 -0400
Received: from [66.92.72.194] (account nazgul HELO [192.168.1.104]) by somewhere.com (CommuniGate Pro SMTP 3.5.7) with ESMTP-TLS id 2362272 for Asrg@ietf.org; Sun, 18 May 2003 02:34:22 -0500
Mime-Version: 1.0
X-Sender: nazgul@somewhere.com@pop.messagefire.com
Message-Id: <p06001254baeb12ff775c@[192.168.1.104]>
To: Asrg@ietf.org
From: Kee Hinckley <nazgul@somewhere.com>
Content-Type: text/plain; charset="us-ascii"
Subject: [Asrg] Some data on the validity of MAIL FROM addresses
Sender: asrg-admin@ietf.org
Errors-To: asrg-admin@ietf.org
X-BeenThere: asrg@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/asrg>, <mailto:asrg-request@ietf.org?subject=unsubscribe>
List-Id: Anti-Spam Research Group - IRTF <asrg.ietf.org>
List-Post: <mailto:asrg@ietf.org>
List-Help: <mailto:asrg-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/asrg>, <mailto:asrg-request@ietf.org?subject=subscribe>
List-Archive: <https://www1.ietf.org/pipermail/asrg/>
Date: Sun, 18 May 2003 03:34:14 -0400
Vernon has regularly made the claim that a significant proportion of spam messages have valid MAIL FROM's. That means that bounces will go the the spammer. This has significant ramifications for C/R systems (especially auto-respond ones) since it means that should they have to, spammers could respond to challenges. To test this theory, I took a day's worth of bounce logs from somewhere.com (2003-05-15). These should be fairly normal logs. There's been a bit of an upswing from a recent virus attack, but otherwise these are pretty normal bounce logs for somewhere.com. These are for addresses that do not, and have never, existed. Because they got on the spammer's lists primarily because someone entered the address on a web site, they get a mix of "true" spam and just standard bulk mail. However if they bulkmailers are doing their job, those addresses should be removed fairly quickly. If they aren't removing on bounces--then they look and smell a lot like spammers. Known oddities in the data: 862 messages to wormalert@somewhere.com and variations. These tend to run about 1/3 viruses, 1/3 real messages and 1/3 spam. That set has 533 distinct MAIL FROM addresses. 12340 messages from olga@somewhere.com to mail@somewhere.com. (Misconfigured Axis video cameras.) Since all I'm counting here are unique MAIL FROM addresses, neither of these should have a huge impact. I ran a program which took each MAIL FROM address, parsed out the domain portion, looked up the MX record, and then connected to the SMTP port of the lowest numbered MX server. I did a HELO somewhere.com MAIL FROM <postmaster+AntiSpamAddressVerification@somewhere.com> RCPT TO <appropriate-address> QUIT Note that a few sites bounced me at the HELO prompt (didn't like that I was on DSL, or that my name was somewhere.com) A few bounced at the MAIL FROM (didn't like somewhere.com--and one claimed that + wasn't a legal email character). But the number of either of those was pretty low (less than half a dozen). I'll do a better job of recording those separately in the future. There were 39595 entries in the log, with 34404 distinct SMTP sessions. There were 11559 unique MAIL FROM addresses. +---------+-------+------------+ | errcode | total | percentage | +---------+-------+------------+ | 0 | 99 | 0.86 | ??? | 250 | 5796 | 50.14 | | 450 | 6 | 0.05 | | 451 | 12 | 0.10 | | 452 | 8 | 0.07 | | 473 | 4 | 0.03 | | 500 | 1 | 0.01 | | 501 | 1 | 0.01 | | 521 | 3 | 0.03 | | 530 | 1 | 0.01 | | 550 | 2341 | 20.25 | | 551 | 3 | 0.03 | | 552 | 2 | 0.02 | | 553 | 288 | 2.49 | | 554 | 48 | 0.42 | | 555 | 1 | 0.01 | | 556 | 1 | 0.01 | | 571 | 1 | 0.01 | | 1001 | 1880 | 16.26 | No MX Record | 1003 | 1055 | 9.13 | No SMTP Server | 1007 | 8 | 0.07 | Invalid Email Format +---------+-------+------------+ In aggregate. 51% of the addresses were valid. 49% were not. Of the ones that were not valid, 52% didn't have a reachable mail server. Now let's see how it breaks down by domain. Here are the top 5 domains in the MAIL FROM's. +-------------------------+-------+ | host | count | +-------------------------+-------+ | yahoo.com | 819 | | hotmail.com | 714 | | aol.com | 632 | | earthlink.net | 209 | | msn.com | 161 | +-------------------------+-------+ Let's do the same stats for each of these. Note that I have a 1-2% "No SMTP Server" rate. This could mean that they were rate limiting my queries. More likely it's do the the very short timeout I put on doing the query. I'll have to adjust that in the future. +-----------+---------+-------+------------+ | host | errcode | total | percentage | +-----------+---------+-------+------------+ | yahoo.com | NULL | 1 | 0.12 | | yahoo.com | 250 | 669 | 81.68 | | yahoo.com | 553 | 129 | 15.75 | | yahoo.com | 1003 | 20 | 2.44 | +-----------+---------+-------+------------+ +-------------+---------+-------+------------+ | host | errcode | total | percentage | +-------------+---------+-------+------------+ | hotmail.com | NULL | 1 | 0.14 | | hotmail.com | 250 | 111 | 15.55 | | hotmail.com | 550 | 602 | 84.31 | +-------------+---------+-------+------------+ +---------+---------+-------+------------+ | host | errcode | total | percentage | +---------+---------+-------+------------+ | aol.com | 0 | 10 | 1.58 | | aol.com | 250 | 581 | 91.93 | | aol.com | 550 | 10 | 1.58 | | aol.com | 1003 | 31 | 4.91 | +---------+---------+-------+------------+ +---------------+---------+-------+------------+ | host | errcode | total | percentage | +---------------+---------+-------+------------+ | earthlink.net | 250 | 43 | 20.57 | | earthlink.net | 550 | 149 | 71.29 | | earthlink.net | 554 | 14 | 6.70 | | earthlink.net | 1003 | 3 | 1.44 | +---------------+---------+-------+------------+ +---------+---------+-------+------------+ | host | errcode | total | percentage | +---------+---------+-------+------------+ | msn.com | NULL | 1 | 0.62 | | msn.com | 250 | 62 | 38.51 | | msn.com | 550 | 97 | 60.25 | | msn.com | 1003 | 1 | 0.62 | +---------+---------+-------+------------+ Interesting that the results vary so much by ISP. Yahoo accounts are pretty valid. Hotmail accounts are pretty bad. AOL is quite good. Earthlink has a problem. MSN's slightly better, but still negative. In general though, it appears that Vernon is correct. If my sample is representative, a large percentage of spam is coming from real email addresses. I'll be making this data (and hopefully live update's to it) available on the web, hopefully in the next few days. As an addition anecdotal piece of information. In the past month I've seen five separate email accounts (including two of mine) get Joe-jobbed in a new way. Instead of major bounceback, they just get one or two. It smells like new spam software that uses the same database of addresses for From that they were using for To. The goal might be to get through verification filters like the above. But it's also interesting to consider what havoc that might wreak on C/R systems. How is someone going to react with they get a challenge for a message they didn't send? I predict that if people get used to C/R systems they'll just click send--and the spammer's message will get through. Finally, as an addendum of sorts. Here are the unique messages associated with the above error codes. I've left out 250 and 550 ones--I'm just tracking the less common ones. And they've been normalized to remove email addresses and domain names. +---------+-------+----------------------------------------------------+ | errcode | count | substring(message,1,50) | +---------+-------+----------------------------------------------------+ | 0 | 99 | | | 250 | 5796 | recipient ok | | 450 | 3 | <EMAILADDRESS>: User unknown in local recipient ta | | 450 | 2 | <localhost.localdomain>: Helo command rejected: Ho | | 450 | 1 | Mailbox unavailable. | | 451 | 1 | 4.0.0 Can't create transcript file ./xfh4GNNYv0581 | | 451 | 1 | 4.3.0 error creating message, status = StatusSpool | | 451 | 1 | 4.3.5 Error getting LDAP results in map sbcldap: | | 451 | 4 | <EMAILADDRESS>: Temporary lookup failure | | 451 | 1 | <LOCALPART> ... Recipient mailbox is full | | 451 | 1 | Can't connect to bisman.com - psmtp | | 451 | 2 | Requested action aborted: local error in processin | | 451 | 1 | Server Error | | 452 | 2 | 4.2.1 Mailbox temporarily disabled: EMAILADDRESS | | 452 | 2 | 4.2.2 Mailbox full | | 452 | 2 | 4.4.5 Insufficient disk space; try again later | | 452 | 2 | Message for <EMAILADDRESS> would exceed mailbox qu | | 473 | 4 | EMAILADDRESS relaying prohibited. You should authe | | 500 | 1 | <EMAILADDRESS>: Recipient address rejected: Recipi | | 501 | 1 | Syntax error in sender: <postmaster+AntiSpamAddres | | 521 | 1 | This User has too many concurrents, please try aga | | 521 | 2 | this mailbox is disabled or invalid (#5.2.1) | | 530 | 1 | Delivery not allowed to non-local recipient, try a | | 550 | 2341 | unknown user | | 551 | 1 | 5.0.0 Mailbox disabled,storage space exceeded | | 551 | 1 | EMAILADDRESS illegal name for an account | | 551 | 1 | not our customer | | 552 | 1 | <EMAILADDRESS>: Recipient address rejected: Sorry, | | 552 | 1 | Requested action aborted: exceeded storage allocat | | 553 | 1 | 5.0.0 <EMAILADDRESS>... No such user | | 553 | 1 | 5.1.3 <EMAILADDRESS>... Invalid route address | | 553 | 17 | 5.3.0 <EMAILADDRESS>... Addressee unknown, relay=[ | | 553 | 1 | 5.3.0 <EMAILADDRESS>... Delivery ERROR!!!User does | | 553 | 4 | 5.3.0 <EMAILADDRESS>... No such user | | 553 | 6 | 5.3.0 <EMAILADDRESS>... No such user here | | 553 | 2 | 5.3.0 <EMAILADDRESS>... That address is not curren | | 553 | 1 | 5.3.0 <EMAILADDRESS>... Try LOCALPART@symantec.com | | 553 | 1 | 5.3.0 <EMAILADDRESS>... User LOCALPART mailbox ful | | 553 | 3 | 5.3.0 <EMAILADDRESS>... User unknown | | 553 | 8 | 5.5.3 <EMAILADDRESS>... Invalid | | 553 | 1 | <EMAILADDRESS>... User unknown | | 553 | 2 | No mailbox here by that name, sorry (#5.7.1) | | 553 | 1 | RCPT TO:<EMAILADDRESS> refused | | 553 | 7 | Requested action not taken: mailbox name not allow | | 553 | 143 | VS10-RT Possible forgery or deactivated due to abu | | 553 | 88 | sorry, that domain isn't in my list of allowed rcp | | 553 | 1 | sorry, your envelope sender is in my badmailfrom l | | 554 | 1 | 5.0.0 ADMIN.COM ISN'T THE DOMAIN YOU'RE LOOKING FO | | 554 | 3 | <EMAILADDRESS>: Recipient address rejected: Access | | 554 | 1 | <EMAILADDRESS>: Recipient address rejected: Domain | | 554 | 9 | <EMAILADDRESS>: Recipient address rejected: Not ac | | 554 | 1 | <EMAILADDRESS>: Recipient address rejected: Relay | | 554 | 5 | <EMAILADDRESS>: Relay access denied | | 554 | 1 | <localhost.localdomain>: Helo command rejected: Ho | | 554 | 1 | EMAILADDRESS Mail quota exceeded | | 554 | 1 | Mail for EMAILADDRESS rejected for policy reasons. | | 554 | 21 | Quota violation for EMAILADDRESS | | 554 | 1 | Relay rejected for policy reasons. | | 554 | 2 | SPAM-Relay detected | | 554 | 1 | recipient <EMAILADDRESS>, Transaction failed | | 555 | 1 | sorry, your envelope recipient is in my badrcptto | | 556 | 1 | invalid email address EMAILADDRESS (5.5.6) | | 571 | 1 | <www.somewhere.com[66.92.72.194]>: Client host rej | | 1001 | 1880 | No MX Record | | 1003 | 1055 | No SMTP Connection | | 1007 | 8 | Bad Address Format | +---------+-------+----------------------------------------------------+ -- Kee Hinckley http://www.messagefire.com/ Junk-Free Email Filtering http://commons.somewhere.com/buzz/ Writings on Technology and Society I'm not sure which upsets me more: that people are so unwilling to accept responsibility for their own actions, or that they are so eager to regulate everyone else's. _______________________________________________ Asrg mailing list Asrg@ietf.org https://www1.ietf.org/mailman/listinfo/asrg
- Re: [Asrg] Some data on the validity of MAIL FROM… Scott Nelson
- [Asrg] Some data on the validity of MAIL FROM add… Kee Hinckley
- Re: [Asrg] Some data on the validity of MAIL FROM… Jon Kyme
- Re: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- Re: [Asrg] Some data on the validity of MAIL FROM… Fred Bacon
- Re: [Asrg] Some data on the validity of MAIL FROM… Yakov Shafranovich
- Re: [Asrg] Some data on the validity of MAIL FROM… Jon Kyme
- Re: [Asrg] Some data on the validity of MAIL FROM… Yakov Shafranovich
- Re: [Asrg] Some data on the validity of MAIL FROM… Alan DeKok
- Re: [Asrg] Some data on the validity of MAIL FROM… Scott Nelson
- Re: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- Re: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- Re: [Asrg] Some data on the validity of MAIL FROM… Yakov Shafranovich
- Re: [Asrg] Some data on the validity of MAIL FROM… Kee Hinckley
- Re: [Asrg] Some data on the validity of MAIL FROM… Kee Hinckley
- Re: [Asrg] Some data on the validity of MAIL FROM… Michael Rubel
- Re: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- Re: [Asrg] Some data on the validity of MAIL FROM… Yakov Shafranovich
- Re: [Asrg] Some data on the validity of MAIL FROM… Scott Nelson
- Re: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- Re: [Asrg] Some data on the validity of MAIL FROM… Michael Rubel
- Re: [Asrg] Some data on the validity of MAIL FROM… Kee Hinckley
- Re: [Asrg] Some data on the validity of MAIL FROM… Kee Hinckley
- Re: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- Re: [Asrg] Some data on the validity of MAIL FROM… Dave Crocker
- Re: [Asrg] Some data on the validity of MAIL FROM… Jon Kyme
- Re: [Asrg] Some data on the validity of MAIL FROM… Jon Kyme
- Re: [Asrg] Some data on the validity of MAIL FROM… Alan DeKok
- Re: [Asrg] Some data on the validity of MAIL FROM… Alan DeKok
- Re: [Asrg] Some data on the validity of MAIL FROM… Alan DeKok
- Re: [Asrg] Some data on the validity of MAIL FROM… Kee Hinckley
- Re: [Asrg] Some data on the validity of MAIL FROM… Kee Hinckley
- Re: [Asrg] Some data on the validity of MAIL FROM… Kee Hinckley
- Re: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- Re: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- RE: [Asrg] Some data on the validity of MAIL FROM… Hallam-Baker, Phillip
- RE: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- RE: [Asrg] Some data on the validity of MAIL FROM… Eric Dean
- RE: [Asrg] Some data on the validity of MAIL FROM… Eric Dean
- RE: [Asrg] Some data on the validity of MAIL FROM… Barry Shein
- RE: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- RE: [Asrg] Some data on the validity of MAIL FROM… Kee Hinckley
- RE: [Asrg] Some data on the validity of MAIL FROM… Eric Dean
- Re: [Asrg] Some data on the validity of MAIL FROM… Jon Kyme
- RE: [Asrg] Some data on the validity of MAIL FROM… Eric Dean
- RE: [Asrg] Some data on the validity of MAIL FROM… Eric Dean
- Re: RE: [Asrg] Some data on the validity of MAIL … Jon Kyme
- RE: RE: [Asrg] Some data on the validity of MAIL … Eric Dean
- Re: RE: [Asrg] Some data on the validity of MAIL … Jon Kyme
- Re: [Asrg] Some data on the validity of MAIL FROM… Yakov Shafranovich
- Re: RE: [Asrg] Some data on the validity of MAIL … Kee Hinckley
- Re: [Asrg] Some data on the validity of MAIL FROM… Michael Rubel
- RE: [Asrg] Some data on the validity of MAIL FROM… Tom Thomson
- Re: RE: [Asrg] Some data on the validity of MAIL … Jon Kyme
- Re: [Asrg] Some data on the validity of MAIL FROM… Jon Kyme
- Re: [Asrg] Some data on the validity of MAIL FROM… Michael Rubel
- Re: [Asrg] Some data on the validity of MAIL FROM… Daniel Feenberg
- Re: [Asrg] Some data on the validity of MAIL FROM… Michael Rubel
- Re: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- Re: [Asrg] Some data on the validity of MAIL FROM… Yakov Shafranovich
- Re: [Asrg] Some data on the validity of MAIL FROM… Michael Rubel
- Re: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- Re: [Asrg] Some data on the validity of MAIL FROM… Kee Hinckley
- RE: [Asrg] Some data on the validity of MAIL FROM… Eric D. Williams
- RE: [Asrg] Some data on the validity of MAIL FROM… Eric D. Williams
- RE: [Asrg] Some data on the validity of MAIL FROM… Eric D. Williams
- RE: [Asrg] Some data on the validity of MAIL FROM… Eric D. Williams
- RE: [Asrg] Some data on the validity of MAIL FROM… Eric D. Williams
- Re: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- RE: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- Re: [Asrg] Some data on the validity of MAIL FROM… Jon Kyme
- Re: [Asrg] Some data on the validity of MAIL FROM… Jon Kyme
- Re: [Asrg] Some data on the validity of MAIL FROM… Richard Rognlie
- RE: [Asrg] Some data on the validity of MAIL FROM… Clayton, Nik [IT]
- Re: RE: [Asrg] Some data on the validity of MAIL … Jon Kyme
- Re: RE: [Asrg] Some data on the validity of MAIL … Jon Kyme
- RE: RE: [Asrg] Some data on the validity of MAIL … Clayton, Nik [IT]
- RE: RE: [Asrg] Some data on the validity of MAIL … Clayton, Nik [IT]
- Re: [Asrg] Some data on the validity of MAIL FROM… Kee Hinckley
- Re: RE: RE: [Asrg] Some data on the validity of M… Jon Kyme
- Re: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- Re: [Asrg] Some data on the validity of MAIL FROM… Jon Kyme
- RE: [Asrg] Some data on the validity of MAIL FROM… Tom Thomson
- RE: [Asrg] Some data on the validity of MAIL FROM… Yakov Shafranovich
- Re: [Asrg] Some data on the validity of MAIL FROM… mathew
- Re: [Asrg] Some data on the validity of MAIL FROM… Yakov Shafranovich
- RE: [Asrg] Some data on the validity of MAIL FROM… Eric D. Williams
- RE: [Asrg] Some data on the validity of MAIL FROM… Eric D. Williams
- RE: [Asrg] Some data on the validity of MAIL FROM… Barry Shein
- Re: [Asrg] Some data on the validity of MAIL FROM… Barry Shein
- Re: [Asrg] Some data on the validity of MAIL FROM… wayne
- Re: [Asrg] Some data on the validity of MAIL FROM… Vernon Schryver
- Re: [Asrg] Some data on the validity of MAIL FROM… Yakov Shafranovich
- RE: [Asrg] Some data on the validity of MAIL FROM… Kee Hinckley
- Re: [Asrg] Some data on the validity of MAIL FROM… Kee Hinckley
- Re: [Asrg] Some data on the validity of MAIL FROM… Markus Stumpf
- Re: [Asrg] Some data on the validity of MAIL FROM… Chris Lewis