Re: [Asrg] Adding a spam button to MUAs

Rich Kulawiec <> Thu, 04 February 2010 19:47 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 0FFB228C163 for <>; Thu, 4 Feb 2010 11:47:05 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -6.417
X-Spam-Status: No, score=-6.417 tagged_above=-999 required=5 tests=[AWL=0.026, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, SUBJECT_FUZZY_TION=0.156]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id akkbRerAQq2y for <>; Thu, 4 Feb 2010 11:47:03 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id 942363A6B38 for <>; Thu, 4 Feb 2010 11:47:03 -0800 (PST)
Received: from ( []) by (8.14.4/8.14.4) with ESMTP id o14Jlnj7004443 for <>; Thu, 4 Feb 2010 14:47:50 -0500 (EST)
Received: from ( []) by (8.14.1/8.14.1) with ESMTP id o14JkIro003438 for <>; Thu, 4 Feb 2010 14:46:19 -0500 (EST)
Received: from (localhost []) by (8.14.3/8.14.3/Debian-9ubuntu1) with ESMTP id o14JlhET016697 for <>; Thu, 4 Feb 2010 14:47:43 -0500
Received: (from rsk@localhost) by (8.14.3/8.14.3/Submit) id o14JlhNJ016696 for; Thu, 4 Feb 2010 14:47:43 -0500
Date: Thu, 4 Feb 2010 14:47:43 -0500
From: Rich Kulawiec <>
To: Anti-Spam Research Group - IRTF <>
Message-ID: <>
References: <alpine.BSF.2.00.0912082138050.20682@simone.lan> <> <> <> <>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <>
User-Agent: Mutt/1.5.20 (2009-06-14)
Subject: Re: [Asrg] Adding a spam button to MUAs
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: Anti-Spam Research Group - IRTF <>
List-Id: Anti-Spam Research Group - IRTF <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 04 Feb 2010 19:47:05 -0000

On Wed, Dec 16, 2009 at 01:59:04PM -0500, Seth wrote:
> Rich Kulawiec <> wrote:
> > There's the zombie problem.  There is no way for anyone or anything
> > external to an end-user's system to know whether the button click
> > (or equivalent event) was generated by a user or by software working
> > at the behest of the new owner of the user's former system.  Given
> > that the zombie problem is epidemic and presently unstoppable,
> > widescale deployment of any such mechanism will lead to its use by
> > zombie-resident malware as soon as it's advantageous for abusers to
> > do so. Thus, anyone proposing such a "report as spam" mechanism on a
> > large scale must also include in their proposal a workable plan for
> > solving the zombie problem.
> How would it be advantageous for a zombie to report as spam?  Report
> as non-spam, sure, to game the filters.  But with the data being noisy
> to begin with, zombies adding noise don't have much effect; they might
> require tuning of the filters.

Well, the zombie problem is massive enough that *if* spammers found it
useful to do so, they could, on a whim, generate enough noise to swamp
the signal.  After all, they now own every email credential present on,
or used on, all those zombie systems.  If we take very conservative
estimates for (a) zombie population and (b) email credentials/zombie,
which I'll put at 100M and 5, that's half a billion accounts that can be
used for starters.  I think much more realistic estimates are probably
200M and 10, BTW, but I we're still talking an order of magnitude around
"billion".  That's not only a large number, but it's much larger than the
number of users who would use such buttons, and *many* orders of magnitude
larger than the number of clueful users, whose fraction of the population
I would place at no more than 1 in 10e5-10e6.

There's also nothing stopping those same spammers from creating an
essentially-unlimited number of email accounts anywhere that those zombies
(or the access/credentials on them) permit them to.  That starts with ~10K
freemail providers and extends to any ISP or email provider that allows
its users to create multiple accounts.  (For example, my DSL provider
allows me to create up to 7.  I've only got 1 in service, but anyone who
hijacked my desktop system would be able to create the other 6 and --
if careful enough -- prevent me from noticing.)  This is fairly common
with major consumer ISPs, which allow users to create email accounts
for additional family members.

All of this combined means that spammers could easily overwhelm any such
reporting system the moment it went live.  They've already beaten it.

Now as to why they might bother:

They stand to gain considerably by (a) creating FP reports (b) creating
FN reports (c) suppressing TP reports (d) suppressing TN reports.  And
they can use the same zombies (or other systems) to generate traffic
which can then be processed (a) through (d).

For example, they could go after their competition with (a).  They could
try to avoid classification of their own spam with (c).  They could
generate targeted traffic -- e.g., a spam run to known-owned accounts --
and use (b) to mark it as not-spam.  There are all kinds of combinations
available to them -- it's just a matter of what goal(s) they might have
and what it's worth to them to pursue those goals.  (And I think I'll
stop outlining scenarios here as I'm sure they're listening.)

Now, certainly some heavy-handed manipulation might be detected, and
I'm sure it would/will be.  But [some] spammers are very crafty, and
are easily smart enough to figure out how to game the system.  And
if/when the system poses an inconvenience to them: they will.  They've
done it repeatedly in the past.  (See "zombies, creation of in order
to evade contemporary DNSBLs")  It would be a major strategic error
to underestimate either their willingness or their ability to turn
this to their advantage.

So I see this entire approach as pre-failed.  Even in an imaginary
world where users could function as reliable classifiers (and in
this real world, they are absolutely miserable classifiers: we all
know this because we know that if they *weren't*, we would not be
here discussing how to address the spam problem [1]) spammers have
the capability of generating far more inputs to the aggregate system
than all humans combined, and thus dominating the results by a huge
margin -- for whatever purpose suits them.


[1] For example, as Juha-Matti Laurio noted on "funsec" just yesterday:

	"Internet scammers had a bumper year in 2009 with people around
	the world continue to be duped by so-called 419 frauds, with one
	Dutch private investigation company estimating the highest ever
	annual losses occurred in 2009.

	Victims lost at least US$9.3 billion last year, up from $6.3
	billion in 2008, said Frank Engelsman of Ultrascan, a Dutch
	company that investigates 419 scams - also known as advance-fee
	frauds (AFFs) - and other types of crime. Ultrascan will release
	a complete report on Friday on its website.

	AFF scams originating in Nigeria were so pervasive that the
	country's Economic and Financial Crimes Commission stepped up law
	enforcement efforts in recent years to crack down on scammers."

	More at

	It appears that the report is here: