Re: [Asrg] Countering Botnets to Reduce Spam

Rich Kulawiec <rsk@gsp.org> Fri, 14 December 2012 17:45 UTC

Return-Path: <rsk@gsp.org>
X-Original-To: asrg@ietfa.amsl.com
Delivered-To: asrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AB74F21F8A80 for <asrg@ietfa.amsl.com>; Fri, 14 Dec 2012 09:45:07 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.47
X-Spam-Level:
X-Spam-Status: No, score=-6.47 tagged_above=-999 required=5 tests=[AWL=0.129, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CoRYZCw2nhTw for <asrg@ietfa.amsl.com>; Fri, 14 Dec 2012 09:45:06 -0800 (PST)
Received: from taos.firemountain.net (taos.firemountain.net [207.114.3.54]) by ietfa.amsl.com (Postfix) with ESMTP id 3ED8621F8A7D for <asrg@irtf.org>; Fri, 14 Dec 2012 09:45:06 -0800 (PST)
Received: from gsp.org (bltmd-207.114.17.210.dsl.charm.net [207.114.17.210]) by taos.firemountain.net (8.14.5/8.14.5) with ESMTP id qBEHj2KF014430 for <asrg@irtf.org>; Fri, 14 Dec 2012 12:45:03 -0500 (EST)
Date: Fri, 14 Dec 2012 12:44:57 -0500
From: Rich Kulawiec <rsk@gsp.org>
To: Anti-Spam Research Group - IRTF <asrg@irtf.org>
Message-ID: <20121214174457.GA18374@gsp.org>
References: <SNT002-W1393526B62C0940EF697B2C54E0@phx.gbl> <20682.3413.665708.640636@world.std.com> <50CA0E91.2080304@mtcc.com> <20682.23612.451287.246798@world.std.com> <50CA805E.3010100@mtcc.com> <50CAA612.3070000@mustelids.ca> <SNT002-W117523E9206C73F54784577C54D0@phx.gbl> <50CABCB4.1030103@mustelids.ca> <20121214133937.GA23699@gsp.org> <50CB4100.2020408@mustelids.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <50CB4100.2020408@mustelids.ca>
User-Agent: Mutt/1.5.20 (2009-06-14)
Subject: Re: [Asrg] Countering Botnets to Reduce Spam
X-BeenThere: asrg@irtf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: Anti-Spam Research Group - IRTF <asrg@irtf.org>
List-Id: Anti-Spam Research Group - IRTF <asrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/options/asrg>, <mailto:asrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/asrg>
List-Post: <mailto:asrg@irtf.org>
List-Help: <mailto:asrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/asrg>, <mailto:asrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Dec 2012 17:45:07 -0000

On Fri, Dec 14, 2012 at 10:08:48AM -0500, Chris Lewis wrote:
> Compromised Linux machines (mostly servers) are now responsible for ~40%
> of all spam.
> 
> The actual _count_ of compromised Linux machines is indeed quite low.
> Say 62K out of 8.6M observed compromised machines.  About .72%. Two 9's ;-)

I believe you.  This suggests two possibilities:

1. Somethings's broken somewhere in my experimental design between data
acquisition and statistical analysis.

or

2. We're talking apples and oranges and that's why our numbers are so
different.  To clarify: I'm not trying to measure spam volume, just
the number of systems (and their OS types).  And to clarify further:
I classify a system as a bot if it meets a set of criteria that includes
more than sending spam: I may also classify it as a bot if it's doing
brute-force SSH/FTP/IMAP/etc. attacks, if it's doing port scans, etc.
(The "may" is there because some systems engaged in these activities don't
appear to be bots.  Of course that's a judgment call and I'm sure I make
FP and FN mistakes.)

For example, if 190.147.78.102 (Static-IP-cr19014778102.cable.net.co,
thus probably in Colombia) makes 133 different IMAP login attempts,
I'm going to conclude that it's not a bored user in Bogota with
nothing better to do, it's most likely a bot doing that.

Do you think #2 explains the difference in our numbers, or do I have
to make a LOT of coffee and dig into #1?

---rsk