Re: [Shutup] [ietf-smtp] Levels of proposals

Russ Allbery <eagle@eyrie.org> Fri, 04 December 2015 04:53 UTC

Return-Path: <eagle@eyrie.org>
X-Original-To: shutup@ietfa.amsl.com
Delivered-To: shutup@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 282A11B2DAE; Thu, 3 Dec 2015 20:53:39 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.91
X-Spam-Level:
X-Spam-Status: No, score=-1.91 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4_KCiTZvYOeA; Thu, 3 Dec 2015 20:53:36 -0800 (PST)
Received: from haven.eyrie.org (haven.eyrie.org [IPv6:2001:470:30:84:e276:63ff:fe62:3539]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B3F281B2DAD; Thu, 3 Dec 2015 20:53:36 -0800 (PST)
Received: from lothlorien.eyrie.org (unknown [96.90.234.101]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by haven.eyrie.org (Postfix) with ESMTPS id 002A711874D; Thu, 3 Dec 2015 20:53:34 -0800 (PST)
Received: by lothlorien.eyrie.org (Postfix, from userid 1000) id DEFFDB40739; Thu, 3 Dec 2015 20:53:32 -0800 (PST)
From: Russ Allbery <eagle@eyrie.org>
To: Ted Lemon <mellon@fugue.com>
In-Reply-To: <1449202966785-573af732-876dd2d9-16d51672@fugue.com> (Ted Lemon's message of "Fri, 04 Dec 2015 04:22:46 +0000")
Organization: The Eyrie
References: <CABa8R6vfT-9=51B32++eUAVeq5xuhTNUuv62yeO+W6AErRFnDQ@mail.gmail.com> <5660F3A1.7060807@mustelids.ca> <1449195108085-9ef6f394-96f931b3-20b99bd2@fugue.com> <87k2ov7xly.fsf@hope.eyrie.org> <1449196775597-73137a19-d32873ba-cad85c2a@fugue.com> <87a8pq98m3.fsf@hope.eyrie.org> <1449202966785-573af732-876dd2d9-16d51672@fugue.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)
Date: Thu, 03 Dec 2015 20:53:32 -0800
Message-ID: <874mfy95mb.fsf@hope.eyrie.org>
MIME-Version: 1.0
Content-Type: text/plain
Archived-At: <http://mailarchive.ietf.org/arch/msg/shutup/n1NPqmoT78qx7uy3fUTWv12-3xs>
X-Mailman-Approved-At: Fri, 04 Dec 2015 00:11:43 -0800
Cc: shutup@ietf.org, ietf-smtp@ietf.org
Subject: Re: [Shutup] [ietf-smtp] Levels of proposals
X-BeenThere: shutup@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: SMTP Headers Unhealthy To User Privacy <shutup.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/shutup>, <mailto:shutup-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/shutup/>
List-Post: <mailto:shutup@ietf.org>
List-Help: <mailto:shutup-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/shutup>, <mailto:shutup-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Dec 2015 04:53:39 -0000

Ted Lemon <mellon@fugue.com> writes:
> Thursday, Dec 3, 2015 10:48 PM Russ Allbery wrote:

>> Why would you throttle per id/password pair?  The attacker doesn't try
>> the same pair more than once.  That would be pointless.

> I think you missed the distinction I was making.  You are describing
> throttling per ip-address, irrespective of the password/id pair.  I am
> asking why you don't simply say "if more than 10 attempts are made on
> this username per hour, lock it out for a while."

Oh, you didn't mean throttle by id/password pair.  You meant throttle
purely by user ID.

There are two reasons (well, at least -- maybe more) why this doesn't help
as much as it sounds like it would, particularly in the case of SMTP AUTH.

One is that the attacker just changes the pattern of their brute force
search to distribute it across more known accounts or across more time, so
they just probe more accounts with fewer passwords.  It's all the same to
them; they're just doing a combinatoric search, and varying parameters
that way doesn't have too negative of an impact on their ability to
compromise accounts.

Another reason is more specific to SMTP AUTH: dozens of incorrect login
attempts for the same username is a common *legitimate* pattern for users
who have a single misconfigured device (a typo in their password, for
instance.  (And thanks to cell phones, those failed logins often come from
a huge variety of IP addresses.)  SMTP UAs often retry authentication
pretty aggressively and will happily look like an attacker.  If you lock
an account for that pattern, you can lock out that user's other legitimate
devices.  Web sites usually don't have to worry as much about automated
incorrect legitimate login attempts.

Now, you can do more complex things like checking whether the failures are
all the same password or are different passwords; I'm not saying there
aren't multiple approaches here.  My point, rather, is that figuring out
what IP addresses are probably part of botnets is really useful and not
something to be casually discarded as additional information for your
abuse prevention.

This example is obviously simplified in an attempt to answer your original
question, since you said you were wholly ignorant of this whole area of
abuse handling.  Real-world scenarios get more complicated and involve
more complex parameters to rate limiting and ways of figuring out what's
going on.  The point is that threat intelligence about compromised IPs is
really useful input into those systems, and that can be garnered from spam
email Received headers in some fairly effective ways, as touched on by
Chris in his original message.

Note that this sort of thing (login abuse detection) is something for
which many companies employ an entire team of full-time people devoted to
nothing but this.  Those people use all the relevant tools that they can
get their hands on because this is an arms race against quite
sophisticated attackers.

I was (mostly successfully) staying out of this thread, but one other
comment I'll make while I'm here: I suspect a lot of people who work on
email abuse detection and mail filtering would make the argument that they
care just as greatly about user privacy as you do, but consider phishing
attacks a much more serious threat to the privacy of the average user than
the information that leaks out in email headers.  It's one thing to get a
user's home address; it's quite another to get all of their bank
information and access to their personal email.  Obviously none of us want
those two goals to be in conflict, but please remember that spam filtering
is also *phishing* filtering, and therefore is *also* privacy defense for
users, and quite important privacy defense at that.

> Yes, I have heard that before.  It makes sense.  However, it does make
> me wonder why you don't just stop accepting mail from sites that are
> this badly run until they shape up.  You seem to be suggesting that it's
> because you value the intelligence that you glean from their
> incompetence.  Did I misunderstand?

Yes, in a couple of ways.

For one, just deciding to ban whole sending sites from your email system
and saying tough luck if someone at that site wants to mail you is the
sort of thing that you can do if you're running a hobbyist or personal
mail service.  When you have a large-scale email system for paying
customers, it turns out that those customers are unimpressed by your
decision to not let specific people talk to them because you don't like
how they run their mail system.  Or, similarly, if you're running a
large-scale mail system for a company, deciding that your sales people
don't get to hear from certain prospects, or that your business people
simply don't get to talk to certain domains, based on your opinions about
their email policy doesn't go over well.  You have to be pretty damn sure
that the sending site is 100% garbage before you can do things like that,
as opposed to something more sophisticated.

For another, I think you're missing that can garner useful information
from the Received headers of messages that you *reject*.  You can accept
the whole message body first and then reject it or throw it away.  If
you're pretty sure for other reasons that this collection of a few million
messages are all spam or phishing, analyzing the headers of those messages
to pull out patterns like, say, a list of submission IP addresses and then
using that IP addresses as an additional data point in analyzing more
borderline messages, or as a set of IP addresses for which to do
additional login throttling for your own authentication systems, can be
very helpful.  Chris had some concrete numbers in his original message.

-- 
Russ Allbery (eagle@eyrie.org)              <http://www.eyrie.org/~eagle/>