Re: Last Call: draft-irtf-asrg-dnsbl (DNS Blacklists and Whitelists)

Keith Moore <moore@network-heretics.com> Sat, 08 November 2008 22:37 UTC

Return-Path: <ietf-bounces@ietf.org>
X-Original-To: ietf-archive@megatron.ietf.org
Delivered-To: ietfarch-ietf-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id EC4E43A67D6; Sat, 8 Nov 2008 14:37:04 -0800 (PST)
X-Original-To: ietf@core3.amsl.com
Delivered-To: ietf@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 541933A67D6 for <ietf@core3.amsl.com>; Sat, 8 Nov 2008 14:37:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.762
X-Spam-Level:
X-Spam-Status: No, score=-2.762 tagged_above=-999 required=5 tests=[AWL=1.038, BAYES_00=-2.599, GB_I_LETTER=-2, SARE_SUB_RAND_LETTRS4=0.799]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZOITPFhnZ1e5 for <ietf@core3.amsl.com>; Sat, 8 Nov 2008 14:37:03 -0800 (PST)
Received: from m1.imap-partners.net (m1.imap-partners.net [64.13.152.131]) by core3.amsl.com (Postfix) with ESMTP id 016653A67B6 for <ietf@ietf.org>; Sat, 8 Nov 2008 14:37:03 -0800 (PST)
Received: from lust.indecency.org (adsl-155-115-114.tys.bellsouth.net [72.155.115.114]) by m1.imap-partners.net (MOS 3.10.3-GA) with ESMTP id BED31096 (AUTH admin@network-heretics.com) for ietf@ietf.org; Sat, 8 Nov 2008 14:36:56 -0800 (PST)
Message-ID: <49161485.1050205@network-heretics.com>
Date: Sat, 08 Nov 2008 17:36:53 -0500
From: Keith Moore <moore@network-heretics.com>
User-Agent: Thunderbird 2.0.0.17 (Macintosh/20080914)
MIME-Version: 1.0
To: Chris Lewis <clewis@nortel.com>
Subject: Re: Last Call: draft-irtf-asrg-dnsbl (DNS Blacklists and Whitelists)
References: <4915DE02.2010803@nortel.com> <4915EA94.6020706@network-heretics.com> <491601C4.3090803@nortel.com>
In-Reply-To: <491601C4.3090803@nortel.com>
Cc: john-ietf@jck.com, "Livingood, Jason" <Jason_Livingood@cable.comcast.com>, ietf@ietf.org
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: ietf-bounces@ietf.org
Errors-To: ietf-bounces@ietf.org

Chris Lewis wrote:
> Keith Moore wrote:
>> I think you're missing the point.
> 
> Oh, no, I fully understand the point.  In contrast, I think you're
> relying on false dichotomies.
> 
> For example:
> 
>> Better "interoperation" of a facility that degrades the reliability of
>> email, by encouraging an increased reliance on dubious filtering
>> criteria and rumors, is not a desirable goal.
> 
> There is no such thing as a filtering mechanism that is 100% accurate,
> and hence all are dubious to one degree or another. 

Fine.  So please explain to me why, and under what conditions, using IP
addresses as a basis for filtering is reliable.

And please also explain why, and under what conditions, using DNS as a
means of communicating whether a particular IP address is spamming is a
good way to do that.  In particular, please explain how it encourages or
discourages selection of good criteria for filtering, how it enhances or
 degrades accountability for the party blacklisting the site (and/or the
party trusting the blacklist).  Explain the limitations of DNS for doing
this kind of rumor mongering and how they are addressed.  Explain the
mechanism for reporting blacklisting errors to the parties most affected
(which would include not only the senders of inappropriately filtered
messages but the intended recipients of inappropriately filtered
messages) and how use of DNS enhances or degrades those parties'
abililty to improve their email reliability.  Explain how DNS enhances
or degrades a party's ability to correct both a blacklist's mislabeling
of a site and a user's ability to correct his MSP's inappropriate use of
a blacklist.  Explain how the DNS ttl should be chosen and how it should
vary on a per-domain basis.  Explain the security risks associated with
trusting DNS as a blacklist query protocol, and in particular what
potentials exist for denial-of-service attacks and what can be done
about them.

See, it really appears that the design work that we would rationally
expect out of any standards-track protocol has not been done in this
case.  It appears that large-scale use of IP addresses as identifiers
for potential spammers is not well-considered, especially in an era with
widespread use of NAT.  It also appears that use of DNS as a query
protocol is not well considered beyond the fact that such a query could
be implemented by a sendmail rewrite rule.

In other words, this is a hack that has gotten way out of hand.  And not
a particularly well-designed hack at that.

> A demonstrable assertion that "IP x is demonstrably infected with
> Srizbi and anything it emits is probably crap" and communicated by
> DNS is much less dubious than "this combination of random content
> fragments makes it look like a Nigerian 419, sorta, marginally".

I'd probably agree with that statement on its face, but (a) comparing
DNSBL to SA like schemes is damning DNSBL with faint praise.  To my
knowledge nobody has requested that IETF standardize SA, but just
because SA sucks doesn't mean that DNSBL is standards-track quality even
if it sucks a bit less.  (b) DNSBL, in practice, doesn't actually
provide anywhere near that level of assurance.  It doesn't say that a
host is infected with Srizbi, it just alleges that a host is bad.   And
in fact the alleged badness might well have been "this host sent out a
combination of random content fragments that look like a Nigerian 419".
 Indeed, DNSBL doesn't actually say that a host is bad, it associates
the badness with an IP address which might or might not be associated
with the same host now as it was when the badness was observed.  It
doesn't say when traffic from that IP address was observed to be bad.  etc.

> Indeed, experience shows that a correctly chosen set of DNSBLs is often
> not only more effective than other techniques in correctly identifying
> malicious email, can often have far fewer false positives than just
> about any other mechanism.

In other words, if you make the selection the filtering criteria someone
else's problem, you can pretend that the problem has gone away - even if
that "someone else" is not responsible to either sender or recipient,
and that "someone else" is not accountable for misrepresentation.

That certainly doesn't sound like something that would pass muster for
standards track.  Maybe the security folks would like to comment on the
sanity of extending trust to unaccountable third parties?

> It certainly is counterintuitive that "source reputation" might be more
> accurate than "per email evaluation".  But, experience in the field
> demonstrates the true reality - the current state of the art is that
> intelligently chosen DNSBLs (the aforementioned BCP is intended to help
> that) often work much better than complete reliance on the latter.

I can accept that it makes some sense to identify compromised hosts, for
instance, and that blocking mail from hosts known to be compromised with
a spamming virus can in some cases be more reliable than blocking
content based on actual message content.

But there are still several problems: (1) the problem of "intelligently
choosing" DNSBLs (since many mail admins - even those working for large
ISPs - don't seem to be able to choose very well).  (2) the problem of
extending trust to a party not accountable to either sender or
recipient.  (3) the problem of trusting IP addresses as host
identifiers.  (4) various problems associated with using DNS as a query
protocol for this - including caching, ttls, spoofing, lack of ability
to carry all of the information needed for a reasonable decision to be
made, etc.

> Furthermore, incorrect DNSBL listings are easier to cope with than, say,
> random combinations of SpamAssassin scores that just happen to zap a
> desirable email.

Marginally easier, yes, because some DNSBLs include TXT records, and
some ISPs return NDNs with those TXT records.  And if the sender is
lucky, his ISP might not drop the NDN.   And so there's at least some
chance that the sender can figure out why his mail was bounced - or ask
someone to do so.  But this has nothing to do with use of a DNSBL per se
- it's equivalent to saying that SpamAssassin, as usually configured,
doesn't bounce mail.

And when I've been asked to track down why a blacklist rejected a
client's mail, the criteria for blacklisting the host have been bogus in
every single case.  For instance I recently found out that a client's
mail had been bounced because the server that sent out the mail didn't
have a PTR record pointing to a DNS name with the letters "mail" or "mx"
in it... and some idiot decided that this would be a good reason to
bounce perfectly legitimate mail.

Obviously there's a problem with "intelligently choosing" blacklists -
maybe mail system administrators are not, overall, very intelligent?
Maybe a system that expects mail system administrators to be able to
evaluate good blacklist criteria is not well-designed?

> I generate several DNSBLs for use here only.  Why shouldn't there be a
> standard so that mail server software I buy/lease/license will
> interoperate with DNSBLs I create?

Having a standard for reputation servers might be a good idea.  Basing
that reputation on IP address might not be a good idea.  Using DNS to
transmit that reputation might not be a good idea either.

> "Not well-established"?  You don't seem to have any idea how prevalent
> the use of DNSBLs is. 

I most assuredly do - because I'm constantly trying to figure out why
people's mail has bounced and running into the damn things.

But I think you misread what wrote earlier.  I didn't say that DNSBLs
weren't well-established.   I said the benefit of reputation servers was
not well-established.

> The reality is: almost everybody uses DNSBLs, and ALL of the very big
> sites do.

Sounds like "almost everybody" needs a clue.  No surprise there.

Or maybe what the industry needs are well-vetted mechanisms for
establishing a host or network's reputation, and well-engineered
mechanisms for reporting them, ensuring repeatability, reporting the
problems to people who can get them fixed, making the reputation
services accountable for their actions and so forth.    Not blessing of
a poorly designed hack.

>> use of DNS to communicate
>> such reputation, and basing such reputation on IP addresses, is itself
>> very dubious.
> 
> You may think it dubious, but it is usually more reliable and effective
> than any other, and its popularity alone means it needs to be standardized.

Nope.  Go read 2026.  Carefully.  The protocol has to be technically
sound (it's not) and it has to have rough consensus of the whole IETF
community (it doesn't).

> Rejecting the standardization of DNSBL protocols because the entries of
> a specific DNSBL _might_ be dubious is in itself a dubious position.

It strikes me that part of the problem with evaluating the quality of
reputation services used to bounce mail is that legitimate users rarely
see when they work well.  They mostly see when they don't work well,
when their mail is filtered for bogus reasons.

>> The problem isn't just the rumor passing mechanism, but
>> the mechanism is indeed part of the problem.
>>
>> It's not as if we're arguing about whether to standardize a facility
>> that is widely believed to work well.
> 
> It is widely believed to work well.  That's part of the point.

It's only believed to work well among mail system administrators who are
isolated from their users.   I've never met an end user who thought
DNSBLs worked well.  I've never met a MSA who worked directly with end
users who thought DNSBLs worked well.

>> We're arguing about whether to
>> standardize a facility that causes problems for everyone.
> 
> No more so than filtering in general.

Perhaps not, but that doesn't mean it's worthy of standardization.

>> Back when I started working with email, getting a message reliably
>> delivered was something of a black art, because you had to know how to
>> thread your message through the hodgepodge of Internet, uucp, BITNET,
>> DECnet, fidonet, and whatnot.  That was in some sense understandable
>> because Internet access was not widely available, and there wasn't
>> really any common network that everybody could tap into.
>>
>> Today, getting a message reliably delivered is once again a black art.
>> But today, it's not for lack of standards or network connectivity.  It's
>> because so many messages are filtered for dubious reasons, or on the
>> basis of what are essentially unsubstantiated rumors, or because of
>> over-reliance on IP source addresses as identifiers.
> 
> DNSBLs exist and their use is extremely widespread because the industry
> _does_ believe in them. 

The industry might believe in them.  But users don't believe in the
reliability of email, because their experience is that email is not
reliable.  And DNSBLs (and mail filtering in general) are the primary
reasons for that.

> We have to get our heads out of the sand and put it on a solid standards
> footing to finish the job of de-black-arting it.

The first step is to stop assuming a priori that DNS is a suitable
protocol for communicating reputation, and that IP addresses are
suitable as host identifiers.

Keith
_______________________________________________
Ietf mailing list
Ietf@ietf.org
https://www.ietf.org/mailman/listinfo/ietf