Re: [Asrg] What are the IPs that sends mail for a domain?

Douglas Otis <dotis@mail-abuse.org> Thu, 18 June 2009 19:00 UTC

Return-Path: <dotis@mail-abuse.org>
X-Original-To: asrg@core3.amsl.com
Delivered-To: asrg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6B29C28C45C for <asrg@core3.amsl.com>; Thu, 18 Jun 2009 12:00:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.358
X-Spam-Level:
X-Spam-Status: No, score=-6.358 tagged_above=-999 required=5 tests=[AWL=0.241, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JIiVAjl30MLp for <asrg@core3.amsl.com>; Thu, 18 Jun 2009 12:00:23 -0700 (PDT)
Received: from harry.mail-abuse.org (harry.mail-abuse.org [168.61.5.27]) by core3.amsl.com (Postfix) with ESMTP id E26C028C4D9 for <asrg@irtf.org>; Thu, 18 Jun 2009 12:00:12 -0700 (PDT)
Received: from [IPv6:::1] (gateway1.sjc.mail-abuse.org [168.61.5.81]) by harry.mail-abuse.org (Postfix) with ESMTP id 80E5FA94439 for <asrg@irtf.org>; Thu, 18 Jun 2009 19:00:25 +0000 (UTC)
Message-Id: <C8F0F10E-E1A4-4D25-AF20-31E3F0DB68DF@mail-abuse.org>
From: Douglas Otis <dotis@mail-abuse.org>
To: Anti-Spam Research Group - IRTF <asrg@irtf.org>
In-Reply-To: <200906180105.VAA21834@Sparkle.Rodents-Montreal.ORG>
Content-Type: text/plain; charset="US-ASCII"; format="flowed"; delsp="yes"
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v935.3)
Date: Thu, 18 Jun 2009 12:00:25 -0700
References: <9112777.1871245190785748.JavaMail.franck@iphone-4.genius.local> <Pine.GSO.4.64.0906161906450.27272@nber6.nber.org> <4D8E56D2-CB37-4713-94E5-0F0C2A1B1F94@blighty.com> <2F26F23C-F1B4-4FD4-BAEB-53168072FF5D@mail-abuse.org> <200906180105.VAA21834@Sparkle.Rodents-Montreal.ORG>
X-Mailer: Apple Mail (2.935.3)
Subject: Re: [Asrg] What are the IPs that sends mail for a domain?
X-BeenThere: asrg@irtf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: Anti-Spam Research Group - IRTF <asrg@irtf.org>
List-Id: Anti-Spam Research Group - IRTF <asrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/listinfo/asrg>, <mailto:asrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/asrg>
List-Post: <mailto:asrg@irtf.org>
List-Help: <mailto:asrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/asrg>, <mailto:asrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Jun 2009 19:00:24 -0000

On Jun 17, 2009, at 5:51 PM, der Mouse wrote:

>> Factors that make managing an IPv6 block-listing service fairly  
>> impractical go beyond 96 additional address bits.
>>
>> 1) Publishing behavior based address lists require evidence  
>> collection in the event of disputes, and this does not scale well.  
>> (expensive)
>
> I can't see why v6 versus v4 makes any difference at all to this.

You have concluded that resolution of an IPv6 address reputation can  
safely ignore portions of the address.  Assuming the lower bits don't  
matter (64 bits of interface or 16 bits of site level identifiers,  
which will in some cases), that still leaves 53 or 37 bits of unicast  
address space in which to contend.

This also represents a space likely to evolve as routes become  
consolidated.  In an effort to constrain zone growth and to thwart  
address hopping, IPv4 addresses may have been consolidated into CIDRs  
that don't cross route announcements when the majority of addresses  
abuse email.  This might then result in a CIDR of /30 that contains a  
single IP address where evidence had not been collected.  Invariably,  
the user of that one IP address will complain and request removal.   
Imagine who might be affected when CIDR consolidation goes from groups  
of of 256 or 4096 to groups of 18x10^18 or 1.2x10^24 based upon the  
evidence of a single IP address?

In addition, the collection of IP address related evidence must be  
retained and reviewed.  IPv6's increased range ensures a greater  
workload for review and a much greater need for storage.  This  
increase will entail a sizable expenditure.

>> 2) IPv6 to IPv4 and IPv6 to IPv6 NATs obfuscates who might be  
>> involved.  (problematic)
>
> I can't see why this is any worse than the v4-to-v4 NATs the net is  
> already full of.

It is not as common to find carrier grade NATs.  Excluding these  
sources has already become problematic in a few countries, and will  
only get worse.

>> 3) Reverse DNS scanning does not scale well. (slow)
>
> True.  DNSBLs that depend on rDNS scanning may die.  There are  
> plenty of DNSBLs, including some of the most useful, that do not.

A reactive system that list addresses as abused and then automatically  
expire provides bad-actors two advantages:

a) bad-actors can avoid effective blocking by rapidly moving both  
source and target.

b) professional bad-actors can quickly identify the location of spam  
traps.

>> 4) Diverse and rapidly expanding address space allows bad-actor's  
>> activity to stay ahead of the massive amounts of IP address related  
>> information publishing.  (futile)
>
> I see no real difference here between a v4 list that lists at the / 
> 32 level and a v6 list that lists at the /48 (or maybe even /64)  
> level.

It already takes minutes to build zones and distribute information.   
Making zones larger increases processing and transfer time which  
already erodes effectiveness.

>> 5) An extremely low cost for IP addresses allows bad actors to  
>> persist at sporadic use for many years.  (futile)
>
> And this differs from v4...how?

It is fairly common to see abusive activity cycle through a range of  
addresses for one day every few weeks.  With IPv6, the addresses being  
abused could then cycle through a range for 1 minute every decade.   
What policy deal effectively with that strategy?

>> The collection of evidence is often constrained by the related  
>> identifier, such as the IP address.  Unfortunately, IPv6 allows a  
>> new IP address to be used for each message sent.
>
> So, collect evidence at the /64, or even /48, level, rather than at  
> the individual address level.
>
> Even to the extent that these problems are real, they are  
> theoretical. It certainly behooves us to think about them ahead of  
> time, but absent experience demonstrating that they are more than  
> potential, I don't see them as a reason to give up on v6 DNSBLs  
> without even trying.

It seems insane to repeatedly do the same thing and then expect  
different results each time.  IPv4 is already approaching the majority  
of all the addresses being blocked.  It will not be long for this to  
transition into only listing IP addresses registered as outbound mail  
servers.  The registration process might be as simple as having both  
an MX and CSV record published.

>> Pushing responsibility to the edge does not work, and email  
>> provides ample evidence.
>
> It's not that doing that has been tried and found wanting; rather,  
> it has not been tried.

Have you heard of SPF?

SPF represents a strategy to shift MTA accountability onto the domains  
of their customers (the preverbal edge).  This scheme may entail a  
maximum of 10 or 11 SPF record transactions, which may then include 10  
transactions per contained MX target.  For each legitimate message,  
there is typically at least 20 abusive messages (which also publish  
SPF records).  So the potential of 10 to 200 additional SPF  
transactions (when Sender-ID is included) should be multiplied by 20,  
where the 200 transactions becomes 4000 per legitimate message.

The ratio of legitimate to abusive is getting worse.  At what point  
does one admit that SPF, when used as designed, does not scale, that  
it imperils the integrity of DNS, and that it imperils the viability  
of SMTP.  If nothing else, SPF/Sender-ID has been responsible for  
delaying more scalable and safer solutions.  Just as you can't ask  
lenders using securitization to dodge lending accountability how they  
might be held accountable, you can't ask a group of large providers  
this question either and expect a reasonable answer.  The system needs  
to hold providers directly accountable.  That is what CSV attempted,  
where publishing too many such records or using too many EHLOs should  
also be considered abusive.

> (Actually, it has been tried in a limited way; there are pieces of  
> the net that _do_ push responsibility to the end user.  Oddly  
> enough, they are basically nonexistent as far as abuse emitters go;  
> what evidence I see indicates that it _does_ work.)

Can you provide some specifics?

-Doug