Re: [Asrg] Adding a spam button to MUAs

Douglas Otis <> Thu, 17 December 2009 19:59 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 79AF23A69C1 for <>; Thu, 17 Dec 2009 11:59:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -6.085
X-Spam-Status: No, score=-6.085 tagged_above=-999 required=5 tests=[AWL=0.358, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, SUBJECT_FUZZY_TION=0.156]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id BJ4635jEZfAi for <>; Thu, 17 Dec 2009 11:59:32 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id 78B7428C166 for <>; Thu, 17 Dec 2009 11:59:32 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id 16676A94442; Thu, 17 Dec 2009 19:59:14 +0000 (UTC)
Message-ID: <>
Date: Thu, 17 Dec 2009 11:59:13 -0800
From: Douglas Otis <>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv: Gecko/20091204 Thunderbird/3.0
MIME-Version: 1.0
To:, ARF mailing list <>
References: <alpine.BSF.2.00.0912082138050.20682@simone.lan> <> <> <> <> <> <> <> <> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Asrg] Adding a spam button to MUAs
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: Anti-Spam Research Group - IRTF <>
List-Id: Anti-Spam Research Group - IRTF <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 17 Dec 2009 19:59:33 -0000

On 12/17/09 10:34 AM, Nathaniel Borenstein wrote:
> There's a difference between having an "A or B" bit, and having
> separate "A" and "B" bits that may be muddled.  If you don't know the
> right way to interpret the two bits, then the single bit is
> potentially much less misleading.
> But as I said, this is something that could really be resolved by
> empirical study, and I'd prefer not to trust my intuition or anyone
> else's if we can figure out the facts about how people would actually
> use two buttons.  If the error rate is low enough, then two buttons
> will make sense, but I've rarely managed to underestimate the
> competence of the average user.   Two buttons is way more than twice
> as confusing as one, in this case.  -- Nathaniel
> On Dec 17, 2009, at 1:25 PM, Seth wrote:
>> Nathaniel Borenstein<>  wrote:
>>> On Dec 17, 2009, at 11:27 AM, Ian Eiloart wrote:
>>>> Twitter seems to think that users are smart enough to
>>>> distinguish between "unwanted" and "spam". They give you a
>>>> button for each. It's an important distinction that most people
>>>> can make.
>>> Twitter isn't always right, and my intuition differs from yours
>>> on this one.  Fortunately it's something that could be resolved
>>> empirically.  I'd like to see such a study, because it wouldn't
>>> take very many users who *can't* properly make that distinction
>>> to render the two-button solution counterproductive.  I'd rather
>>> have one bit of meaningful data than two bits of muddled data.
>>> -- Nathaniel
>> One button is the "OR" of the two buttons, so there's no less
>> information available.  Given enough data, it should be easy to
>> get pretty accurate statistics on how reliable _each_ user is, and
>> the unreliable ones can be mapped into the one-button treatment.

Agreed. After all, can users recognize an auto-response in Chinese, or
know the difference between real or spoofed DSNs?  Of course not.

All that can be determined from user feedback metrics is that 'X' number
of messages from some source were "unwanted".  Filters can be trained at
removing "unwanted" for specific users based upon their feedback,
however this information's usefulness is limited without knowing overall
source volumes, and whether messages might have been either foreign
language auto-responses, or valid DSNs.

Having "unwanted" considered equivalent of "spam-trap" feedback will
create a number of complaints, especially when dealing with a diverse
range of users.  Spam-trap email-addresses do not solicit email and are
good at excluding auto-responses and valid DSN.  This allows spam-trap
feedback be applied over a larger range of users with less regard to
overall source volume.  Unfortunately, since no provider is able to
ensure all customers are not 0wned, even "spam-trap" feedback can
benefit when overall volume is shared.

Our process consolidates metrics for all activity seen from each
specific source and evaluates related metrics in a non-linear fashion.
Offering this information within ARF feedback would not entail a sizable
increase in report size to include a field of "source activity rate" as
a method to share the number of message events (good or bad) seen over
some number of seconds.

Volume metrics added to an ARF report might look something like:

IP-Activity: [source-IP], [#messages], [#seconds];
DKIM-Activity: [d=], [#messages], [#seconds];

Such as

When metrics are passed through a feedback conduit, each source should
be qualified as representing spam-trap feedback, or user feedback, where
feedback metrics from multiple sources would be consolidated into at
least two separate categories.