Re: [dmarc-ietf] p=quarantine

Alessandro Vesely <vesely@tana.it> Tue, 22 December 2020 18:46 UTC

Return-Path: <vesely@tana.it>
X-Original-To: dmarc@ietfa.amsl.com
Delivered-To: dmarc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9C4293A11FB for <dmarc@ietfa.amsl.com>; Tue, 22 Dec 2020 10:46:34 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_HEX=0.1] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1152-bit key) header.d=tana.it
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CChft5FVXCyi for <dmarc@ietfa.amsl.com>; Tue, 22 Dec 2020 10:46:31 -0800 (PST)
Received: from wmail.tana.it (wmail.tana.it [62.94.243.226]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7B81D3A11F9 for <dmarc@ietf.org>; Tue, 22 Dec 2020 10:46:31 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tana.it; s=delta; t=1608662789; bh=/EjDx4n1UFKPZsXtxbYvjDrdBWfpGEvJ8ZUL49rJE9U=; l=7456; h=To:References:From:Date:In-Reply-To; b=BQSKvQOfcM6pT0Hj+LBfiTr0yn0FRXy7FMvRGEDnKM/Oyjdtgs7DkO9zdNLyAEq4v QrUnHK/XKVqHNwj+f2Hc9JL1z1TXIizz78L4KONNPGShEipWDPqKpWdVXYRjxiZafN 44FKD1HnjRdZVMf001cx62ZeVgxOgl9yN0g2tohLJB5yTee69MnxiwAdHDKQZ
Authentication-Results: tana.it; auth=pass (details omitted)
Original-From: Alessandro Vesely <vesely@tana.it>
Received: from [172.25.197.111] (pcale.tana [172.25.197.111]) (AUTH: CRAM-MD5 uXDGrn@SYT0/k, TLS: TLS1.3, 128bits, ECDHE_RSA_AES_128_GCM_SHA256) by wmail.tana.it with ESMTPSA id 00000000005DC056.000000005FE23F05.000040BC; Tue, 22 Dec 2020 19:46:29 +0100
To: dmarc@ietf.org
References: <20201211173722.6B4DF29782C7@ary.qy> <ea074aad-971b-abc6-d557-ea2f433b3cc7@gmail.com> <CAH48ZfxEjGHv99z3RGj+Z+KJaFVPvm6RG4UzkKuOoDQDVCmb3g@mail.gmail.com> <A5E108DC-2692-4927-B2C1-AE3FED6DA8AA@wordtothewise.com> <CAH48ZfwkPEgexwGvyMT_PevMM5ngBT_XRfHYi7Wy1yxMw1LP1A@mail.gmail.com> <A07FA3DE-4C51-48C4-A2E7-067987200E1F@wordtothewise.com> <CAH48ZfwykEJM9AXKrp+SS4SgM4N1W70eLqHW+PXB18a_TrV6iw@mail.gmail.com> <02f786e5-b7cd-9a89-e3e3-73594f3bcda0@mtcc.com> <CAHej_8nHfn4uT4oeFJN-pbd-u3vrv2WiSnmAH-2v35OBmSi1cg@mail.gmail.com> <e715de9a-5f24-8077-0038-14c664850bd1@mtcc.com> <CAHej_8=FoTmCg8goD-yC2nTPKzoMUTNjfJ4aeTC4j7vYJEf0sw@mail.gmail.com> <cb659179-a3f0-bec6-d6be-7d2bd665d78e@tana.it> <CAHej_8k1Z-=_0gRrbV32r1vnGKAcB287utuEL5kR=Xru_Uth2A@mail.gmail.com>
From: Alessandro Vesely <vesely@tana.it>
Message-ID: <a1cdfe8d-2b34-a4f9-aa7c-082c4efff517@tana.it>
Date: Tue, 22 Dec 2020 19:46:29 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0
MIME-Version: 1.0
In-Reply-To: <CAHej_8k1Z-=_0gRrbV32r1vnGKAcB287utuEL5kR=Xru_Uth2A@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/dmarc/O9mN6obGwNBdrTyvX7XmR9lKfN4>
Subject: Re: [dmarc-ietf] p=quarantine
X-BeenThere: dmarc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Domain-based Message Authentication, Reporting, and Compliance \(DMARC\)" <dmarc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dmarc>, <mailto:dmarc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dmarc/>
List-Post: <mailto:dmarc@ietf.org>
List-Help: <mailto:dmarc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dmarc>, <mailto:dmarc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Dec 2020 18:46:35 -0000

On Tue 22/Dec/2020 16:41:43 +0100 Todd Herr wrote:
> On Mon, Dec 21, 2020 at 12:47 PM Alessandro Vesely <vesely@tana.it> wrote:
>> On Sun 20/Dec/2020 18:10:03 +0100 Todd Herr wrote:
>>>
>>> Lists are a specific instance of the more general case of indirect mail
>>> flows. >>
>> How many kinds of indirect mail flows do rewrite From:?
>>
>> Specific methods might prove more effective than general ones.
>
> Sender Rewriting Scheme (SRS) exists for the rewriting of the RFC5321.From
> address, and is sometimes used in mail that is forwarded by rule, say from
> @alum.institution.edu to @consumerMailboxProvider.com


VERP as done by MLMs is more specific (and better thought) than generic SRS.  I 
don't have numbers but off the top of my head I'd reckon MLMs cover a major 
part of RFC5321.From rewriters.  If you additionally select by RFC5322.From 
rewriting, I'd guess you're left with exactly MLMs.


> It seems to me that we can either work to find a way to ensure that
> [forwarding] failures don't happen, or we can work to find a reliable and 
> trustworthy way to record authentication results along the way so that the 
> failures can be mitigated and not result in failed delivery of wanted mail.

Actually, people seem to be doing both.  Avoiding gratuitous autoconversions, 
anti-virus results as footers, and the like, have become a must for most modern 
MTA software.  And there are several mailing lists which operate that way as 
well.  OTOH, modifications are sometimes unavoidable and we still need to cope.


>>> Since the receiver typically can't perform the same checks under the
>>> same conditions that existed when the intermediary performed them (if it
>>> could, we wouldn't need something like ARC) then the receiver has to
>>> decide if the message is consistent with messages it's previously seen
>>> through direct mail flows using that same authenticated identity that
>>> was captured by the intermediary in the ARC header set. >>
>> Doesn't that assume some kind of omniscience at the receiver's? 
>> Consistency with previous messages by the same source is not
>> straightforward.  Using the same selector?  Signing more or less the same
>> set of header fields? Choice of vocabulary?  HTML vs. plain text style
>> (e.g. line length)?  A.I.? >
> Not omniscience, no, but certainly a method of tracking an authenticated
> identity's reputation, and consistency of reputation is what I'm speaking
> of. Allow me to try again to get across the idea that I'm so far failing to
> make clear.
> 
> I'm not currently working for a mailbox provider, but I have in the past,
> and so I will role play as one here.
> 
> As a mailbox provider, I have a system for authenticating the identity or
> identities associated with messages that come directly to me.
> 
> These authenticated identities can include some or all of:
> 
>     - The DKIM signing domain(s)
>     - The DKIM signing domain(s) and selector(s)
>     - The RFC5322.From domain (authenticated using DMARC)
>     - The RFC5321.From domain (SPF)
>     - The IP address of the client that passed the message to me
>     - The domain associated with the PTR record of that IP address
>     - Others as I deem useful


Except for PTR records, that's the data I collect to send out aggregate reports.


> For each of these authenticated identities, I can and will track how my
> users/customers/mailbox holders engage with the mail they receive, thus
> over time establishing in my system a reputation to associate with each
> authenticated identity. If, for example, mail that is DKIM signed using
> selector S and domain D is mail that my users demonstrate through their
> actions (opening it, clicking on links in it, etc.) is wanted mail, then
> that authenticated identity (S._domainkey.D) will be associated with a good
> reputation (however I define "good") in my system. Lather, rinse, and
> repeat for other authenticated identities associated with the message, and
> allow that both good and bad reputations can be earned.


That's a delicate job.  For one point, 20161025._domainkey.gmail.com is not the 
kind of identifier you want to associate with users' liking, as it is shared 
among so many messages with diverging characteristics.  For another point, 
there are domains that change selector very often (taugh.com changes it on 
every message), so it can identify too narrow a message set.

Characterizing by author (a.k.a. RFC5322.From) probably works better.  An 
author authenticated by her submission server (a.k.a. author's domain) looks 
like a good identifier.  You still have to sort out authors having multiple 
email address and using them interchangeably.

However, as forwarding of modified messages settles on From: rewriting, 
recovering that identifier becomes fuzzy.  ARC does not cover that bit.  If we 
want to reliably use the author as an identifier, we need those forwarders who 
rewrite the From: header field —let's call'em MLMs— to adopt a standard way to 
convey the original value.  In my mlm-transform draft, I propose 
Original-From:.  IETF uses X-Original-From:.  Mailman uses Reply-To: or Cc:.


> Now here comes a message that is DKIM-signed by D with selector S, and it
> fails DKIM validation when I do my checks. However, in scanning the
> message, I see that there is an ARC header set, one that was signed and
> sealed by A, and in that ARC header set is an ARC-Authentication-Results
> header that says that the message passed DKIM validation with signing
> domain D and selector S when A did its check.
> 
> My conundrum here is "Do I trust A's claim that the message was correctly
> DKIM signed by domain D with selector S?"
> 
> The answer to that question will depend on how my users treat the message
> and others like it, assuming that I accept it and deliver it. If my users
> treat such messages in a manner that's consistent with how they treat
> direct mail flow that is DKIM-signed by D with selector S, then A's
> reputation as an ARC-sealer/signer will be positive, because A will
> establish with me a history of being a source of ARC-sealed/signed mail
> with ARC header sets that can be believed. On the other hand, if my users
> consistently treat such messages differently than they do direct mail flow
> that is DKIM-signed by D with selector s, then A's reputation as an
> ARC-sealer/signer will be negatively impacted with me, because I will not
> have evidence in hand that this is a path for mail with an authenticated
> identity of S._domainkey.D to take.


Here you're pairing a method which is fuzzy in nature with one, digital 
signatures, designed to be strictly binary.  You need a very large user base to 
achieve meaningful results.  Most medium/ small mailbox provider, think company 
servers to family MXes, will not even want to spy on users to determine their 
liking.


> The point here is that ARC (or any system designed to capture intermediate
> authentication results) can only succeed if the downstream recipients of
> the message trust the information that the intermediary host(s) record in
> the message. What I'm describing above is one way to determine if that
> information can be trusted, one that I'm trying to design to scale beyond
> "Let me just whitelist these X hosts as reliable ARC sealer/signers."


Overall, your plan confirms that ARC better suits global mailbox providers.


Best
Ale
--