Re: [dmarc-ietf] Prove me wrong!

Bron Gondwana <brong@fastmailteam.com> Wed, 16 August 2017 01:35 UTC

Message-Id: <1502847341.2162341.1074690552.30E822B4@webmail.messagingengine.com>
From: Bron Gondwana <brong@fastmailteam.com>
To: Brandon Long <blong@fiction.net>
Cc: dmarc@ietf.org
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: multipart/alternative; boundary="_----------=_150284734121623412"
In-Reply-To: <CABa8R6vYDQaj64ahhivONp4DVv0-zrv8D8ZFZOAT71xC0q+FfA@mail.gmail.com>
Date: Wed, 16 Aug 2017 11:35:41 +1000
References: <1502762779.1086232.1073552432.22DE7E98@webmail.messagingengine.com> <CABa8R6vYDQaj64ahhivONp4DVv0-zrv8D8ZFZOAT71xC0q+FfA@mail.gmail.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/dmarc/yb-tFAARpnvtxtYP5CP-ppYTZak>
Subject: Re: [dmarc-ietf] Prove me wrong!
Precedence: list

On Wed, 16 Aug 2017, at 09:55, Brandon Long wrote:
> Since AMS i=1 doesn't pass, the information included in Set 2 only
> says that site3 claims that site2 said that spf passed, whereas in
> Set 1, the AS allows you to verify that site2 actually claimed that
> spf passed.
Yes, but that doesn't mean anything about this message, only that SPF
passed for _a_ message.
> Now, since the i=1 AMS doesn't pass, it is true that the i=1 headers
> in both cases could have been either made out of whole cloth (in Set
> 2) or copied from an existing message (in Set 1).> 
> Your formulation also means that who asserted anything in i=1 is now
> missing.  You could include information in the i=1 aar on who made the
> assertion (srv-id?) or not strip the broken AMS for i=1.> 
> Looking at my whitelist based local policy override code, it
> doesn't care about the seal, the seal is only used to verify the
> intact arc chain.> 
> So, assuming some changes to what you're saying, your handling of a
> single message is not different based on the two versions above.> 
> That is not the only utility of the arc chain, however.
> 
> AS allows me to verify that what was asserted was asserted by the
> actual service, but not that that assertion applies to this message.
> The fact that it applies to this message is based on trusting the
> services which handled receipt, yes.  But your version allows a
> malicious actor to invent the path the message went through.  With AS,
> they have to copy an existing chain, without it they can just write
> whatever they want.
Yes, everything BEFORE the last bad actor is entirely untrustworthy in
set 2, it can be made out of whole cloth.
In the set 1 example, you can tell that _a_ message went through that
particular set of servers in that particular order, and they verified
particular facts about the message (SPF, DKIM, etc).  In my example the
bad actor can fake more things about the path beforehand.
So the question becomes - is there any value in knowing that mail flows
along a particular path between services and that a theoretical bad
actor can read that mail (or at least that headers from it)?
> This distinction becomes more important when using the information as
> training data for learning which paths to trust and which not to
> trust.  The AS certainly contains more information there, but perhaps
> that more information is only useful for the largest actors, and then
> maybe only as some small percentage of decreased false positives or
> the ability to allow trust further down the long tail.  Without
> sufficient data and implementation, it's hard to judge whether the
> utility of this extra information is useful.
I will certainly agree that if you process a large fraction of the
world's mail flow, you can ascertain things about a mail flow having
been present (and if you have in index of every ARC-Seal that's flowing
in the world, you can detect reuse!)
I doubt even Google has quite that much ability to correlate the world's
mail flow, particularly the bits that never go to a Google server.
Maybe there are security agencies with this level of capability, but
they're not going to want to tip their hand by helping us identify spam.
Regardless, what you're getting here is a cryptographically verifiable
log that an email (which may have never been intended for you in the
first place) was passed between a set of services and then accessed by a
bad actor.  You can only tell that the bad actor listened at a
particular spot OR LATER, because ARC doesn't protect against
truncating.  You can read the message at i=8 and truncate back to just
the i=1 and i=2 headers then restart the chain yourself.
There do exist cryptographic methods for making a non-truncatable chain
- but that would require everyone to rewrite older headers at each step
- and is not the design we have.
If I was a bad actor, I'd probably subscribe to mailing lists or read
public archives and slurp up ARC headers from there, then decide whether
to truncate back or not.  I guess a large enough actor (e.g. Google)
could ALSO slurp public mailing lists and blacklist the ARC headers, but
that makes it a big challenge for everybody else!
 I don't  know if I like the idea of "Google can create a huge
 database of cryptographic proof that mail flowed from site X to site
 Y even if the original message wasn't intended for Google" as a spam
 prevention measure!
In summary - tell me if I'm wrong, but what I'm getting as the
difference is:
Set 1:
* the receiver (site5.com) can verify that an email (not necessarily
  this one) was sent from site1.com to site2.com.
Set 2:
* if site3.com is a bad actor, site5.com does not know that any email
  ever flowed between site1.com and site2.com, that "fact" could be
  faked as well.
Is that it?  That an email was sent from site1.com to site2.com?  I
don't think you can tell from ARC who the sender or recipient was on
that email that went from site1.com to site2.com, so you know nothing
other than that "an email existed that traveled this path".
All you know from ARC-Seals is that there is AN EMAIL that started its
life by following that path, and somebody could read it and inject the
headers from it to an email generated at site3.com.
Since site3.com is "bad", it can falsify all the mail flow after
site2.com added its ARC headers, and it can falsify it on any email that
followed a different path out of site2 originally.  So we don't even
know that site2.com sends any email on to site3.com.  Just that if
site2.com isn't bad, then it receives email from site1.com.
Bron.

--
  Bron Gondwana, CEO, FastMail Pty Ltd
  brong@fastmailteam.com

[dmarc-ietf] Prove me wrong! Bron Gondwana
Re: [dmarc-ietf] Prove me wrong! Brandon Long
Re: [dmarc-ietf] Prove me wrong! Bron Gondwana