Re: [dmarc-ietf] Composition Kills: A Case Study of Email Sender Authentication

Dave Crocker <dhc@dcrocker.net> Wed, 22 April 2020 20:28 UTC

Return-Path: <dhc@dcrocker.net>
X-Original-To: dmarc@ietfa.amsl.com
Delivered-To: dmarc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5BF413A0915 for <dmarc@ietfa.amsl.com>; Wed, 22 Apr 2020 13:28:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.506
X-Spam-Level:
X-Spam-Status: No, score=-2.506 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, HTML_OBFUSCATE_10_20=0.093, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id w3oZbJVeUwXM for <dmarc@ietfa.amsl.com>; Wed, 22 Apr 2020 13:28:50 -0700 (PDT)
Received: from simon.songbird.com (simon.songbird.com [72.52.113.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D1CD13A090E for <dmarc@ietf.org>; Wed, 22 Apr 2020 13:28:50 -0700 (PDT)
Received: from [192.168.1.67] (108-226-162-63.lightspeed.sntcca.sbcglobal.net [108.226.162.63]) (authenticated bits=0) by simon.songbird.com (8.14.4/8.14.4/Debian-4.1ubuntu1.1) with ESMTP id 03MKUcNc022887 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT) for <dmarc@ietf.org>; Wed, 22 Apr 2020 13:30:38 -0700
From: Dave Crocker <dhc@dcrocker.net>
Reply-To: dcrocker@bbiw.net
To: dmarc@ietf.org
References: <2656238.kvSPeydUtl@sk-desktop>
Organization: Brandenburg InternetWorking
Message-ID: <51be5654-94c4-38c6-8f6b-dca403d6680a@dcrocker.net>
Date: Wed, 22 Apr 2020 13:28:44 -0700
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0
MIME-Version: 1.0
In-Reply-To: <2656238.kvSPeydUtl@sk-desktop>
Content-Type: multipart/alternative; boundary="------------0DF52F749A3E4F34FCD241FF"
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/dmarc/c8Xg2k07Bex7-y8lE-TwMdsAaZ0>
Subject: Re: [dmarc-ietf] Composition Kills: A Case Study of Email Sender Authentication
X-BeenThere: dmarc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Domain-based Message Authentication, Reporting, and Compliance \(DMARC\)" <dmarc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dmarc>, <mailto:dmarc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dmarc/>
List-Post: <mailto:dmarc@ietf.org>
List-Help: <mailto:dmarc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dmarc>, <mailto:dmarc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 22 Apr 2020 20:28:56 -0000

The paper concerns valid authentication and effective use of 
authenticated information.  That's an excellent goal.  The paper offers 
extensive technical analyses and empirical demonstrations of operational 
failures. Exercises like these are valuable and too infrequent.  Besides 
prompting specific repairs, work like this should motivate careful 
industry review of systems-level issues and standards-based documentation.

The paper does not adequately distinguish between claims of protocol 
design/specification errors, versus implementation errors, yet that 
distinction is essential.

The paper has a simplistic model of email anti-abuse processing and this 
distorts some of its analysis.  At the least, the paper needs to 
distinguish between theory and practice.

The paper has a flawed model of the role of recipient users in 
anti-abuse work and therefore the paper makes a basic error in its 
assessment of the importance of From: field validation.

The paper identifies a number of significant implementation deficiencies 
in various system.


Detailed comments:

> to authenticate the sender’s purported identity, as the basis for 
> displaying in email clients assurances of validity to userswhen they 
> read messages

That's a rather basic misunderstanding of how these mechanisms are 
actually used.


> In practice, attackers usually exploit SMTPby running their own email 
> servers or clients.

This is largely a non-sequitur.  It has built-in assumptions that aren't 
explained and are wrong with respect to any of the domain name-based 
protection mechanisms.  (And it isn't SMTP that is being exploited, per 
se. That's like saying that if I send a postal letter purporting to be 
the president of the United States, I've exploited the postal system. )


> SMTP’s design includes multiple “identities” when handling messages. 
Since this has been documented, they should cite it, for background.


> Both theMAIL FROMandFromheadersidentify the email sender, but they 
> have different meaningsin an SMTP conversation. The first represents 
> the user whotransmittedthe message, and is usually not displayed to 
> therecipient. 
It would be clearer and more helpful to say 'operator' or 'service 
provider' rather than 'user'.  A separate issue is whether their 
characterization of the differences between the two identifiers is valid 
in practice... The paper uses sender in a way that encourages confusion 
in the reader.  Especially since this is a technical paper, it should 
use language that carefully distinguishes roles, especially since 
terminology for making these distinctions is readily available.


> In addition, SMTP introduces multiple other sender identi-ties, such 
> as theHELOcommand,SenderandResent-Fromheaders.

I suspect that the failure to cite DKIM's d= field , until later, is a 
significant oversight.


> http://www.ietf.org/internet-drafts/draft-blank-ietf-bimi-00.txt

heh.  the URL isn't valid.  Should be:

https://www.ietf.org/archive/id/draft-blank-ietf-bimi-00.txt<https://datatracker.ietf.org/doc/draft-blank-ietf-bimi/>


> DKIM.DomainKeys Identified Mail (DKIM) uses cryp-tography to 
> authenticate senders...

I'm being too picky, right?  Formal DKIM semantics don't produce exactly 
that result.


> The general idea behind DKIM is to let senders sign parts ofmessages 
> so that receivers can validate them.
This is poorly written, wrong, or both.  'them' is ambiguous.  And the 
signing is to affix the d= value, not to validate any of the data that 
is part of the signature.


> it only need to have the same registered domain

Will typical readers understand what this means, in the context of this 
paper, since all of the domain name's components are 'registered'? It's 
not obvious what language would make the meaning clearer.


> If the email passes theDMARC verification, it enters the user’s inbox.

This is simply wrong.  It needs to place this phase of processing inside 
a complex filtering engine, which is the primary venue for using any of 
the mechanisms discussed in the paper.

> usually the MUA only displays theFromheader as the message sender.

These days, this isn't true either.  Most users only see the 
un-validated Display-Name.


> Thus, theFromheaderprovides the key identity relevant for gaining the 
> user’s trust

On its face, this seems a reasonable view.  In practice, it isn't.  Mail 
with obviously bogus email addresses in the rfc5322.From field are still 
effective for phishing, because what real users actually pay attention 
to is the body of the message and the identification information in it.  
That's why assertions about the display of validated source/author 
information to the end user are demonstrably wrong.


> Malicious usersof legitimate email providers exploit thefailure of 
> some email providers to perform sufficient valida-tion of emails 
> received from local MUAs. These attackers cansend emails with 
> spoofedFromheaders. The exploited emailproviders may automatically 
> attach DKIM signatures to theiroutgoing emails, enabling the attackers 
> to impersonate otherusers of the email provider

The writing of the paragraph's conclusion seems to be fundamentally 
misleading or wrong.  It seems to be saying that the DKIM signature will 
be for the domain in the From: field, which it won't.


> Security requirement.To achieve this goal,

I'm not sure exactly what goal they are referring to.


> (1) TheFromheader of the email thatSsends matches theauthenticated 
> username (other users ofScannot spoof Alice’saddress);


This requirement applies only in the presence of DMARC.


> (3) SPF/DKIM and DMARCcomponents inRconsistently authenticate the same 
> identifier


Except that SPF and DKIM are permitted to validate other identifiers, 
not just the one in the rfc5322.From field.


> This require-ment, although intuitive, implies a set of semantic 
> bindingrelations that every component in the email processing 
> chainmust respect.


This is imposing a requirement for deep, internal systems knowledge 
about features that are not, in fact, deeply embedded in email history 
or processing.  Arguably, the demand for having every component enforce 
this model is a basic mistake.

Rather the requirement is to prevent assumptions inside the system that 
serve to violate the policies needed at evaluation points.


 > robust principle

robustness

https://en.wikipedia.org/wiki/Robustness_principle


> be permissive in how they process malformed inputs


No, Postel did not direct accepting 'malformed' inputs.

And RFC 1122, Section 1.2.2 elaborated on this issue very helpfully:

> Software should be written to deal with every conceivable
> error, no matter how unlikely;


> 4.1 HELO/MAIL FROM confusion
This seems to imply that tightening is needed in DMARC, so that it uses 
an SPF domain that SPF has actually been validated? I think the issue is 
that SPF validation needs to inform DMARC what domain it has validated, 
rather than have DMARC decide which domain to fetch.


> SPF implementations treat “(any@legitimate.com” as anempty MAIL FROM 
> address, and thus forward the resultsof checking HELO to the DMARC 
> component, because thestring in the parentheses can be parsed as a 
> comment ac-cording to RFC 5322 [10]. Some DMARC 
> implementations,however, may take it as a normal non-empty address,


1. This issues goes away if SPF supplies DMARC the domain name, rather 
than DMARC having to fetch it

2. I doubt this otherwise needs changes to language in the DMARC spec, 
but it's worth making sure.


> 4.3 Authentication results injection


Another focus on what results are communicated and how.

The paper asserts that AR is used as DMARC input.  I suspect that is 
rarely, if ever, true.  Yes? No?


> Generally, we can divideFromheader-relatedprocessing into two phases: 
> 1) parsing a MIME message to extract the Fromheader; 


The rfc5322.From: field is independent of MIME. MIME pertains only to 
the body.


> Microsoft:disregarded our report (which included our pa-per and a 
> video5demoing theA10attack) because the threatsrely on social 
> engineering, which they view as outside thescope of security 
> vulnerabilities.


That's oddly impressive.


> Improving MUA display
...

> We note however that experiences with suchapproaches for promoting 
> HTTPS (via browsers displayingtrusted icons for websites with valid 
> TLS certificates) havedemonstrated the challenges of ensuring that 
> users correctlyinterpret the icons and do not get fooled by imposters


"Challenges" is incorrect.  "Failure to be useful" is more appropriate 
language.















-- 
Dave Crocker
Brandenburg InternetWorking
bbiw.net