[dmarc-ietf] DMARC aggregate reports XML Schema inconsistencies
"Freddie Leeman" <freddie@leemankuiper.nl> Wed, 31 July 2019 09:47 UTC
Return-Path: <freddie@leemankuiper.nl>
X-Original-To: dmarc@ietfa.amsl.com
Delivered-To: dmarc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6C9DA12003F for <dmarc@ietfa.amsl.com>; Wed, 31 Jul 2019 02:47:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=leemankuiper.nl
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VTvc6EjaFvwu for <dmarc@ietfa.amsl.com>; Wed, 31 Jul 2019 02:47:32 -0700 (PDT)
Received: from srv01.leeman-automatisering.nl (srv01.leeman-automatisering.nl [87.239.9.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DBC9E120033 for <dmarc@ietf.org>; Wed, 31 Jul 2019 02:47:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=leemankuiper.nl; s=mta1; h=Content-Type:MIME-Version:Message-ID:Date: Subject:To:From:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=gjyMH81qITkCkFO0D9PIPVHPEqdhH9jq1GYGkyig1x8=; b=HoVA5fgDMn5Ngw3cdsEwG3risg JOGdoZ2ncVbf+Xs6xJH/edsQP6L5SjQwTG8LDCgytFTf99asSUF/KGB3t3AOSfWqcFMezqrWzxnOf GArgOPqYmLXEp7a3/IqcRnApt7x9Zi3ZQe/BVeoNjhNF0W1ZKQDYlWvT115BY3HbFF5VKQbq9Bfw0 ph/F7rY5KC/yfZuf/mb6N1DWhtqEo/p/Bv0GGyQPpVTD7NDeCMDObqiWdtOQVc7Oml7bql7ChTmI4 d3AlGbtIeG2weHi4tmdrMolrn3hXRTO6fPLm5/uGpGQNml2h4ZVw2GQMfZKgtq6qpWIdigOWcQm5d eeXVFzXA==;
Received: from 83-83-140-171.cable.dynamic.v4.ziggo.nl ([83.83.140.171] helo=LAPC01) by srv01.leeman-automatisering.nl with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.92) (envelope-from <freddie@leemankuiper.nl>) id 1hslCj-0006jw-B0 for dmarc@ietf.org; Wed, 31 Jul 2019 11:47:29 +0200
From: Freddie Leeman <freddie@leemankuiper.nl>
To: dmarc@ietf.org
Date: Wed, 31 Jul 2019 11:47:29 +0200
Message-ID: <008401d54784$f8300750$e89015f0$@leemankuiper.nl>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_0085_01D54795.BBB99AA0"
X-Mailer: Microsoft Outlook 15.0
Thread-Index: AdVHd+Tnb/OcO6SJQ6ewR7XrEBdXOw==
Content-Language: nl
X-Antivirus-Scanner: Clean mail though you should still use an Antivirus
X-Authenticated-Id: info@leemankuiper.nl
Archived-At: <https://mailarchive.ietf.org/arch/msg/dmarc/kZI0bytLU9uaMmh4yQB_FTJSKX4>
Subject: [dmarc-ietf] DMARC aggregate reports XML Schema inconsistencies
X-BeenThere: dmarc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Domain-based Message Authentication, Reporting, and Compliance \(DMARC\)" <dmarc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dmarc>, <mailto:dmarc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dmarc/>
List-Post: <mailto:dmarc@ietf.org>
List-Help: <mailto:dmarc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dmarc>, <mailto:dmarc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 31 Jul 2019 09:47:36 -0000
I've been processing millions of DMARC aggregate reports from a lot of different organizations, and have been trying to make sense of them for quite some time now. I've noticed that most of them, even those from large parties like Google and Yahoo!, fail to follow the DMARC RFC guidelines (Appendix C. DMARC XML Schema). I've written a blog about this that can be found here: https://www.uriports.com/blog/dmarc-reports-ietf-rfc-compliance/ The bottom line is that the RFC 7489 Appendix C is a mess and contradicts itself numerous times in both schema and comments. I think it's important to be clearer and stricter about the xml elements and their values. Too much of this section is open to interpretation. Some examples: The report has an element with the name "policy_published". This name would indicate that the elements within, contain the domain's published policy. The comments however, mention "applied" and "apply". Most organizations that send aggregate reports do not send failure reports and thus do not "apply" the "fo" (Failure reporting options) element. This is why parties like Google leave this element out of their reports. This particular element's comment ("failure reporting options in effect") also implies that it is optional. On the other hand, this element has a default "minOccurs" value of 1, so it should not be omitted. It should also be clearer about what to do with policy elements that are unspecified in the domain's DNS record. I think it is best to fill these elements in the report with their respected default values. So when 'pct' is not specified in the domain's policy, the report should state '100'. When 'sp' is not specified it should have the value of the 'p' element. I've also noticed that most parties do not specify the PolicyOverrideType, even when both SPF and DKIM alignment fails. So this element should be made mandatory whenever alignment fails and the disposition doesn't follow the domain's DMARC policy. The RFC guidelines for aggregate reports should also state that empty elements with a minOccurs of 0 should be omitted and not be left blank. It should also be specified that if a message is not signed with DKIM the 'DKIMAuthResultType' should be omitted. And thus the 'DKIMResultType' 'none' would never be used. Because when a message has no signatures, then it also doesn't have a specified 'domain' (d=) (minOccurs 1) and 'selector' (s=) (minOccurs 0). What happens now is that some organizations report non-signed messages with the 'dkim' element and fill the 'domain' and 'selector' with a bogus 'none' value. There are also multiple mentions of MinOccurs="1", even though the document specifies that unless otherwise specified in the schema, the minOccurs and maxOccurs values for each element are set to 1. This adds to the confusion. DMARC reporting capabilities are a valuable aspect of the DMARC mechanism. It can help domain owners in setting up and hardening their DKIM/SPF/DMARC policy. But unless these reports follow strict guidelines they just pile up to a lot of inconsistent data open to interpretation and guesswork. Domain owners should be able to understand the data without the need for a spiritual voodoo DMARC guru (trademark pending) to make sense of it all. Kind regards, Freddie Leeman
- [dmarc-ietf] DMARC aggregate reports XML Schema i… Freddie Leeman
- Re: [dmarc-ietf] DMARC aggregate reports XML Sche… Kurt Andersen (b)
- Re: [dmarc-ietf] DMARC aggregate reports XML Sche… Freddie Leeman
- Re: [dmarc-ietf] DMARC aggregate reports XML Sche… Alessandro Vesely
- Re: [dmarc-ietf] DMARC aggregate reports XML Sche… Freddie Leeman
- Re: [dmarc-ietf] DMARC aggregate reports XML Sche… Alessandro Vesely
- Re: [dmarc-ietf] DMARC aggregate reports XML Sche… Marc Bradshaw