[dmarc-ietf] Tree walk max depth concern and impact on reporting for domain owners working as expected
Seth Blank <seth@valimail.com> Wed, 05 April 2023 22:54 UTC
Return-Path: <seth@valimail.com>
X-Original-To: dmarc@ietfa.amsl.com
Delivered-To: dmarc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 60F37C15C291 for <dmarc@ietfa.amsl.com>; Wed, 5 Apr 2023 15:54:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.086
X-Spam-Level:
X-Spam-Status: No, score=-2.086 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=valimail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KeLIz9IEyCs8 for <dmarc@ietfa.amsl.com>; Wed, 5 Apr 2023 15:54:29 -0700 (PDT)
Received: from mail-lf1-x136.google.com (mail-lf1-x136.google.com [IPv6:2a00:1450:4864:20::136]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 40843C151546 for <dmarc@ietf.org>; Wed, 5 Apr 2023 15:54:29 -0700 (PDT)
Received: by mail-lf1-x136.google.com with SMTP id y15so48644065lfa.7 for <dmarc@ietf.org>; Wed, 05 Apr 2023 15:54:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=valimail.com; s=google2048; t=1680735267; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=6yU5OKgPx6WogzQv8RlXGN93iCbGJ9CPeGbvpal+ed4=; b=S16BHXRFm2toCFIgXPR8DBMWJWEzwHH8X8plKDmv9x+4WA8DYUAtyDd7RFZntGoQLJ f/BqX2K+BrGX5VD2Jr0FkY1VQlxvfTQDPHijXAXWn5Iins3+OoumWAuJe29pAtzyfdkn 9Bl5DQh5mM24UdseOuLmPZYF+4pdF/3/xU8g7yoKlpW45ql5c7bZ8+uLmXDS+bdnpKXF L06VE8FQNrcP83O3zOupTtdcU70IOVvmhfH4aVrR1gnZK6jJCg8LUAqwi8wLeS10PmC4 /r0aipiAkZOLKe7yE3uZmMR+MEL0DalMGjAoZBacW3iaCvj9pHtHlEnpezNTYVsgvAdo G2gQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680735267; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=6yU5OKgPx6WogzQv8RlXGN93iCbGJ9CPeGbvpal+ed4=; b=cUVVuTA8L/1mDWiNX8vV1chtO/KTC/gj3Y3IpyK9iEKEz68gsfujf8PLNkNB05jhb+ 0cPLfUHFAduBhokwraC6iveWQXtvU4hT6dVjd52TCaRmb16g6UdIMP5VtkCfzNQ4Pn7I fxQKWjJHHGQZzq/rRZf5l0nEzxJRn67s9uQdKDGfIMVVAm9pco1DT0T0ue7B279vKXfx quTSFSGGnub6NeNCyLvPFPCpr85oHLQX8Xhr7+CGXzVig9ngmuQmYNfdqPteO39XCYN4 JoTe9HSBoG6dSTrVlJw2UAabbc5NeO6/XlhydtgetM0jXVwBOTjQ2yOGtkFQdpyU337J 9L8w==
X-Gm-Message-State: AAQBX9dyqFPj/OP12cLpQv8Xr5WHqkmGs2QJ7Zo39eSQ5fE+qPybsgDB n3i5hkTc7sj8ybnj53Ji9sqW0AD7rwRjmk8160b030qztsJzfjbXMj8=
X-Google-Smtp-Source: AKy350Y6FxQKvrV958fpy3p2gNHnkhhzkkMxaNQfy+JBYTCyrWKQUr6NDMytFG19XVLRu00eGkCrKJbhnsZb3Dw73QE=
X-Received: by 2002:a19:7417:0:b0:4eb:1606:48d5 with SMTP id v23-20020a197417000000b004eb160648d5mr2421428lfe.7.1680735266641; Wed, 05 Apr 2023 15:54:26 -0700 (PDT)
MIME-Version: 1.0
From: Seth Blank <seth@valimail.com>
Date: Wed, 05 Apr 2023 15:54:15 -0700
Message-ID: <CAOZAAfN9AN5gYTsB420o9K0a9iWT_j3Uz=icU1x86fe4q8DuxQ@mail.gmail.com>
To: IETF DMARC WG <dmarc@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000087b36e05f89eacdb"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dmarc/GoExCeJYWhxnvH8lwjbr7nAcFh4>
Subject: [dmarc-ietf] Tree walk max depth concern and impact on reporting for domain owners working as expected
X-BeenThere: dmarc@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Domain-based Message Authentication, Reporting, and Compliance \(DMARC\)" <dmarc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dmarc>, <mailto:dmarc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dmarc/>
List-Post: <mailto:dmarc@ietf.org>
List-Help: <mailto:dmarc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dmarc>, <mailto:dmarc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 05 Apr 2023 22:54:34 -0000
I believe there’s a critical use case we missed with the tree walk, specifically around policy and reporting discovery, not determining organizational domain alignment. One of the reasons we discussed a tree walk for DMARC bis in the first case, was a specific problem with larger more complicated organizations for whom policy discovery in 7489 does not work as needed. These organizations include governments, universities, and healthcare organizations, and they have shared some details both on this list and at M3AAWG. Specifically, with complex organizations with sub-organizations with separate reporting and policy needs, the sub-organization policy/reporting was being skipped by the policy lookup, and the reports wound up at places unable to act on or properly route them to the right place. In the most extreme cases (US federal government), we see the following paradigm with some frequency: bounce.sender-subdomain.division.agency.department.gov with a 5322.From generally of division.agency.department.gov or agency.department.gov. Sometimes the full bounce domain is in the From (especially when things are first set up but not yet configured well using dmarc, hence the need for appropriate reporting!)-- this is rare, but essential to get reporting right when it happens. The reporting and policy here that are important are around the division or agency, and rarely the department. In the case where we have a long PSD, this reporting and policy would be skipped by the current proposed algorithm with an N=5. e.g. with sender.agency.department.example.gov.ccTLD, the current discovery mechanism would skip the agency policy and reporting (which is what is wanted) and instead land on the department one (which is not). When we first proposed the tree walk, we thought we’d just walk up 5 labels, and then stop if nothing had been found. This handles this complex organization use case cleanly for legitimate mail, which was the initial intent. However, it a) does not handle the abuse use case well (what policy do you choose if you exhaust the lookups without a policy answer?), and b) the group rightly pointed out that this misses the use case of determining organizational domain alignment, which is essential for dmarc overall. The current jump in the algorithm handles both (a) and (b) effectively. John Levine suggested I was twisting myself in knots trying to solve both these use cases, when the simplest solution was to leave the algorithm exactly as it is, and just revisit N. So how do we handle this? What’s the worst case? Looking at the above example, the longest “complex org” would be 5 labels long. I think we’ve already agreed, backed by data from the PSL, that the longest PSD would be 4 labels long. This seems to say that revisiting an N of (max complex labels, + max psd labels, + 1) = 10 would cover even the most complex use cases without needing to change the normative part of the document. Maybe there’s a better N at 8 or 9. We should discuss. Below, I’ve proposed some explanatory text and updated examples if we do want to revisit. I’ve used N=10 as a placeholder, so if we end up at a lesser N, we only need to remove examples, not generate more. To be clear, due to the current policy discovery mechanics (check author domain then jump to organizational domain), I'm not aware of any of these complex orgs setting dmarc policies on Author Domains at such a depth. i.e. N=5 today would not break anything currently in place. However, the tree walk now enables these complex orgs to set policy much deeper in their hierarchy, which would then potentially not work as expected and possibly send reports to the wrong destination due to the current N=5. I don't feel strongly about N=10, but I do feel strongly that N=5 is insufficient. My gut feel is that 6 or 7 is likely more than enough to cover all real world examples, but it's a gut feel only and not backed by data. Have at it! Seth, as an individual --- I propose that the existing text in rev -27 be slightly modified. Current text is shown here, and current text with proposed modifications in bold italics is shown on subsequent pages: OLD ## DNS Tree Walk {#dns-tree-walk} The DMARC protocol defines a method for communicating information through the publishing of records in DNS. Both the content of the records and their location in the DNS hierarchy are used for two purposes: policy discovery (see Section 4.7 <https://datatracker.ietf.org/doc/html/draft-ietf-dmarc-dmarcbis-27#dmarc-policy-discovery>) and Organizational Domain determination (see Section 4.8 <https://datatracker.ietf.org/doc/html/draft-ietf-dmarc-dmarcbis-27#organizational-domain-discovery> ). The relevant DMARC record for these purposes is not necessarily the DMARC policy record found in DNS at the same level as the name label for the domain in question. Instead, some domains will inherit their DMARC policy records from parent domains one level or more above them in the DNS hierarchy. Similarly, the Organizational Domain may be found at a higher level in the DNS hierarchy. These records are discovered through the technique described here, known colloquially as the "DNS Tree Walk". The target of any DNS Tree Walk is a valid DMARC policy record, but the rules defining required content for that record depend on the reason for performing the Tree Walk. To prevent possible abuse of the DNS, a shortcut is built into the process so that domains that have more than five labels do not result in more than five DNS queries. The generic steps for a DNS Tree Walk are as follows: 1. Query the DNS for a DMARC TXT record at the appropriate starting point for the Tree Walk. A possibly empty set of records is returned. 2. Records that do not start with a "v=" tag that identifies the current version of DMARC are discarded. If multiple DMARC records are returned, they are all discarded. If a single record remains and it contains a "psd=n" tag, stop. 3. Determine the target for additional queries (if needed; see the note in Section 4.8 <https://datatracker.ietf.org/doc/html/draft-ietf-dmarc-dmarcbis-27#organizational-domain-discovery>), using steps 4 through 8 below. 4. Break the subject DNS domain name into a set of ordered labels. Assign the count of labels to "x", and number the labels from right to left; e.g., for "a.mail.example.com", "x" would be assigned the value 4, "com" would be label 1, "example" would be label 2, "mail" would be label 3, and so forth. 5. If x < 5, remove the left-most (highest-numbered) label from the subject domain. If x >= 5, remove the left-most (highest-numbered) labels from the subject domain until 4 labels remain. The resulting DNS domain name is the new target for the next lookup. 6. Query the DNS for a DMARC TXT record at the DNS domain name matching this new target. A possibly empty set of records is returned. 7. Records that do not start with a "v=" tag that identifies the current version of DMARC are discarded. If multiple DMARC records are returned for a single target, they are all discarded. If a single record remains and it contains a "psd=n" or "psd=y" tag, stop. 8. Determine the target for additional queries by removing a single label from the target domain as described in step 5 and repeating steps 6 and 7 until the process stops or there are no more labels remaining. To illustrate, for a message with the arbitrary RFC5322.From domain of " a.b.c.d.e.mail.example.com", a full DNS Tree Walk would require the following five queries to locate the policy or Organizational Domain: - _dmarc.a.b.c.d.e.mail.example.com - _dmarc.e.mail.example.com - _dmarc.mail.example.com - _dmarc.example.com - _dmarc.com NEW ## DNS Tree Walk {#dns-tree-walk} The DMARC protocol defines a method for communicating information through the publishing of records in DNS. Both the content of the records and their location in the DNS hierarchy are used for two purposes: policy discovery (see Section 4.7 <https://datatracker.ietf.org/doc/html/draft-ietf-dmarc-dmarcbis-27#dmarc-policy-discovery>) and Organizational Domain determination (see Section 4.8 <https://datatracker.ietf.org/doc/html/draft-ietf-dmarc-dmarcbis-27#organizational-domain-discovery> ). The relevant DMARC record for these purposes is not necessarily the DMARC policy record found in DNS at the same level as the name label for the domain in question. Instead, some domains will inherit their DMARC policy records from parent domains one level or more above them in the DNS hierarchy. Similarly, the Organizational Domain may be found at a higher level in the DNS hierarchy. These records are discovered through the technique described here, known colloquially as the "DNS Tree Walk". The target of any DNS Tree Walk is a valid DMARC policy record, but the rules defining required content for that record depend on the reason for performing the Tree Walk. The Tree Walk described here is designed with two goals in mind. First, it had to ensure that it could discover DMARC policies that might be published many levels deep in the DNS hierarchy by both simple and complex organizations. Examples of complex organizations include governments, educational institutions, and healthcare organizations, which tend to use longer than average RFC5322.From domains (e.g., sub-org.org.division.agency.department.gov) and which distribute DMARC policy management rather than maintaining central control. Second, it had to ensure unambiguous answers for organizational domain alignment and policy discovery regardless of the number of labels in the author domain, without opening up a DNS lookup abuse vector. To prevent possible abuse of the DNS, To meet both of these goals, the tree walk is designed to handle domains that have up to ten labels, and a shortcut is built into the process so that domains that have more than five ten labels do not result in more than five ten DNS queries. The generic steps for a DNS Tree Walk are as follows: 1. Query the DNS for a DMARC TXT record at the appropriate starting point for the Tree Walk. A possibly empty set of records is returned. 2. Records that do not start with a "v=" tag that identifies the current version of DMARC are discarded. If multiple DMARC records are returned, they are all discarded. If a single record remains and it contains a "psd=n" tag, stop. 3. Determine the target for additional queries (if needed; see the note in Section 4.8 <https://datatracker.ietf.org/doc/html/draft-ietf-dmarc-dmarcbis-27#organizational-domain-discovery>), using steps 4 through 8 below. 4. Break the subject DNS domain name into a set of ordered labels. Assign the count of labels to "x", and number the labels from right to left; e.g., for "a.mail.example.com", "x" would be assigned the value 4, "com" would be label 1, "example" would be label 2, "mail" would be label 3, and so forth. 5. If x < 5 10, remove the left-most (highest-numbered) label from the subject domain. If x >= 5 10, remove the left-most (highest-numbered) labels from the subject domain until 4 9 labels remain. The resulting DNS domain name is the new target for the next lookup. 6. Query the DNS for a DMARC TXT record at the DNS domain name matching this new target. A possibly empty set of records is returned. 7. Records that do not start with a "v=" tag that identifies the current version of DMARC are discarded. If multiple DMARC records are returned for a single target, they are all discarded. If a single record remains and it contains a "psd=n" or "psd=y" tag, stop. 8. Determine the target for additional queries by removing a single label from the target domain as described in step 5 and repeating steps 6 and 7 until the process stops or there are no more labels remaining. To illustrate, for a message with the arbitrary RFC5322.From domain of "a.b.c.d.e.f.g.h.i.j.k.l.m.n.mail.example.com", a full DNS Tree Walk would require the following five ten queries, in order to locate the policy or Organizational Domain: - * _dmarc.a.b.c.d.e.f.g.h.i.j.k.l.m.n.mail.example.com - * _dmarc.i.j.k.l.m.n.mail.example.com - * _dmarc.j.k.l.m.n.mail.example.com - * _dmarc.k.l.m.n.mail.example.com - * _dmarc.l.m.n.mail.example.com - * _dmarc.m.n.mail.example.com - * _dmarc.n.mail.example.com - * _dmarc.e.mail.example.com - * _dmarc.mail.example.com - * _dmarc.example.com - * _dmarc.com -- *Seth Blank * | Chief Technology Officer *e:* seth@valimail.com *p:* 415.273.8818 This email and all data transmitted with it contains confidential and/or proprietary information intended solely for the use of individual(s) authorized to receive it. If you are not an intended and authorized recipient you are hereby notified of any use, disclosure, copying or distribution of the information included in this transmission is prohibited and may be unlawful. Please immediately notify the sender by replying to this email and then delete it from your system.
- [dmarc-ietf] Tree walk max depth concern and impa… Seth Blank
- Re: [dmarc-ietf] Tree walk max depth concern and … Alessandro Vesely
- Re: [dmarc-ietf] Tree walk max depth concern and … Scott Kitterman
- Re: [dmarc-ietf] Tree walk max depth concern and … John Levine
- Re: [dmarc-ietf] Tree walk max depth concern and … Scott Kitterman
- Re: [dmarc-ietf] Tree walk max depth concern and … Alessandro Vesely
- Re: [dmarc-ietf] Tree walk max depth concern and … Scott Kitterman
- Re: [dmarc-ietf] Tree walk max depth concern and … Alessandro Vesely
- Re: [dmarc-ietf] Tree walk max depth concern and … Scott Kitterman