Re: [dmarc-ietf] Some Proposed Language for a New pct Tag Defintion
Douglas Foster <dougfoster.emailstandards@gmail.com> Sun, 01 August 2021 18:57 UTC
Return-Path: <dougfoster.emailstandards@gmail.com>
X-Original-To: dmarc@ietfa.amsl.com
Delivered-To: dmarc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E90583A00D2 for <dmarc@ietfa.amsl.com>; Sun, 1 Aug 2021 11:57:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.837
X-Spam-Level:
X-Spam-Status: No, score=-1.837 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, HTML_OBFUSCATE_05_10=0.26, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pZKmJTQrjr89 for <dmarc@ietfa.amsl.com>; Sun, 1 Aug 2021 11:57:07 -0700 (PDT)
Received: from mail-oo1-xc31.google.com (mail-oo1-xc31.google.com [IPv6:2607:f8b0:4864:20::c31]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D2B853A00CF for <dmarc@ietf.org>; Sun, 1 Aug 2021 11:57:06 -0700 (PDT)
Received: by mail-oo1-xc31.google.com with SMTP id h7-20020a4ab4470000b0290263c143bcb2so3897529ooo.7 for <dmarc@ietf.org>; Sun, 01 Aug 2021 11:57:06 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=vua9jdKGupiXFBFcBJDbXVwBHjppqBbhN2Aq0/OQ4ss=; b=QSjore4VpMu+ZBFY9uExN76cnCvztoFcy9Hlbwam/3wa+sVxMFuOq5k45ZOxlzNJoC ICIDsDd57Yzs4/+BIFnZMsMjJHY2vSBsKcsSwim1TF+c8+pZvD++tLWdYAjoAvGDJpAP /np+yC08LaAXeIKim+JWisgzjl8shUcp6ruaPKGSQMIG8PcDFg9/Q16Nj3nBBcL94sEy xc2yVEbOiSYNmpGKXUTXdFTv5NhHm7xs8nTmZd1C6i8SoAyPWgL9OGSM53Lf5Ed4Bmys igxlu03jQDpPezKpa1ER0x5P2t7Bz5j+k6TuKat/VI55Yc0Nv+kDxXjxLWLCCPGESNbZ 1faA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=vua9jdKGupiXFBFcBJDbXVwBHjppqBbhN2Aq0/OQ4ss=; b=GeoRezqaBWtu3v/OZ/lKKQWw+nKF94/IaFZuunbwhfX04RPfp/UkgBPCh036oTIXzb L0mX9jC6Yh7FPN+LrcQyOWF25IA4Z6YXBSPmWXUSYKaEYaxCqPoca7DmGcobdj8o8r5C +vhoHcoFD9/XCRJnOHYVS3YoGkL3Krw6RNNXB6SYYmP56C82jF6/BCZa6iV0Op8zJSXE NbdwzugxdQZ8/gHTcXbl7+SUb0aLnvXZ4um5ot/t3DrBlkI13mzww352HvDDCK43V44U cEFIcEj5BB/CxzYkbNUYDSFaZAJdt16EyYFGE1JsnZUDWGiujSkUARUhUPO1nmbZnNvJ LY0g==
X-Gm-Message-State: AOAM530yaG5f7ncTOkGp38w/7FnJfJIIL4w+J97HVdRUxcFGpJKF8dYr jTN/KByGNY3mPNIOuBt4fjWAC2KCyeCW/MnygmMnUa+hLVc=
X-Google-Smtp-Source: ABdhPJxgrgl9RjMTGDGIsfruTP0/cae0nUluI8s2QNYs3Gu16iGrcIb5Y/nsIs/NMRRmZHcAhAlAnTOj5v01bUjyds0=
X-Received: by 2002:a4a:be1a:: with SMTP id l26mr8424574oop.27.1627844225148; Sun, 01 Aug 2021 11:57:05 -0700 (PDT)
MIME-Version: 1.0
References: <CAHej_8m4W_k_r9SV6reNJA7aMGFCkK451tjvQGtrPNwRtJwC8A@mail.gmail.com> <6e96de62-f387-bb42-a5da-0b7f74674a02@tana.it> <CAH48ZfzjQxRzqpGD9GqgeJcJ25V1cA3ke-x-N-bxO9--Lm4NUQ@mail.gmail.com> <6a5ba0e4-7bc1-0dee-4bb1-4fa1678d5c70@tana.it>
In-Reply-To: <6a5ba0e4-7bc1-0dee-4bb1-4fa1678d5c70@tana.it>
From: Douglas Foster <dougfoster.emailstandards@gmail.com>
Date: Sun, 01 Aug 2021 14:56:55 -0400
Message-ID: <CAH48ZfwOPeFyjVWs6C7A0DfJ5uYFHCYQBnij8QZrBVsQeg6Msw@mail.gmail.com>
To: IETF DMARC WG <dmarc@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000ca2b2105c884043f"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dmarc/R--Wr7qCMca64DfwB9yt2rZHUOI>
Subject: Re: [dmarc-ietf] Some Proposed Language for a New pct Tag Defintion
X-BeenThere: dmarc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Domain-based Message Authentication, Reporting, and Compliance \(DMARC\)" <dmarc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dmarc>, <mailto:dmarc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dmarc/>
List-Post: <mailto:dmarc@ietf.org>
List-Help: <mailto:dmarc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dmarc>, <mailto:dmarc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 01 Aug 2021 18:57:12 -0000
Ale, I tried to explain my objections in the original post. However, it is a very important question, so I am happy to revise and extend my points. Forgive me for being long-winded , I am trying to be thorough because I see problems at many levels. Doug Foster Random Guessing can increase the volume of wrong decisions. The basic math does not work. Assume that a message sequence has a probability P of being unwanted, and a probability of Q = 1-P of being wanted. Does it make sense to use a random number based on P to discard messages? Probability of outcomes: · P*P – unwanted messages, correctly blocked · P*Q – unwanted messages, incorrectly accepted · Q*P – wanted messages, incorrectly blocked · Q*Q – wanted messages correctly accepted Total error rate is 2*P*Q. We have exchanged a one-sided error (allowing P unwanted messages) for a two-sided error distribution? Does it improve the overall error rate. Specifically, when is 2*P*Q < P ? Cancelling P from both sides (P>0) yields 2*Q < 1 and Q < 0.5 If the message stream is more than 50% unwanted, then random guessing might produce fewer total errors than allow-all. If the message stream at least 50% wanted, then random guessing produces inferior results. Other filtering stages will raise Q and lower P Since the specific issue is failed DMARC Authentication, we also need to consider how this task fits into the evaluation process. I believe my process is typical: · First, messages from known-bad senders are blocked. · Second, sender authentication is performed, at which point some messages may be discarded. · Third, content filtering is applied, and suspicious content is blocked. · Fourth, end-user activity occurs, where some messages are ignored or discarded. One effect of the first stage is that it lowers P and raises Q. During sender authentication, Q is likely to be above 50% even if the initial mail stream has a Q below 50%. If a false negative occurs during sender authentication, causing an unwanted message to be allowed, the message may be blocked during content filtering or it may be ignored by the user. Consequently, if the probability P is applicable during sender authentication, the probability of a threat being successful is less than P. Random guessing will increase the volume of unrecoverable errors. If a false positive occurs during sender authentication, causing a wanted message to be blocked, there is no opportunity for recovery. Therefore, false positives are a greater problem than false negatives, and the random guessing algorithm has the effect of replacing false negatives with false positives. Sender’s probability has no relation to Evaluator’s probability For any single domain, incoming messages can be broken into three categories: · Legitimately-sourced messages which arrive with valid credentials. · Legitimately-sourced messages which arrive with failed credentials. · Impersonation messages which arrive with failed credentials. For simplicity, assume that sender and receiver interests are aligned – the receiver wants to accept all legitimately-sourced messages from the domain. Since the sender is moving toward P=REJECT and the recipient wants to enforce P=REJECT, we will also assume that mailing lists are not part of the mail stream. Neither sender nor receiver know the volume of unwanted impersonating messages. This means that the denominator is unknown, but would be determined by the volume of impersonation + legitimate messages. The numerator for computing wanted message rates (Q) is all of the legitimate messages. The numerator for computing unwanted message rates (P) is all of the impersonation messages. Because the recipient wants all of the legitimately-source messages, the percentage of legitimate messages sent with imperfect credentials is irrelevant. Assuming that the source domain knows the volume of messages which are sent without complete credentials, and publishes a percentage based on that knowledge. Can the evaluator benefit from that information? I don’t think so. Credentials at origin are determined by whether the source is configured to apply correct SPF and DKIM credentials or not. The source domain could determine message volumes by server to compute a weighted statistic for percentage of messages with correct credentials. But any single evaluator will need see the same weighted distribution of message sources. It may not receive any messages from non-compliant servers, it may receive messages only from non-compliant servers, or any other possibly weight distribution. Applying the source-domain’s percentage estimate to the received message stream would only make sense if the weighting is comparable. More importantly, the assumed goal for both sender and receiver is to have all legitimately-sourced messages to be accepted. Arbitrarily blocking some wanted messages, for the sake of notifying about credentialling problems, works against the goal of the evaluator and his user base. It is too high a price to pay. On Sun, Aug 1, 2021 at 5:13 AM Alessandro Vesely <vesely@tana.it> wrote: > On Sun 01/Aug/2021 01:47:12 +0200 Douglas Foster wrote: > > > > My core objection is the partial-enforcement algorithm. I cannot > believe that > > it would be wise for me, or any other receiver, to implement that > algorithm. > > > Why not? What's wrong with it? > > if DMARC fail and (p=quarantine or p=reject) then > if (random mod 100) < pct then > apply policy > > > > In the face of ambiguity, the only way to get a correct disposition is > to > > collect more data. If I had more time, I would quarantine all > > unauthenticated mail until I could determine whether the sender should > be > > authenticated by local policy or blacklisted by local policy. > > > If you collect millions DMARC-fail messages every day for some years and > calculate the exact percentage you will get the same result as the > algorithm > above applied on each message as it arrives. See: > https://en.wikipedia.org/wiki/Monte_Carlo_method#Overview > > If you collect unauthenticated message, besides the implied delay, you'll > have > the problem of selecting which ones to select until the percentage is > fulfilled. The first ones? Distribute evenly in time or in size? Select > the > ones with highest score? Luckily we don't have to do so. > > > Best > Ale > -- > > > > > > > > > > > > > > > > _______________________________________________ > dmarc mailing list > dmarc@ietf.org > https://www.ietf.org/mailman/listinfo/dmarc >
- [dmarc-ietf] Some Proposed Language for a New pct… Todd Herr
- Re: [dmarc-ietf] Some Proposed Language for a New… Douglas Foster
- Re: [dmarc-ietf] Some Proposed Language for a New… Alessandro Vesely
- Re: [dmarc-ietf] Some Proposed Language for a New… Дилян Палаузов
- Re: [dmarc-ietf] Some Proposed Language for a New… Murray S. Kucherawy
- Re: [dmarc-ietf] not enhanced status codes Some P… John Levine
- Re: [dmarc-ietf] Some Proposed Language for a New… John Levine
- Re: [dmarc-ietf] Some Proposed Language for a New… Douglas Foster
- Re: [dmarc-ietf] not enhanced status codes Some P… Douglas Foster
- Re: [dmarc-ietf] Some Proposed Language for a New… Alessandro Vesely
- Re: [dmarc-ietf] Some Proposed Language for a New… Douglas Foster
- Re: [dmarc-ietf] Some Proposed Language for a New… Douglas Foster
- Re: [dmarc-ietf] Some Proposed Language for a New… Alessandro Vesely
- Re: [dmarc-ietf] Some Proposed Language for a New… Todd Herr
- Re: [dmarc-ietf] Some Proposed Language for a New… Dotzero
- Re: [dmarc-ietf] Some Proposed Language for a New… John Levine
- Re: [dmarc-ietf] Some Proposed Language for a New… Murray S. Kucherawy
- Re: [dmarc-ietf] Some Proposed Language for a New… Todd Herr
- Re: [dmarc-ietf] Some Proposed Language for a New… Barry Leiba
- Re: [dmarc-ietf] Some Proposed Language for a New… John R Levine
- Re: [dmarc-ietf] Some Proposed Language for a New… Todd Herr
- Re: [dmarc-ietf] Some Proposed Language for a New… Dave Crocker
- Re: [dmarc-ietf] Some Proposed Language for a New… Todd Herr
- Re: [dmarc-ietf] Some Proposed Language for a New… Dave Crocker
- Re: [dmarc-ietf] Some Proposed Language for a New… David I
- Re: [dmarc-ietf] Some Proposed Language for a New… Alessandro Vesely
- [dmarc-ietf] Reporting rewrites, was Some Propose… Alessandro Vesely
- Re: [dmarc-ietf] Reporting rewrites, was Some Pro… Todd Herr
- Re: [dmarc-ietf] Reporting rewrites Alessandro Vesely
- Re: [dmarc-ietf] Some Proposed Language for a New… Douglas Foster
- Re: [dmarc-ietf] Reporting rewrites Todd Herr
- Re: [dmarc-ietf] Reporting rewrites Alessandro Vesely
- Re: [dmarc-ietf] Some Proposed Language for a New… Murray S. Kucherawy
- Re: [dmarc-ietf] Some Proposed Language for a New… Douglas Foster