Re: [dmarc-ietf] Which DKIM(s) should be reported? (Ticket #38)

Douglas Foster <dougfoster.emailstandards@gmail.com> Wed, 27 January 2021 11:32 UTC

Return-Path: <dougfoster.emailstandards@gmail.com>
X-Original-To: dmarc@ietfa.amsl.com
Delivered-To: dmarc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 63F053A0E60 for <dmarc@ietfa.amsl.com>; Wed, 27 Jan 2021 03:32:05 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gFup6ZcFyMCn for <dmarc@ietfa.amsl.com>; Wed, 27 Jan 2021 03:32:03 -0800 (PST)
Received: from mail-vs1-xe32.google.com (mail-vs1-xe32.google.com [IPv6:2607:f8b0:4864:20::e32]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EFC7A3A0E5D for <dmarc@ietf.org>; Wed, 27 Jan 2021 03:32:02 -0800 (PST)
Received: by mail-vs1-xe32.google.com with SMTP id o125so943878vsc.6 for <dmarc@ietf.org>; Wed, 27 Jan 2021 03:32:02 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=HV3REaRzWcZS8vBQKRCsdlFDooSgbatrkl2bGtQIKSQ=; b=MukkDdKHmIsDjfM3YG62FGMY/sc9qMpGYdyVnIAwAnB1gvEfWXanNLpu/0nCjmedvv zGh3h8cZhAmcZD7t4uzqaCF2q9UlQxGUhmXnPtL8JlrVWeHdUDSC+HXNfqPnPpcx3cB8 6jWt8pE9PEuid7zXKgODa+dxrVt7eQzOAuzdt/44hmDsKLpayIxVjujhW7CDK/8bvGxj p+O38TmOKm3udlFq7Mo1//2uRBKPxkaw86keMhgmYdEWllpK3J8JnP4nJoVVUZKX77lk nl5tOn6GZZZ5jiZbIFRvRpHFO5jjQf7VIrcso/Wq7xRkYmTGZmhGVFYpW5jzPMEpk2Ru pSGQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=HV3REaRzWcZS8vBQKRCsdlFDooSgbatrkl2bGtQIKSQ=; b=rVzlqaglP6kkTlI6x05owN2g7N8D5Oupl4y4pBG+ooo/3+M5pP0e4KfPuCkf2fJ3DM fOxeVt6F6A5JM1jW0SMVgpWr0aCRTF7C9GZk0mvfSEAxx/KWEoKhYCmzZ4HYrzr95Ge9 krxNTnaVgum3b87U19nki8b+5U/B5J0YRrC9Z5YvZdLpvtM2XBb8PUt77WrFVAgUc2ZZ 4onqU2ikCg1fJY3qj+Li15OVqpzXOaElX7goKAKTtF2UVe/AMnM96J1qzv4sdOfnZkoI pO9pwTWvU7sQhGEon6Ph3q6kTEOxJ6U6EX3x1vXfhDXZElB7SFvg42EeK2JAjumybjGV 34NA==
X-Gm-Message-State: AOAM532if3pNdbLwiGFUINEJN4ZMtpHgdGfzp+x5mYDWpKClq6OwNK+v TPCIqGOQNOFh39nM5ua8agFqcsADzB43yPfTuQQp33pYBDv6qw==
X-Google-Smtp-Source: ABdhPJz/JRPWeORj4ngM0HwlVN2rPo7o63Q5QfQBYMa0jIczm29y6jeH+jhzeVSQN3oKfSFOMt1BnBThM5Rv+c0rAK8=
X-Received: by 2002:a67:c29e:: with SMTP id k30mr7833523vsj.45.1611747121522; Wed, 27 Jan 2021 03:32:01 -0800 (PST)
MIME-Version: 1.0
References: <MN2PR11MB4351BD7203D41DB25771D3B3F7BD9@MN2PR11MB4351.namprd11.prod.outlook.com> <CAH48Zfwat5MmXrvfEp-G=0pTZe2fwwDOJ6s6M1FSWs6M50yk0w@mail.gmail.com> <MN2PR11MB43513C20B5A598496FFBA4AAF7BD9@MN2PR11MB4351.namprd11.prod.outlook.com> <7231cfb1-1553-fd11-e356-57b960c5bfdc@tana.it> <CAH48ZfwvBj3abrAEz1uK2UNyMOBAM1q3pH8cOmazn8VBow3ACQ@mail.gmail.com> <adcede1d-a260-7b78-9439-63eb706989e2@tana.it>
In-Reply-To: <adcede1d-a260-7b78-9439-63eb706989e2@tana.it>
From: Douglas Foster <dougfoster.emailstandards@gmail.com>
Date: Wed, 27 Jan 2021 06:31:51 -0500
Message-ID: <CAH48Zfzp5zDpGkyOwud55-OgNqTkHO5Vo4yL0mT9o2DR+-P51Q@mail.gmail.com>
To: IETF DMARC WG <dmarc@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000a56e4d05b9e01e18"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dmarc/90VoFX7k8Rlo37T1FpLhaZurdVs>
Subject: Re: [dmarc-ietf] Which DKIM(s) should be reported? (Ticket #38)
X-BeenThere: dmarc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Domain-based Message Authentication, Reporting, and Compliance \(DMARC\)" <dmarc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dmarc>, <mailto:dmarc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dmarc/>
List-Post: <mailto:dmarc@ietf.org>
List-Help: <mailto:dmarc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dmarc>, <mailto:dmarc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Jan 2021 11:32:05 -0000

Is this already a settled issue?  The specification already calls for a
complete A-R data set, so all signatures are supposed to be included if
they are evaluated.  Are the largest reporting sources already providing a
complete list of DKIM signatures?

However, there are significant technical problems with aggregating a list
with a variable number of members, because the list must be converted into
a list with a fixed number of elements before aggregation can be
performed.

- One technique is to convert the list into a variable-length text string,
so that the entire list is handled as one element.   Including all
signatures in an A-R record, and then grouping on the A-R text, would be an
example of this approach.  The technique will work up to the maximum
allowed text string supported by the data management system.   The maximum
number of list elements will depend on the mechanism used to build the text
string, the information being reported, and the maximum text size.   The
maximum number of supported list elements becomes unpredictable, but in
many data management systems will be larger than the expected number of
signatures in a message, unless a message is specifically constructed to
trigger a denial-of-service attack.

- Another approach, based on E.F.Codd's data normalization rules for
relational databases, is to have a table of messages which is keyed on a
message ID, and a table of signatures, which is keyed on message ID and
sequence number.   Then an outer join can be used to append the list
element with sequence number # to the message record.   A separate outer
join is required for each sequence number being appended, so the
implementation must choose a maximum number of list elements to append.
 One recent poster said that he was using this approach.    Outer joins are
generally inefficient, and this approach might work for up to 4 list
elements, but it will not work acceptable for a list with 100 elements.

For report sources with a fixed limit, it seems appropriate to have a
metadata element where the report provider states the maximum number of
signatures that might be reported by his system.   An indicator would be
needed to indicate "many, with no pre-determined limit"

Doug

On Tue, Jan 26, 2021 at 7:50 AM Alessandro Vesely <vesely@tana.it> wrote:

> On Tue 26/Jan/2021 13:02:46 +0100 Douglas Foster wrote:
> > DKIM Scopes
> > I have not heard a compelling argument to require information about
> > authentication tests that are unrelated to alignment testing.    For
> DKIM
> > specifically, I think one scope should be sufficient, on this hierarchy:
> >
> > - The best-aligned scope that verified, or
> > - the best-aligned scope that failed verification, or
> > - a no-signature result otherwise.
> >
> > Anything more complex imposes a gratuitous data collection burden on the
> > reporting domain and reduces aggregation significantly.   On the
> technical
> > side, it has already been noted that variable-length lists are
> particularly
> > problematic for calculating aggregates.
>
>
> Let me attach an HTML rendering of a report I received today, so we can
> talk
> about something real.
>
> Lines with IP 4.31.198.44 bear a ietf.org identifier.  I see no reason to
> remove it.  It is useful for understanding the mailflow, which is what
> DMARC
> reporting is designed to do.
>
>
> > Aggregation Controls
> >
> > We have discussed whether the target domain should be included in the
> > report.  I understand that doing so is not reasonable for the large
> hosting
> > services.   On the other hand, including the target domain would be a
> > trivial matter for smaller operations, and I think it would be valuable
> for
> > some research.    Similarly, DKIM scopes are known to be useful for most
> > investigations, but John has already observed that proliferation of DKIM
> > scopes can be used to force disaggregation down to the individual
> recipient
> > level.
>
>
> Even if this is a small example, learning the disaggregated, or even
> individual
> recipients does not help my understanding.  Authentication is obviously
> conditioned by how the Mediator treats my messages.
>
> I expect that Fastmail Pty Ltd carries out SPF and DKIM validation using
> the
> same algorithm, irrespective of the recipient.  That is what I, as a
> sender, am
> interested in.  Splitting the report in 66 lines wouldn't tell me anything
> more, it would just consume more eyeballs.  And is useless for people who
> sum
> up all reports and just look at the totals.  In any case, I cannot verify
> if
> the messages I didn't send directly are real.
>
> If a multi-domain host allows personalized validation algorithms for some
> domains, I'd expect they send separated aggregate reports, if any.
>
>
> Best
> Ale
> --
>
>