Re: [dmarc-ietf] Reversing modifications from mailing lists

Wei Chuang <weihaw@google.com> Thu, 25 November 2021 08:07 UTC

Return-Path: <weihaw@google.com>
X-Original-To: dmarc@ietfa.amsl.com
Delivered-To: dmarc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9D69B3A0FFD for <dmarc@ietfa.amsl.com>; Thu, 25 Nov 2021 00:07:58 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.599
X-Spam-Level:
X-Spam-Status: No, score=-17.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jIfvVoONuQqj for <dmarc@ietfa.amsl.com>; Thu, 25 Nov 2021 00:07:53 -0800 (PST)
Received: from mail-il1-x12c.google.com (mail-il1-x12c.google.com [IPv6:2607:f8b0:4864:20::12c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8FA763A0FFC for <dmarc@ietf.org>; Thu, 25 Nov 2021 00:07:53 -0800 (PST)
Received: by mail-il1-x12c.google.com with SMTP id r2so5032241ilb.10 for <dmarc@ietf.org>; Thu, 25 Nov 2021 00:07:53 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=YxNYyGLrIVsXM16qEapI8PGAvXxG5NKS91CDUz0PmQ8=; b=lAD6qa1RwtKEcgYa2BskL9pVPGKbHbS14wWO11qJoGgtaf0L+H+9OEakI/BscoM/aN 6lALOQAtLDUGMy9CG6SpC9UsqUhV8DUIB0NEKSwWgDyxkXJVOm+C1zpfb/LW7Jxac0n2 H7UnxbcLDYJ7bQXYzQcYXKa6NtUI+kqUMgxzYXADcPlWc4aIBXwysxbDQHx2HuzGjJ8v pY8O9jRZyDw03KmzYoC3wMZ2AFLVAMwPwFkdKPCMq7jDY09/2rb8A1efxbca4jSmQBdc SeuhLL1HRSDJZtuQfqRGRKnMOG3unTiN3xUD2aB1Q6i7ceVdFtvREgtdlKQpRRkVc5Oz LvCw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=YxNYyGLrIVsXM16qEapI8PGAvXxG5NKS91CDUz0PmQ8=; b=W+E4IYmHdAvHzh0oRwDHoY3zC7Ido3oJ1I50gjqyjooX87M7nzV0faGgPJE28ccrVX 9X7UKIuBe79Lm87P+B3VD6R49NL2TpO7jctTtB8eECc2HYeT1W936u+AyLoqBg0gFNhK Es/GY+LZnBJ0SxetJaLq03XeIRGXNeoktVN7PyV5gzJwXl4auoFejSsxK+GTL/uf480i Y4mq/LZGdYZYKwIwhlAl0Ogg9yPrF0YE+6hnqOdT/Q0qjHvHNlwrLv7xCmzmO0m9gG/8 ciQysOi7c/Wz5jGshmcYGYdetJsJbyeGPTOqRGS2LH65EUxO0J4kFCyFf0G5vtfcFBFp LvRQ==
X-Gm-Message-State: AOAM532vDrbUeiCo18SSQp/8/Tmir4Z4bBu/W2jJPHVgmPQzbY1USFbp 7idRTI3toj26jHGbURCGAcu6P61SpjAPwZKnBrOfIA==
X-Google-Smtp-Source: ABdhPJxjkGiJUxo8jJIDgr6CjvPac7M8awCt8oIkqibpwxEAmZNSq6jVzcTK/A1on2cDbKFOfvDpgeEchn2hRaU5J+E=
X-Received: by 2002:a05:6e02:156b:: with SMTP id k11mr20035822ilu.77.1637827671501; Thu, 25 Nov 2021 00:07:51 -0800 (PST)
MIME-Version: 1.0
References: <CAAFsWK3qshdYDeeTOLPJEnk=gHFrRp==QJLvoG6RAYHau6Fy8g@mail.gmail.com> <6aad0642-f73c-ba6f-d26c-1c1fd90e2c9a@tana.it>
In-Reply-To: <6aad0642-f73c-ba6f-d26c-1c1fd90e2c9a@tana.it>
From: Wei Chuang <weihaw@google.com>
Date: Thu, 25 Nov 2021 00:07:36 -0800
Message-ID: <CAAFsWK23GGfe+uSyPqa2wxFgRn3mk7G9ajtjfz6cKw-FaoFM_A@mail.gmail.com>
To: Alessandro Vesely <vesely@tana.it>
Cc: dmarc@ietf.org, "Murray S. Kucherawy" <superuser@gmail.com>
Content-Type: multipart/alternative; boundary="000000000000909cd405d1987833"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dmarc/mmxSA6V7RSYMKtLTI30NLr1eBgg>
Subject: Re: [dmarc-ietf] Reversing modifications from mailing lists
X-BeenThere: dmarc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Domain-based Message Authentication, Reporting, and Compliance \(DMARC\)" <dmarc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dmarc>, <mailto:dmarc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dmarc/>
List-Post: <mailto:dmarc@ietf.org>
List-Help: <mailto:dmarc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dmarc>, <mailto:dmarc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Nov 2021 08:07:59 -0000

Thanks for the feedback and answers.

On Wed, Nov 24, 2021 at 3:01 AM Alessandro Vesely <vesely@tana.it> wrote:

> Hi,
>
> On Tue 23/Nov/2021 00:28:01 +0100 Wei Chuang wrote:
> >
> > 1. Ale's draft suggests not reversing all possible transforms, and
> rather
> > focuses on a subset caused by mailing lists that are reversible
> >    * Could ARC be suitable for those other scenarios? Could we expect
> that
> > forwarders that do more substantial irreversible rewriting such as
> modifying
> > URLs in a spam/phishing filter MTA, already have a strong relationship
> with the
> > receiver?  Presumably, might they be trusted by the receiver and their
> ARC
> > result could be used?
>
>
> Sure.  Note that if the receiver trusts the MLM, simply recognizing it
> would be
> enough to pass DMARC per the "mailing_list" policy override.  ARC
> additionally
> provides the ability to learn the authentication status of the message
> when it
> was received by the MLM.  That way, reputation can be reckoned with great
> precision.
>
>
> > 2. Footers must only be added with as a) append on single text/plain
> part b)
> > mime part appended to multipart/mixed c) mime wrap where a footer is
> added in a
> > new multipart/mixed.
> >    * It's not very clear to me how Ale's draft handles the b) and c)
> scenario.
> >   (There is mention of "reason="transformed"", but this still seems
> incomplete)
> >   I saw that Murray has a draft draft-kucherawy-dkim-list-canon
> > <https://datatracker.ietf.org/doc/html/draft-kucherawy-dkim-list-canon> that
>
> > identifies addition of new mime parts that could be helpful there.
>
>
> I tried and implemented Murray's draft, but it requires that MLMs declare
> which
> transformation they do.  Since they don't, you need a pre-parser that
> guesses
> the transformation type.  That's the difference between the two drafts.
>
> If there are two top-level MIME parts, the transformation must be (c),
> because
> no one writes a MIME structure with just one part.  Otherwise it's (b).
>
>
> > 3.  Footers added to text/plain must be identified with at least four
> "_" as a
> > separator.
> >    * Would the DKIM length "l=" field be helpful?  Understood there are
> abuse
> > risks.
>
>
> Yes, l= could be a useful hint.
>
> The risk of l= is that an attacker could exploit a poor HTML interpreter
> to add
> a part that completely hides the original content when rendered.
> Requiring
> attachments to be plain text avoids that risk.
>
>
> > 4. "quoted-printable encoding must not be used for... single-part
> text/plain
> > messages, as it is impossible to guess original soft line breaks after
> re-encoding"
> >     * Are you suggesting quoted printable encoding aren't fully
> reversible?
> > Actually, could the RFC2045 canonical encoding of the message be used as
> the
> > source for doing the DKIM content hashing?  This would bypass having to
> worry
> > about additional transfer re-encodings by forwarders.
>
>
> Mailman can copy MIME structures without changes, but simple text is often
> re-encoded.  Many messages on this list are converted to base64.  If the
> original text was quoted printable, its form depends on the agent.  An
> agent
> can choose where to break lines, whether to encode some characters or
> represent
> them as ASCII, whether to break lines at column 76 or, to increase
> readability,
> at white spaces.  That can vary too widely.
>

If the RFC2045 canonical representation at the final destination can be the
same as the canonical representation at the original sender, (and assuming
there isn't some content modification like adding inline footers etc). then
that might be a way of side stepping some of the issues with quoted
printable encoding.  Understood that would be a departure from DKIM/ARC
hashing and verification.


>
> > 5. Finding the original FROM by looking at From, Author, Original-From,
> > X-Original-From, Reply-To, and Cc.
> >    * Can this be standardized to a fixed location such as Author?
> (Sorry I'm
> > unfamiliar with the discussion on Author)
>
>
> That's exactly the purpose of Author.  However, no one is using it yet.
>
>
> > 6. Subject
> >    * Agreed that some simple heuristic as proposed in the draft is a
> good
> > approach.  Perhaps the original subject suffix length also might work
> here too.
>
>
> I don't get this, I'm afraid.  What is the subject suffix length?
>

Sorry I wasn't too clear here.  It's largely the same idea as the DKIM body
length "l=" field above except for reformulated for the Subject header and
its mailing list mutations.  The original sender would encode a length of
the original subject say "s.l=<value>".  A receiver would only hash the
right most "s.l=<value>" length string when validating a Subject hash from
the original sender.  This assumes that mailing lists may prepend a string
typically for identification.

thanks again,
-Wei


>
> Best
> Ale
> --
>
>
>
>
>
>
>