Re: [dmarc-ietf] Reversing modifications from mailing lists

Alessandro Vesely <vesely@tana.it> Wed, 24 November 2021 11:06 UTC

Return-Path: <vesely@tana.it>
X-Original-To: dmarc@ietfa.amsl.com
Delivered-To: dmarc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9B4E53A0D6B for <dmarc@ietfa.amsl.com>; Wed, 24 Nov 2021 03:06:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.95
X-Spam-Level:
X-Spam-Status: No, score=-3.95 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-1.852, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=neutral reason="invalid (unsupported algorithm ed25519-sha256)" header.d=tana.it header.b=OtHgvlZd; dkim=pass (1152-bit key) header.d=tana.it header.b=ASOFsHAI
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3Rzb9pJyx0dM for <dmarc@ietfa.amsl.com>; Wed, 24 Nov 2021 03:05:58 -0800 (PST)
Received: from wmail.tana.it (wmail.tana.it [62.94.243.226]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BF6DA3A0D6D for <dmarc@ietf.org>; Wed, 24 Nov 2021 03:05:57 -0800 (PST)
DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=tana.it; s=epsilon; t=1637751643; bh=PADTa1Xre5gqOllFVmEmX36t/S4M3LIMGBt8qNITiQk=; l=3779; h=Subject:To:Cc:References:From:Date:In-Reply-To; b=OtHgvlZdOGixzF0/nUItGoz0rKGkFHzImB2lkWf4kSqZnHVeo5pdd4mJ4ZcaUCOco LsA34aPiSKz/nfEFCmBBQ==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tana.it; s=delta; t=1637751643; bh=PADTa1Xre5gqOllFVmEmX36t/S4M3LIMGBt8qNITiQk=; l=3779; h=To:Cc:References:From:Date:In-Reply-To; b=ASOFsHAIxbhnb4SBmj4ktFh3dIDRfL9qSNym41EmE0dpKyZ2MeVQijbBZU/3mPIv6 nou0+L2qZ54Zw7+LohPeO0vBXSanLfTXhQmV5COxKn10qIUsiRjCK7iAr8tsFLyaF0 U69RrBeF4/l0VU7ITvTvuQAEWv4N6mIPJliYGLI8ozcddQrBtTsJdP4ogF5Ia
Authentication-Results: tana.it; auth=pass (details omitted)
Original-From: Alessandro Vesely <vesely@tana.it>
Original-Cc: "Murray S. Kucherawy" <superuser@gmail.com>
Received: from [172.25.197.111] (pcale.tana [172.25.197.111]) (AUTH: CRAM-MD5 uXDGrn@SYT0/k, TLS: TLS1.3, 128bits, ECDHE_RSA_AES_128_GCM_SHA256) by wmail.tana.it with ESMTPSA id 00000000005DC0EA.00000000619E1B5B.00001F6D; Wed, 24 Nov 2021 12:00:43 +0100
To: Wei Chuang <weihaw=40google.com@dmarc.ietf.org>, dmarc@ietf.org
Cc: "Murray S. Kucherawy" <superuser@gmail.com>
References: <CAAFsWK3qshdYDeeTOLPJEnk=gHFrRp==QJLvoG6RAYHau6Fy8g@mail.gmail.com>
From: Alessandro Vesely <vesely@tana.it>
Message-ID: <6aad0642-f73c-ba6f-d26c-1c1fd90e2c9a@tana.it>
Date: Wed, 24 Nov 2021 12:00:43 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0
MIME-Version: 1.0
In-Reply-To: <CAAFsWK3qshdYDeeTOLPJEnk=gHFrRp==QJLvoG6RAYHau6Fy8g@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/dmarc/IxACyOpx_8xwRTNd8oWt46SAe2A>
Subject: Re: [dmarc-ietf] Reversing modifications from mailing lists
X-BeenThere: dmarc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Domain-based Message Authentication, Reporting, and Compliance \(DMARC\)" <dmarc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dmarc>, <mailto:dmarc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dmarc/>
List-Post: <mailto:dmarc@ietf.org>
List-Help: <mailto:dmarc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dmarc>, <mailto:dmarc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 24 Nov 2021 11:06:05 -0000

Hi,

On Tue 23/Nov/2021 00:28:01 +0100 Wei Chuang wrote:
> 
> 1. Ale's draft suggests not reversing all possible transforms, and rather 
> focuses on a subset caused by mailing lists that are reversible
>    * Could ARC be suitable for those other scenarios? Could we expect that 
> forwarders that do more substantial irreversible rewriting such as modifying 
> URLs in a spam/phishing filter MTA, already have a strong relationship with the 
> receiver?  Presumably, might they be trusted by the receiver and their ARC 
> result could be used?


Sure.  Note that if the receiver trusts the MLM, simply recognizing it would be 
enough to pass DMARC per the "mailing_list" policy override.  ARC additionally 
provides the ability to learn the authentication status of the message when it 
was received by the MLM.  That way, reputation can be reckoned with great 
precision.


> 2. Footers must only be added with as a) append on single text/plain part b) 
> mime part appended to multipart/mixed c) mime wrap where a footer is added in a 
> new multipart/mixed.
>    * It's not very clear to me how Ale's draft handles the b) and c) scenario.  
>   (There is mention of "reason="transformed"", but this still seems incomplete) 
>   I saw that Murray has a draft draft-kucherawy-dkim-list-canon 
> <https://datatracker.ietf.org/doc/html/draft-kucherawy-dkim-list-canon> that 
> identifies addition of new mime parts that could be helpful there.


I tried and implemented Murray's draft, but it requires that MLMs declare which 
transformation they do.  Since they don't, you need a pre-parser that guesses 
the transformation type.  That's the difference between the two drafts.

If there are two top-level MIME parts, the transformation must be (c), because 
no one writes a MIME structure with just one part.  Otherwise it's (b).


> 3.  Footers added to text/plain must be identified with at least four "_" as a 
> separator.
>    * Would the DKIM length "l=" field be helpful?  Understood there are abuse 
> risks.


Yes, l= could be a useful hint.

The risk of l= is that an attacker could exploit a poor HTML interpreter to add 
a part that completely hides the original content when rendered.  Requiring 
attachments to be plain text avoids that risk.


> 4. "quoted-printable encoding must not be used for... single-part text/plain 
> messages, as it is impossible to guess original soft line breaks after re-encoding"
>     * Are you suggesting quoted printable encoding aren't fully reversible?  
> Actually, could the RFC2045 canonical encoding of the message be used as the 
> source for doing the DKIM content hashing?  This would bypass having to worry 
> about additional transfer re-encodings by forwarders.


Mailman can copy MIME structures without changes, but simple text is often 
re-encoded.  Many messages on this list are converted to base64.  If the 
original text was quoted printable, its form depends on the agent.  An agent 
can choose where to break lines, whether to encode some characters or represent 
them as ASCII, whether to break lines at column 76 or, to increase readability, 
at white spaces.  That can vary too widely.


> 5. Finding the original FROM by looking at From, Author, Original-From, 
> X-Original-From, Reply-To, and Cc.
>    * Can this be standardized to a fixed location such as Author?  (Sorry I'm 
> unfamiliar with the discussion on Author)


That's exactly the purpose of Author.  However, no one is using it yet.


> 6. Subject
>    * Agreed that some simple heuristic as proposed in the draft is a good 
> approach.  Perhaps the original subject suffix length also might work here too.


I don't get this, I'm afraid.  What is the subject suffix length?


Best
Ale
--