Re: [dmarc-ietf] Reversing modifications from mailing lists

ned+dmarc@mrochek.com Sun, 28 November 2021 03:34 UTC

Return-Path: <ned+dmarc@mrochek.com>
X-Original-To: dmarc@ietfa.amsl.com
Delivered-To: dmarc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 73C843A0775 for <dmarc@ietfa.amsl.com>; Sat, 27 Nov 2021 19:34:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.1
X-Spam-Level:
X-Spam-Status: No, score=-2.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=mrochek.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fHfMu2wupu45 for <dmarc@ietfa.amsl.com>; Sat, 27 Nov 2021 19:34:37 -0800 (PST)
Received: from mauve.mrochek.com (mauve.mrochek.com [98.153.82.211]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 65CC33A076C for <dmarc@ietf.org>; Sat, 27 Nov 2021 19:34:37 -0800 (PST)
Received: from dkim-sign.mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01S6OJDG1U8000FGSQ@mauve.mrochek.com> for dmarc@ietf.org; Sat, 27 Nov 2021 19:29:34 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mrochek.com; s=201712; t=1638070174; bh=VxIEx+hBitFdLR6Zb5Me6k9B20iU7kCYtqya11Oh+gw=; h=From:Cc:Date:Subject:In-reply-to:References:To:From; b=IBeBs+uH3T98vfCSi7bQ5IaCOGfFMyJxcyBn5tA1O2GT1k8TK6f52REUkbhLKPiws 84YUgDD7KJBWARl/LUISmw6iRHffFL5bSM72D6EdsH9kLFiI6B0FcrrYr78jRsnBQp nST/ZYvoqhW4vW8XTqzM7bG7XZ1aS+TyzT9J9u+k=
MIME-version: 1.0
Content-transfer-encoding: 7bit
Content-type: TEXT/PLAIN; CHARSET="us-ascii"
Received: from mauve.mrochek.com by mauve.mrochek.com (PMDF V6.1-1 #35243) id <01S6FNRL78ZK005PTU@mauve.mrochek.com> (original mail from NED@mauve.mrochek.com) for dmarc@ietf.org; Sat, 27 Nov 2021 19:29:31 -0800 (PST)
From: ned+dmarc@mrochek.com
Cc: dmarc@ietf.org, weihaw@google.com
Message-id: <01S6OJDEJX3O005PTU@mauve.mrochek.com>
Date: Sat, 27 Nov 2021 19:04:34 -0800
In-reply-to: "Your message dated Thu, 25 Nov 2021 23:29:45 -0500" <20211126042946.6F86F3091B76@ary.local>
References: <CAAFsWK23GGfe+uSyPqa2wxFgRn3mk7G9ajtjfz6cKw-FaoFM_A@mail.gmail.com> <20211126042946.6F86F3091B76@ary.local>
To: John Levine <johnl@taugh.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/dmarc/XCQkvqG_FeHxqrWBXbpl2frHSd0>
Subject: Re: [dmarc-ietf] Reversing modifications from mailing lists
X-BeenThere: dmarc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Domain-based Message Authentication, Reporting, and Compliance \(DMARC\)" <dmarc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dmarc>, <mailto:dmarc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dmarc/>
List-Post: <mailto:dmarc@ietf.org>
List-Help: <mailto:dmarc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dmarc>, <mailto:dmarc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 28 Nov 2021 03:34:41 -0000

> It appears that Wei Chuang  <weihaw@google.com> said:
> > If the RFC2045 canonical representation at the final destination can be the
> > same as the canonical representation at the original sender, ...

> When we were working on DKIM canonicalization we had lengthy discussions about
> what to do about MIME and we decided not to even try.

A mistake IMO.

> There is no canonical
> representation of a MIME message and nobody to my knowledge has ever tried to
> describe what it would mean for two MIME messages to be equivalent, since they
> could vary in a fantastic number of ways.

First, a caonnical form doesn't have to produce a 100% reliable equivalency
test in order to be useful.

Second, there can be more to a hash computation than a canonical form. This
is especially true given that a MIME message is a tree.

> Part separators can change, the
> pieces of multipart/whatever might change, line breaks in quoted-printable
> and base64 can change, spacing and capitalization of headers can change, and
> that's just what I can think of in two minutes.

If you treat the message as a Merkle tree with:

o Separate header and body hashes
o Decoding message bodies prior to hashing
o Applying the already-defined unfolding/capitalization stuff from DKIM
  to part headers.
o Removing the CTE field and boundary value from CT fields in the header

You end up with a value that's:

o Invariant in regards to part separator changes
o Invariant in regards to CTE changes
o Invariant in regards to many/most common header changes 
o Allows for rapid computation of hashes for large numbers of large messages
  that share common content.

Which I note takes care of your list.

But the question is, as always, whether or not defining such a thing is worth
the trouble. At this point I think the answer is "no".

				Ned