Re: [dmarc-ietf] Reversing modifications from mailing lists

Wei Chuang <weihaw@google.com> Sun, 28 November 2021 23:31 UTC

Return-Path: <weihaw@google.com>
X-Original-To: dmarc@ietfa.amsl.com
Delivered-To: dmarc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 985233A0954 for <dmarc@ietfa.amsl.com>; Sun, 28 Nov 2021 15:31:27 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.599
X-Spam-Level:
X-Spam-Status: No, score=-17.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XvgQFjqjp2JI for <dmarc@ietfa.amsl.com>; Sun, 28 Nov 2021 15:31:22 -0800 (PST)
Received: from mail-il1-x12a.google.com (mail-il1-x12a.google.com [IPv6:2607:f8b0:4864:20::12a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BAB363A094F for <dmarc@ietf.org>; Sun, 28 Nov 2021 15:31:22 -0800 (PST)
Received: by mail-il1-x12a.google.com with SMTP id j7so15267462ilk.13 for <dmarc@ietf.org>; Sun, 28 Nov 2021 15:31:22 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=j1Zp9QDABxN0DLh0oHxJI+GtHs/8V/aylGAc9jOplgw=; b=j1lSpnE1ywH3enldEH0Ykr5NS1mP4ySO+qBfsPwsTc+VsyfgIDjO6B7AM8y7Qb4QOP o6mb+5QThdUi/F/OiCVuIyUN31HmEz2nCe1Q3KZ6AonIOIhxbzKW0ViQ2DjAAcQalcyI gYZEQ2k6IQ6F0KaPI1gTT7+RlmXnK/+PzNc9kBgbxLKvnDzVcHTuJ0vyjOdnIgdkd3j4 gvMdfrJ+k5joqGX+4Yomo4sx4WyyGX9e4eak9qp5u+ViiROIAjYaH0zrU25+Td9YsZbW vG9goSgtlyr/FnO8CZVk/UjlvprY4mbhKBw33Qef00N4y6DTgIduI3Fgh/UD4/JVJOPK MtZA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=j1Zp9QDABxN0DLh0oHxJI+GtHs/8V/aylGAc9jOplgw=; b=RC7o5fZkDVAvJfXnOpqS05sWQ1HvUQsZhMaboCgHUf7Pm+fG0omq/Q6dSJ83RAxV1A 82ndNKqVTjhUQQZuTfmZ/j1LY4GNYjT585K6patXqegi8vU81UcQcHBdUCSCQqOR1J++ aXqW2ISf7Br8fmuxGhaJc3QMo0K/Di8ECXPRZgAIIxFS5lKwKGWqCqxflxdrztuB0rLL IeYkvcTdJX2Z6A8fK+Aagg5fSlZCbVqBBZKXkcFf3fwdpDCyN27gnJRVuUMBMg18wBb7 FlEf4f3iETuRd4LUfgiVy+m2zxHWBtCp9GJiX1x8sCUT7KlR1iCqWv1uCP0mEEcdAJPe mnWA==
X-Gm-Message-State: AOAM532yCpi8APyBRVFNNsEuvGW+VnJ6HulhrXEkx+Ov0uxxe+4mYBWC Ke5dqO69afJx5quNPbfXMPSteRMtB+z5obPF7KFANg==
X-Google-Smtp-Source: ABdhPJwBfHUvHExu9KVXVykWFwk2PQlgkxhUygJSH57IlzdW3RRRKoKABHiC3JTafhSvw0tiC6l7xOi8AAeMPuz/xHo=
X-Received: by 2002:a92:1e0c:: with SMTP id e12mr47608429ile.294.1638142280949; Sun, 28 Nov 2021 15:31:20 -0800 (PST)
MIME-Version: 1.0
References: <CAAFsWK23GGfe+uSyPqa2wxFgRn3mk7G9ajtjfz6cKw-FaoFM_A@mail.gmail.com> <20211126042946.6F86F3091B76@ary.local> <01S6OJDEJX3O005PTU@mauve.mrochek.com>
In-Reply-To: <01S6OJDEJX3O005PTU@mauve.mrochek.com>
From: Wei Chuang <weihaw@google.com>
Date: Sun, 28 Nov 2021 15:31:06 -0800
Message-ID: <CAAFsWK27b=SBnUfdpTUq7n6p9nZ9E=ddkB+krEa_wYbZWFcksw@mail.gmail.com>
To: Ned Freed <ned.freed@mrochek.com>
Cc: John Levine <johnl@taugh.com>, dmarc@ietf.org
Content-Type: multipart/alternative; boundary="000000000000bfbc6c05d1e1b8f5"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dmarc/azOsZNP8vJJKx3U7RgC3BgQ_irc>
Subject: Re: [dmarc-ietf] Reversing modifications from mailing lists
X-BeenThere: dmarc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Domain-based Message Authentication, Reporting, and Compliance \(DMARC\)" <dmarc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dmarc>, <mailto:dmarc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dmarc/>
List-Post: <mailto:dmarc@ietf.org>
List-Help: <mailto:dmarc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dmarc>, <mailto:dmarc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 28 Nov 2021 23:31:28 -0000

On Sat, Nov 27, 2021 at 7:34 PM Ned Freed <ned.freed@mrochek.com> wrote:

> > It appears that Wei Chuang  <weihaw@google.com> said:
> > > If the RFC2045 canonical representation at the final destination can
> be the
> > > same as the canonical representation at the original sender, ...
>
> > When we were working on DKIM canonicalization we had lengthy discussions
> about
> > what to do about MIME and we decided not to even try.
>
> A mistake IMO.
>
> > There is no canonical
> > representation of a MIME message and nobody to my knowledge has ever
> tried to
> > describe what it would mean for two MIME messages to be equivalent,
> since they
> > could vary in a fantastic number of ways.
>
> First, a caonnical form doesn't have to produce a 100% reliable equivalency
> test in order to be useful.
>
> Second, there can be more to a hash computation than a canonical form. This
> is especially true given that a MIME message is a tree.
>
> > Part separators can change, the
> > pieces of multipart/whatever might change, line breaks in
> quoted-printable
> > and base64 can change, spacing and capitalization of headers can change,
> and
> > that's just what I can think of in two minutes.
>
> If you treat the message as a Merkle tree with:
>
> o Separate header and body hashes
> o Decoding message bodies prior to hashing
> o Applying the already-defined unfolding/capitalization stuff from DKIM
>   to part headers.
> o Removing the CTE field and boundary value from CT fields in the header
>
> You end up with a value that's:
>
> o Invariant in regards to part separator changes
> o Invariant in regards to CTE changes
> o Invariant in regards to many/most common header changes
> o Allows for rapid computation of hashes for large numbers of large
> messages
>   that share common content.
>
> Which I note takes care of your list.
>

This approach and benefit was what I was thinking could be feasible as
well.  The cited draft-kucherawy-dkim-list-canon
<https://datatracker.ietf.org/doc/html/draft-kucherawy-dkim-list-canon>
draft notes
your contribution to the concept described there i.e. to perform hashing as
a mime-tree (though that draft doesn't do content transport decoding).


> But the question is, as always, whether or not defining such a thing is
> worth
> the trouble. At this point I think the answer is "no".
>

What type of concern do you have?  Is it algorithmic complexity?  Or
runtime or header size overhead?

-Wei