Re: [perpass] draft-josefsson-email-received-privacy

"John Levine" <johnl@taugh.com> Sat, 24 October 2015 22:46 UTC

Return-Path: <johnl@taugh.com>
X-Original-To: perpass@ietfa.amsl.com
Delivered-To: perpass@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D749A1A877E for <perpass@ietfa.amsl.com>; Sat, 24 Oct 2015 15:46:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.862
X-Spam-Level:
X-Spam-Status: No, score=0.862 tagged_above=-999 required=5 tests=[BAYES_40=-0.001, HELO_MISMATCH_COM=0.553, HOST_MISMATCH_NET=0.311, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2GTBi1haPkzV for <perpass@ietfa.amsl.com>; Sat, 24 Oct 2015 15:46:46 -0700 (PDT)
Received: from miucha.iecc.com (abusenet-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:1126::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BC3401A877A for <perpass@ietf.org>; Sat, 24 Oct 2015 15:46:45 -0700 (PDT)
Received: (qmail 25997 invoked from network); 24 Oct 2015 22:46:45 -0000
Received: from unknown (64.57.183.18) by mail1.iecc.com with QMQP; 24 Oct 2015 22:46:45 -0000
Date: Sat, 24 Oct 2015 22:46:21 -0000
Message-ID: <20151024224621.15562.qmail@ary.lan>
From: John Levine <johnl@taugh.com>
To: perpass@ietf.org
In-Reply-To: <871tcl3f03.fsf@latte.josefsson.org>
Organization:
X-Headerized: yes
Mime-Version: 1.0
Content-type: text/plain; charset="utf-8"
Content-transfer-encoding: 8bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/perpass/tCoYUEasVsqoic5uigqyJL2l0oA>
Cc: simon@josefsson.org
Subject: Re: [perpass] draft-josefsson-email-received-privacy
X-BeenThere: perpass@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "The perpass list is for IETF discussion of pervasive monitoring. " <perpass.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/perpass>, <mailto:perpass-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/perpass/>
List-Post: <mailto:perpass@ietf.org>
List-Help: <mailto:perpass-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/perpass>, <mailto:perpass-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 24 Oct 2015 22:46:47 -0000

>I agree with your recommendation to retain the Received header but
>modify what to put in it.  I believe the FROM clause should be removed
>completely, or we put in a "magic" (syntactically valid but semantically
>invalid) IPv4 or IPv6 address in it.  Similarily, implementations could
>put a magic time-stamp value in the field if they don't want to reveal
>when they received a particular message.

I was going to say it would make it impossible for me to send spam
reports, since I would have to way to tell who sent it, but then I
realized it would make no difference, since each received header my
system adds has a sequence number I can look up in the mail logs and
find out the connecting IP and time and a lot of other stuff.

But I'd like to back up a little.  You know how crypto people feel
when someone shows up with a wonderful new crypto scheme?  And then
when the someone says well, just tell me what's wrong with it?  Mail
is a lot like that.  It's much more complex and subtle than it
appears, even to people who've used it casually for a long time.

There are lots and lots of ways that a mail system can leak PII that
are unrelated to Received headers.  For example, MTAs look up the
connecting IP of each message in DNSBLs, they check SPF records, they
look up DKIM key records (which in my mail are unique for every
message), they look up DMARC records, they swap checksums with bulk
counting systems, they put stuff in Authentication-Results: headers,
and that's just what I can think of in two minutes.  When a mail system
is large enough that it has to spread the load across multiple MTAs,
there's more traffic among them to keep things in sync.

My suggestion would be to start by finding people who have experience
in large mail systems (Ned would be a good start if he has the time),
and then state clearly what you're trying to do.  It looks like it's
identifying and minimizing the amount of PII collected, reported (to
downstream consumers), and logged (for internal users) for incoming
mail.  Once you've done that, it'd be quite interesting to try and see
what gets collected, and what the tradeoffs are if you don't collect
it or don't report or log it.

R's,
John