Re: [Shutup] [ietf-smtp] real life privacy tradeoffs, was Proposed Charter

Ted Lemon <mellon@fugue.com> Wed, 02 December 2015 15:12 UTC

Return-Path: <mellon@fugue.com>
X-Original-To: shutup@ietfa.amsl.com
Delivered-To: shutup@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 84C471A0065; Wed, 2 Dec 2015 07:12:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.013
X-Spam-Level:
X-Spam-Status: No, score=-0.013 tagged_above=-999 required=5 tests=[BAYES_20=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Pxl1wSkVPr28; Wed, 2 Dec 2015 07:12:48 -0800 (PST)
Received: from fugue.com (mail-2.fugue.com [IPv6:2a01:7e01::f03c:91ff:fee4:ad68]) by ietfa.amsl.com (Postfix) with ESMTP id BD0221A0049; Wed, 2 Dec 2015 07:12:47 -0800 (PST)
Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="----sinikael-?=_1-14490691644980.015955416252836585"
From: Ted Lemon <mellon@fugue.com>
To: shutup@ietf.org
In-Reply-To: <20151202144953.22592.qmail@ary.lan>
References: <20151202144953.22592.qmail@ary.lan>
Date: Wed, 02 Dec 2015 15:12:44 +0000
Message-Id: <1449069164873-53573cc1-d1798c08-2eeb118d@fugue.com>
MIME-Version: 1.0
Archived-At: <http://mailarchive.ietf.org/arch/msg/shutup/h72AndL1oNJxlPMkWIGSy9pCchc>
Cc: ietf-smtp@ietf.org
Subject: Re: [Shutup] [ietf-smtp] real life privacy tradeoffs, was Proposed Charter
X-BeenThere: shutup@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: SMTP Headers Unhealthy To User Privacy <shutup.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/shutup>, <mailto:shutup-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/shutup/>
List-Post: <mailto:shutup@ietf.org>
List-Help: <mailto:shutup-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/shutup>, <mailto:shutup-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 02 Dec 2015 15:12:50 -0000

Wednesday, Dec 2, 2015 9:49 AM John Levine wrote:
> Different people are different and it is not helpful to pretend that
> all end users are the same.  Most people say they care about privacy,
> but their actions show that they actually don't, e.g., they'll trade
> their password and SSN for a candy bar.

Show your data or please stop making generalizations like this.   This is really not helpful.

> Some people really do care about privacy.  I don't know if you've ever
> talked to someone who runs a battered women's shelter, but I have.
> For them, their privacy is really a matter of life and death, and they
> have to deal with impressively complex threats.  I've heard direct
> reports of malware that installs keyloggers that report back to the
> hostile spouse.  These people boot their computers from a CD to use
> webmail through Tor, and buy burner phones in bulk.  The kind of stuff
> we're talking about redacting here is completely irrelevant to them,
> since as I said, they are not so dim as to depend on their mail
> provider's logging practices for their safety.

So what you are saying is that there are two kinds of caring about privacy: not at all, or extremely.   I'm sure these stereotypes are based in real personal experience, but anecdotes are not data.   This is not what actual research on this topic appears to show.   We tend not to remember reasonable people in ordinary situations: we remember complete idiots, because it's funny/disturbing, and we remember people in trouble.   Basing policy decisions on personal recollections tends to get things badly wrong.

> Christian's point about bulk collection is a reasonable one, but just
> as the collection affects a lot of people, the security benefits from
> good header logging affect a lot of people, too.  We need to start by
> understanding how they're really used and what the benefits are.

I agree that we need to understand this.  I've been asking people who say they are in the know if they could share some data with us, and since I asked yesterday, it's unreasonable to think that someone would already have answered.   Hopefully we will get some data.

>From what we've heard here from people who run significant mail
> systems for real users, the benefits are substantial.

We've heard unsubstantiated assertions to this effect, not accompanied by any data, yes.

The reason I'm skeptical about this is twofold.  First, the one example someone presented that seemed to support the case for including IP address identifying information from the mail submission server actually looks to me like it makes things worse, not better, at least in the presence of a competently operated mail submit server.

Second, I'm pretty sure that if you are filtering spam, and then you add a heuristic that pays attention to the initial sender's source address, you will see some increase in messages identified as spam.   However, if you saw that effect five years ago when you installed that heuristic, and your filtering software has gotten a lot more sophisticated in its reliance on ML since then, you might still believe that the heuristic is making a big difference, when it's really making a small difference. 

And you might never have instrumented it in such a way that you could discover the present truth of the matter.   And that's precisely the perfectly reasonable mindset from which claims that "it makes a big difference" would come without any data at all to back them up.


--
Sent from Whiteout Mail - https://whiteout.io

My PGP key: https://keys.whiteout.io/mellon@fugue.com