RE: Next steps on Web Analytics Project

Roman Danyliw <rdd@cert.org> Mon, 07 October 2019 20:37 UTC

Return-Path: <rdd@cert.org>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6133312010C for <ietf@ietfa.amsl.com>; Mon, 7 Oct 2019 13:37:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cert.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5rpV0n6ZbAuP for <ietf@ietfa.amsl.com>; Mon, 7 Oct 2019 13:37:29 -0700 (PDT)
Received: from veto.sei.cmu.edu (veto.sei.cmu.edu [147.72.252.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5665B1200A1 for <ietf@ietf.org>; Mon, 7 Oct 2019 13:37:29 -0700 (PDT)
Received: from delp.sei.cmu.edu (delp.sei.cmu.edu [10.64.21.31]) by veto.sei.cmu.edu (8.14.7/8.14.7) with ESMTP id x97Kb6me011105; Mon, 7 Oct 2019 16:37:06 -0400
DKIM-Filter: OpenDKIM Filter v2.11.0 veto.sei.cmu.edu x97Kb6me011105
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cert.org; s=yc2bmwvrj62m; t=1570480626; bh=ceajB71Ci4EKyTCsFkIDnZPcZNfS/iC/8g9dKFaYCpw=; h=From:To:Subject:Date:References:In-Reply-To:From; b=YVmXFa3VxMAYlcbhYoT6nPRTBJ5MaI/p6K9rcSLLtpMc0VfB1xNS1H3+86aTBI15+ Dm0h1l0ksLWp1EhKU4irruS+8sV8b2s33keCP2+wuZOQg3vr+kGHW1EaMn2cR63NJ+ H9RhVl/C+GBvlJRPGb1X4/4CrIessVzMs5ZpXJ8M=
Received: from CASCADE.ad.sei.cmu.edu (cascade.ad.sei.cmu.edu [10.64.28.248]) by delp.sei.cmu.edu (8.14.7/8.14.7) with ESMTP id x97Kb7qu045662; Mon, 7 Oct 2019 16:37:07 -0400
Received: from MARATHON.ad.sei.cmu.edu ([10.64.28.250]) by CASCADE.ad.sei.cmu.edu ([10.64.28.248]) with mapi id 14.03.0468.000; Mon, 7 Oct 2019 16:37:07 -0400
From: Roman Danyliw <rdd@cert.org>
To: S Moonesamy <sm+ietf@elandsys.com>, Stephen Farrell <stephen.farrell@cs.tcd.ie>, "ietf@ietf.org" <ietf@ietf.org>
Subject: RE: Next steps on Web Analytics Project
Thread-Topic: Next steps on Web Analytics Project
Thread-Index: AdV0YcUQwPdOEpIpSKy4TSQ8M6VgcgBO8OQAAAEhiIAAArNqgAHna6lg
Date: Mon, 07 Oct 2019 20:37:06 +0000
Message-ID: <359EC4B99E040048A7131E0F4E113AFC01B347B877@marathon>
References: <359EC4B99E040048A7131E0F4E113AFC01B3469321@marathon> <6.2.5.6.2.20190927141301.14fc0c58@elandnews.com> <2ba9db60-3041-135b-69c8-4b8905d5458a@cs.tcd.ie> <6.2.5.6.2.20190927153722.14fd7d58@elandnews.com>
In-Reply-To: <6.2.5.6.2.20190927153722.14fd7d58@elandnews.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.64.22.6]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/hSJQaRZkQfF3CEPVx4zaAxCE7fs>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 07 Oct 2019 20:37:31 -0000

Hi!

> -----Original Message-----
> From: S Moonesamy [mailto:sm+ietf@elandsys.com]
> Sent: Friday, September 27, 2019 7:29 PM
> To: Stephen Farrell <stephen.farrell@cs.tcd.ie>; ietf@ietf.org
> Cc: Roman Danyliw <rdd@cert.org>
> Subject: Re: Next steps on Web Analytics Project
> 
> Hi Stephen,
> At 03:11 PM 27-09-2019, Stephen Farrell wrote:
> >Yes, tracking what and when becomes possible.
> >
> >I'm also unhappy with that. Is there no way to ensure that addresses
> >and geolocated regions are sufficiently aggregated so as to not
> >identify individuals?
> >
> >/16's and countries are not sufficient for all IETFers.
> >
> >I'm sure someone who reads this list would have a fair chance at
> >(re-)identifying various individuals based on time, /16 or /48, and
> >URL.
> 
> It is technically possible to identify a person or a small set of persons even if
> the IPv4 addresses are aggregated by /14.  I suggest stepping back a little.
> The technical solution is being used to drive the policy statement.  Would it
> be better to do the reverse to figure out what is feasible [1]?  That would
> entail flushing out the policy statement to get a sense of what information
> IESG members [2][3] would find useful.

I'm not entirely following how the technical solution is driving the policy statement (i.e., motivation for the project).  Section 1.1 identifies what information is useful -- the use cases and questions that would be helpful to answering for improving the web-site.  Section 2.2 describes a candidate solution based on needs dictated by the use cases.  Section 2.3 provides a mapping between the individual data elements that will be collected by the solution and these motivating use cases.  Section 3 and 4 acknowledges that there are security and privacy issues in implementing this policy and provides a series of mitigations.  

You're right, ultimately, the technology solution (Matamo) does drive some of the mitigations as it provides only certain types of anonymization and aggregation primitives.

Regards,
Roman

> The data processor could then use a
> "custom dimension" to decrease the probability of identification of that
> small set of persons.
>
> Regards,
> S. Moonesamy
> 
> 1. Please see the P.S. in your email
> 2. One of the issues is that web analytics usually use the IP addresses to
> aggregate by country.  Does an IESG member need to know whether Country
> X has expressed an interest in, for example, the IESG history of appeals?
> 3. Does an IETF LLC Director need to know who is reading the monthly
> financial statements?