Re: [Pearg] Feedback and suggestions on draft-irtf-pearg-safe-internet-measurement-08

Mallory Knodel <> Mon, 13 November 2023 14:22 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 8C4C1C15C282 for <>; Mon, 13 Nov 2023 06:22:09 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.106
X-Spam-Status: No, score=-2.106 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id wFQOmWzwrVHQ for <>; Mon, 13 Nov 2023 06:22:05 -0800 (PST)
Received: from ( [IPv6:2607:f8b0:4864:20::731]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by (Postfix) with ESMTPS id 61E48C1522B9 for <>; Mon, 13 Nov 2023 06:22:05 -0800 (PST)
Received: by with SMTP id af79cd13be357-77896da2118so292051185a.1 for <>; Mon, 13 Nov 2023 06:22:05 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=google; t=1699885324; x=1700490124;; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=Addv4PzJWMtx+WOSAHAhZP9TkDeaS9y1+i5n5ppvQbQ=; b=sTwBO8y0cQH/OPpCiFwaGlCZuPAL4Y0zMqtCic0W5FI0eb+DEDBz4jMUNZvWbIZxTS jXv7TXOmu9FpreMvemgDAFPSEDKCRcv2w8DiBibbNi3zv3aDgvQFvDLKR7jl3dBepLG6 jlaBiFBf6aTVg3SXyY4VH6YLqB3lLnaTvtApQ=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20230601; t=1699885324; x=1700490124; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Addv4PzJWMtx+WOSAHAhZP9TkDeaS9y1+i5n5ppvQbQ=; b=oJIDoftvqAWbgmK4v+1qIYx5vCNHSF1NBXXf8dj8C4PegF2TjrM3q4BByvcTVv+oPc T+kuF2SMFKYO8bZk+8/TckP5VGQx0GPZshXNw/DupiUPzV45seotoRh3Yn3A4z9SHznr 1dWIdEoJh9I+xCXGE6QIE/isQ5PyELFq3DOWSjdwtSEtHA0J6s7U/8TQUoHFaLtLgBaY 7cN4hiffAGz0o32Lnj0h+sqQoP135D6d9bKtCJcMoayZ5jyy9bwWkKewy3s7QfpBT9F4 UAjfY3DlPa6mOQ21sErW3eidjJuae/L3J88Dd3GG7CkRVzopjYOnsqX5swWcEkVkCivl 4kPw==
X-Gm-Message-State: AOJu0Yyba9dy1ErjMLIgBjKqpPSRvh/18bdsp7bhcQ+f6FrM6cDW6zoT QeE7HniAFcHbEYLwN45UY/1BZoopKdDAUoaTwYO3Gw==
X-Google-Smtp-Source: AGHT+IF/bxq/F4RyMeI2WofVeaD623VZUsFrYDQEWaZuve7JcZKpaV/qGDB9g8MUGr6lolGSgOjvLw==
X-Received: by 2002:a05:620a:3192:b0:779:eb01:8390 with SMTP id bi18-20020a05620a319200b00779eb018390mr8314339qkb.49.1699885324036; Mon, 13 Nov 2023 06:22:04 -0800 (PST)
Received: from [] ([]) by with ESMTPSA id g1-20020a05620a278100b0077a02cf7949sm1923506qkp.32.2023. (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 13 Nov 2023 06:22:03 -0800 (PST)
Message-ID: <>
Date: Mon, 13 Nov 2023 09:21:43 -0500
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: en-US
To: Jeroen van der Ham <>,
References: <>
From: Mallory Knodel <>
In-Reply-To: <>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <>
Subject: Re: [Pearg] Feedback and suggestions on draft-irtf-pearg-safe-internet-measurement-08
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Privacy Enhancements and Assessment Proposed RG <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 13 Nov 2023 14:22:09 -0000

These are great-- thank you, Jeroen. I'm adding them to the GitHub 
issues tracker for the next version.


On 11/13/23 7:57 AM, Jeroen van der Ham wrote:
> Hi,
> The draft was brought to my attention by a colleague. I later saw that some comments were already made, my apologies if my comments are duplicates.
> Some background comments:
> * It is good to see that a document on this topic is being prepared, especially as this kind of activity is getting more and more attention. Ethics review is being performed more and more, and academic conferences on internet measurements are requiring it more as well.
> We should consider aligning the best practices here and for the conferences.
> See:
> * Furthermore, as a member of a computer science ethics committee we recently described our approach to reviewing ethics requests:
> There we take a more general angle at measurements on the internet, and also consider scanning the internet for security vulnerabilities. That angle seems to be completely missing in this draft, even though it is a very closely related activity.
> Both these resources provide some background reading that may help in general in improving this draft.
> Some more direct comments on the current text (-08):
> * Section 1.2: there is no definition for “One-/two-ended” ?
> * section 1.3: Describing ‘Measurement Studies’ as `attacks’ does not make sense to me. I can understand you want to call attention to the user impact of a measurement, but then I would describe that in terms of ‘privacy impact’, ‘information leakage’, etc.
> * section 2.1.1: The risks described here are conjecture for a hypothetical example. That is not really the best way to accessibly describe this.
> As a suggestion:
> The experiment can carry substantial risk for the user depending on the their local context. Trying to access censored material can be seen as (network) policy infringement or breaking laws.  Even if the experimenter wants to expose volunteers to this kind of risk, they must thus be fully informed, and voluntarily give consent to run the measurement. And even then experimenters should seriously consider designing their experiment in another way.
> * Section 2.1.3: the A/B example seems very convoluted in its description. It reads like the author had some kind of example in mind, but tried to hard to abstract away from it, or using two examples at the same time.
> * Section 2.4: There is no conclusion drawn from the fact that it may be possible to infer members of the "do not scan" list.
> * Section 2.5: replace "it" with "data" in all the section titles.
> * Section 2.5.1: you mean data that the measurement generated but also the data generated as response from the subjects?
> * Suggested change: For performance benchmarking, [RFC2544] requires that any frames ..
> * Section 2.5.2: we published a survey on masking data several years ago, but I don't think things changed that much since then:
> Most important consideration from our research: masking can do pseudonimisation (i.e. making it harder to immediately infer identity), but there is almost no masking that can provide anonymization (i.e. making it impossible to infer identity).
> Happy to help out further.
> Regards,
> Jeroen van der Ham-de Vos.
Mallory Knodel
CTO :: Center for Democracy and Technology
newsletter ::