[Mtgvenue] Exploration of a "posting" metric - draft-elkins-mtgvenue-participation-metrics

"Fred Baker (fred)" <fred@cisco.com> Wed, 20 July 2016 09:37 UTC

Return-Path: <fred@cisco.com>
X-Original-To: mtgvenue@ietfa.amsl.com
Delivered-To: mtgvenue@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B776812D9BB for <mtgvenue@ietfa.amsl.com>; Wed, 20 Jul 2016 02:37:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -115.808
X-Spam-Level:
X-Spam-Status: No, score=-115.808 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.287, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8ZhdoW3EPDbL for <mtgvenue@ietfa.amsl.com>; Wed, 20 Jul 2016 02:37:10 -0700 (PDT)
Received: from rcdn-iport-2.cisco.com (rcdn-iport-2.cisco.com [173.37.86.73]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B84A312D158 for <mtgvenue@ietf.org>; Wed, 20 Jul 2016 02:37:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=3389; q=dns/txt; s=iport; t=1469007429; x=1470217029; h=from:to:cc:subject:date:message-id:mime-version; bh=VNntqb5bkvazeKV7v1rnCxVRUd01nxxuqRx/vUtXHt4=; b=LQLEPljbg2DYVYGQ0LZla3fM75EcTCCckE6HK8O1zd6Mo/5+NbZOXUpi 5wLKO3iHnbDOyGiRqHR3B/Eui1c+XBAa7FjX8IPoR3lMASA4Rp55gpQ8m R65AO+hS182tKlaTItBNGye5/Q5hpCbS0wyA4nVgn27HsKshpmjubLCdZ I=;
X-Files: signature.asc : 833
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0AtBQCgRY9X/4sNJK1TCoM/VoECumQkhXaBMDsRAQEBAQEBAWUcC4RjbgsSAR1jJwQOE4giDrUQiD0BAQEBAQEBAQEBAQEBAQEBAQEBAQEOCQWIIgiGXwYKAgGDR4IvBY4LixsBgTSCAoFuiT2POZAfATQgg3OHX38BAQE
X-IronPort-AV: E=Sophos;i="5.28,393,1464652800"; d="asc'?scan'208";a="130698047"
Received: from alln-core-6.cisco.com ([173.36.13.139]) by rcdn-iport-2.cisco.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 20 Jul 2016 09:37:08 +0000
Received: from XCH-ALN-013.cisco.com (xch-aln-013.cisco.com [173.36.7.23]) by alln-core-6.cisco.com (8.14.5/8.14.5) with ESMTP id u6K9b8It003295 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Wed, 20 Jul 2016 09:37:08 GMT
Received: from xch-rcd-013.cisco.com (173.37.102.23) by XCH-ALN-013.cisco.com (173.36.7.23) with Microsoft SMTP Server (TLS) id 15.0.1210.3; Wed, 20 Jul 2016 04:37:08 -0500
Received: from xch-rcd-013.cisco.com ([173.37.102.23]) by XCH-RCD-013.cisco.com ([173.37.102.23]) with mapi id 15.00.1210.000; Wed, 20 Jul 2016 04:37:08 -0500
From: "Fred Baker (fred)" <fred@cisco.com>
To: "nalini.elkins@insidethestack.com" <nalini.elkins@insidethestack.com>
Thread-Topic: Exploration of a "posting" metric - draft-elkins-mtgvenue-participation-metrics
Thread-Index: AQHR4mpIZEOUJTC71kmgqKWvSOf+Xg==
Date: Wed, 20 Jul 2016 09:37:08 +0000
Message-ID: <F4EACEAA-0255-4985-AB4B-0247085C1D68@cisco.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: yes
X-MS-TNEF-Correlator:
x-mailer: Apple Mail (2.3124)
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.61.239.122]
Content-Type: multipart/signed; boundary="Apple-Mail=_8646DE4D-58E5-4ABC-8E15-600462C98F67"; protocol="application/pgp-signature"; micalg="pgp-sha1"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/mtgvenue/VuRwDFfCriJ_eXnCnYIgeyQF4Ok>
Cc: "mtgvenue@ietf.org" <mtgvenue@ietf.org>
Subject: [Mtgvenue] Exploration of a "posting" metric - draft-elkins-mtgvenue-participation-metrics
X-BeenThere: mtgvenue@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "List for email discussion of the IAOC meeting venue selection process." <mtgvenue.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mtgvenue>, <mailto:mtgvenue-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mtgvenue/>
List-Post: <mailto:mtgvenue@ietf.org>
List-Help: <mailto:mtgvenue-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mtgvenue>, <mailto:mtgvenue-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Jul 2016 09:37:12 -0000

In your draft and in your talk, you mention the possibility of measuring a person's participation or contribution in terms of email postings. You also mention a number of other possible models, such as whether their name winds up be "acknowledged" or listed as a "contributor" in a draft or RFC.

I took a crack at extracting information about email postings from the mail archive. You can find a very preliminary result at https://www.dropbox.com/sh/2t1d3m96fi4brbf/AACqZ8-fQ08bbtFwJA9ZQczQa?dl=0

The point of that was to identify low-hanging fruit. We can and do measure a person's registration for or attendance at face-to-face meetings. We can also determine whether their email address shows up in an internet draft (I find the email address easier to think about than their name, as names can have several formats, wrap around line breaks, and so on). In this prototype, I rsync'd the mailman archive, and for each list and each month's box capture, parsed the file to find "From:" lines from emails.

The output is in the CSV file in that dropbox link. There's nothing magic in there - we could sort things in a different way or etc easily enough. What I tracked in this version was a sender's email address, the lists they posted to, for each such list the total number of messages they sent, and then per month how many messages they sent in that month.

My thought in that is that it's pretty easy to associate an email address with a mailing list, and determine an individual's transmission rate. If one knew that a given person used a given email address, one could determine a person's transmission rate on a monthly basis. What isn't readily found is anything about topic, whether the contribution is positive or negative, useful or otherwise, etc. Semantic analysis, and perhaps the distinction between "participation" and "contribution", is not, from my perspective, "low-hanging fruit".

I'm wondering here what you want in terms of a participation metric.