Re: [Last-Call] OT: change BCP 83 [Re: Last Call: BCP 83 PR-Action Against Dan Harkins]

Toerless Eckert <> Wed, 05 October 2022 17:29 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 099E4C14CE2C; Wed, 5 Oct 2022 10:29:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -6.657
X-Spam-Status: No, score=-6.657 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id HPTouzzbfNXq; Wed, 5 Oct 2022 10:29:28 -0700 (PDT)
Received: from ( [IPv6:2001:638:a000:4134::ffff:40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by (Postfix) with ESMTPS id DDF95C14CE28; Wed, 5 Oct 2022 10:29:25 -0700 (PDT)
Received: from ( []) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by (Postfix) with ESMTPS id 5739A548532; Wed, 5 Oct 2022 19:29:18 +0200 (CEST)
Received: by (Postfix, from userid 10463) id 418184EBC3D; Wed, 5 Oct 2022 19:29:18 +0200 (CEST)
Date: Wed, 05 Oct 2022 19:29:18 +0200
From: Toerless Eckert <>
To: Colin Perkins <>
Cc: Adam Roach <>, Stephen Farrell <>, Eliot Lear <>, Mladen Karan <>,, IETF Chair <>, Ravi Shekhar <>
Message-ID: <>
References: <> <> <> <> <> <> <>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <>
Archived-At: <>
Subject: Re: [Last-Call] OT: change BCP 83 [Re: Last Call: BCP 83 PR-Action Against Dan Harkins]
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: IETF Last Calls <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 05 Oct 2022 17:29:32 -0000

On Wed, Oct 05, 2022 at 10:07:20AM +0100, Colin Perkins wrote:


> The IETF sill maintains IMAP access to the mail archive for lists hosted on
>, so access to the emails is straightforward.

For me using mutt as the high-speed text mail reader:

cat ~/bin/ietf-list 
#!/bin/sh -f
mutt -f "imaps:// Folders/$1"

Alas not very fast for "old" mailing lists with many messages (IMAP sucks for bulk ?
No idea, too little experience, could just be my client).

B) wrt to "classifying" messages:

Given how moderation (of forums/lists/submissions) is a big business,
it would at least be interesting to learn how expensive it would be to
learn how much an external, commercial service provider would charge for what type
of moderation service. And how the policies for moderation could be set.
I would guess a good number of app/mail participants in the IETF might have an idea.
I am mostly curious, not suggesting that would be a solution or for what problem,
the main point of interest is whether it is possible to get less biased 
but still useful evaluations by explicitly NOT using community members to do the assessment.
(aka: i can see how its easy for non-community members to be less biased, but
 i also fear they could be less effective figuring out what iss and what is not
 a valid technical argument...).


> > Then you either:
> > 
> >  * Take a suitably large random sample of messages over the past 37
> >    years (work out the size of the corpus and determine what you want
> >    your confidence interval to be), and assign a team to score which
> >    ones they believe meet some relevant criteria (e.g., violate today's
> >    code of conduct). You'll want at least two people -- and preferably
> >    more -- of differing backgrounds to look at each message to
> >    countervail certain kinds of biases. Or
> This would be time-consuming and expensive, but would likely give an
> interesting result.
> >  * Use one of the several available forum management tools to
> >    automatically score each message. Details vary, but most such tools
> >    will generate both "toxicity" and "sentiment" scores that you can
> >    plot over time. The ones I'm familiar with are run as a service, so
> >    you'd need to perform some light API integration (which might be as
> >    easy as piping formail into a curl command); although it's entirely
> >    possible that offline tools are also available.
> > 
> > Again, I know how to do this, but can't invest the resources. Let me
> > know if you're earnest, and I'll happily consult with you on getting it
> > to work.
> I’m part of [a project]( that’s doing mailing
> list analysis of IETF data. and the recent [IAB AID
> workshop]( also explored this
> topic.
> We haven’t spent too much time looking at sentiment analysis, but my
> colleagues took a quick look at messages on the list.
> The plots below show the average extent, expressed in the range 0…1, to
> which text in emails sent to that list in each year rate as positive,
> negative, or neutral sentiment, according to the [VADER Sentiment
> Analysis]( library:
> ![]( "Unknown.png")
> Redrawing the plot with a different range, to focus on the positive and
> negative sentiment categories, it’s clear that messages labelled as positive
> sentiment outweigh those labelled as negative, but there’s a significant
> fraction of negativity. Proportions don’t look to be changing significantly
> over time.
> ![]( "Unknown.png")
> Sentiment analysis, of course, is a crude measure that doesn’t necessarily
> correlate with toxicity. It’d be interesting to analyse further, and look at
> the other mailing lists too.
> If anyone’s interested in exploring the data further, [our
> project](  will be at the hackathon at the
> London IETF in a few weeks - come talk to us.
> Colin
> (with thanks to Mladen Karan and Ravi Shekhar, cc’d, for wrangling the data)

> -- 
> last-call mailing list