Re: [Last-Call] OT: change BCP 83 [Re: Last Call: BCP 83 PR-Action Against Dan Harkins]

On Wed, Oct 05, 2022 at 10:07:20AM +0100, Colin Perkins wrote:

A)

> The IETF sill maintains IMAP access to the mail archive for lists hosted on
> ietf.org, so access to the emails is straightforward.

For me using mutt as the high-speed text mail reader:

cat ~/bin/ietf-list 
#!/bin/sh -f
mutt -f "imaps://anonymous:tteATcs.fau.de@imap.ietf.org/Shared Folders/$1"

Alas not very fast for "old" mailing lists with many messages (IMAP sucks for bulk ?
No idea, too little experience, could just be my client).

B) wrt to "classifying" messages:

Given how moderation (of forums/lists/submissions) is a big business,
it would at least be interesting to learn how expensive it would be to
learn how much an external, commercial service provider would charge for what type
of moderation service. And how the policies for moderation could be set.
I would guess a good number of app/mail participants in the IETF might have an idea.
I am mostly curious, not suggesting that would be a solution or for what problem,
the main point of interest is whether it is possible to get less biased 
but still useful evaluations by explicitly NOT using community members to do the assessment.
(aka: i can see how its easy for non-community members to be less biased, but
 i also fear they could be less effective figuring out what iss and what is not
 a valid technical argument...).

Cheers
    Toerless

> 
> > Then you either:
> > 
> >  * Take a suitably large random sample of messages over the past 37
> >    years (work out the size of the corpus and determine what you want
> >    your confidence interval to be), and assign a team to score which
> >    ones they believe meet some relevant criteria (e.g., violate today's
> >    code of conduct). You'll want at least two people -- and preferably
> >    more -- of differing backgrounds to look at each message to
> >    countervail certain kinds of biases. Or
> 
> This would be time-consuming and expensive, but would likely give an
> interesting result.
> 
> >  * Use one of the several available forum management tools to
> >    automatically score each message. Details vary, but most such tools
> >    will generate both "toxicity" and "sentiment" scores that you can
> >    plot over time. The ones I'm familiar with are run as a service, so
> >    you'd need to perform some light API integration (which might be as
> >    easy as piping formail into a curl command); although it's entirely
> >    possible that offline tools are also available.
> > 
> > Again, I know how to do this, but can't invest the resources. Let me
> > know if you're earnest, and I'll happily consult with you on getting it
> > to work.
> 
> I’m part of [a project](https://sodestream.github.io) that’s doing mailing
> list analysis of IETF data. and the recent [IAB AID
> workshop](https://www.iab.org/activities/workshops/aid/) also explored this
> topic.
> 
> We haven’t spent too much time looking at sentiment analysis, but my
> colleagues took a quick look at messages on the ietf@ietf.org list.
> 
> The plots below show the average extent, expressed in the range 0…1, to
> which text in emails sent to that list in each year rate as positive,
> negative, or neutral sentiment, according to the [VADER Sentiment
> Analysis](https://github.com/cjhutto/vaderSentiment) library:
> 
> 
> ![](cid:213515C1-D217-49DF-917F-1917027BFFE5@csperkins.org "Unknown.png")
> 
> Redrawing the plot with a different range, to focus on the positive and
> negative sentiment categories, it’s clear that messages labelled as positive
> sentiment outweigh those labelled as negative, but there’s a significant
> fraction of negativity. Proportions don’t look to be changing significantly
> over time.
> 
> ![](cid:DEBEA8F9-D09E-400E-B002-2092F65EF089@csperkins.org "Unknown.png")
> 
> Sentiment analysis, of course, is a crude measure that doesn’t necessarily
> correlate with toxicity. It’d be interesting to analyse further, and look at
> the other mailing lists too.
> 
> If anyone’s interested in exploring the data further, [our
> project](https://sodestream.github.io)  will be at the hackathon at the
> London IETF in a few weeks - come talk to us.
> 
> Colin
> (with thanks to Mladen Karan and Ravi Shekhar, cc’d, for wrangling the data)

> -- 
> last-call mailing list
> last-call@ietf.org
> https://www.ietf.org/mailman/listinfo/last-call

-- 
---
tte@cs.fau.de