Re: Documents with no authors
John C Klensin <john-ietf@jck.com> Fri, 23 November 2018 17:25 UTC
Return-Path: <john-ietf@jck.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E401312DD85 for <ietf@ietfa.amsl.com>; Fri, 23 Nov 2018 09:25:51 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zkRE-yszE23z for <ietf@ietfa.amsl.com>; Fri, 23 Nov 2018 09:25:50 -0800 (PST)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D17C2130DC0 for <ietf@ietf.org>; Fri, 23 Nov 2018 09:25:49 -0800 (PST)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1gQFDA-000PU9-3G; Fri, 23 Nov 2018 12:25:48 -0500
Date: Fri, 23 Nov 2018 12:25:41 -0500
From: John C Klensin <john-ietf@jck.com>
To: Doug Royer <douglasroyer@gmail.com>, ietf@ietf.org
Subject: Re: Documents with no authors
Message-ID: <95327D9B3B548C86FBB67CE6@PSB>
In-Reply-To: <817cb7db-c095-75cf-3450-ddc9c3372784@gmail.com>
References: <7f831a6a-e2d0-cb88-1d2a-dfcdab921307@gmail.com> <817cb7db-c095-75cf-3450-ddc9c3372784@gmail.com>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/bZWXx1ZQV5eJDBM99Mg65axAEqg>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 23 Nov 2018 17:25:52 -0000
--On Friday, November 23, 2018 07:57 -0700 Doug Royer <douglasroyer@gmail.com> wrote: > On 11/23/18 3:29 AM, Stewart Bryant wrote: >> https://datatracker.ietf.org/stats/document/authors/ >> >> Why do such a high proportion of our documents (for example >> 1929 RFCs) have no authors? > > Well, RFC-1929 does have an author. So I am guessing the > automated tools can not (or did not) parse the older text only > documents. A different guess would be that whatever tool/ algorithm produces this graph counts a document with "only" an editor as having no author. If that is the way things are counted, this sort of statistic would not be surprising. Indeed, if documents that came out of a WG and that were ultimately compendiums of input from many WG participants were identified as having an editor and not an author or handful or authors, I'd expect 16.79% to be somewhat low and hope 25.21% (of RFCs only) would be low too. Doug, if the problem were "text only", then one would expect a much larger number. If you intended "XML available" then that wasn't defined until RFC 2629 and, IIR, the RFC Editor didn't start accepting the XML files, much less archiving them, until much leter. If it were "XML or nroff", I don't know -- it might depend on whether the documents that were submitted/archived on paper and then scanned and converted passed through an nroff page. More important to this little detective job, if one adds up the numbers in the right column of the "RFCs" tab, one ends up with 8311, a fair approximation to the largest RFC number as of this morning (8521), and a closer one if the number "not issued" (79) is subtracted (8442). Could the difference of about 210 be documents that have been issued numbers but are still in the publication queue? I don't know, but, given the highest issued numbers are 8496, 8505, and 8521, it doesn't seem entirely implausible. Similar comments would apply to I-Ds: as far as I know, it has never been possible to post one without an identifiable author or editor. There are definitely a few pseudonyms but those are still authors for the purpose of this type of count Possibly something slipped through the cracks, but I'd expect that number to be in single digits. Moreover, counting an RFC as having "no author" when it was really "not parsed" would be seriously irresponsible and I would not expect that of the tools team. FWIW, I would expect a page like these to show the date compiled (perhaps there was a lag between the RFC list or I-D list and the compilation date/time) and exactly what is reported as "0 authors". --your friendly statistical detective
- Documents with no authors Stewart Bryant
- Re: Documents with no authors Doug Royer
- Re: Documents with no authors John C Klensin
- Re: Documents with no authors Brian E Carpenter