Re: [Tools-discuss] Google Scholar not indexing Internet-Drafts

John R Levine <johnl@taugh.com> Fri, 30 July 2021 11:35 UTC

Return-Path: <johnl@taugh.com>
X-Original-To: tools-discuss@ietfa.amsl.com
Delivered-To: tools-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 340AB3A2721 for <tools-discuss@ietfa.amsl.com>; Fri, 30 Jul 2021 04:35:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.1
X-Spam-Level:
X-Spam-Status: No, score=-2.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=iecc.com header.b=sS58+Dfj; dkim=pass (2048-bit key) header.d=taugh.com header.b=II1tn5Zl
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TUbNMEgACwyn for <tools-discuss@ietfa.amsl.com>; Fri, 30 Jul 2021 04:35:15 -0700 (PDT)
Received: from gal.iecc.com (gal.iecc.com [IPv6:2001:470:1f07:1126:0:43:6f73:7461]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 99DE53A2725 for <tools-discuss@ietf.org>; Fri, 30 Jul 2021 04:35:15 -0700 (PDT)
Received: (qmail 17169 invoked from network); 30 Jul 2021 11:35:13 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=iecc.com; h=date:message-id:from:to:cc:subject:in-reply-to:references:mime-version:content-type; s=430f.6103e3f1.k2107; bh=azO9lkP5t7iUGTlVRiJOlT6YmGa7THtdp1XUBW/v6nk=; b=sS58+DfjQceUEK7HvZzX8NQMBYQeW8pUT/Pp0kSvodsvWakBM4ti4ovG6HYmiKR0waCzdn1vowOYFAK+8eW4Hq+DiI6rXepKDBZi9IUNkcPLDpAa/iwOaXzVYDimtYmX7MU14O+Z2LBuic8PA2flcA2vkZLtCf/spP+7ewI96VDyG4QHs4USTf9vfyqLhniuPErT+MS+E1GN7c6zwW14jFQFYu7YP+FihEXTuSKYvdxsrXPyeyLEW4fpl/pslQj0IIt5GDv5YeVeqGBW3qWv0TFzUhOdUiw27MGkLmPJueqnQXf2J5PISxTfjO+CoBGo5VD9dc/75xy1HV8KK60iSQ==
DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=taugh.com; h=date:message-id:from:to:cc:subject:in-reply-to:references:mime-version:content-type; s=430f.6103e3f1.k2107; bh=azO9lkP5t7iUGTlVRiJOlT6YmGa7THtdp1XUBW/v6nk=; b=II1tn5ZltSVRQa77hAmjDtNWEHs9VelkQEakA4YE5ggAf5Tbw2jbNrvmJp5JXf8Om82FlTUTIWMR7Lu7HVtOamDyfmbiUKoyGRB+PbQOSIuuXfzO0CWUoGerqEHOxCZjvGftW6z85P9JG2GSVC/0GyFYC/5h31bH5xcC8aYhcrpS9gtL0sNs4OkwyLbKveqIVHp6ZFlppvTCBRopMONnPiTu7wnSGabA2S9/sht6ukfRiA4zrU014KZ1HED7pYlbNVj+LjJVf007nAvWtALBSOTBT2oDn6+CervykQW2P8dWgQtGdwnFDBDc+himoewt8L9BYX3EeLuNp0oDvLNqng==
Received: from ary.qy ([IPv6:2001:470:1f07:1126::78:696d:6170]) by imap.iecc.com ([IPv6:2001:470:1f07:1126::78:696d:6170]) with ESMTPS (TLS1.2 ECDHE-RSA AES-256-GCM AEAD) via TCP6; 30 Jul 2021 11:35:13 -0000
Received: by ary.qy (Postfix, from userid 501) id BE9AB25555FA; Fri, 30 Jul 2021 07:35:11 -0400 (EDT)
Received: from localhost (localhost [127.0.0.1]) by ary.qy (Postfix) with ESMTP id 3EED225555DC; Fri, 30 Jul 2021 07:35:11 -0400 (EDT)
Date: Fri, 30 Jul 2021 07:35:11 -0400
Message-ID: <e415ab95-9863-78c9-b111-6b0dd2aef@taugh.com>
From: John R Levine <johnl@taugh.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: Tools Discussion <tools-discuss@ietf.org>
X-X-Sender: johnl@ary.qy
In-Reply-To: <FE7F9DF4-1C3B-4955-B51B-2EEC9432F9C2@tzi.org>
References: <b69f81cc-b0bc-ba9d-c752-e707d3b9174f@petit-huguenin.org> <20210729035232.EF66B254891D@ary.qy> <CAHw9_iLqS58BqefUVBeYSW22wEZy9LMKkRhaw0pDMNCGbcjo4A@mail.gmail.com> <2f359f5-838c-f1fe-bdb0-156d98ddc0e5@taugh.com> <FE7F9DF4-1C3B-4955-B51B-2EEC9432F9C2@tzi.org>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="0-1312564875-1627644911=:25957"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tools-discuss/gh9IhWkcwYtZVt60P7v9jgdlb68>
Subject: Re: [Tools-discuss] Google Scholar not indexing Internet-Drafts
X-BeenThere: tools-discuss@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF Tools Discussion <tools-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tools-discuss/>
List-Post: <mailto:tools-discuss@ietf.org>
List-Help: <mailto:tools-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 30 Jul 2021 11:35:21 -0000

> Clearly, Google Scholar should be taught to find the canonical RFCs at https://rfc-editor.org/rfc - having many secondary copies is an SEO anathema.

Right.  I hope some sitemaps will help.

> Internet-Drafts, hmm.  Many are not arxiv quality, and we don’t want the 
> I-D repository as a dumping ground for people who just want to get their 
> garbage into Scholar.  But the main problem is that the references to 
> the drafts will stay active even when the RFC has been published(*) (or 
> the I-D replaced), so we should be very careful with what we offer under 
> the URI that will be indexed.

I agree they're not what Scholar is looking for.

I'm not worried about people gaming it, you should be able to get anything 
into Scholar if you put it on a web server with the right metadata, but 
the last thing I want is yet another reason people will imagine that every 
I-D is an Internet Standard.

Regards,
John Levine, johnl@taugh.com, Taughannock Networks, Trumansburg NY
Please consider the environment before reading this e-mail. https://jl.ly