Re: [rfc-i] Wrong Internet search results for new RFCs

Toerless Eckert <tte@cs.fau.de> Tue, 03 May 2022 17:56 UTC

Return-Path: <rfc-interest-bounces@rfc-editor.org>
X-Original-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Delivered-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id CC328C15E6EC for <ietfarch-rfc-interest-archive@ietfa.amsl.com>; Tue, 3 May 2022 10:56:16 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ietf.org; s=ietf1; t=1651600576; bh=4k1Tv+KqjIkE8wB1JYOBDD8RaWl1pstfMFJW5ql0/84=; h=Date:From:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe; b=GV+3DrUtKIsCMel0VBs8CSD/Xz65s1cXnw4yYc6t0RRcA5uTnTcFyRbkWlBBKgQbg ygK0GG/o/4TcbG56/UmfvW2OKPyuUVa8v0wUvGTqGDiR3/pyRD2J5JPJad24+YNfeG 4xUCK6frEFjqCqo3TFHHN/aF0WNVZ/zvapclweZc=
X-Mailbox-Line: From rfc-interest-bounces@rfc-editor.org Tue May 3 10:56:16 2022
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 4DA47C159824; Tue, 3 May 2022 10:56:16 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ietf.org; s=ietf1; t=1651600576; bh=4k1Tv+KqjIkE8wB1JYOBDD8RaWl1pstfMFJW5ql0/84=; h=Date:From:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe; b=GV+3DrUtKIsCMel0VBs8CSD/Xz65s1cXnw4yYc6t0RRcA5uTnTcFyRbkWlBBKgQbg ygK0GG/o/4TcbG56/UmfvW2OKPyuUVa8v0wUvGTqGDiR3/pyRD2J5JPJad24+YNfeG 4xUCK6frEFjqCqo3TFHHN/aF0WNVZ/zvapclweZc=
X-Original-To: rfc-interest@ietfa.amsl.com
Delivered-To: rfc-interest@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BCC13C159824 for <rfc-interest@ietfa.amsl.com>; Tue, 3 May 2022 10:56:15 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.648
X-Spam-Level:
X-Spam-Status: No, score=-1.648 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.248, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6LSD1k4jqlKF for <rfc-interest@ietfa.amsl.com>; Tue, 3 May 2022 10:56:12 -0700 (PDT)
Received: from faui40.informatik.uni-erlangen.de (faui40.informatik.uni-erlangen.de [131.188.34.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E7238C1595E4 for <rfc-interest@rfc-editor.org>; Tue, 3 May 2022 10:56:11 -0700 (PDT)
Received: from faui48e.informatik.uni-erlangen.de (faui48e.informatik.uni-erlangen.de [IPv6:2001:638:a000:4134::ffff:51]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by faui40.informatik.uni-erlangen.de (Postfix) with ESMTPS id 72D42549EA0; Tue, 3 May 2022 19:56:05 +0200 (CEST)
Received: by faui48e.informatik.uni-erlangen.de (Postfix, from userid 10463) id 504444EADAB; Tue, 3 May 2022 19:56:05 +0200 (CEST)
Date: Tue, 03 May 2022 19:56:05 +0200
From: Toerless Eckert <tte@cs.fau.de>
To: John R Levine <johnl@taugh.com>
Cc: rfc-interest@rfc-editor.org
Message-ID: <YnFstfHL/ppX22Nd@faui48e.informatik.uni-erlangen.de>
References: <YnCECATOh9mI1HY4@faui48e.informatik.uni-erlangen.de> <20220503021720.69EDC3F4BACA@ary.qy> <YnCkoCzNTr3aquH/@faui48e.informatik.uni-erlangen.de> <710830b2-0361-83e0-314f-3b39ee9422d2@taugh.com>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <710830b2-0361-83e0-314f-3b39ee9422d2@taugh.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/rfc-interest/mykWDUc1Q9UDWVbKdkljOOJpxGo>
Subject: Re: [rfc-i] Wrong Internet search results for new RFCs
X-BeenThere: rfc-interest@rfc-editor.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: "A list for discussion of the RFC series and RFC Editor functions." <rfc-interest.rfc-editor.org>
List-Unsubscribe: <https://mailman.rfc-editor.org/mailman/options/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rfc-interest/>
List-Post: <mailto:rfc-interest@rfc-editor.org>
List-Help: <mailto:rfc-interest-request@rfc-editor.org?subject=help>
List-Subscribe: <https://mailman.rfc-editor.org/mailman/listinfo/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: rfc-interest-bounces@rfc-editor.org
Sender: rfc-interest <rfc-interest-bounces@rfc-editor.org>

On Tue, May 03, 2022 at 01:18:09PM -0400, John R Levine wrote:
> > Sure, but they need to find the URL first, and given how these "authors"
> > URLs are typically private-email used, i wouldn't know where the
> > search engines would find he URLs other than from the index URL.
> 
> They only add URLs that work.

Sure. Absent of a sitemap, google would have had to find some place that
included a URL https://www.rfc-editor.org/rfc/authors/, then it finds the list
of URLS there, check them and adds them - and expires them only periodically.
And at that point in time, rfc9148.html was valid

> > The site-map would list a URL like https://www.rfc-editor.org/rfc/authors/rfc9148.html
> > given how its temporary... ?!
> 
> I wrote the script that generates the site map and it never included
> anything in the authors directory.

Well, either there is something wrong with the sitemap, or google _also_
looks up URLs through the non-sitemap ways described above, and some place
is pointing to https://www.rfc-editor.org/rfc/authors/. I just can't
find which place that would be through google itself.

Cheers
    Toerless

_______________________________________________
rfc-interest mailing list
rfc-interest@rfc-editor.org
https://mailman.rfc-editor.org/mailman/listinfo/rfc-interest