Re: [rfc-i] Wrong Internet search results for new RFCs

Toerless Eckert <tte@cs.fau.de> Tue, 03 May 2022 03:42 UTC

Return-Path: <rfc-interest-bounces@rfc-editor.org>
X-Original-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Delivered-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 3B15CC15E6C5 for <ietfarch-rfc-interest-archive@ietfa.amsl.com>; Mon, 2 May 2022 20:42:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ietf.org; s=ietf1; t=1651549357; bh=zHfxoMPZ1j4XBpyQtA2On/yFDjiB45r0AaLdg94IJv8=; h=Date:From:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe; b=HDBjvZGZB1GklmNeV5kCReAJ/7qFTvCBptEu+J4WEm6EG1vOqQ6cX/kPb+oETEi0K nlDr/OZSjhZufdg9W263S+qnvt1+Iec0shGUnLSncME78kGAGgxobv2LGb8Igz5Uzs qJ2mY0q3pDF/ris2ATaDop5x7BRiyfmHoV/JjSA0=
X-Mailbox-Line: From rfc-interest-bounces@rfc-editor.org Mon May 2 20:42:37 2022
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id ECE96C157B35; Mon, 2 May 2022 20:42:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ietf.org; s=ietf1; t=1651549357; bh=zHfxoMPZ1j4XBpyQtA2On/yFDjiB45r0AaLdg94IJv8=; h=Date:From:To:Cc:References:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe; b=HDBjvZGZB1GklmNeV5kCReAJ/7qFTvCBptEu+J4WEm6EG1vOqQ6cX/kPb+oETEi0K nlDr/OZSjhZufdg9W263S+qnvt1+Iec0shGUnLSncME78kGAGgxobv2LGb8Igz5Uzs qJ2mY0q3pDF/ris2ATaDop5x7BRiyfmHoV/JjSA0=
X-Original-To: rfc-interest@ietfa.amsl.com
Delivered-To: rfc-interest@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C4BF4C14F745 for <rfc-interest@ietfa.amsl.com>; Mon, 2 May 2022 20:42:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.651
X-Spam-Level:
X-Spam-Status: No, score=-1.651 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.248, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Bww_Hu-uhjDQ for <rfc-interest@ietfa.amsl.com>; Mon, 2 May 2022 20:42:30 -0700 (PDT)
Received: from faui40.informatik.uni-erlangen.de (faui40.informatik.uni-erlangen.de [131.188.34.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 85C18C14F741 for <rfc-interest@rfc-editor.org>; Mon, 2 May 2022 20:42:29 -0700 (PDT)
Received: from faui48e.informatik.uni-erlangen.de (faui48e.informatik.uni-erlangen.de [131.188.34.51]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by faui40.informatik.uni-erlangen.de (Postfix) with ESMTPS id 206F2549C87; Tue, 3 May 2022 05:42:24 +0200 (CEST)
Received: by faui48e.informatik.uni-erlangen.de (Postfix, from userid 10463) id 07D284EAD9D; Tue, 3 May 2022 05:42:24 +0200 (CEST)
Date: Tue, 03 May 2022 05:42:24 +0200
From: Toerless Eckert <tte@cs.fau.de>
To: John Levine <johnl@taugh.com>
Cc: rfc-interest@rfc-editor.org
Message-ID: <YnCkoCzNTr3aquH/@faui48e.informatik.uni-erlangen.de>
References: <YnCECATOh9mI1HY4@faui48e.informatik.uni-erlangen.de> <20220503021720.69EDC3F4BACA@ary.qy>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <20220503021720.69EDC3F4BACA@ary.qy>
Archived-At: <https://mailarchive.ietf.org/arch/msg/rfc-interest/RizBjWKNERa7T0PFQtCP4U0345A>
Subject: Re: [rfc-i] Wrong Internet search results for new RFCs
X-BeenThere: rfc-interest@rfc-editor.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: "A list for discussion of the RFC series and RFC Editor functions." <rfc-interest.rfc-editor.org>
List-Unsubscribe: <https://mailman.rfc-editor.org/mailman/options/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rfc-interest/>
List-Post: <mailto:rfc-interest@rfc-editor.org>
List-Help: <mailto:rfc-interest-request@rfc-editor.org?subject=help>
List-Subscribe: <https://mailman.rfc-editor.org/mailman/listinfo/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: rfc-interest-bounces@rfc-editor.org
Sender: rfc-interest <rfc-interest-bounces@rfc-editor.org>

On Mon, May 02, 2022 at 10:17:18PM -0400, John Levine wrote:
> It appears that Toerless Eckert  <tte@cs.fau.de> said:
> >When i search for RFC9148 on google or bing, i do only get
> >
> >https://www.rfc-editor.org/rfc/authors/rfc9148.html
> >
> >And of course the URL already does not exist anymore.
> >And yes, i did at least send a note to google.
> >
> >I assume that google/bing get these URLs from the searchable directory
> >
> >https://www.rfc-editor.org/rfc/authors/
> 
> No, that's not how search engines work. They only add working URLs to
> their index,

Sure, but they need to find the URL first, and given how these "authors"
URLs are typically private-email used, i wouldn't know where the
search engines would find he URLs other than from the index URL.

> and this is particularly strange since I helped the RPC
> put site maps on the rfc-editor site that have the right URLs for all
> of the RFCs.

The site-map would list a URL like https://www.rfc-editor.org/rfc/authors/rfc9148.html
given how its temporary... ?!

> When I do a search I see the nonexistent authors page and also a bunch
> of pages at https://sandbox-ng.ietf.org/.  This suggests that something got
> spidered very strangely during the datatracker upgrade.
> 
> I can see if I can ask the RPC to resubmit the site index which should help
> it clean out the bad rfc-editor URLs.

Thanks!
    Toerless

_______________________________________________
rfc-interest mailing list
rfc-interest@rfc-editor.org
https://mailman.rfc-editor.org/mailman/listinfo/rfc-interest