Re: [rfc-i] archiving outlinks in RFCs

Alexis Rossi <rsce@rfc-editor.org> Thu, 27 April 2023 19:29 UTC

Return-Path: <rsce@rfc-editor.org>
X-Original-To: rfc-interest@ietfa.amsl.com
Delivered-To: rfc-interest@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 16011C1519AB for <rfc-interest@ietfa.amsl.com>; Thu, 27 Apr 2023 12:29:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.896
X-Spam-Level:
X-Spam-Status: No, score=-1.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RcEMgxY7elYH; Thu, 27 Apr 2023 12:29:17 -0700 (PDT)
Received: from smtpclient.apple (157-131-78-231.fiber.dynamic.sonic.net [157.131.78.231]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPSA id CC90DC151540; Thu, 27 Apr 2023 12:29:17 -0700 (PDT)
From: Alexis Rossi <rsce@rfc-editor.org>
Message-Id: <8D7BD550-7236-43D6-80C5-994A262198E7@rfc-editor.org>
Content-Type: multipart/alternative; boundary="Apple-Mail=_64740841-67C3-474A-A5B6-6FAD554FAD97"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.3\))
Date: Thu, 27 Apr 2023 12:29:16 -0700
In-Reply-To: <06938c9e-7d71-fdb3-62ed-88908a66642c@it.aoyama.ac.jp>
Cc: rfc-interest@rfc-editor.org
To: "\"Martin J. Dürst\"" <duerst@it.aoyama.ac.jp>
References: <E024D9AC-2B92-4720-9713-519592D2362B@rfc-editor.org> <06938c9e-7d71-fdb3-62ed-88908a66642c@it.aoyama.ac.jp>
X-Mailer: Apple Mail (2.3696.120.41.1.3)
Archived-At: <https://mailarchive.ietf.org/arch/msg/rfc-interest/DaaSJSfzmpTj-tBY9Hzq5q8w6wQ>
Subject: Re: [rfc-i] archiving outlinks in RFCs
X-BeenThere: rfc-interest@rfc-editor.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "A list for discussion of the RFC series and RFC Editor functions." <rfc-interest.rfc-editor.org>
List-Unsubscribe: <https://mailman.rfc-editor.org/mailman/options/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rfc-interest/>
List-Post: <mailto:rfc-interest@rfc-editor.org>
List-Help: <mailto:rfc-interest-request@rfc-editor.org?subject=help>
List-Subscribe: <https://mailman.rfc-editor.org/mailman/listinfo/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Apr 2023 19:29:22 -0000

Yes, I think it makes sense to understand for a given domain which URLs are meant to be durable over a long period of time. e.g. On rfc-editor.org I think we would probably agree that we always want a URL that looks like https://www.rfc-editor.org/info/rfc# <https://www.rfc-editor.org/info/rfc#> to point to the official RFC and that we intend for that to never change. Going through that exercise and documenting it for the other domains that we collectively control would probably be useful (I intend to do that for rfc-editor.org as part of the archival policy draft that i’m pulling together). And of course, we would want to encourage people to use those durable URLs in their references where possible.

Alexis

> On Apr 25, 2023, at 10:27 PM, Martin J. Dürst <duerst@it.aoyama.ac.jp> wrote:
> 
> It's interesting that the broken link found first is a link to one of "our own" documents. This would suggest that (besides some of the proposals that Alexis and others brought up about tweaking/fixing RFCs) we should make sure that the relevant sites such as www.ietf.org have policies and procedures in place (and follow them) to make sure they keep their content stable.
> 
> Regards,   Martin.
> 
> On 2023-04-26 03:50, Alexis Rossi wrote:
>> Hi all,
>> I wanted to let the community know about something I’ve been working on. As you might know, one of my previous jobs was running the Wayback Machine, so when I started working with with this collection of RFCs one of my first thoughts was, “I wonder how many broken links are in these RFCs from the past few decades?”
>> In general, the average lifespan of a URL before the content changes or disappears is on the order of 100 days. Fortunately for us, the links used in RFC references seem to be much more stable than that. For instance, so far I’ve only found one broken link in an RFC from the past 6 months [1].
> 
>> [1] RFC9311 published in September 2022, in Section 11 (Informative References) this link is 404: https://www.ietf.org/how/meetings/98/bits-n-bites/ <https://www.ietf.org/how/meetings/98/bits-n-bites/>[2] https://archive-it.org/organizations/2540 <https://archive-it.org/organizations/2540>
>