Re: [rfc-i] archiving outlinks in RFCs

Martin Thomson <mt@lowentropy.net> Thu, 04 May 2023 01:59 UTC

Feedback-ID: ic129442d:Fastmail
User-Agent: Cyrus-JMAP/3.9.0-alpha0-386-g2404815117-fm-20230425.001-g24048151
Mime-Version: 1.0
Message-Id: <c3ec8244-c00a-4c71-b642-bf7e0cf09aa8@betaapp.fastmail.com>
In-Reply-To: <BB283056-9CDA-4B3F-BEC7-BBAA036A3D29@rfc-editor.org>
References: <E024D9AC-2B92-4720-9713-519592D2362B@rfc-editor.org> <30c30c2f-4e96-560a-73dd-a51ba8d04714@comcast.net> <771B7586-FFBB-49E4-9B99-5578863FBD8B@rfc-editor.org> <CABcZeBOevOj8cWY7dacWxzwZS82+iAjf1p+DZWF=7WZ9JydnrQ@mail.gmail.com> <48de4d92-e279-4c26-ab3c-15dd854b56f8@betaapp.fastmail.com> <CABcZeBPqePQwPAq5pWda1pGaY_=kLkcOxCjZWmOv9yRZ_MNb7g@mail.gmail.com> <CA+9kkMBVMTG7Zku4gt_DwCNWArYTauR_O0u70zceCMtN2GNN_Q@mail.gmail.com> <796.1682529129@localhost> <CA+9kkMBiqZCqbDviOVQFmjROYJtViz=S7ZsW6T41mv4XGbZ3=g@mail.gmail.com> <04BE48FA-322D-457A-9D7B-A9DA8FCE8E50@rfc-editor.org> <CA+9kkMCKM7A81+EU0OegtE5UbjLoVwsK7FVig8toddj-1APwxw@mail.gmail.com> <CANMZLAakmafNpe91TGG0eioR_yHt=n=ncV7nKLMCvCaQevoH8A@mail.gmail.com> <1718A586-7CFE-42CB-8206-DD7B18383BC9@ietf.org> <CA+9kkMCm1C762sTXiiP=MLLP9huuzdTbjJ-zROEXXJKGuwoGdg@mail.gmail.com> <93dd2fb8-f986-ed10-9369-529ab6bd320c@huitema.net> <BB283056-9CDA-4B3F-BEC7-BBAA036A3D29@rfc-editor.org>
Date: Thu, 04 May 2023 11:59:16 +1000
From: Martin Thomson <mt@lowentropy.net>
To: rfc-interest@rfc-editor.org
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/rfc-interest/5jIe8WYE3DVGz1nXfqFAW8M-uHg>
Subject: Re: [rfc-i] archiving outlinks in RFCs
Precedence: list

I've followed this thread with some amount of discomfort.  I'm going to try to articulate that here.

On Wed, May 3, 2023, at 08:15, Alexis Rossi wrote:
> Goal #1: Encourage authors to choose the appropriate URL when creating an RFC. 

This is existing practice and I don't see any evidence to suggest that people are doing poorly at this.  The RPC is particularly careful in this regard.  I don't see how we could do better in terms of choosing URLs.

> - References to URLs where the information you want to cite is the 
> exact same information you want all future readers to access should use 
> archived URLs in their references (ie take a “snapshot” of the info as 
> it is at this point in time and use that archived snapshot as the 
> reference).

This is an interesting idea, but a challenging one for a couple of reasons:

1. Superficial archival doesn't really work.  https://rfc-editor.org/info/rfcXXXX is our preferred form of link, but that references a cover page, not the important content.  Many sites that host the sorts of publications we might cite often (IEEE, IACR, ARXIV, ITU) all tend to do the same thing.
2. Paywalls, copyrights, etc...  We can't just take a copy of a spec and host it without running afoul of various constraints.
3. Formats.  The reason for cover pages can be to offer alternative formats.  Archival forces us to contend with format choices at the point of archival and then at retrieval time.

These are surmountable if we are willing to make the archival discretionary to some degree.  Anything mandatory will just keep hitting those obstacles.

> Goal #2: Allow RPC to fix broken links in a version of published RFCs 
> with appropriate approval.

I don't think that this is a good idea, as worded, pending conclusions in RSWG about changing XML.  I tend to think that this sort of change is over the line.

However, I do think that offering an alternative presentation form that shows, in a sufficiently clear form, any archived forms or updated links *as alternatives*, potentially by also marking broken links.

For instance, HTML is easy to tweak, and you could have links annotated.  Interacting with the annotation could show an overlay with information.  For example:

🔗 https://broken.link.example/X6.92

might become:

🔗[  https://broken.link.example/X6.92 (marked broken 2021-02-28)
      Alternative link: https://alternative.source.example/ (added 2023-05-04)
      Alternative link: https://another.alternative/ (added 2021-02-28)
      Archived copies of this resource: _HTML_, _PDF_, _TXT_ (added 2023-05-04) ]

(The added links could link to a record of the transaction.)  

This marker could be added easily to HTML renderings, without needing to query the database until the interaction occurs, meaning that you wouldn't need to regenerate the RFC rendering often; realistically, only when the process for rendering this stuff needs to change.

Note that adding archived copies might be part of the RPC service for newly published documents, but this wouldn't require that to happen.

> - When the RPC receives notification of a broken link, they can 
> identify a suggested replacement, obtain approval from the appropriate 
> entity, and update an html version of the RFC with the approved link.

> - Approval of replacement links for a document is provided by the same 
> entities who approve errata for the document.

This makes sense in the abstract, but I'd like to see this improved in this case, because the errata process leans more heavily on area directors than this process would seem to require.  If this only affects rendering and some of the metadata that is maintained for published RFCs (as my above proposal would), then a lighter process might help avoid this turning into a reason that we get even further behind on processing errata.

One thing that might make this easier, but which might also overturn this point, is that the same links are very common across the entire RFC catalogue.  So if https://broken.example/ appears in 15 RFCs, maybe the key to that database isn't the RFCs, but the URL itself.  That has several advantages: new RFCs that cite URLs already in the database don't need new copies of the documents, one change can fix many RFCs, etc...  Maybe you can use any editor from any of those RFCs to approve the change.

That's harder, but I think that a citation-centred database has enough advantages to justify considering it.

Re: [rfc-i] archiving outlinks in RFCs Paul Kyzivat
[rfc-i] archiving outlinks in RFCs Alexis Rossi
Re: [rfc-i] archiving outlinks in RFCs Stephen Farrell
Re: [rfc-i] archiving outlinks in RFCs Eric Rescorla
Re: [rfc-i] archiving outlinks in RFCs Alexis Rossi
Re: [rfc-i] archiving outlinks in RFCs Eric Rescorla
Re: [rfc-i] archiving outlinks in RFCs Martin Thomson
Re: [rfc-i] archiving outlinks in RFCs Eric Rescorla
Re: [rfc-i] archiving outlinks in RFCs Martin Thomson
Re: [rfc-i] archiving outlinks in RFCs Brian E Carpenter
Re: [rfc-i] archiving outlinks in RFCs Martin J. Dürst
Re: [rfc-i] archiving outlinks in RFCs Martin J. Dürst
[rfc-i] standards for references/URLs in RFCs ? (… Toerless Eckert
Re: [rfc-i] archiving outlinks in RFCs Jay Daley
Re: [rfc-i] standards for references/URLs in RFCs… Jay Daley
Re: [rfc-i] archiving outlinks in RFCs tom petch
Re: [rfc-i] archiving outlinks in RFCs Ted Hardie
Re: [rfc-i] standards for references/URLs in RFCs… Brian Carpenter
Re: [rfc-i] archiving outlinks in RFCs Michael Richardson
Re: [rfc-i] archiving outlinks in RFCs Michael Richardson
Re: [rfc-i] archiving outlinks in RFCs Michael Richardson
Re: [rfc-i] archiving outlinks in RFCs Larry Masinter
Re: [rfc-i] standards for references/URLs in RFCs… Jean Mahoney
Re: [rfc-i] standards for references/URLs in RFCs… Jean Mahoney
Re: [rfc-i] standards for references/URLs in RFCs… Brian E Carpenter
Re: [rfc-i] archiving outlinks in RFCs Brian E Carpenter
Re: [rfc-i] archiving outlinks in RFCs Ted Hardie
Re: [rfc-i] archiving outlinks in RFCs Alexis Rossi
Re: [rfc-i] archiving outlinks in RFCs Alexis Rossi
Re: [rfc-i] archiving outlinks in RFCs Alexis Rossi
Re: [rfc-i] archiving outlinks in RFCs Alexis Rossi
Re: [rfc-i] archiving outlinks in RFCs Brian E Carpenter
Re: [rfc-i] archiving outlinks in RFCs Marc Petit-Huguenin
Re: [rfc-i] archiving outlinks in RFCs Eliot Lear
Re: [rfc-i] archiving outlinks in RFCs Brian E Carpenter
Re: [rfc-i] archiving outlinks in RFCs Ted Hardie
Re: [rfc-i] archiving outlinks in RFCs Brian Carpenter
Re: [rfc-i] archiving outlinks in RFCs Jay Daley
Re: [rfc-i] archiving outlinks in RFCs Ted Hardie
[rfc-i] IANA, too (Re: archiving outlinks in RFCs) Carsten Bormann
Re: [rfc-i] archiving outlinks in RFCs Jay Daley
Re: [rfc-i] archiving outlinks in RFCs Ted Hardie
Re: [rfc-i] archiving outlinks in RFCs Jay Daley
Re: [rfc-i] archiving outlinks in RFCs Eliot Lear
Re: [rfc-i] archiving outlinks in RFCs Paul Kyzivat
Re: [rfc-i] archiving outlinks in RFCs Paul Kyzivat
Re: [rfc-i] IANA, too (Re: archiving outlinks in … tom petch
Re: [rfc-i] IANA, too (Re: archiving outlinks in … Carsten Bormann
Re: [rfc-i] archiving outlinks in RFCs Jean Mahoney
Re: [rfc-i] IANA, too (Re: archiving outlinks in … tom petch
Re: [rfc-i] archiving outlinks in RFCs Michael Richardson
Re: [rfc-i] archiving outlinks in RFCs Stephen Farrell
Re: [rfc-i] archiving outlinks in RFCs Brian E Carpenter
Re: [rfc-i] archiving outlinks in RFCs Alexis Rossi
Re: [rfc-i] archiving outlinks in RFCs Stephen Farrell
Re: [rfc-i] archiving outlinks in RFCs Paul Hoffman
Re: [rfc-i] archiving outlinks in RFCs Ted Hardie
Re: [rfc-i] archiving outlinks in RFCs Christian Huitema
Re: [rfc-i] archiving outlinks in RFCs Alexis Rossi
Re: [rfc-i] archiving outlinks in RFCs Stephen Farrell
Re: [rfc-i] archiving outlinks in RFCs Paul Hoffman
Re: [rfc-i] archiving outlinks in RFCs Michael Richardson
Re: [rfc-i] archiving outlinks in RFCs Brian E Carpenter
Re: [rfc-i] archiving outlinks in RFCs Stephen Farrell
Re: [rfc-i] archiving outlinks in RFCs Alexis Rossi
Re: [rfc-i] archiving outlinks in RFCs Alexis Rossi
Re: [rfc-i] archiving outlinks in RFCs Alexis Rossi
Re: [rfc-i] archiving outlinks in RFCs Paul Hoffman
Re: [rfc-i] archiving outlinks in RFCs Alexis Rossi
Re: [rfc-i] archiving outlinks in RFCs Alexis Rossi
Re: [rfc-i] archiving outlinks in RFCs Paul Hoffman
Re: [rfc-i] archiving outlinks in RFCs Martin Thomson
Re: [rfc-i] archiving outlinks in RFCs Brian E Carpenter
Re: [rfc-i] IANA, too (Re: archiving outlinks in … Alexis Rossi