Re: [rfc-i] Meta decorations in generated HTML

Michael Richardson <mcr+ietf@sandelman.ca> Fri, 27 May 2022 14:07 UTC

Return-Path: <rfc-interest-bounces@rfc-editor.org>
X-Original-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Delivered-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id C5504C26E8B6 for <ietfarch-rfc-interest-archive@ietfa.amsl.com>; Fri, 27 May 2022 07:07:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ietf.org; s=ietf1; t=1653660457; bh=cET099C7Uay3biItKedw89Y+/9B0sTUYEXh+4J88RQ4=; h=From:To:In-Reply-To:References:Date:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe; b=tuE7dFNinevXMapAM2eVy2qw3LbchFAVJ89+5YyYHrET1Nxw2/pO1hNntp06yBMyv TPBQYkcTus7gKchK4QcyR7dTUZWYJAVCRkFL5Bg6APyyozCDQNLNPaWOuCk05aV1Ms Mh7Cclt4WG08eqWicJUYl49/LxupkmVusQFOVxkQ=
X-Mailbox-Line: From rfc-interest-bounces@rfc-editor.org Fri May 27 07:07:37 2022
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 79878C26E8A0; Fri, 27 May 2022 07:07:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ietf.org; s=ietf1; t=1653660457; bh=cET099C7Uay3biItKedw89Y+/9B0sTUYEXh+4J88RQ4=; h=From:To:In-Reply-To:References:Date:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe; b=tuE7dFNinevXMapAM2eVy2qw3LbchFAVJ89+5YyYHrET1Nxw2/pO1hNntp06yBMyv TPBQYkcTus7gKchK4QcyR7dTUZWYJAVCRkFL5Bg6APyyozCDQNLNPaWOuCk05aV1Ms Mh7Cclt4WG08eqWicJUYl49/LxupkmVusQFOVxkQ=
X-Original-To: rfc-interest@ietfa.amsl.com
Delivered-To: rfc-interest@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B65DFC26D47E for <rfc-interest@ietfa.amsl.com>; Fri, 27 May 2022 07:07:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.698
X-Spam-Level:
X-Spam-Status: No, score=-1.698 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=neutral reason="invalid (public key: not available)" header.d=sandelman.ca
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id I50w7rNurfG5 for <rfc-interest@ietfa.amsl.com>; Fri, 27 May 2022 07:07:31 -0700 (PDT)
Received: from tuna.sandelman.ca (tuna.sandelman.ca [209.87.249.19]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 50A97C26E8A0 for <rfc-interest@rfc-editor.org>; Fri, 27 May 2022 07:07:30 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by tuna.sandelman.ca (Postfix) with ESMTP id 94B3238D60; Fri, 27 May 2022 10:22:03 -0400 (EDT)
Received: from tuna.sandelman.ca ([127.0.0.1]) by localhost (localhost [127.0.0.1]) (amavisd-new, port 10024) with LMTP id VVehMtdvWN-R; Fri, 27 May 2022 10:22:02 -0400 (EDT)
Received: from sandelman.ca (obiwan.sandelman.ca [IPv6:2607:f0b0:f:2::247]) by tuna.sandelman.ca (Postfix) with ESMTP id 2596638D4F; Fri, 27 May 2022 10:22:02 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=sandelman.ca; s=mail; t=1653661322; bh=KTt46f13UAs8bctAgAdKslDD5xYa9tM/Mg0roACuhY0=; h=From:To:Subject:In-Reply-To:References:Date:From; b=f6MtlmdAvvlvKHW5dq/9uT2Xc3MwzYa9wAfxb+j90JKu+4XA2WP2qi/c0davyRr22 rsD6RIe8/ELpmNMs/0R6mGivzDZ7VXiW8PNA0AtUroQFTDGjHmFTX6jLtEcepHLPCH 2OHcxB6OI875OI7PXTRHg2ZUGrXneITg1ELlu5li8aigIUAq5j17d0bh0yxHUXy9Rk /ojm11I4GbdJ0bppF9ibvODxP9vCjFEGkfh64HcjyIB9hPsnp8m4H00yETz5Zsl1Ot G+TsEE5ru0FsPN6pnACH5VaEEVFh8O+CedKQhrIdPrHuXXXfFvDCn/eiHT5GwCecEG F9KBIf8ygxc9g==
Received: from localhost (localhost [IPv6:::1]) by sandelman.ca (Postfix) with ESMTP id EBCA5471; Fri, 27 May 2022 10:07:27 -0400 (EDT)
From: Michael Richardson <mcr+ietf@sandelman.ca>
To: John R Levine <johnl@taugh.com>, rfc-interest@rfc-editor.org
In-Reply-To: <0ab66d2e-aa7d-eb17-83dc-2774e9d021a7@taugh.com>
References: <20220525203826.8606A41A4E93@ary.qy> <f0f92d4c-8cc4-c3bb-0f0d-96c3ad422303@gmx.de> <C826D239-7CCB-404E-9591-B33C34ED82C9@tzi.org> <5afe0f29-ab5a-b79e-cad4-7c18cf8fc5d3@gmx.de> <0ab66d2e-aa7d-eb17-83dc-2774e9d021a7@taugh.com>
X-Mailer: MH-E 8.6+git; nmh 1.7+dev; GNU Emacs 27.1
X-Face: $\n1pF)h^`}$H>Hk{L"x@)JS7<%Az}5RyS@k9X%29-lHB$Ti.V>2bi.~ehC0; <'$9xN5Ub# z!G,p`nR&p7Fz@^UXIn156S8.~^@MJ*mMsD7=QFeq%AL4m<nPbLgmtKK-5dC@#:k
MIME-Version: 1.0
Date: Fri, 27 May 2022 10:07:27 -0400
Message-ID: <27659.1653660447@localhost>
Archived-At: <https://mailarchive.ietf.org/arch/msg/rfc-interest/Y42ElRwRpghQHIYtL_PTM_xqoQw>
Subject: Re: [rfc-i] Meta decorations in generated HTML
X-BeenThere: rfc-interest@rfc-editor.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: "A list for discussion of the RFC series and RFC Editor functions." <rfc-interest.rfc-editor.org>
List-Unsubscribe: <https://mailman.rfc-editor.org/mailman/options/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rfc-interest/>
List-Post: <mailto:rfc-interest@rfc-editor.org>
List-Help: <mailto:rfc-interest-request@rfc-editor.org?subject=help>
List-Subscribe: <https://mailman.rfc-editor.org/mailman/listinfo/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1295682832652577024=="
Errors-To: rfc-interest-bounces@rfc-editor.org
Sender: rfc-interest <rfc-interest-bounces@rfc-editor.org>

John R Levine <johnl@taugh.com> wrote:
    > We're talking about seven small pieces of bibliographic metadata that are
    > added mechanically to the HTML.  They're all the ones listed on the
    > Scholar web site that Google uses to decide where to index the pages they
    > include.  While I suppose they might still include us if we left some of
    > them out, I don't see why we would want to make it harder to index our RFCs
    > accurately.

I haven't looked at all into what's in the tags, except by view-source.
I compared
view-source:https://www.rfc-editor.org/rfc/rfc6698.html (which was an open tab)
to:
view-source:https://www.rfc-editor.org/rfc/rfc8995.html

and in the former, I see lots of interesting <meta name="citation_...
stuff, but in the later, I do not.

In the former, I do see an almost good self-reference:
   <meta name="citation_pdf_url" content="https://www.rfc-editor.org/rfc/pdfrfc/rfc6698.txt.pdf"/>

I looked this up because I was thinking that some of our indexing of AUTH48
drafts might be dealt with if the indexers could be told where the canonical
URL of the document is.

    > I would have thought it was self-evident why we have redundant bibliographic
    > tags in the HTML (not the XML).  After three decades of disorganized
    > evolution, different indexes use different tags.  None of them are large or
    > hard to create, so if we add them all, we help get our documents indexed
    > better.

I agree.  :-)




--
Michael Richardson <mcr+IETF@sandelman.ca>   . o O ( IPv6 IøT consulting )
           Sandelman Software Works Inc, Ottawa and Worldwide




_______________________________________________
rfc-interest mailing list
rfc-interest@rfc-editor.org
https://mailman.rfc-editor.org/mailman/listinfo/rfc-interest