Re: [Tools-discuss] How do we diagnose DOI errors?

Carsten Bormann <cabo@tzi.org> Mon, 19 October 2020 06:45 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: tools-discuss@ietfa.amsl.com
Delivered-To: tools-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D95C93A1427 for <tools-discuss@ietfa.amsl.com>; Sun, 18 Oct 2020 23:45:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.003
X-Spam-Level:
X-Spam-Status: No, score=0.003 tagged_above=-999 required=5 tests=[RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3fNvd3pJpEuD for <tools-discuss@ietfa.amsl.com>; Sun, 18 Oct 2020 23:45:05 -0700 (PDT)
Received: from gabriel-vm-2.zfn.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9E6F73A1422 for <tools-discuss@ietf.org>; Sun, 18 Oct 2020 23:45:05 -0700 (PDT)
Received: from [192.168.217.118] (p548dcc60.dip0.t-ipconnect.de [84.141.204.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-2.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4CF6gz4z6GzyQD; Mon, 19 Oct 2020 08:45:03 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <1c73edad-56f1-4eda-a835-db593c316968@www.fastmail.com>
Date: Mon, 19 Oct 2020 08:45:03 +0200
Cc: Tools Team Discussion <tools-discuss@ietf.org>
X-Mao-Original-Outgoing-Id: 624782703.044178-151b701f1840aba6742775c6c47de19a
Content-Transfer-Encoding: quoted-printable
Message-Id: <3AE5D3DB-0B5E-4D90-A397-1A06D4772712@tzi.org>
References: <181dadfc-37bf-46ad-b907-853cad3dccd2@www.fastmail.com> <51FF31EC-14A2-40C5-BE66-A9B57552EE9B@tzi.org> <1c73edad-56f1-4eda-a835-db593c316968@www.fastmail.com>
To: Martin Thomson <mt@lowentropy.net>
X-Mailer: Apple Mail (2.3608.120.23.2.4)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tools-discuss/7NZU6xhRuCldgyUb64HlHyb7ikU>
Subject: Re: [Tools-discuss] How do we diagnose DOI errors?
X-BeenThere: tools-discuss@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF Tools Discussion <tools-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tools-discuss/>
List-Post: <mailto:tools-discuss@ietf.org>
List-Help: <mailto:tools-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Oct 2020 06:45:09 -0000


> On 2020-10-19, at 08:19, Martin Thomson <mt@lowentropy.net> wrote:
> 
> 
> On Mon, Oct 19, 2020, at 16:39, Carsten Bormann wrote:
>> On 2020-10-19, at 03:58, Martin Thomson <mt@lowentropy.net> wrote:
>>> 
>>> I've come to rely on this mechanism for citing documents, but it isn't good if it isn't reliable.  Some better information about how this operates would be great.
>> 
>> As far as I understand, the DOI bibxml is based on the doilit tool that 
>> is part of kramdown-rfc2629 (which in turn is using 
>> https://dx.doi.org/).  A 24-h cache helps keeping the load down but the 
>> data reasonably fresh.  As far as I can see, all kinds of errors are 
>> mapped to 404.
>> 
>> One tweak that can help coping with doi.org outages might be keeping 
>> the cache indefinitely.  Freshness could still be achieved by *trying* 
>> to get a refresh after 24 h, but keeping the cached value in place if 
>> that fails for some reason.
> 
> That seems like a reasonable enhancement.  These outages are far to frequent for my liking.  This is hardly the first occurrence.

Indeed, the reason kramdown-rfc doesn’t do the processing entirely on its own was to take advantage of the cache at ietf.org to improve availability.

> You can tune the timeout right down at the same time; these rarely change and anyone who uses these often (through CI, for instance) will ensure that they are kept fresh enough.  (stale-while-revalidate...)

Exactly.

> You might need to consider whether or not the cache needs an occasional flush to avoid unbounded accumulation of old entries.

I don’t think so; the number of DOIs in existence is sufficiently limited (unless we try to do negative caching, but I see no point in that).

Grüße, Carsten