Re: [Sidrops] Benjamin Kaduk's Discuss on draft-ietf-sidrops-6486bis-09: (with DISCUSS and COMMENT)

Rob Austein <sra@hactrn.net> Sun, 27 February 2022 00:25 UTC

Return-Path: <sra@hactrn.net>
X-Original-To: sidrops@ietfa.amsl.com
Delivered-To: sidrops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1D55B3A0970; Sat, 26 Feb 2022 16:25:43 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.91
X-Spam-Level:
X-Spam-Status: No, score=-1.91 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FMVb2ZaVhb2x; Sat, 26 Feb 2022 16:25:39 -0800 (PST)
Received: from khatovar.hactrn.net (khatovar.hactrn.net [198.180.150.30]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DC7473A096E; Sat, 26 Feb 2022 16:25:38 -0800 (PST)
Received: from minas-ithil.hactrn.net (c-73-100-88-186.hsd1.ma.comcast.net [73.100.88.186]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) client-signature RSA-PSS (2048 bits)) (Client CN "nargothrond.hactrn.net", Issuer "Grunchweather Associates" (not verified)) by khatovar.hactrn.net (Postfix) with ESMTPS id A664F4E614; Sun, 27 Feb 2022 00:25:36 +0000 (UTC)
Received: from minas-ithil.hactrn.net (localhost [127.0.0.1]) by minas-ithil.hactrn.net (Postfix) with ESMTP id 3516E2EA009D; Sat, 26 Feb 2022 19:25:36 -0500 (EST)
Date: Sat, 26 Feb 2022 19:25:36 -0500
From: Rob Austein <sra@hactrn.net>
To: Benjamin Kaduk <kaduk@mit.edu>
Cc: Job Snijders <job@fastly.com>, The IESG <iesg@ietf.org>, sidrops-chairs@ietf.org, morrowc@ops-netman.net, sidrops@ietf.org, draft-ietf-sidrops-6486bis@ietf.org
In-Reply-To: <20220225235526.GY12881@kduck.mit.edu>
References: <164366773060.21391.16732854790829264927@ietfa.amsl.com> <YgZTmoUhfxlsQKMJ@snel> <20220225235526.GY12881@kduck.mit.edu>
User-Agent: Wanderlust/2.15.9 (Almost Unreal) Emacs/27.1 Mule/6.0
MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue")
Content-Type: text/plain; charset="US-ASCII"
Message-Id: <20220227002536.3516E2EA009D@minas-ithil.hactrn.net>
Archived-At: <https://mailarchive.ietf.org/arch/msg/sidrops/-fZC3suTFyGSe1tRhyFY5UOgAMw>
Subject: Re: [Sidrops] Benjamin Kaduk's Discuss on draft-ietf-sidrops-6486bis-09: (with DISCUSS and COMMENT)
X-BeenThere: sidrops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: A list for the SIDR Operations WG <sidrops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidrops>, <mailto:sidrops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/sidrops/>
List-Post: <mailto:sidrops@ietf.org>
List-Help: <mailto:sidrops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidrops>, <mailto:sidrops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 27 Feb 2022 00:25:43 -0000

Apologies for jumping in late here, I've been distracted by $dayjob
(which has not been RPKI for many years now) and only just saw this.

I fear that I must object, strongly, to Job's proposed change:

> > ----- section 4: Manifest Definition ----
> > 
> > OLD section 4.2.1 second paragraph:
> >    Because a "one-time-use" EE certificate is employed to verify a
> >    manifest, it is RECOMMENDED that the EE certificate have a validity
> >    period that coincides with the interval from thisUpdate to nextUpdate
> >    in the manifest, to prevent needless growth of the CA's CRL.
> > 
> > NEW Section 4.2.1:
> >    Because a "one-time-use" EE certificate is employed to verify a
> >    manifest, the EE certificate MUST be issued with a validity period
> >    that coincides with the interval from thisUpdate to nextUpdate in the
> >    manifest, to prevent needless growth of the CA's CRL.
> > 
> > ----- Section 5: Manifest Generation ----
> > 
> > OLD Section 5.1 last paragraph:
> >        It is RECOMMENDED that the validity interval of the EE
> >        certificate exactly match the thisUpdate and nextUpdate times of
> >        the manifest.
> >        Note: An RP MUST verify all mandated syntactic constraints, i.e.,
> >        constraints imposed on a CA via a "MUST".
> > 
> > NEW section 5.1:
> >        The validity interval of the EE certificate MUST exactly match
> >        the thisUpdate and nextUpdate times specified in the manifest's
> >        eContent. (An RP MUST NOT consider misalignment of the validity
> >        interval misalignment in and of itself to be an error.)

This business of pinning the manifiest EE certificate's validity
period to the manifest's thisUpdate/nextUpdate interval has always
been a bad idea, I've objected to it in the WG before, and we have a
long sad history of operational disasters resulting from
implementations that followed that recommendation.

The typical failure mode (demonstrated at least once each by three of
the RIRs) is to pick a thisUpdate/nextUpdate cycle on the order of a
day or so (reasonable), then go home for a long weekend, during which
something breaks and no new updates occur.

The manifest thisUpdate/nextUpdate semantics are (deliberately) very
much like the CRL semantics: failure to issue a new manifest by the
nextUpdate time does not automatically invalidate all the existing
data, it's just a hint that something might be wrong.  Think of it as
a staleness indication.  Slightly stale, log a warning but no big
deal (yet).  Gets stale enough, maybe you don't want to eat that.

Certificate expiration, on the other hand, is a hard failure.  When a
manifest's EE certificate expires without being replaced, the data
listed in the manifest is just gone.

So when an RIR does this with the manifest directly under their root
certificate, a large portion of the total database goes away.  Oops.

Manifests only have EE certificates because they're a patch on the
side of X.509.  If X.509 had allowed us to sign manifests using the CA
certificate, as with CRLs, we would have done that.  But it doesn't,
so we have the manifest EE certificates, OK, but that's no reason to
kill large portions of the database every time an operator near the
root of the tree says "oops".

All of this has been discussed before, and should be both in WG
archives and in notes from face to face meetings (probably both the
SIDR and SIDROPS WGs, not a new topic).

My own RPKI CA implementation uses the validity interval of the parent
CA certificate as the validity interval of the manifest EE
certificate, and has for about fifteen years now.  I'm not aware of
any operational problems that have arisen as a result.

Yes, there's a potential issue of a CRL getting too long, but if it
happens at all, it happens at a predictable rate, and one can always
rekey if it's really becoming big enough to be a problem.

Job's proposed change makes the bad idea a requirement rather than
just bad advice, hence my objection.