Re: [Sidrops] Benjamin Kaduk's Discuss on draft-ietf-sidrops-6486bis-09: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Fri, 25 February 2022 23:55 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: sidrops@ietfa.amsl.com
Delivered-To: sidrops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 237B63A0B31; Fri, 25 Feb 2022 15:55:42 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.498
X-Spam-Level:
X-Spam-Status: No, score=-1.498 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, KHOP_HELO_FCRDNS=0.399, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7YLFjWHl33BA; Fri, 25 Feb 2022 15:55:38 -0800 (PST)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6F22D3A0B3C; Fri, 25 Feb 2022 15:55:36 -0800 (PST)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 21PNtQbw025959 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 25 Feb 2022 18:55:32 -0500
Date: Fri, 25 Feb 2022 15:55:26 -0800
From: Benjamin Kaduk <kaduk@mit.edu>
To: Job Snijders <job@fastly.com>
Cc: The IESG <iesg@ietf.org>, sidrops-chairs@ietf.org, morrowc@ops-netman.net, sidrops@ietf.org, draft-ietf-sidrops-6486bis@ietf.org
Message-ID: <20220225235526.GY12881@kduck.mit.edu>
References: <164366773060.21391.16732854790829264927@ietfa.amsl.com> <YgZTmoUhfxlsQKMJ@snel>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <YgZTmoUhfxlsQKMJ@snel>
Archived-At: <https://mailarchive.ietf.org/arch/msg/sidrops/N64WXLOto_6zlsH9mwT23aBthNg>
Subject: Re: [Sidrops] Benjamin Kaduk's Discuss on draft-ietf-sidrops-6486bis-09: (with DISCUSS and COMMENT)
X-BeenThere: sidrops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: A list for the SIDR Operations WG <sidrops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidrops>, <mailto:sidrops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/sidrops/>
List-Post: <mailto:sidrops@ietf.org>
List-Help: <mailto:sidrops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidrops>, <mailto:sidrops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 25 Feb 2022 23:55:43 -0000

Hi Job,

My pleasure!  Sorry to have taken a bit long to get back to you.

On Fri, Feb 11, 2022 at 01:16:26PM +0100, Job Snijders wrote:
> Dear Benjamin,
> 
> To me it's always is a pleasure to receive your thorough reviews. Thank
> you for sharing your insights!
> 
> On Mon, Jan 31, 2022 at 02:22:11PM -0800, Benjamin Kaduk via Datatracker wrote:
> > ----------------------------------------------------------------------
> > DISCUSS:
> > ----------------------------------------------------------------------
> > 
> > It looks like we may have a setup where a compliant RP and compliant
> > CA/repository fail to interoperate.  This is sufficiently surprising that
> > I want to confirm that it's the intended behavior.  In particular, both
> > manifests and CRLs have thisUpdate and nextUpdate fields, and since
> > issuing an update to one requires issuing an update to the other (though
> > the CRL is always actually generated first), it is only natural for us to
> > give guidance that the times in question should match between manifest and
> > corresponding CRL.  However, we do this only as RECOMMENDED/SHOULD-level
> > guidance, and accompany it by guidance to RPs that they SHOULD NOT reject
> > a manifest of the fields do not match the CRL.  Accordingly, when a CA
> > violates the first SHOULD and issues manifeset+CRL with mismatched
> > thisUpdate/nextUpdate, and an RP violates the second SHOULD (NOT) and
> > rejects such a setup, the RP will be unable to get any RPKI data for that
> > CA.  (As a tangent, we also have one place where we give related guidance
> > that the validity period of the single-use EE cert that signs the manifest
> > match the thisUpdate/nextUpdate period, which we might want to keep in
> > mind if we make any changes in this space.)
> > 
> > It looks like RFC 6486 had a conditional MUST-level requirement that *if* a
> > manifest encompasses a CRL, then the "nextUpdate" fields MUST match (no
> > guidance on thisUpdate), which we change to a statement of fact that each
> > manifest does encompass a CRL and guidance that the "nextUpdate"s SHOULD
> > match.
> > Additionally, RFC 6486 had a MUST-level requirement for the validity
> > period of the EE cert to exactly match the thisUpdate/nextUpdate time
> > interval of the manifest, which we currently are relaxing to a SHOULD.
> > 
> > Often in this scenario we would strengthen one of the SHOULDs to be a
> > MUST so that interoperability is guaranteed, but I'm not sure that I see a
> > clear argument for which requirement is better to make the MUST.
> 
> I agree with your analysis and think your suggestion to reinforce one of
> the two SHOULDs into a MUST is a reasonable way to resolve this friction.
> From my perspective, upgrading some normative terms in this context
> won't be perceived as controversial by the WG, as the current 'watered
> down' text is the result of concerns in relationship to retaining
> interopability with CAs 'in the field'.
> 
> Some background on the ability for RP's to deploy a strict ("MUST") at
> this point in time: https://mailarchive.ietf.org/arch/msg/sidrops/VNG77j05I2JXOwv4qkSy-DCgRNE/
> 
> One CA implementation rectified the lack of validity window alignment:
> https://github.com/NLnetLabs/krill/commit/51c58ec58b266697dd1843a5504e79bfc385932d
> There appears to be agreement enforcing alignment on the CA side 'going
> forward', while RPs remain relaxed:
> https://mailarchive.ietf.org/arch/msg/sidrops/W-SrezAc5Oa1HoTW0w64EdaD1gk/
> 
> Thus, I propose the following adjustments to resolve this DISCUSS.
> 
> ----- section 4: Manifest Definition ----
> 
> OLD section 4.2.1 second paragraph:
>    Because a "one-time-use" EE certificate is employed to verify a
>    manifest, it is RECOMMENDED that the EE certificate have a validity
>    period that coincides with the interval from thisUpdate to nextUpdate
>    in the manifest, to prevent needless growth of the CA's CRL.
> 
> NEW Section 4.2.1:
>    Because a "one-time-use" EE certificate is employed to verify a
>    manifest, the EE certificate MUST be issued with a validity period
>    that coincides with the interval from thisUpdate to nextUpdate in the
>    manifest, to prevent needless growth of the CA's CRL.
> 
> ----- Section 5: Manifest Generation ----
> 
> OLD Section 5.1 last paragraph:
>        It is RECOMMENDED that the validity interval of the EE
>        certificate exactly match the thisUpdate and nextUpdate times of
>        the manifest.
>        Note: An RP MUST verify all mandated syntactic constraints, i.e.,
>        constraints imposed on a CA via a "MUST".
> 
> NEW section 5.1:
>        The validity interval of the EE certificate MUST exactly match
>        the thisUpdate and nextUpdate times specified in the manifest's
>        eContent. (An RP MUST NOT consider misalignment of the validity
>        interval misalignment in and of itself to be an error.)

This seems like a good approach to me, thanks.
In this last chunk, should we add a clause at the beginning of the
parenthetical like "In order to remain compatible with implementations of a
previous version of this specification", to motivate the "MUST NOT"?

> > ----------------------------------------------------------------------
> > COMMENT:
> > ----------------------------------------------------------------------
> > 
> > My reading of the changes in the diff from RFC 6486 indicate that there is
> > something of a fundamental shift in the processing model being made, but
> > this does not seem to be mentioned specifically in the list of changes in
> > Appendix B (or the introduction/abstract).  Specifically, it seems that
> > the old RFC 6486 model involved the manifest being a tool that's useful
> > for the RP and SHOULD be used, but that ultimately the RP policy decides
> > which signed objects from the repository to use and fundamentally places
> > trust in the signatures on those objects; the new model in this draft
> > seems to place the manifest as the primary control on what signed objects
> > to use, with phrasing like "MUST use the current manifest"/"files not
> > listed on the manifest MUST NOT be used" and the error-handling behavior
> > in §6.6 being to fall back to the previous cached state (at a SHOULD-level
> > requirement), which as I understand it would mean using the previous
> > cached manifest.  (Maybe I'm wrong about that last bit.) While the new
> > focus doesn't seem problematic per se, and it's perfectly reasonable from
> > a process perspective to make that sort of behavior change in a Proposed
> > Standard, it seems that if this was a deliberate decision it ought to be
> > emphasized a bit more.  The final bullet point in the change list of
> > removing the notion of "local policy" is perhaps related to this
> > fundamental shift, but seems to only cover part of the shift.
> 
> I appreciate the observations in the above text, but don't see an
> actionable remark. However, I can provide a bit of background:
> 
> A write-up on the /why/ of RPKI Manifests can be found here:
> https://blog.apnic.net/2020/11/10/rpki-manifests-securely-declare-contents/
> 
> Manifests are the only way to verify the completeness of a SET of signed
> objects. The notion of 'local policy' in the original RFC appears to
> have been the result of an awkward compromise (this was before my time).
> I think by now all implementers understand that just like car seatbelts,
> Manifests are not optional, and never were optional.

The background is much appreciated, thanks.

My intention here was mostly to ask for another bullet point in the list of
changes in Appendix B.  In light of the above clarifiations, it might take
the form of replacing the last bullet ("Removed the notion of 'local
policy'") or it might be a standalone point.  If replacing/modifying the
existing item, maybe:

* Clarified that use of a manifest is the only way to ensure that a
  consistent and complete view of a publication point's contents is used,
  removing the notion of "local policy" that modifies manifest processing
  and use.

There might also be room for wording tweaks in the Abstract/Introduction to
reflect this change of emphasis, but I'm inclined to say that the potential
benefit is not really worth the effort of thinking about them.

> > Section 3
> > 
> >    The CA MUST sign only one manifest with each generated private key,
> >    and MUST generate a new key pair for each new version of the
> >    manifest.  This form of use of the associated EE certificate is
> >    termed a "one-time-use" EE certificate [RFC6487]
> > 
> > Not really a comment on *this* document (6486bis), but since RFC 6487
> > generically defines the "sequential use" EE certificate that we are
> > disallowing for RPKI Manifest use, it seems reasonable to ask whether the
> > issues that cause us to forbid sequential use in this context might also
> > apply to other scenarios where "sequential use" certificates might have
> > been used.  (I did not make a survey of the RPKI specification corpus to
> > investigate whether such scenarios are likely to exist.)
> 
> I didn't perform an extensive survey either, but AFAIK the Manifest -bis
> document is one of the few places to be so explicit about the use of
> one-time-use certificates. I believe the primary motivation for this
> requirement is that Manifests / CRLs are the "#1 top churning object"
> type in the RPKI ecosystem. CRLs (and thus Manifests) generally have a
> validity window measured in hours, whereas other object types such as
> ROAs often times have a validity window of 12 to 18 months, or longer.
> 
> I expect that going forward, where needed, new object types or -bis
> documents will be mor explicit whether to "recycle keypairs" or mandate
> one-time-use.
> 
> As you highlighted, your above comment is not really a comment on *this*
> document.

Indeed, no action for this document, and thanks for the insight into the
RPKI ecosystem.

> > Section 4.2.1
> > 
> >    fileHashAlg:
> >       This field contains the OID of the hash algorithm used to hash the
> >       files that the authority has placed into the repository.  The hash
> >       algorithm used MUST conform to the RPKI Algorithms and Key Size
> >       Profile specification [RFC6485].
> > 
> > RFC 6485 has been obsoleted by RFC 7935.
> 
> Correct, this has been rectified in my working copy, also noted in
> https://mailarchive.ietf.org/arch/msg/sidrops/NzCcNzcN46FD7RgV3D5ULMnEhxs/
> 
> > Section 5.1
> > 
> >    2.  Issue an EE certificate for this key pair.  The CA MUST revoke
> >        the EE certificate used for the manifest being replaced.
> > 
> > Do the mechanics of revocation still need to be undertaken even if the EE
> > certificate in question has expired?
> 
> No, in the wild one can observe how the property of expired Manifest EE
> certificates is a helpful utility to prevent needless growth of CRLs.
> Also, trimming of CRLs is permitted. Do you have suggested text to
> capture that in a concise succinct manner? Or leave as-is?

Maybe "revoke or otherwise ensure invalidity of"?

> > Section 6.x
> > 
> > The old RFC 6486 procedures allowed ("MAY") the RP to verify that each
> > file at a publication point is listed in exactly one current manifest,
> > which is no longer allowed.  I think I can see how this was a problematic
> > thing to try, but want to confirm that it is intentionally removed.
> 
> Ack.
> 
> > Similarly, RFC 6486 wanted a warning produced if there were objects
> > present in the repository that are not listed in any manifest, which does
> > not seem particularly useful and I do not mind seeing removed.
> 
> Ack.
> 
> > RFC 6486 also wanted a warning issued if a publication point contained
> > both valid and invalid manifests; removing that guidance seems correct to
> > me.
> 
> Ack.
> 
> > Section 6.3
> > 
> > The two "proceed to Section 6.4"s seem redundant (I suggest removing the
> > last one).
> > 
> > Section 6.5
> > 
> >    the publication point.  If the computed hash value of a file listed
> >    on the manifest does not match the hash value contained in the
> >    manifest, then the fetch has failed and the RP MUST proceed to
> >    Section 6.6; otherwise proceed to Section 6.6.
> > 
> > This looks an awful lot like "if X; goto Y; else; goto Y", which could
> > just be written as "goto Y".  However, since §6.6 is the "failed fetches"
> > processing, in the successful verification case perhaps we don't want to
> > do that and should instead proceed to use the content of the files listed
> > in the manifest.
> 
> Nice catch! I'll remove the last sentnce.
> 
> > Section 8
> > 
> > Perhaps it is sufficiently obvious so as to go without saying, but I trust
> > that in a prolonged period of failed fetches, the human operator is
> > expected to go figure out (via out of band channels) what's up with the
> > CA/repository in question and whether an attack is underway.
> > 
> >                              The requirement for every repository
> >    publication point to contain at least one manifest allows an RP to
> >    determine if the manifest itself has been occluded from view.  [...]
> > 
> > Do we want to say anything about how these properties change during a CA
> > key rollover or if multiple "unrelated" CAs are publishing at the same
> > publication point (the latter of which would be quite unusual, if I
> > understand correctly)?
> 
> Multiple 'repositories' can exist in a single rsync directory, akin to
> how multiple CA Repositories can be fetched via a single RRDP
> publication server. My preference would be to leave this as-is, I do not
> recall a perception of ambiguity in the wider CA operator community abou
> this aspect.

Okay, let's leave it as-is, then.

> > Section 9
> > 
> > I see that the latest update from IANA has the registry (entry) references
> > being updated from RFC 6488 to [this document], which is great and
> > obviates what I would otherwise have remarked upon.
> 
> Ack.
> 
> > Section 11.1
> > 
> > Now that we're obsoleting RFC 6486, that should probably bump it off into
> > the Informative References section.
> 
> Done!
> 
> > RFCs 6482, 6493, 8209, and 8488 appear to not actually be referenced
> > from anywhere and thus could probably be removed.
> 
> Done, this was addressed in
> https://mailarchive.ietf.org/arch/msg/sidrops/NzCcNzcN46FD7RgV3D5ULMnEhxs/
> 
> > Section 11.2
> > 
> > Do we really want to reference RFC 3370 ("CMS Algorithms") for "the CMS
> > wrapper of the object" rather than something like RFC 5652 ("CMS")?
> 
> Good suggestion, I've replaced 3370 with 5652.
> 
> Please let me know if the suggested changes to Section 4 and 5 seem
> appropriate, if so I'll make the changes and share a patchset.

They look good; please go ahead and publish at your convenience.

Thanks again,

Ben