Re: [Sidrops] Benjamin Kaduk's Discuss on draft-ietf-sidrops-6486bis-09: (with DISCUSS and COMMENT)

Job Snijders <job@fastly.com> Fri, 11 February 2022 12:16 UTC

Return-Path: <job@fastly.com>
X-Original-To: sidrops@ietfa.amsl.com
Delivered-To: sidrops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9C0143A0EB2 for <sidrops@ietfa.amsl.com>; Fri, 11 Feb 2022 04:16:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Level:
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=fastly.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id O99XeUPOSQNv for <sidrops@ietfa.amsl.com>; Fri, 11 Feb 2022 04:16:32 -0800 (PST)
Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 230523A0EB8 for <sidrops@ietf.org>; Fri, 11 Feb 2022 04:16:32 -0800 (PST)
Received: by mail-ed1-x531.google.com with SMTP id ch26so16066623edb.12 for <sidrops@ietf.org>; Fri, 11 Feb 2022 04:16:32 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastly.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=Ar5ktNY5bqw3MPft4P1e9cUfyepipNS1zNbns1Dp/Ac=; b=uV2aDDZZ1qRqjMiPwFFy9njLE17BiDdGY9vUpFp09kDhRsUH0dYejrivmGiu3MG9NJ c/M6XVO0MhOI1hlERbxX6DyLDG+Vp/r00ZWtUIkPUEL29axSlprXGhNZ9W6VaHAN5sfo 2Y1aRzW0IYbBFSisattNTFxVPqT8YXqf64ra4=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=Ar5ktNY5bqw3MPft4P1e9cUfyepipNS1zNbns1Dp/Ac=; b=Qc2raWpuIgxEvZDUYNBwDQQyvDaytfLK3YqNdKKFATwWLgOwOGzhDNW7Rcr2MH2k4X 18q7aK1pcAEikDpA5FEOp+0IjOex+51JKB4ilTRVB8GMQTIhzutiqCNpXg5OmsZLg7Uv kZ/sgrD7YPzC0WhVpVPW5BgfIz/Q5Yo1Bx7yqFfy1cVsCuJyq6txFwO4Np6c1Bd60UA8 p/k806k17Rku023GLKlT0HoeZ8F+ggnaKCjV5sa99uEverIhLE+n84+zLYlLdXYfaj+n 3H7kEifUkLIR1ECFpTELd12NLqEIH4/RzCRXCUIcIre7B78Ae9k6JlNNvepwNuGnyqzz RIxA==
X-Gm-Message-State: AOAM530uc0IvMRngtD4z4lB34eG6Lydk3q93c9i8c5vgp1rWOG5Ocbo8 NhK3WAkKDaPi5Sdlbtw7883bSg==
X-Google-Smtp-Source: ABdhPJwWDsZuHMihS4bwDFJqms3WzUoKpnUFDQRsWy5iXsxAHpOV6v6d0R8205kaH8dwWxdK3Vlfgw==
X-Received: by 2002:a05:6402:190a:: with SMTP id e10mr1637975edz.135.1644581788468; Fri, 11 Feb 2022 04:16:28 -0800 (PST)
Received: from snel ([2a10:3781:276:2:16f6:d8ff:fe47:2eb7]) by smtp.gmail.com with ESMTPSA id gj8sm4936301ejc.186.2022.02.11.04.16.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Feb 2022 04:16:27 -0800 (PST)
Date: Fri, 11 Feb 2022 13:16:26 +0100
From: Job Snijders <job@fastly.com>
To: Benjamin Kaduk <kaduk@mit.edu>
Cc: The IESG <iesg@ietf.org>, sidrops-chairs@ietf.org, morrowc@ops-netman.net, sidrops@ietf.org, draft-ietf-sidrops-6486bis@ietf.org
Message-ID: <YgZTmoUhfxlsQKMJ@snel>
References: <164366773060.21391.16732854790829264927@ietfa.amsl.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <164366773060.21391.16732854790829264927@ietfa.amsl.com>
X-Clacks-Overhead: GNU Terry Pratchett
Archived-At: <https://mailarchive.ietf.org/arch/msg/sidrops/r-d3nCXU1G9RR7F3cTejVVj_go0>
Subject: Re: [Sidrops] Benjamin Kaduk's Discuss on draft-ietf-sidrops-6486bis-09: (with DISCUSS and COMMENT)
X-BeenThere: sidrops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: A list for the SIDR Operations WG <sidrops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidrops>, <mailto:sidrops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/sidrops/>
List-Post: <mailto:sidrops@ietf.org>
List-Help: <mailto:sidrops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidrops>, <mailto:sidrops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Feb 2022 12:16:38 -0000

Dear Benjamin,

To me it's always is a pleasure to receive your thorough reviews. Thank
you for sharing your insights!

On Mon, Jan 31, 2022 at 02:22:11PM -0800, Benjamin Kaduk via Datatracker wrote:
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
> 
> It looks like we may have a setup where a compliant RP and compliant
> CA/repository fail to interoperate.  This is sufficiently surprising that
> I want to confirm that it's the intended behavior.  In particular, both
> manifests and CRLs have thisUpdate and nextUpdate fields, and since
> issuing an update to one requires issuing an update to the other (though
> the CRL is always actually generated first), it is only natural for us to
> give guidance that the times in question should match between manifest and
> corresponding CRL.  However, we do this only as RECOMMENDED/SHOULD-level
> guidance, and accompany it by guidance to RPs that they SHOULD NOT reject
> a manifest of the fields do not match the CRL.  Accordingly, when a CA
> violates the first SHOULD and issues manifeset+CRL with mismatched
> thisUpdate/nextUpdate, and an RP violates the second SHOULD (NOT) and
> rejects such a setup, the RP will be unable to get any RPKI data for that
> CA.  (As a tangent, we also have one place where we give related guidance
> that the validity period of the single-use EE cert that signs the manifest
> match the thisUpdate/nextUpdate period, which we might want to keep in
> mind if we make any changes in this space.)
> 
> It looks like RFC 6486 had a conditional MUST-level requirement that *if* a
> manifest encompasses a CRL, then the "nextUpdate" fields MUST match (no
> guidance on thisUpdate), which we change to a statement of fact that each
> manifest does encompass a CRL and guidance that the "nextUpdate"s SHOULD
> match.
> Additionally, RFC 6486 had a MUST-level requirement for the validity
> period of the EE cert to exactly match the thisUpdate/nextUpdate time
> interval of the manifest, which we currently are relaxing to a SHOULD.
> 
> Often in this scenario we would strengthen one of the SHOULDs to be a
> MUST so that interoperability is guaranteed, but I'm not sure that I see a
> clear argument for which requirement is better to make the MUST.

I agree with your analysis and think your suggestion to reinforce one of
the two SHOULDs into a MUST is a reasonable way to resolve this friction.
>From my perspective, upgrading some normative terms in this context
won't be perceived as controversial by the WG, as the current 'watered
down' text is the result of concerns in relationship to retaining
interopability with CAs 'in the field'.

Some background on the ability for RP's to deploy a strict ("MUST") at
this point in time: https://mailarchive.ietf.org/arch/msg/sidrops/VNG77j05I2JXOwv4qkSy-DCgRNE/

One CA implementation rectified the lack of validity window alignment:
https://github.com/NLnetLabs/krill/commit/51c58ec58b266697dd1843a5504e79bfc385932d
There appears to be agreement enforcing alignment on the CA side 'going
forward', while RPs remain relaxed:
https://mailarchive.ietf.org/arch/msg/sidrops/W-SrezAc5Oa1HoTW0w64EdaD1gk/

Thus, I propose the following adjustments to resolve this DISCUSS.

----- section 4: Manifest Definition ----

OLD section 4.2.1 second paragraph:
   Because a "one-time-use" EE certificate is employed to verify a
   manifest, it is RECOMMENDED that the EE certificate have a validity
   period that coincides with the interval from thisUpdate to nextUpdate
   in the manifest, to prevent needless growth of the CA's CRL.

NEW Section 4.2.1:
   Because a "one-time-use" EE certificate is employed to verify a
   manifest, the EE certificate MUST be issued with a validity period
   that coincides with the interval from thisUpdate to nextUpdate in the
   manifest, to prevent needless growth of the CA's CRL.

----- Section 5: Manifest Generation ----

OLD Section 5.1 last paragraph:
       It is RECOMMENDED that the validity interval of the EE
       certificate exactly match the thisUpdate and nextUpdate times of
       the manifest.
       Note: An RP MUST verify all mandated syntactic constraints, i.e.,
       constraints imposed on a CA via a "MUST".

NEW section 5.1:
       The validity interval of the EE certificate MUST exactly match
       the thisUpdate and nextUpdate times specified in the manifest's
       eContent. (An RP MUST NOT consider misalignment of the validity
       interval misalignment in and of itself to be an error.)

> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
> 
> My reading of the changes in the diff from RFC 6486 indicate that there is
> something of a fundamental shift in the processing model being made, but
> this does not seem to be mentioned specifically in the list of changes in
> Appendix B (or the introduction/abstract).  Specifically, it seems that
> the old RFC 6486 model involved the manifest being a tool that's useful
> for the RP and SHOULD be used, but that ultimately the RP policy decides
> which signed objects from the repository to use and fundamentally places
> trust in the signatures on those objects; the new model in this draft
> seems to place the manifest as the primary control on what signed objects
> to use, with phrasing like "MUST use the current manifest"/"files not
> listed on the manifest MUST NOT be used" and the error-handling behavior
> in §6.6 being to fall back to the previous cached state (at a SHOULD-level
> requirement), which as I understand it would mean using the previous
> cached manifest.  (Maybe I'm wrong about that last bit.) While the new
> focus doesn't seem problematic per se, and it's perfectly reasonable from
> a process perspective to make that sort of behavior change in a Proposed
> Standard, it seems that if this was a deliberate decision it ought to be
> emphasized a bit more.  The final bullet point in the change list of
> removing the notion of "local policy" is perhaps related to this
> fundamental shift, but seems to only cover part of the shift.

I appreciate the observations in the above text, but don't see an
actionable remark. However, I can provide a bit of background:

A write-up on the /why/ of RPKI Manifests can be found here:
https://blog.apnic.net/2020/11/10/rpki-manifests-securely-declare-contents/

Manifests are the only way to verify the completeness of a SET of signed
objects. The notion of 'local policy' in the original RFC appears to
have been the result of an awkward compromise (this was before my time).
I think by now all implementers understand that just like car seatbelts,
Manifests are not optional, and never were optional.

> Section 3
> 
>    The CA MUST sign only one manifest with each generated private key,
>    and MUST generate a new key pair for each new version of the
>    manifest.  This form of use of the associated EE certificate is
>    termed a "one-time-use" EE certificate [RFC6487]
> 
> Not really a comment on *this* document (6486bis), but since RFC 6487
> generically defines the "sequential use" EE certificate that we are
> disallowing for RPKI Manifest use, it seems reasonable to ask whether the
> issues that cause us to forbid sequential use in this context might also
> apply to other scenarios where "sequential use" certificates might have
> been used.  (I did not make a survey of the RPKI specification corpus to
> investigate whether such scenarios are likely to exist.)

I didn't perform an extensive survey either, but AFAIK the Manifest -bis
document is one of the few places to be so explicit about the use of
one-time-use certificates. I believe the primary motivation for this
requirement is that Manifests / CRLs are the "#1 top churning object"
type in the RPKI ecosystem. CRLs (and thus Manifests) generally have a
validity window measured in hours, whereas other object types such as
ROAs often times have a validity window of 12 to 18 months, or longer.

I expect that going forward, where needed, new object types or -bis
documents will be mor explicit whether to "recycle keypairs" or mandate
one-time-use.

As you highlighted, your above comment is not really a comment on *this*
document.

> Section 4.2.1
> 
>    fileHashAlg:
>       This field contains the OID of the hash algorithm used to hash the
>       files that the authority has placed into the repository.  The hash
>       algorithm used MUST conform to the RPKI Algorithms and Key Size
>       Profile specification [RFC6485].
> 
> RFC 6485 has been obsoleted by RFC 7935.

Correct, this has been rectified in my working copy, also noted in
https://mailarchive.ietf.org/arch/msg/sidrops/NzCcNzcN46FD7RgV3D5ULMnEhxs/

> Section 5.1
> 
>    2.  Issue an EE certificate for this key pair.  The CA MUST revoke
>        the EE certificate used for the manifest being replaced.
> 
> Do the mechanics of revocation still need to be undertaken even if the EE
> certificate in question has expired?

No, in the wild one can observe how the property of expired Manifest EE
certificates is a helpful utility to prevent needless growth of CRLs.
Also, trimming of CRLs is permitted. Do you have suggested text to
capture that in a concise succinct manner? Or leave as-is?

> Section 6.x
> 
> The old RFC 6486 procedures allowed ("MAY") the RP to verify that each
> file at a publication point is listed in exactly one current manifest,
> which is no longer allowed.  I think I can see how this was a problematic
> thing to try, but want to confirm that it is intentionally removed.

Ack.

> Similarly, RFC 6486 wanted a warning produced if there were objects
> present in the repository that are not listed in any manifest, which does
> not seem particularly useful and I do not mind seeing removed.

Ack.

> RFC 6486 also wanted a warning issued if a publication point contained
> both valid and invalid manifests; removing that guidance seems correct to
> me.

Ack.

> Section 6.3
> 
> The two "proceed to Section 6.4"s seem redundant (I suggest removing the
> last one).
> 
> Section 6.5
> 
>    the publication point.  If the computed hash value of a file listed
>    on the manifest does not match the hash value contained in the
>    manifest, then the fetch has failed and the RP MUST proceed to
>    Section 6.6; otherwise proceed to Section 6.6.
> 
> This looks an awful lot like "if X; goto Y; else; goto Y", which could
> just be written as "goto Y".  However, since §6.6 is the "failed fetches"
> processing, in the successful verification case perhaps we don't want to
> do that and should instead proceed to use the content of the files listed
> in the manifest.

Nice catch! I'll remove the last sentnce.

> Section 8
> 
> Perhaps it is sufficiently obvious so as to go without saying, but I trust
> that in a prolonged period of failed fetches, the human operator is
> expected to go figure out (via out of band channels) what's up with the
> CA/repository in question and whether an attack is underway.
> 
>                              The requirement for every repository
>    publication point to contain at least one manifest allows an RP to
>    determine if the manifest itself has been occluded from view.  [...]
> 
> Do we want to say anything about how these properties change during a CA
> key rollover or if multiple "unrelated" CAs are publishing at the same
> publication point (the latter of which would be quite unusual, if I
> understand correctly)?

Multiple 'repositories' can exist in a single rsync directory, akin to
how multiple CA Repositories can be fetched via a single RRDP
publication server. My preference would be to leave this as-is, I do not
recall a perception of ambiguity in the wider CA operator community abou
this aspect.

> Section 9
> 
> I see that the latest update from IANA has the registry (entry) references
> being updated from RFC 6488 to [this document], which is great and
> obviates what I would otherwise have remarked upon.

Ack.

> Section 11.1
> 
> Now that we're obsoleting RFC 6486, that should probably bump it off into
> the Informative References section.

Done!

> RFCs 6482, 6493, 8209, and 8488 appear to not actually be referenced
> from anywhere and thus could probably be removed.

Done, this was addressed in
https://mailarchive.ietf.org/arch/msg/sidrops/NzCcNzcN46FD7RgV3D5ULMnEhxs/

> Section 11.2
> 
> Do we really want to reference RFC 3370 ("CMS Algorithms") for "the CMS
> wrapper of the object" rather than something like RFC 5652 ("CMS")?

Good suggestion, I've replaced 3370 with 5652.

Please let me know if the suggested changes to Section 4 and 5 seem
appropriate, if so I'll make the changes and share a patchset.

Kind regards,

Job