Re: [Sidrops] draft-ietf-sidrops-6486bis-04.txt

Steve,

Thank you for your response.

> On 15 Jun 2021, at 01:57, Stephen Kent <stkent=40verizon.net@dmarc.ietf.org> wrote:
> 
> Tim,
>> Krill defaults to 24 hours for the next update time on CRLs and Manifests. The not-after time on the manifest EE currently still defaults to 7 days following discussions we had about this in sidr many years ago. I am quite happy to change this match the next update time instead as this document now instructs. 
> So, to be clear, a new CRL and manifest are issued every 24 hours, but the manifest EE cert has a 7-day lifespan, right? Since you say that the manifest EE cert is a one-time use, this clearly contradicts the text in the manifest spec. Please do change the validity interval.

More than happy to do so..

Fwiw this goes back to a discussion we had in sidr over 10 years ago. Of course now I can't find the slides or email threads anymore, but I remember presenting a question to the WG on this specific question - and the outcome back then in my understanding was that it would good, desired even, to have a slightly longer lifespan on the manifest EE certs - and that 'stale' manifests and CRLs could result in warnings as per local policy, while expired manifests would always be invalid. So, right or wrong (probably) - this behaviour was based on what I believed to be that outcome..

Anyway, I am more than happy to change this. I actually prefer to have the lifespan of the EE match the next update time and have eliminate the difference based on local policy of stale vs expired manifest that may still be present in older deployed RP software.

> 
>> But in any case a new Manifest and CRL are issued by default 8 hours before they would go stale. I don't dare to leave this much longer, I think a CA operator needs this window to be able to deal with unforeseen outages.
>> 
>> When a manifest is replaced the previous EE cert is revoked - and its not-after (expire) time is remembered. When that time has passed the entry is removed.
>> 
>> All these values are configurable. I don't know for sure how other CA implementations do this. It would be good to hear from them. Based on memory at least the RIPE NCC managed CA uses a similar strategy.
>> 
> It would seem that each old manifest cert lives on a CRL for about 6 days, so there might be 5 or 6 of them on the CRL at any time, needlessly. That's not awful, but it's not pretty either.

I agree that the amount is not too bad.

Still it is not clear to me whether I MUST, or MAY, or am RECOMMENDED to revoke replaced manifests.

Remember that you would only find the revocation if you had found a new manifest and accepted the new CRL revoking the now irrelevant previous. Given the choice I would rather *not* revoke them ever.

>>>> ...
>>>> 
>>>> One could therefore argue that rather than revoking previous manifest, the 'thisUpdate' or possibly 'manifestNumber' should be leading in selecting an eligible manifest - should an RP have multiple validly signed and current manifests for the same CA somehow. Not revoking previous manifests would help to avoid the "chicken and egg" issue raised in section 6. It is not clear to me that this would really be less safe.
>>>> 
>>>> But look, really, rather than trying to pick an argument over this, I raise this because I would like to see explicit text in the document that says what's expected here :)
>>>> 
>>>> 
>>> as I said, unless a CA issues a new manifest before the next scheduled issue time, this ought not be an issue.
>>> 
>> As above, I don't dare to leave this to the last moment.
>> 
>> Furthermore, there could be other changes like a change in ROAs that require a new manifest to be issued.
>> 
> I am curious- do you track how often a new manifest needs to be issued because a ROA has changed? It would be useful to know if manifests are being reissued frequently only because a timer dictates it, not because any object covered by the manifest has actually changed.

I do not track how users of our software do this. It does not call home :) But, this can be observed by anyone looking at the RPKI with sufficient granularity.

Generally speaking any change in ROAs or delegated certificates results in a changed manifest. I believe that most RPKI CAs try to publish such changes within seconds to minutes.

For our own prefixes this is very infrequent - we just have few prefixes out there which don't change, unless we are running experiments. For an RIR or NIR delegated certificates change multiple times per day. For big operators ROAs may change much more frequently than in our case.

> 
>> Which reminds me.. I am not sure now if or where this is required, but in order to keep things simple to reason about I always issue a new manifest and CRL together.
> I don't think any text requires that a new CRL MUST be issued with every new manifest, although the opposite is obviously true if the old manifest EE cert is revoked. If a new manifest is issued 8 hours before the old one expires, and if there are no changes in the objects covered by the manifest, other than a new CRL, and if that CRL is also unchanged in terms of the revoked certs being listed, then there seems to be a lot of churn for no obvious reason.

One reason is that it's just easier to reason about the next update time for publishing both.

Another reason is that I am always issuing new manifests before they would expire, because I don't want a publication point to be rejected if I am a few minutes late because of some issue. And the old manifest is revoked, so that means the CRL changes.

Furthermore I use the same 'next update' time on CRLs and manifests. And I want to update the CRL before it would go stale for the same reason as updating manifests. Longer lived CRLs could only work if I used different times for both.

Note that if I do not need to revoke previous manifest this CRL is generally small, which reduces churn as well.

> 
>> It's not that the CRLs get huge, because revocations for expired EE certificates get removed as well. But, then again, I still think these manifest EE revocations are most likely not needed at all because with this -bis because RPs would only accept a new CRL with a changed hash, when they have already seen the replacement manifest.
>> 
>> 
>>> ... 
>> I am not vested in the text I proposed.
>> 
>> I believe that for normal top-down validation RPs need not consider any objects not found on manifests. This seems to align with what you are saying.
>> 
> cert path validation is always top-down, as per 5280 and its predecessors.
>> But what I am after is the following. RTA as proposed supports the idea of out-of-band exchanges of RTA objects, where those objects may NOT appear in RPKI. I believe RSC supports this option as well (but I need to re-read). So, in short, they may or may not appear on manifests. So, I would prefer that RPs can still do ad-hoc validation of RTA objects which they received out-of-band. Typically the RTA EE certificate would refer to a CRL and parent CA certificate which can be found in the RPKI (and manifests).
> Well, we've been reminded by Job that RSC doesn't assume distribution via the RPKI, so ...
> 
> I propose the following text to begin Section 6:
> 
>   Each RP MUST use the current manifest of a CA to control addition of listed
>   files to the set of signed objects the RP employs for validating basic RPKI
>   objects: certificates, ROAs, and CRLs. Any files not
>   listed on the manifest MUST NOT be used for validation of these objects.
>   However, files not listed on a manifest MAY be employed to validate other
>   signed objects, if the profile of the object type explicitly states that
>   such behavior is allowed. Note that relying on files not listed in a
>   manifest may allow an attacker to effect substitution attacks against such
>   objects.

That works for me.

Similar question to revocation of replaced manifests..

If I replace a .cer or .roa file with another one that has the same name, and a new hash, do I still need to revoke the old file as well? Given that a hash mismatch would be seen as a 'failed fetch'.

Currently I do. I can keep doing it. But I observe that the CRL and manifest are signed under the same CA certificate so I am not convinced that doing both (replace hash and revoke old) adds security, while it does make the CRL longer if there is lots of churn in ROAs or delegated CA certificates.

So, given the choice I would rather revoke objects only when their name goes out of scope, or perhaps even never - as they "MUST NOT" be used for validation if they are not listed.

Note that I see full value in the CRL for validating out-of-band RSC/RTA objects, or any object type which may be out of scope for manifests by design.

So, I hope that it's clear why I keep going on about the CRLs in this context. I can keep revoking things like I do today, but I would appreciate a clarification why this is better, and I think it may be worthwhile documenting this explicitly.

Tim

> Steve
> 
> 
> 
> _______________________________________________
> Sidrops mailing list
> Sidrops@ietf.org
> https://www.ietf.org/mailman/listinfo/sidrops