[sidr] comments on validation revisited -01

Stephen Kent <kent@bbn.com> Thu, 19 March 2015 15:00 UTC

Return-Path: <kent@bbn.com>
X-Original-To: sidr@ietfa.amsl.com
Delivered-To: sidr@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DD5A61ACD76 for <sidr@ietfa.amsl.com>; Thu, 19 Mar 2015 08:00:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.99
X-Spam-Level: *
X-Spam-Status: No, score=1.99 tagged_above=-999 required=5 tests=[BAYES_50=0.8, GB_SUMOF=1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, URIBL_DBL_SPAM=2.5] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fLukOXMTASEy for <sidr@ietfa.amsl.com>; Thu, 19 Mar 2015 08:00:47 -0700 (PDT)
Received: from smtp.bbn.com (smtp.bbn.com [128.33.1.81]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1F9351ACD78 for <sidr@ietf.org>; Thu, 19 Mar 2015 08:00:46 -0700 (PDT)
Received: from ssh.bbn.com ([192.1.122.15]:52978 helo=COMSEC.home) by smtp.bbn.com with esmtp (Exim 4.77 (FreeBSD)) (envelope-from <kent@bbn.com>) id 1YYbwC-000BcP-0J for sidr@ietf.org; Thu, 19 Mar 2015 11:00:44 -0400
Message-ID: <550AE49C.9010207@bbn.com>
Date: Thu, 19 Mar 2015 11:00:44 -0400
From: Stephen Kent <kent@bbn.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.5.0
MIME-Version: 1.0
To: sidr <sidr@ietf.org>
Content-Type: multipart/alternative; boundary="------------060508090109060903040206"
Archived-At: <http://mailarchive.ietf.org/arch/msg/sidr/-pKD4X18ElGNwxT_42uzfmcgZFU>
Subject: [sidr] comments on validation revisited -01
X-BeenThere: sidr@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Secure Interdomain Routing <sidr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidr>, <mailto:sidr-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sidr/>
List-Post: <mailto:sidr@ietf.org>
List-Help: <mailto:sidr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidr>, <mailto:sidr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Mar 2015 15:00:55 -0000

I reviewed draft-ietf-sidr-rpki-validation-reconsidered-01.txt.

There are a number of minor changes to the text, but the big change is 
the addition of a specification for the proposed, revised validation 
procedure. Thus many of my comments repeat concerns cited with respect 
to the -00 version of this document and previously posted to this list.

At the SIDR meeting in Honolulu John Curran delivered a briefing 
co-authored coauthored by Geoff Huston. That briefing described the 
potential for errors and the impact of such errors when a CA reissues a 
certificate with a smaller set of INRs. This presentation referred to 
the analysis in the -00 version of this document as “not detailed” and 
noted that the motivation for changing the validation algorithm was “not 
based on operational experience.” As a result I expected to see a more 
detailed analysis and discussion of operational experience with respect 
to over claiming (slide 4). For example, the importance of addressing 
this issue would be clearer if there were statistics detailing the 
number of INR transfers, by region, for the past few years. Such stats 
should indicate when the transfers were for “live” (in-use) vs. unused 
INRs. Finally, to understand the implications for the RPKI, the stats 
should indicate how transfers relate to the RPKI hierarchy, i.e., how 
many tiers were involved in a transfer.

John’s briefing also suggested that a standard procedure for certificate 
management during INR transfers be documented (slide 5). The changes 
that resulted in the -01 version of this document do not addresses any 
of these issues.

At the previous meeting, in Toronto, I offered to provide edits to 
improve the text when sentences run on, as they do in many places. I 
failed to do so, butsince most of the text is unchanged from the -00 
version, I offer such revisions now, at the end of this message. Below 
are other comments on the -01 version of the document. The BEFORE/AFTER 
text is the bulk of this message.

The second and third paragraphs of Section 3, argue that over-claiming 
may be the result of errors by a CA in the course of normal operation, 
independent of an INR transfer. This is inconsistent with the stated 
rationale from the earlier sections of the document. However, I agree 
with the concern cited here, i.e., we should examine ways to make the 
RPKI tolerant of errors made by CAs (especially higher tier CAs) and 
repository operators. Yet, the proposed mechanism addresses only one 
type of error. I suggest that an analysis is required to explore the 
range and types of errors that might occur, so that we can pursue a 
solution that addresses more than just the one type of error cited in 
this document.

Later in Section 3 the document cites one possible way that a transfer 
might be effected, and notes that this would result in problems as 
illustrated in Section 2. However, the document does not examine other 
possible transfer procedures that might avoid this problem. It seems 
inappropriate to cite one example of how to effect a transfer and use 
that example to justify a major change in the RPKI validation 
procedure.(I’m tempted to cite the joke about a patient who complains to 
his doctor that when he tries to move his arm in a particular fashion it 
hurts. The doctor replies “Then don’t do that!”)

I recall someone noting during a previous meeting that the CA’s on the 
receiving side of an INR transfer could issue new certificates, 
containing just the INRs being transferred, as a way to avoid the 
problem cited in this document. If these certificates are issued prior 
to the transfer being represented in the “common CA” certificate, they 
will be invalid, but they will not affect any other INRs in the path. 
When the transfer has been completed, the INRs can be represented in the 
(long term) certificates of the affected parties, and the certificates 
issued on a temporary basis for the transfer can be revoked (or allowed 
to expire). This suggestion does not seem to require changes to the 
existing path validation procedure, nor changes to RFCs. It does have 
the downside of adding more certificates to the RPKI repository system, 
for a (hopefully short) time. Absent stats on the frequency of transfers 
and the number of CAs that are (would be) involved, it’s impossible to 
determine if the impacts of this approach are significant.

Later in Section 3 the document says: “Avoiding such situations requires 
that CA's adhere to a very specific ordering of certificate issuance.” 
It’s true that coordination of actions across CAs is needed when INRs 
are transferred. But coordination is required irrespective of the RPKI, 
e.g., to ensure that the INRs being transferred are not allocated by 
both the previous holder and the new holder (or their parent 
organizations). Thus the need for coordination is one of degree and 
extent, not a black-or-white distinction as suggested by the text.

The text goes on to suggest that there is only one order of events that 
will effect a transfer without breaking the RPKI. This is not quite 
true. The (currently expired) Transfer Authorization Object (TAO) draft 
described how transfers could be effected under the current certificate 
path validation procedures. That document noted where coordination is 
required, and it painted a less rigid picture than that presented in the 
validation revisited I-D. Also, the discussion in this section fails to 
distinguish between a transfer of “live” address space vs. unused space. 
The distinction, which was addressed in the TAO I-D, is important and 
should be part of any discussion that purports to provide an analysis of 
inherent limitations on INR transfer procedures. Moreover, an approach 
based on issuing temporary certificates, as noted above, represents 
another means of staging a transfer that may reduce constraints on the 
ordering ofthe steps in the process.

I note a very brief mention of the TAO idea in Section 4, along with a 
very quick dismissal. In part the text suggests that the TAO approach 
may be inadequate because its fails to “…mitigate to any meaningful 
extent the risks of failure to ensure strict INR consistency at all 
times.” This text sounds like its is again considering the larger topic 
of errors by CAs, independent of INR transfers. If so, then, as I noted 
above, an analysis of such errors needs to precede detailed discussion 
of proposed solutions.

Section 5 is new, specifying the proposed, revised validation procedure. 
Much of this text is redundant. For example, steps 1 and 2 (page 7) are 
always part of path validation. Steps 3 and 4 are already stated in 
[RFC6487], and need not be two, separate steps.Steps 5 and 7 are always 
part of certificate path validation. Step 6 is the only RPKI-unique step 
and I don’t understand what it means. Presumably the text in the 
following paragraph (from page 8) is intended to establish the context 
for step 6, but that text says:

Validation of signed resource data using a signing key that is

certified in a resource certificate, coupled with a specific set of 
number resources, consists of verifying that the digital signature of 
the signed resource data is valid, using the public key that is 
certified by the resource certificate, and also validating the resource 
certificate in the context of the RPKI, using the path validation process.

This 7-line sentence is unintelligible to me. For example, validation of 
signed anything is performed using a public key not a “signing” 
(private) key. Is signed resource data an oblique reference to 3779 
extensions in a certificate? The comment “using the path validation 
process” seems to make the description of the process a circular 
definition.

Suggested fixes for run-on sentences:

Section 1.

BEFORE

This document reviews the certificate validation procedure specified

in RFC6487 and highlights aspects of operational fragility in the

management of certificates in the RPKI in response to the movement of

resources across registries, and the associated actions of

Certification Authorities to maintain continuity of validation of

certification of resources during this movement.

AFTER

This document reviews the certificate validation procedure specified

in RFC6487, with respect to operational fragility in the context of RPKI 
certificate management. It focuses on scenarios involving resource 
transfers across registries and the actions of

Certification Authorities as needed to ensure continuity of validation 
of resource certificates during such transfers.

Section 2.

BEFORE

As currently defined in section 7.2 of [RFC6487], validation of PKIX

certificates that conform to the RPKI profile relies on the use of a

path validation process where each certificate in the validation path

is required to meet the certificate validation criteria.This is a

recursively defined validation process where, in the context of an

ordered sequence of certificates, as defined by each pair of

certificates in this sequence having a common Issuer and Subject Name

respectively, a certificate is defined as valid if it satisfies basic

validation criteria relating to the syntactic correctness, currency

of validity dates and similar properties of the certificate itself,

as described in [RFC5280], and also that it satisfies certain

additional criteria with respect to the previous certificate in the

sequence (the Issuer part of the pair), and that this previous

certificate is itself a valid certificate using the same criteria.

This process is applied to all certificates in the sequence apart

from the initial sequence element, which is required to be a Trust

Anchor.

For RPKI certificates, the additional criteria relating to the

previous certificate in this sequence is that the certificate's

number resource set, as defined in [RFC3779], is "encompassed" by the

number resource set contained in the previous certificate.

AFTER

Path validation of resource certificates follows the basic procedure 
described in section 6.1of [RFC5280], with some additional restrictions 
appropriate for the RPKI context, as described in section 7.2 of 
[RFC6487]. In particular, the RFC 3779 extensions present in a 
certificate MUST be “encompassed” by the RFC 3779 extensions in the 
parent certificate. (As is always the case for path validation, trust 
anchors are exempt from this requirement.)

BEFORE

Because [RFC6487] validation demands that all resources in a

certificate be valid under the parent (and recursively, to the root),

a digitally signed attestation, such as a Route Origin Authorization

(ROA) object [RFC6482], which refers only to a subset of RFC3779-

specified resources from that certificate validation chain can be

concluded to be invalid, but not by virtue of the relationship

between the RFC3779 extensions of the certificates on the putative

certificate validation path and the resources in the ROA, but by

other resources described in these certificates where the

"encompassing" relationship of the resources does not hold.Any such

invalidity along the certificate validation chain can cause this

outcome, not just at the immediate parent of the end entity

certificate that attests to the key used to sign the ROA.

AFTER

Resource certificate validation demands that all INRs in each 
certificate (other than a trust anchor) be encompassed by the parent 
certificate. For a Route Origin Authorization (ROA) [RFC6482] this means 
that the INRs in its EE certificate MUST ne encompassed by the parent CA 
certificate, and by all superior certificates along the path to a trust 
anchor.

BEFORE

The underlying observation here is that this definition of

certificate validation treats a collection of resources as

inseparable, so that a single certificate containing a bundle of

number resources is semantically distinct from an equivalent set of

certificates where each certificate contains a single number

resource.This semantic distinction between the whole and the sum of

its parts is an artifice introduced by the particular choice of a

certificate validation procedure used by the RPKI, as distinct from

meeting any particular operational requirement, and the result is the

introduction of operational fragility into the handling of RPKI

certificates, particularly in the case where number resources are

moved between the corresponding registries, as described here.

AFTER

The underlying observation here is that this definition of

certificate validation treats a collection of resources as

inseparable. Thus, a single certificate containing a bundle of

INRs is semantically distinct from an equivalent set of

certificates where each certificate contains a single number

resource.Specifically, only the individual certificates in the set that 
fail to be encompassed by the parent would be invalid. This distinction 
is especially significant for CA certificates. If a CA certificate 
contains just one INR that is not encompassed by all of its superior CA 
certificates, the CA certificate is treated as invalid, and thus all of 
its subordinate certificates are invalid as well. There is no direct, 
operational requirement that mandates this aspect of resource 
certificate path validation. This aspect of path validation introduces 
operational fragility into management of resource certificates, 
particularly in the case where INRs are moved between the corresponding 
registries.

Section 3.

BEFORE

This constraint creates a degree of operational fragility in the

issuance of certificates, as all CA's are now required to exercise

extreme care in the issuance and reissuance of certificates to ensure

that at no time do they overclaim on the resources described in the

parent CA, as the consequences of an operational lapse or oversight

implies that all the subordinate certificates from the point of INR

mismatch are invalid.It would be preferred if the consequences of

such an operational lapse were limited in scope to the specific INRs

that formed the mismatch, rather than including the entire set of

INRs within the scope of damage from this point of mismatch downward

across the entire sub-tree of descendant certificates in the RPKI

certificate hierarchy.

AFTER

This constraint creates a degree of operational fragility in the

issuance of certificates. CA's are required to exercise extreme care in 
the issuance and reissuance of certificates, to ensure that each the 
INRs in each certificate are encompassed by the CA’s own certificate. 
Failure to ensure this property in subordinate CA certificates would 
cause signed products of such subordinate certificates to be treated as 
invalid. It would be preferable if the consequences of such an 
operational lapse were limited in scope to the specific INRs that are 
not encompassed by superior CA certificates.

BEFORE

The second operational consideration described here relates to the

situation where a registry withdraws a resource from the current

holder, and the resource to transferred to another registry, to be

registered to a new holder in that registry.The reason why this is

a consideration in operational deployments of the RPKI lies in the

movement of the "home" registry of number resources during cases of

mergers, acquisitions, business re-alignments, and resource transfers

and the desire to ensure that during this movement all other

resources can continue to be validated.

AFTER

A second operational concern arises during transfers of INRs. During a 
transfer, a registry withdraws a resource from the current

Holder and transfers it to another registry, to be allocated to a new 
holder in that registry.If the INRs being transferred are in use by the 
holder in the "home" registry, then it is critical that routing not be 
disrupted during the transfer. Mergers, acquisitions, and business 
re-alignments all may trigger such transfers. (In contrast, if INRs are 
allocated but not in use in the "home" registry context, transfer of the 
INRs does not require that they are continuously valid during the process.)

BEFORE

If the original registry's certification actions are simply to issue

a new certificate for the current holder with a reduced resource set,

and to revoke the original certificate, then there is a distinct

possibility of encountering the situation illustrated by the example

in the previous section.This is a result of an operational process

for certificate issuance by the parent CA being de-coupled from the

certificate operations of child CA.

This de-coupled operation of CAs introduces a risk of unintended

third party damage: since a CA certificate can refer to holdings

which relate to two or more unrelated subordinate certificates, if

this CA certificate becomes invalid due to the reduction in the

resources allocated to this CA relating to one subordinate resource

set, all other subordinate certificates are invalid until the CA

certificate is reissued with a reduced resource set.

AFTER

The original registry's CA might issues a new certificate for the 
current holder with a reduced resource set, and revoke the original 
certificate. However, such action would cause the INRs being transferred 
become invalid, until the recipient of these INRs receives a new, valid 
certificate containing them.

*(I’m unclear what the end of the first paragraph and most of the second 
paragraph is trying to say.)*

BEFORE

At the lower levels of the RPKI hierarchy the resource sets affected

by such movements of resources may not encompass significantly large

pools of resources.However, as one ascends through this

certification hierarchy towards the apex, the larger the resource set

that is going to be affected by a period of invalidity by virtue of

such uncoordinated certificate management actions.In the case of a

Regional Internet Registry (RIR) or National Internet Registry (NIR),

the potential risk arising from uncoordinated certification actions

relating to a transfer of resources is that the entire set of

subordinate certificates that refer to resources administered by the

RIR or the NIR cannot be validated during this period.

AFTER

At the lower levels of the RPKI hierarchy the resource sets affected

by transfer of INRs may not encompass significantly large

pools of resources.However, at higher tiers in the RPKI, the sets of 
INRs represented in CA certificates can be very large. In the case of a 
Regional Internet Registry (RIR) or National Internet Registry (NIR), 
the number INRs represented in their certificates is very large. *(This 
is true for an NIR only if it acts as a CA vs. an RA. Will all NIRs 
operates as CAs or will some act only as RAs?)* Thus there is a 
potential for a very large number of subordinate CAs to be adversely 
affected if the INRs represented in these CA certificates fail to adhere 
to the subset requirements imposed by [RFC3779]. *(The situation seems 
more complex than suggested here. First, if one focuses only on 
over-claiming, as this document seems to, the current RIR situation does 
not seem to have this problem. That’s because each RIR is a TA. Any INRs 
in a TA certificate are not subject to 3779 restrictions, so they cannot 
over-claim.Also, I believe that each RIR issues a subordinate CA 
certificate below its TA certificate, and uses the inherit bit to 
represent the INRs. In that case, the subordinate CA certificate also 
cannot over-claim. This sentence needs to turn into a paragraph to more 
accurately discuss the potential for very wide scale problems at the RIR 
tier.)*

BEFORE

Avoiding such situations requires that CA's adhere to a very specific

ordering of certificate issuance.In this framework, the common

registry CA that describes (directly or indirectly) the resources

being shifted from one registry to the other, and also contains in

subordinate certificates (direct or indirect) the certificates for

both registries who are parties to the resource transfer has to

coordinate a specific sequence of actions.

This common registry CA has to first issue a new certificate towards

the "receiving" registry that adds to the RFC3779 extension resource

set the specific resource being transferred into this receiving

registry.The common registry CA then has to wait until all

registries in the subordinate certificate chain to the receiving

registry have also performed a similar issuance of new certificates,

and in each case a registry must await the issuance of the immediate

superior certificate with the augmented resource set before it, in

turn, can issue its own augmented certificate to its subordinate CA.

This is a "top down" issuance sequence."

AFTER

Avoiding such situations requires that CA's adhere to a very specific

ordering of certificate issuance. The common registry CA that is 
responsible (directly or indirectly) for the INRs being transferred, 
must coordinate the sequence of actions that effect the transfer.

*(I didn’t rewrite the second paragraph above because the sequence of 
events that it describes is not the only way to orchestrate a transfer, 
as demonstrated in the TAO I-D.)*

**

**

BEFORE

It is possible for the common registry to issue a certificate to the

"sending" registry with the reduced resource set at any time, but it

should not revoke the previously issued certificate, nor overwrite

this previously issued certificate in its repository publication

point without specific coordination.Only when the common registry

is assured that the top down certificate issuance process to the

receiving registry CA chain has been completed can the common

registry commence the revocation of the original certificate for the

sending registry, However, it should not so until it is assured that

the immediate subordinate registry CA in the path to the sending

registry has issued a certificate with a reduced resource set, and so

on.This implies that on the sending side the certificate issuance

and revocation is a "bottom up" process.

AFTER

It is possible for the common registry to issue a certificate to the

"sending" registry with the reduced resource set at any time. However, 
it should not revoke nor replace the previously issued certificate,

without specific coordination. The common registry must verify that the 
certificate path to the recipient of the transferred INRs is in place 
before revoking and replacing the CA certificate of the source.

This implies that on the sending side the certificate issuance

and revocation is a "bottom up" process. *(The process also may be 
“bottom up” on the receiving side, with respect to requesting the INRs 
being transferred. However, the issuance of certificates with the new 
resources must be “top down.”)*

BEFORE

The underlying consideration here is that the operational

coordination of these certificate issuance and revocation actions to

effect a smooth resource transfer across registries is mandated by

the nature of the particular choice of certificate validation process

described in [RFC6487].

AFTER

The certificate path validation procedure described in [RFC6487] 
requires that transfers of INRs be coordinated to prevent even transient 
over-claiming.

Section 4.

BEFORE

Validation of signed resource data using a signing key that is

certified in a resource certificate, coupled with a specific set of

number resources, consists of verifying that the digital signature of

the signed resource data is valid, using the public key that is

certified by the resource certificate, and also validating the

resource certificate in the context of the RPKI, using the path

validation process.

AFTER

*(Nothing, since the text seems to add nothing to the description of the 
alternativepath validation procedure.)*