Re: [sidr] WGLC for draft-ietf-sidr-algorithm-agility-03

Eric Osterweil <eosterweil@verisign.com> Mon, 14 November 2011 06:05 UTC

Return-Path: <eosterweil@verisign.com>
X-Original-To: sidr@ietfa.amsl.com
Delivered-To: sidr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6DCD211E8208 for <sidr@ietfa.amsl.com>; Sun, 13 Nov 2011 22:05:45 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.578
X-Spam-Level:
X-Spam-Status: No, score=-6.578 tagged_above=-999 required=5 tests=[AWL=0.021, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6oY1qNz4iM6Q for <sidr@ietfa.amsl.com>; Sun, 13 Nov 2011 22:05:44 -0800 (PST)
Received: from exprod6og104.obsmtp.com (exprod6og104.obsmtp.com [64.18.1.187]) by ietfa.amsl.com (Postfix) with ESMTP id C616611E8207 for <sidr@ietf.org>; Sun, 13 Nov 2011 22:05:43 -0800 (PST)
Received: from peregrine.verisign.com ([216.168.239.74]) (using TLSv1) by exprod6ob104.postini.com ([64.18.5.12]) with SMTP ID DSNKTsCvspJxDIuqbh0Z0va1MwEmusj5c/tL@postini.com; Sun, 13 Nov 2011 22:05:43 PST
Received: from dul1wnexcn01.vcorp.ad.vrsn.com (dul1wnexcn01.vcorp.ad.vrsn.com [10.170.12.138]) by peregrine.verisign.com (8.13.6/8.13.4) with ESMTP id pAE65cJQ028517; Mon, 14 Nov 2011 01:05:38 -0500
Received: from dul1eosterwe-m1.vcorp.ad.vrsn.com ([10.100.0.69]) by dul1wnexcn01.vcorp.ad.vrsn.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 14 Nov 2011 01:05:35 -0500
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset="windows-1252"
From: Eric Osterweil <eosterweil@verisign.com>
In-Reply-To: <p06240803cae62a2b13af@[128.89.89.129]>
Date: Mon, 14 Nov 2011 14:05:32 +0800
Content-Transfer-Encoding: quoted-printable
Message-Id: <E54B2072-87B3-4D9A-B1D7-0146A0B51274@verisign.com>
References: <CAD6DA02.1C611%terry.manderson@icann.org> <p06240803cad6af1b0ce7@[193.0.26.186]> <7B40776F-D906-46DA-A788-C4E9C0E758A9@verisign.com> <p06240803cad951813fd9@[193.0.26.186]> <CB6FE413-BEC2-4910-AEEF-98D6EAFD4E83@verisign.com> <p06240802cadde494171b@[128.89.89.6]> <3F1388E3-A694-42C9-AE2F-F12BF15DC86F@verisign.com> <p06240811cade1873e723@[128.89.89.6]> <BDA75A7E-2B2D-44A5-A18F-2D7DA01DF3A2@verisign.com> <p06240808cadf618efaa8@[128.89.89.6]> <E9BAE21C-A8EF-4D07-90C1-E8A5FD7F00E7@verisign.com> <p06240803cae62a2b13af@[128.89.89.129]>
To: Stephen Kent <kent@bbn.com>
X-Mailer: Apple Mail (2.1084)
X-OriginalArrivalTime: 14 Nov 2011 06:05:35.0438 (UTC) FILETIME=[6CB1B6E0:01CCA293]
Cc: "sidr@ietf.org list" <sidr@ietf.org>
Subject: Re: [sidr] WGLC for draft-ietf-sidr-algorithm-agility-03
X-BeenThere: sidr@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Secure Interdomain Routing <sidr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidr>, <mailto:sidr-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sidr>
List-Post: <mailto:sidr@ietf.org>
List-Help: <mailto:sidr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidr>, <mailto:sidr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 14 Nov 2011 06:05:45 -0000

Hey Steve,

Thanks for the response.  I commented below:

On Nov 14, 2011, at 10:16 AM, Stephen Kent wrote:

> Eric,
> 
> In response to your message from last week.
> 
> Some candidate text dealing with the timeline document in section 2:
> 
> An additional document, the algorithm transition timeline will be published as a BCP (?) to define the timeline for the algorithm suite transition. It will defines dates for the phase transitions, consistent with the descriptions provided in Section 4. It is RECOMMENDED that the timeline document be developed by the entities that act as CAs, RPs, and repository operators in the RPKI, e.g., IANA, Internet Registries, and network operators. It is also RECOMMENDED that the timeline document describe procedures to track the progress of the transition and to amend the timeline, e.g., if problems arise in implementing later phases of the transition.

I really think we should address these issues in a single document.  It seems like splitting this off into a separate/as yet unwritten document is likely to cause some problems.  In particular, since that document does not yet exist, and it may not be written and adopted for some time, this draft will not be complete on its own.  I'm worried that it is hard to judge this document's readiness w/o these timeline issues worked out or even broached (as they may demand changes to this process).  Besides, isn't the corpus of drafts rather extensive already (w/o adding another)? :)

> 
> 
> You raised a question about the implications for CAs that do not transition to the new algorithm suite, motivated by the last paragraph of section 11. You posited that "Š the implication is that if any CA doesn't keep up (so to speak) they are considered invalid and therefore would be un-routable?" That's not quite accurate. At the end of Phase 4 the Internet resources of any CAs that have not made the transition will be treated the same as resources that have not been protected via RPKI certificates and ROAs. Participation in the RPKI is voluntary, so there may always been unprotected resources in the public Internet. These are not un-routable, but they are subject to hijacking.

OK, I see.  But, aren't there second-order effects of this that we have to worry about?  For example, if I am an ISP whose CA performs the rollover properly, but my upstream's CA does not, then their CA's failure to keep up will cause my ISP to no longer be able to participate in BGPsec, right (because I'm no longer part of a contiguous BGPsec island)?  I realize this is a bit different than my original example, but upon thinking about the motivation for my comment, my point was more general.  It was that transitioning a CA to this state can have very undesirable effects.

> 
> You suggested that we codify how the community should deal with problems that motivate delaying a phase transition. We're not writing the timeline document now, but the text at the beginning of this message is an effort to make sure such considerations are addressed in that document.
> 
> You suggested that we add text, for each phase, saying "Š what to do if its success requirements are not met (exceptions, error legs, etc)." Here's my proposed initial text to address your question:
> 
> Phase 0 is the start of the process, when a new algorithm suite has been selected and the timeline published. The only problem I envision that might arise at this stage, prior to Phase 1, is a discovery of a problem that makes Suite B unacceptable. If this situation arises, the algorithm document will have to be reissued with a new Suite B, and the timeline document will be reissued.

s/I envision/we envision/

> 
> Phase 1 requires all CAs to be able to issue certificates under Suite B. If a problem arises that makes this infeasible for a substantial number of CAs, the timeline document can be reissued, pushing back this date, and dates for subsequent milestones. CAs that are capable of issuing Suite B certificates may continue to do so, if requested by their child CAs. Since this phase does not require any RPs to process signed objects under Suite B, and since Suite B product SHOULD be stored at independent publication points, there is no adverse impact on RPs.

kewl, thnx.  One minor nit: can we rephrase one part for clarity.  Instead of "If a problem arises that makes this infeasible for a substantial number of CAs," can we just specify a little bit about how this is determined.  Maybe something like, "If <whoever the operational governing body we elect for timeline statements> deems a problem to have arisen that is significant enough to make this infeasible for a significant enough number of CAs..."

> 
> Phase 2 requires that CAs MUST publish all signed products under Suite B, as well as Suite A, and RP MAY be prepared to validate these products using Suite B. If a problem arises that makes this infeasible for a substantial number of CAs, the timeline document can be reissued, pushing back this date and dates for subsequent milestones. (Since the processing requirement for RPs here is  MAY, if RPs have problems with Suite B products this does not require pushing back the Phase 2 milestone, but it does motivate delaying the start of Phase 3.) CAs that are capable of publishing products under Suite B may continue to do so. Phase  2, like Phase 1, does not require any RPs to process signed objects under Suite B, and since Suite B product SHOULD be stored at independent publication points, there is no adverse impact on RPs.

First, same minor nit as above.
Second, do we want to consider the case were we want to rollback (perhaps an alg B has become unsuitable for some reason and we need to choose a new alg B altogether)?  I'm not saying the above text should be yanked, maybe just augmented?

> 
> Phase 3 requires that RPs MUST be able to process Suite B signed products, and RPs are encouraged to validate signed products using Suite B. However, each RP is required to be able to fall back to using the Suite A product if the Suite B product set cannot be validated. As Section 4.6 notes, there are no CA behavior changes at this phase, so there is no requirement for CA rollback. If a substantial number of RPs are unable to process product sets signed with Suite B, this Phase could be delayed, and subsequent milestones pushed back. There is no rollback required here, as there is no change in CA behavior.

I think this reads well.  I just have the same rollback comment as above (again, as a possible addition, not replacement text).

> 
> Phase 4 begins the phase out of Suite A. At this phase products sets signed under the old algorithm suite may begin to disappear, i.e., CAs MAY choose to not publish them anymore. This phase should be delayed if it is determined that many RPs are not capable of processing the new algorithm suite. There is no rollback required if this phase is delayed. However, CAs should be reminded to not remove old  algorithm suite A product sets if this phase is delayed.

Same rollback comment.

Minor typo:
s/products sets/product sets/

> 
> Section 4.8 describes EOL for the old algorithm suite, i.e., it kills off all support for the old algorithm suite. It is described as a return to Phase 0. If we wait until Phase 0, then it may be a very, very long time before the old Suite A is killed off. We may want to revisit this, i.e., if we want to kill off the old algorithm suite sooner.
> Your last comment deals with the question of top-down vs. laissez faire algorithm transition. I think your comment is that we cannot mandate adherence to the transition process and timeline. Use of the RPKI, both the issuance of signed products and their consumption, is optional. So, no, we cannot mandate adherence to this timeline. But, we can establish a transition  process and timeline for all CAs and RPs that chose to follow it. The goal in establishing the process and timeline is to minimize the cost to the community, as a whole, for the transition. Externalization of cost is the appropriate term here (from economics perspective) to describe what happens if one were to adopt the laissez faire approach. (Chaos might be an alterative term :-))

I think this is a very strange comment (and I note that you said it earlier in this email too).  "Use of the RPKI... is optional."  Is the goal of this system to protect the global routing infrastructure or not?  Unless we are talking about making this an experimental expedition, and are prepared to create a globally applicable system later?  If the goal is to secure the global routing system (note I am not saying universal deployment in the foreseeable future, just global applicability), then this is an operational non-starter.  I do not believe it is appropriate for us to try and re-legislate operational axioms in these drafts.  What we could be doing is grossly misaligning this standards work with operational invariants.

> 
> First, not that a CA cannot unilaterally decide to transition to a new algorithm. There is a requirement that the issuer of its certificate is prepared to accept a certificate request under the new algorithm suite (because of the PoP requirement).

I'm sorry, I don't understand.

> 
> The exponential growth arises if we have Suite A and B product sets at each tier. If we allow CAs to "do their own thing," there is the potential for four combinations:
> - a Suite A certificate issued under Suite A
> - a Suite A certificate issued under Suite B
> - a Suite B certificate issued under Suite A
> - a Suite B certificate issued under Suite B
> 
> If we allow all of these combinations, which may be needed to allow RPs to process signed products during the transition, then each tier has this potential 4-way branching, to accommodate RPs at different stages of algorithm capability. If you want more details, I suggest reviewing the slides from the SIDR WG meeting that I believe I cited previously. When Geoff Huston initially noted this problem, I disagreed, and it was not until I started to write the spec that I realized what he meant.

Indeed, I will take a look before coming back to this (to be sure I have not missed anything), thanks.

> To avoid more flame wars, I will duck your question about my views on DNSSEC and accommodation of different algorithm suites :-).

Shall I take that as a retraction of your comment? ;)

Eric