Re: [sidr] WGLC for draft-ietf-sidr-algorithm-agility-03

Brian Dickson <brian.peter.dickson@gmail.com> Sun, 13 November 2011 22:14 UTC

Return-Path: <brian.peter.dickson@gmail.com>
X-Original-To: sidr@ietfa.amsl.com
Delivered-To: sidr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DD35721F8AAF for <sidr@ietfa.amsl.com>; Sun, 13 Nov 2011 14:14:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.954
X-Spam-Level:
X-Spam-Status: No, score=-2.954 tagged_above=-999 required=5 tests=[AWL=-0.555, BAYES_00=-2.599, J_CHICKENPOX_33=0.6, J_CHICKENPOX_35=0.6, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id i9ANFYovfZF1 for <sidr@ietfa.amsl.com>; Sun, 13 Nov 2011 14:14:22 -0800 (PST)
Received: from mail-bw0-f44.google.com (mail-bw0-f44.google.com [209.85.214.44]) by ietfa.amsl.com (Postfix) with ESMTP id 1C5FB21F8A97 for <sidr@ietf.org>; Sun, 13 Nov 2011 14:14:21 -0800 (PST)
Received: by bkbzv15 with SMTP id zv15so6445372bkb.31 for <sidr@ietf.org>; Sun, 13 Nov 2011 14:14:21 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=hswp9iXf28P7X74L0LPBYbuo8L9LmepxkMUOBbz4adM=; b=usPOojTkvKj+t8NiuL5S+gFMsOqMaEv0WeUxQv9eplmAmf+MMMKh+0lOfwBordFr4A hEfKddjfQ2WBHyznH0u/bAWbTD0ram4hjxPvYw4+Exa7Qy6gKpzEthjCxWnYcDSYrx7i DwOReyaesmIm6G2HLB9sXIfaUHzvUGyYKPFss=
MIME-Version: 1.0
Received: by 10.204.136.211 with SMTP id s19mr9698608bkt.28.1321222461065; Sun, 13 Nov 2011 14:14:21 -0800 (PST)
Received: by 10.223.54.15 with HTTP; Sun, 13 Nov 2011 14:14:20 -0800 (PST)
In-Reply-To: <CAH1iCiq5+tsQ6kaPi1E-_YBguez1rCQfDFGFEAw_YUvN0FStLw@mail.gmail.com>
References: <Pine.WNT.4.64.1110201037470.4820@SMURPHY-LT.columbia.ads.sparta.com> <24B20D14B2CD29478C8D5D6E9CBB29F6025FEE@Hermes.columbia.ads.sparta.com> <CAH1iCipuaB=niUZY2WQdMX8REDVTWGjhosxTyq1AekkUiLZ=FQ@mail.gmail.com> <p06240805cae063a02041@128.89.89.6> <CAH1iCiq5+tsQ6kaPi1E-_YBguez1rCQfDFGFEAw_YUvN0FStLw@mail.gmail.com>
Date: Sun, 13 Nov 2011 17:14:20 -0500
Message-ID: <CAH1iCiokMr6pd1ZJROcRDxtJ1HFAeOeH1d9pWL=NizetrJEBPw@mail.gmail.com>
From: Brian Dickson <brian.peter.dickson@gmail.com>
To: "sidr@ietf.org" <sidr@ietf.org>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [sidr] WGLC for draft-ietf-sidr-algorithm-agility-03
X-BeenThere: sidr@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Secure Interdomain Routing <sidr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidr>, <mailto:sidr-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sidr>
List-Post: <mailto:sidr@ietf.org>
List-Help: <mailto:sidr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidr>, <mailto:sidr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 13 Nov 2011 22:14:24 -0000

Dear SIDR-WG,

Since my original message was pretty long and detailed, and it does
not appear anyone has made it through it yet, let me try to summarize
the main issues with which it is concerned.

The current proposal focuses on ending support for Algorithm A.

IMHO, it does so at the expense of the process by which support for
Algorithm B is introduced.

What I'm suggesting is, that instead of tying the two together, each
be handled separately.

And I'm suggesting that providing a path for early and widespread (but
not necessarily universal) adoption of "B", without regard to ending
support for "A", is more important than ending "A".

(Obviously, starting "B" and ending "A" can't be entirely independent,
since removing A before B is supported, would be illogical in the
extreme.)

It may be premature, as well, to suggest the overall strategy before
getting significant input from ICANN/IANA folks who will be prime on
the trust anchor and root certificates of the system, and operational
processes and strategies for making changes to them.

ICANN/IANA's experience doing similar things in signing the DNSSEC
root, and the processes they used, may better inform the "agility"
document.

NB: The most difficult thing about ending "Algorithm A" is stopping it
being used. This can be done independently, by having CAs delete certs
using "A", or by having RPs have a "knob" to say, "don't use Algorithm
'A'". I suggest both should be done. It's dead easy, IMHO.

The difficulty in introducing "Algorithm B" is much more substantial.
It involves new code for CAs, new code for RPs (multiple vendors,
multiple hardware platforms, many code streams, etc.). There are
interoperability requirements in three directions at least (CA-CA,
rtr-rtr, and rtr-CA). It also pretty much requires non-lab
experimentation, which means having certs issued which involve at
least one signature using "B". It also means that experimentation may
involve announcing BGPsec prefixes using such certs, before all RPs
understand "B".

There might be a place for describing use of an "alternative trust
anchor" to enable testing using production systems, without directly
impacting production certificates. New CA code supporting "B" (and
"A") could be deployed, and a chain of trust using a different trust
anchor and a mix of "A" and "B" certs (either/or at each delegation
point), would allow for exercising the code while maintaining the
"real" cert chain, e.g. as a final test before actually replacing
"A"-signed certs with "B"-signed certs. Trust anchors can be
established at any CA, and while the root "CA" is one place one would
expect this to be done, it isn't the only place it is possible (and
advisable) to do so.

Putting this into the phases is important, specifically so everyone
knows this activity is expected, and that all implementers are
prepared (in their implementations) to be sensible in their code.
Having those phases have flexibility in them, and having them
coordinated with feedback loops built in, is a much smarter way to go
than having deadlines. I would rather see full top-to-bottom support
for all-B certs, than worry about some certs that still have one or
more "A" signatures in them. Earlier transition to "B" at the root is
possible if flexible strategies are employed. Providing RPs with "no
A" knobs makes the CA-signing side of ending "A" cert signatures much
less of a security-critical issue.

The scalability issue (avoiding exponential growth in number of
published certs) means every cert should be signed with only one
algorithm.

This results in a single major requirement - RPs who want BGPsec to
work, must be able to handle Algorithm "B" certs, before the root is
signed with "B". This is _very_ different from requiring top-down
deployment.

Coordinating this is very important, but IMHO, does not require two
"parallel" chains of certs (all "A", and all "B" respectively). This
would be a "flag day", or close enough to it to be ill advised.

Having RPs having code that supports "B", with knobs for "allow-B" and
"allow-A", means RPs will have sufficient control to limit exposure to
security issues or operational issues, specific to "A" or "B". It also
allows deployment of new code without exposure to premature use of
prefixes used for testing. It also means a final wave of "B"
enablement, followed by a "B" root-signing, can be coordinated without
the kind of risk that a no-fall-back scenario (which the current
document requires) at all. A problem with a B-signed root can be fixed
by having the A-signed root re-deployed, either by a new A signature
or by rolling back the old signatures.

The original email below, goes into some of the "how" and "why" to
demonstrate the feasibility of this approach, and suggestions for how
this can be incorporated into a newer version of the "agility"
document.

(If this were easy, my emails would be much shorter.)

Brian

On Wed, Nov 9, 2011 at 4:44 PM, Brian Dickson
<brian.peter.dickson@gmail.com> wrote:
> Rather than respond, point-by-point, I will top-reply, and try to
> clear this up in a structured manner.
>
> First, from the perspective of normative references:
> - most of the main SIDR documents reference each other, both
> generally, and in specific places.
> -- e.g. -rpki-algs-05 refers to -arch, -cp, -res-certs and -signed-object
> -- e.g. -cp explicitly delegates algorithm specification to -rpki-algs
> -- e.g. -res-certs section 9, in addition to referring to -cp, clearly
> indicates that versions of itself and -cp need to be re-issued in
> lock-step
> -- e.g. Changing (by updating/replacing) the RFC for -rpki-algs, would
> actually require issuing new versions of -cp, -res-certs, and
> -rpki-algs, as a "document set".
>
> I believe I would be accurate in saying, one of the primary purposes
> of having an algorithm-agility doc, is to document what would need to
> happen if a new set of algorithms to be published. And the scope of
> the agility needs to include interaction between the RPKI system and
> the consumers of that data, the RPs.
>
> Let us stay with one "current" CP.
>
> Again, the presumption is: one -cp doc, one -res-certs doc, one
> -rpki-algs doc; and the agility doc describing the necessary state
> changes to go from one controlling, unified set of docs, to another
> controlling, unified set of docs.
>
> It's is all about the content of the "rpki-algs" document; everything
> (the top-down and exponential growth issues in particular) hinge on
> that.
>
> [terminology]
> Alg.suite = { algorithms }
> Let A denote Alg.Suite "A" (upper case means suite)
> let a denote algorithm (or alg-pair) "a" (lower case means algorithm)
> A example of the above would be:
> X = { n p q }
> [end terminology]
>
> [example cases]
> A = { a }
> B = { b }
> C = { a b }    (Algorithm suite C, includes algorithms "a" and "b")
> [end example cases]
>
> Note the following (Venn diagram) results:
> A ^ B = { }
> A ^ C = { a } = A
> C ^ B = { b } = B
> A v B = { a b } = C
> A v C = { a b } = C
> C v B = { a b } = C
>
> The current algorithm-agility document presumes that an algorithm
> update will always be of the form "A -> B".
> It is because of the fact that A ^ B = {}, that the duplicate certs
> problem arises, which gives rise to the exponential problem, which
> leads to top-down.
> With more than one suite, it becomes necessary to have multiple certs.
> A cert is only valid within one suite.
>
> If you consider suites A and C, note that a cert valid under A, is
> _also_ valid under C - it does not require new key date; it may need
> to re-issue.
>
> If, instead, the updates were done by two successive, albeit
> technically independent, updates, "A -> C", followed by "C -> B", the
> problem goes away.
>
> The timelines involved for "A -> C" would look like:
>   Process for RPKI CAs:
>
>     Phase 0   Phase 1   Phase 2   Phase 3
>   -----------x--------x--------x----------
>     ^        ^        ^        ^
>     |        |        |        |
>    (1)      (2a)     (3)      (4)
>
>   Process for RPKI RPs:
>
>     Phase 0      Phase 1
>   --------------x-----x-------------------
>     ^           ^     ^
>     |           |     |
>    (1)         (2b)  (3)
>
>   (1) RPKI's algorithm document updated.
>   (2a) CA Ready Algorithm C Date (all CAs can accept algorithms in set
> C, "a" or "b" specifically)
>   (2b) RP Ready Algorithm C Date (all RPs can validate algorithms in
> set C, "a" or "b" specifically)
>   (3) CA/RP Set Algorithm C Date (all RPs and CAs now are ready - on
> or after later of 2a/2b)
>   (4) CA Go Algorithm C Date (any given CA can now _choose_ to switch
> from using "a" to using "b")
>
> The mechanics on re-issuing certificates are clear. In the last
> paragraph of "sidr-arch", section 4.2:
>
>   If a CA certificate is reissued with the same public key, it should
>   not be necessary to reissue (with an updated AIA URI) all
>   certificates signed by the certificate being reissued. Therefore, a
>   certification authority SHOULD use a persistent URI naming scheme for
>   issued certificates. That is, reissued certificates should use the
>   same publication point as previously issued certificates having the
>   same subject and public key, and should overwrite such certificates.
>
> So, if we presume that Algorithm suite C allows the choice of two
> algorithms, then a CA can switch from
> one algorithm to the other by (a) re-requesting its own CA cert using
> the PoP of its new public key,
> which is the public key for algorithm "b"; and (b) reissuing all of
> its certificates using the new key.
>
> Every certificate is signed by exactly one algorithm, and there is no
> problem with the algorithm being
> either "a" or "b". RPs understand both "a" and "b" before this
> happens. There is no requirement for
> keeping more than one certificate, so there is no exponential problem.
> (Each cert identifies algs by OID.)
>
> And furthermore, the re-issuing is done unilaterally by each CA,
> meaning each CA can choose to do so
> (or not!) any time after the "go" date. This can happen at any time,
> in any order, independently.
>
> Note very well: The most important aspect of this is, that _each_ CA,
> in this model, has the ability to roll back
> unilaterally, since both algorithms are valid. There are no timing or
> hierarchy dependencies to this.
>
> In fact, the only time there is a need for ensuring all CAs have done
> so, is when there is a "C -> B" update.
>
> All that needs to happen for "C -> B" is for every CA to have
> re-issued their certs using suite B (alg "b" only),
> by that date, and to stop accepting requests with public keys of alg
> "a" at that time.
>
> Again, at no time in this transition are duplicate certs needed
> (neither 2 nor 3, no exponentiation).
> And the re-issuing is done unilaterally by each CA, with no top-down
> requirement.
>
> You are right on this issue:
> - The RPs and PoP rules definitely mean that only one rpki-algs
> document and one CP can be "current" (phase 0) at any time, globally.
>
> However, I disagree that that single algorithm suite, needs to contain
> only _one_ pair of algorithms. The wisdom of the WG,
> and of expert advice from the PKIX folks, should inform the contents
> of the rpki-algs, and conceivably this could contain
> more than one hash/keying algorithm pair. E.g. RSA/SHA 384 and 512,
> and also an EC algorithm, for a choice of 3 algs.
> As long as the alg choices are justified and mainstream, I don't see
> any problem with more than one.
>
> One other point about "C -> B" transitions: the transition avoids the
> top-down (or exponential) issue, if C ^ B = B,
> i.e. if C is strictly a superset of B. However, nothing in this rule
> places restrictions on the sizes of C or B.
> It is conceivable that more than one algorithm be retired during such
> a transition.
> It is also entirely possible that the post-retirement set of
> algorithms be larger than one.
> E.g C = { a b c d e }, B = { b c e }.
> This creates more flexibility in terms of WG work, and more perceived
> stability operationally.
> CAs are then free to choose from multiple algorithms.
> Thus, zero day risks on one active algorithm don't require IETF
> response, as CAs can trivially switch to another active algorithm.
>
> I'd even go so far as suggesting pre-publishing "document sets" (-cp,
> -rpki-algs, -res-certs) for multiple future alg sets,
> in advance, well in advance, to give implementers and operators the
> longest possible lead time.
>
> Respectfully,
>
> Brian
>
> P.S. The multiple CP and  CPS goes away in the above - so long as the
> rpki-algs supports multiple algs.
>
> On Wed, Nov 9, 2011 at 1:42 PM, Stephen Kent <kent@bbn.com> wrote:
>> At 1:27 AM -0500 11/8/11, Brian Dickson wrote:
>>
>> ...
>>
>> I do not support adoption of this document in its current form.
>>
>> The main reasons have to do with fundamental aspects which at a high
>>
>> level have been addressed by my colleagues,
>>
>> so, this is a Verisign critique, provided by you, Eric, and Danny?
>>
>> Here's why:
>> - everybody is a CA. Both the "root" of the INR tree (ICANN/IANA),
>> plus the RIRs, etc., down to the publishers of EE certs.
>>
>> yes, essentially every actor in the RPKI is both a CA and an RP.
>>
>> - each CA publishes its policy via a CPS (it's a SHOULD, but
>> functionally a MUST for RPs to be able to understand what a CA
>> publishes.)
>>
>> small ISPs and orgs that have address space probably will not bother with a
>> CPS, which is why it is a SHOULD, not a MUST. In a typical PKI context, a
>> CPS primarily benefits the subjects to whom certs are issued; RPs also are
>> potential CPS consumers.  In the RPKI, a "keaf" CA issues certs to itself,
>> so a CPS is not of much interest for the first class of consumers. In the
>> RPLI one does not get to shop around to choose a CA, so RPs don't need much
>> from a CPS.
>>
>>
>> - Each CPS specifies the OID of the corresponding CP
>>
>> there is just one CP. not clear form your statement if thatr ws clear.
>>
>> - Each CP refers to the corresponding policy for algorithms
>>
>> there is only one policy (CP) for the RPKI, and it specifies algs via a
>> reference to an alg spec. so, I am not sure what you have in mind here.
>>
>> - Algorithms themselves have OIDs and are referenced as such in certs
>>
>> yes.
>>
>> - Every cert also specifies the OID of the CP itself (which embodies
>>
>> the rules for allowed algorithms)
>>
>> yes.
>>
>> So while the first revision of the CP insists on only one algorithm
>> for pub/private keys, and one algorithm for hashes, it explicitly
>> calls out that these are expected to change.
>>
>> yes.
>>
>> In changing allowed algorithms, it can reasonably be inferred that CPs
>> could be issued which increase the _number_ of allowed algoriths of
>> both types beyond one.
>>
>> there is only one CP.
>>
>> And similarly, the methodology demonstrated by key rollover has local
>>
>> scope. There is no requirement that children do anything at all when a
>> parent executes a key roll. _This is by design_.
>>
>> yes, this is by design, but is irrelevant to the the alg transition design,
>> which has global impact (on all RPs).
>>
>> So the analogous high-level design for agility SHOULD be as follows:
>> - new CP documents may be published, with new OIDs
>>
>> as I mentioned above, there is one CP for the RPKI. When you suggest multile
>> CPs, are you thinking of them on a per CA basis, or RPKI-wide?
>>
>> - ONLY when a CA with a given CPS decides to change CP does that CA
>> need to execute a locally-significant key+alg roll
>>
>> see question above. also, unlike key roll, an alg roll affects ALL RPs,
>> which is why the analogy between the two procedures is bad. Also, note my
>> 'reply to Brian re the top-dowen deployment model that the Wg adopted, to
>> avoid
>> exponential growth in the repository system.
>>
>> - The CA would issue new certs with the new CP which itself lists
>>
>> additional algorithms
>>
>> ibid.
>>
>> - The same procedure would be executed in multiple phases - issue new
>> child certs published under the old main cert; move them to the new
>> cert, rewriting/overwriting in the same location
>>
>> ibid.
>>
>> This could be handled gracefully by having two CPs - one CP having the
>> additional algorithm(s), and subsequently another CP with the new but
>>
>> minus the old.
>>
>> not graceful re repository growth, and impact on RPs.
>>
>> This mechanism could be used to introduce new algorithms without
>> requiring retiring specific old algorithms. The two actions - adding
>> and removing - are in fact independent, beyond the requirement that
>> there be at least one algorithm (which goes without saying, really).
>> The only other requirement is that the issued certs have algorithms
>> consistent with the specified CP (OID) attached to the cert.
>>
>> there needs to be one alg that ALL RPs can deal with at all times.  Also,
>> unlike key roll, when a CA wants to have a new cert with a public key
>> using a new alg, its parent MUST be able to support that alg, because of
>> the PoP requirement.
>>
>> I may be completely off the mark, but this would seem to be much more
>> in line with the whole manner in which algorithms, policies, resource
>> objects, etc., have been separated out and linked by normative
>> reference.
>>
>> I do not agree.
>>
>> Perhaps we could get Geoff Huston to comment on my interpretation of
>> the CP/CPS/alg interaction and explicit/implicit rules?
>> Is it intended that CAs have a uniform hierarchy using exactly one
>> algorithm set, or is it intended that each CA be able to specify (via
>>
>> CPS + CP)  the set of algorithms it supports, with the initial CP
>> document being the minimum acceptable algorithm set?
>>
>> This text suggests that you believe there is on CP per CA, vs. a
>> system-wide CP. The architecture is the latter.  Also, while I respect
>> Goeff, why is your question directed to him? I am a co-author of the CP,
>> the arch, and the key roll and the alg roll docs :-).
>> Steve
>