Re: [DNSOP] draft-fanf-dnsop-trust-anchor-witnesses-00.txt

Joe Abley <jabley@hopcount.ca> Fri, 14 February 2014 00:02 UTC

Content-Type: multipart/signed; boundary="Apple-Mail=_BCA4F9FF-550A-48AF-99E7-3CF5AFDEF0E6"; protocol="application/pgp-signature"; micalg="pgp-sha1"
Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\))
From: Joe Abley <jabley@hopcount.ca>
In-Reply-To: <alpine.LSU.2.00.1402132050440.18502@hermes-1.csi.cam.ac.uk>
Date: Thu, 13 Feb 2014 19:02:26 -0500
Message-Id: <79F80225-91C0-4185-9FB7-172E643DCE90@hopcount.ca>
References: <alpine.LSU.2.00.1402132050440.18502@hermes-1.csi.cam.ac.uk>
To: Tony Finch <dot@dotat.at>
Archived-At: http://mailarchive.ietf.org/arch/msg/dnsop/agjyDVvJ5xqa-1DaMDNHQIjDsVs
Cc: dnsop <dnsop@ietf.org>
Subject: Re: [DNSOP] draft-fanf-dnsop-trust-anchor-witnesses-00.txt
Precedence: list

Hi Tony,

On 2014-02-13, at 15:56, Tony Finch <dot@dotat.at> wrote:

> There was some discussion last month about dispersing trust in the root.
> http://www.ietf.org/mail-archive/web/dnsop/current/msg10977.html
> 
> This inspired me to write up a concrete proposal for the
> quorum-of-witnesses idea that I have vaguely suggested several
> times over the last few years.
> 
> All thoughts / suggestions / criticisms welcomed.

I agree with your thoughts in the draft that (to paraphrase) there has been remarkably little interest in working on (or implementing) the suggestions in draft-jabley-dnssec-trust-anchor.

The ability to bootstrap the trust anchor for the root zone without operator intervention (e.g. on embedded devices, or any computer system or application with a non-technical user) is weak as currently deployed. An approach which is robust and for which there is enthusiasm to implement is sorely needed.

Some additional thoughts on the actual text, below. This started off being brief, but then I got on a roll. Sorry about that.

One comment, in case it's not obvious: I once worked for ICANN, and I once had some responsibility for parts of the root zone DNSSEC system. I no longer work for ICANN, however, and anything you read below that looks like a fact should be treated with distrust and suspicion.

> Abstract
> 
>    At the moment the root DNSSEC key is a single point of trust and a
>    single point of failure for the whole system.  This memo describes a
>    mechanism for dispersing trust in the root key.  Witnesses vouch for
>    the root trust anchor by publishing WS records in the DNS.
>    Validators only update their root trust anchors if multiple witnesses
>    agree.  The root-witnesses.arpa zone enables a validator to bootstrap
>    trust when it has no working trust anchors other than its witnesses.

You're using "root key" and "root trust anchor" a bit loosely here; my presumption is that you're talking about the root zone KSKs, and a trust anchor set which corresponds to a set of active (non-revoked, published) KSKs in the root zone.

> 1.  Introduction
> 
>    At the moment the root DNSSEC key is a single point of trust and a
>    single point of failure for the whole system.  It has a number of
>    problems:
> 
>    o  Root trust anchor rollovers using [RFC5011] require validators to
>       be online while the rollover happens.  With the current root key
>       management plan, rollovers take a few weeks.  This is
>       uncomfortably long for emergency rollovers.

I don't believe the various systems that manage DNSSEC in the root zone have a way to execute an emergency KSK roll whilst still following RFC 5011 semantics. There's a single active key, for example, and no standby key. What was specified in the root zone was the use of 5011 timers for scheduled rollovers; a definitive compromise of the active KSK would trigger other action. It would be nice if this was more definitively specified (and better mechanisms to bootstrap trust would help with that).

>    o  Systems that are offline during a rollover have to use an out-of-
>       band mechanism to update their trust anchors, relying on non-DNS
>       sources of trust.  There is no clear specification or security
>       analysis for this process.

Well, there were a couple of attempts:

  draft-jabley-dnssec-trust-anchor-08
  draft-jabley-dnsop-validator-bootstrap-00 (expired)

>    o  The root key is a single point of failure with no standby, though
>       its storage and management is extremely resilient and trustworthy
>       (in stark contrast to the out-of-band trust anchor update keys).

I presume by that you mean the X.509 certs published on data.iana.org/root-anchors. The goal there was to facilitate endorsements by multiple independent actors, each following a documented and audited process to verify that what they were signing was accurate. The use of X.509 certificates anticipated vendors of operating systems and embedded devices already having certificate machinery for code-signing, and hence this seemed like an approach that had a low implementation cost.

Turns out the actual approach (do nothing) had an even lower implementation cost.

>    o  The concentration of trust in the root is politically
>       uncomfortable.

This may be true, depending on exactly what you mean, but I doubt it's universally true (perhaps not even widely true). I'm not sure I understand the motivation for including that statement has in a technical specification, especially when you consider that the intended trust mechanisms were designed to be dispersed amongst multiple (e.g. vendor) authorities.

>    This memo describes a mechanism for dispersing trust in the root key.

"another mechanism" :-)

>    Witnesses vouch for the root trust anchor by publishing WS records in
>    the DNS.  Validators only update their root trust anchors if multiple
>    witnesses agree.
> 
>    There are some potential advantages:
> 
>    o  It can allow for a crash rollover of the root key, in the event
>       that it is lost or compromised, with validators recovering
>       automatically rather than having to be manually forced to fetch
>       and authenticate the replacement trust anchor.

crash -> emergency, presumably.

The existing mechanisms (again, lamentably light on implementation) would also facilitate this. The mechanisms in the validator-bootstrap draft I referred to earlier allow trust anchor retrieval with no ability to validate, authenticating the retrieved data using appropriate local methods (e.g. the key that signed the code is used to test the signature of the data the code retrieved).

>    o  It could allow a smaller root DNSKEY RRset by allowing the
>       witnesses to vouch for the root ZSK directly instead of via a KSK.
>       This saves the cost of high-assurance storage for the root KSK,
>       but requires more frequent communication between the root DNSSEC
>       key managers and the witnesses.

As I'm sure you appreciate, this advantage would only be realised if there was a change in the way that the work is divided up amongst the root zone partners. At present, there is a deliberate separation between KSK management and ZSK management. The intersection between the two is entirely conducted at key ceremonies which are widely scrutinised.

I suspect a transition to a single key solution would be a difficult transition from an administrative perspective, bordering on the non-actionable. Listing impossible things as advantages seems like a bit of a stretch. :-)

>    There are some limitations and disadvantages:
> 
>    o  It does not disperse trust in the root zone signing key or root
>       zone maintenance.

"root zone signing key" is a bit vague. I presume you mean the ZSK.

"root zone maintenance" is a bit vague. I presume you mean the process of applying changes to the root zone and signing it by the root zone maintainer, and not the various other pieces of the system?

>    o  A lot more co-ordination between organizations is necessary, for
>       the witnesses to get out-of-band authentication of new trust
>       anchors.

This should not be underestimated (or at least should be characterised). Presumably if we expect people (and devices) to trust a collection of witnesses to the authenticity of a particular copy of the KSK, we don't want that process to dilute the security of the process used to generate, store and exercise the KSK. Are we talking about additional ceremonies, potentially tens or hundreds of them, with each witness? Or are we talking about something that might fit with minimal change into the existing ceremonies?

I realise you're not trying to boil the ocean with this -00; this is not criticism, just ideas for future typing :-)

>    This mechanism can be used to automatically update any trust anchor,
>    though it is designed for and includes some special considerations
>    for the root trust anchor.  The root-witnesses.arpa zone is set up to
>    enable a validator to bootstrap trust when it has no working trust
>    anchors other than its witnesses.

I'm not sure I understand the benefit of making this a general mechanism. The only real application it has, in these heady days of DS RRSet proliferation, is in the root zone; it's hard to imagine it being easier to set up a plausible array of witnesses for an island with an otherwise insecure delegation than it is to sign the parent (or just choose a different parent).

> 3.  How validators use WS records
> 
> 3.1.  Trust anchor configuration
> 
>    A validator's configuration for a trust anchor consists of the the
>    trust anchor owner name, and either a set of public keys or a set of
>    DS records, as described in [RFC4035] section 4.4.
> 
>    A trust anchor that is automatically updated is associated with the
>    witnesses that vouch for it.  It has a quorum value stating how many
>    witnesses must agree before the trust anchor is updated.

Where is that quorum parameter published? By the KSK maintainer, or by individual witnesses, or is it up to the validator operator to follow local policy?

Bad choices in this parameter seem likely to cause pain (too high and you need to worry about the currency of a lot of witness zones; too low and you are at risk of shenanigans by a small number of conspiring witnesses).

>    Each witness is a normal statically-configured trust anchor.  That
>    is, witnesses are not updated automatically except by out-of-band
>    configuration updates or software updates.  Each witness is
>    associated with one automatically updated trust anchor for which it
>    vouches.

I worry slightly that you've just shifted the problem that ICANN anticipated with vendors shipping a single trust anchor and made it larger, a problem of shipping many trust anchors from different places. This is the core problem we're trying to address, I think; if it's also a problem that needs to be solved for this proposal, we may have just found our own footprints on the beach.

> 3.2.  When to try a trust anchor update
> 
>    The validator SHALL keep track of the DNSKEY records from the DNS at
>    the trust anchor name. It only tracks the set of all records with
>    the SEP flag set, or the subset of SEP keys with algorithms supported
>    by the validator.  (This is so that ZSK rollovers do not trigger
>    trust anchor updates.)

So, the validator receives responses to the query ./IN/DNSKEY from somewhere (not necessarily directly from the authority servers) and discards any RRs in the set with SEP=0.

>    When the validator notices that this set has changed it SHOULD
>    attempt to update the trust anchor as described below.  During the
>    update process it SHOULD continue to serve clients and use the
>    existing trust anchor to validate responses.
> 
>    The validator MAY track the DNSKEY records persistently in order to
>    make restarts faster.  If so, it SHOULD discard any saved DNSKEY
>    records after their RRSIG expiry time.  If it does not, it SHOULD
>    perform an update attempt at restart.

persistently -> regularly?

>    When starting, a validator can find that its existing trust anchor
>    does not work, perhaps because a key rollover happened while it was
>    offline.  In this situation it cannot serve clients until the update
>    process completes successfully.

What about queries received from clients with CD=1?

>    A broken trust anchor is not expected to happen during normal
>    operations, since validation ought to work at every point in a key
>    rollover.  However, if some disaster occurs and the trust anchor
>    private key is lost or compromised, there might be a disruptive crash
>    key rollover.
> 
>    When it sees a crash rollover, a validator will not be able to
>    validate the new DNSKEY RRset, so will discard it and retry the query
>    in an attempt to obtain a working version.  If this problem persists
>    the validator MAY attempt to update the trust anchor using an invalid
>    DNSKEY response.
> 
> 3.3.  Trust anchor update process
> 
>    Trust anchor updates are performed with respect to a DNSKEY RRset
>    from the trust anchor owner name.  This allows the validator to
>    ensure that a successful update will lead to a working configuration.

I presume you're talking about the trust anchor for the root zone here, and not a trust anchor for any witness zone. "trust anchor owner name" is a bit of a weird phrase, but that's probably just me.

>    The validator queries for the WS RRset at each of the trust anchor's
>    witnesses.  The witnesses SHOULD be queried in a random order, so
>    that the validator avoids relying too much on a subset of the
>    witnesses.  The query process SHOULD stop when a quorum has been
>    achieved for one or more WS RRs.  The queries MAY be performed
>    concurrently to improve performance (though it doesn't make sense to
>    use a level of concurrency greater than the quorum size).

How does the validator obtain a list of witness zones? That's manually configured? How are additions, removals and changes to the list of witness zones managed?

>    The witness queries follow normal DNS resolution and DNSSEC
>    validation rules.  The response from a witness MUST validate as
>    secure using that witness's trust anchor.  (The special arrangements
>    for the root trust anchor witnesses described in Section 5.1 ensure
>    that the requirements in this paragraph can be satisfied even when
>    the root trust anchor is broken.)
> 
>    The validator MUST ensure there are no duplicate WS RRs in the
>    response from a witness.  Duplicate RRs are not allowed (see
>    [RFC2181] section 5), but it is particularly important to prevent
>    duplicate WS RRs so that a witness cannot count more than once
>    towards a quorum.
> 
>    The validator SHOULD ignore a WS RR if it does not contain a valid
>    digest of a DNSKEY record with the SEP flag set.  This ensures that
>    the validator does not count a quorum of useless WS RRs.
> 
>    For each usable WS RR that the validator receives from a witness, it
>    keeps a count of the number of responses that contained that WS RR.
> 
>    A WS RR can be trusted when this count reaches the required quorum.

You mean a particular DNSKEY RR, retrieved without authentication from some arbitrary source, can be trusted when sufficient validated WS responses that match the DNSKEY RR under scrutiny have been retrieved?

Surely any WS RRSet can be trusted as well as the locally-configured trust anchor for the zone that contains it, once you validate the WS RRSet's signature.

>    If the trust anchor that is being updated is configured with DS RRs,
>    then the validator converts the trusted WS RRs into DS RRs by
>    changing their RR TYPE fields and uses those for the new
>    configuration.  If the trust anchor is configured with public keys,
>    then the new keys are taken from the DNSKEY RRs that are
>    authenticated by the trusted WS RRs.
> 
> 
> 4.  How witnesses publish WS records
> 
>    The administrative arrangements for publishing WS records in a
>    witness zone are analogous to publishing DS records in a parent zone.
> 
>    There MUST be an out-of-band (non-DNS) communications channel between
>    the witnesses and the owner of the zone for which they vouch.  This
>    is used to authenticate WS RRset changes.

If this process is to be trusted as secure, sufficient to allow unattended operation, etc then presumably these processes need to be carefully specified and audited.

> 4.2.  Lifecycle of witness zones
> 
>    A trust anchor SHOULD have many witness zones, in order to provide
>    resilience as well as dispersal of trust.
> 
>    Each witness zone is tied to a fixed witness trust anchor.  The zone
>    lasts as long as its trust anchor.  This SHOULD be at least 10 years,
>    since old software and configurations cannot function after too many
>    of their witnesses have been retired.

"10 years" seems like a bit of a hard-coded parameter. The root zone KSK was expected to be rolled every 5 years. There is enthusiasm in some circles for rolling the root zone KSK much more frequently than that. If there are reasons to roll the root zone KSK that often, then presumably the same reasons dictate that witness zones' KSKs should roll similarly?

>    Witnesses are continually retired.  It is expected that some
>    witnesses will have to retire early, for instance, if their keys are
>    lost or compromised, or if their host organization is no longer able
>    to maintain them.  This is OK since there are plenty of other
>    witnesses.
> 
>    New witnesses are continually introduced.  Validators configured with
>    an up-to-date set of witnesses will have a decent lifetime.  Given an
>    average witness lifetime of W years, a pool of P witnesses, and a
>    quorum size of Q, we expect P/W witness retirements per year.  A
>    validator configuration will last until there are Q witnesses left,
>    that is, until there have been P-Q retirements, which takes V=(P-
>    Q)*(W/P) years.  For example, if P=30, W=10, and Q=6, then V=8.
> 
>    A witness organization may run multiple witness zones on a rolling
>    replacement schedule in order to avoid a hiatus when a zone is
>    retired.  Validators SHOULD be configured to use only one witness
>    zone from each witness organization, to avoid trusting one
>    organization too much.

It's not obvious from your description how anybody should (a) obtain a list of witness zones, or (b) should retrieve authenticated, local trust anchors for sufficient witness zones such that a quorum can reliably be achieved. These seem like important gaps to fill.

> 5.  The root trust anchor
> 
> 5.1.  Locating root trust anchor witness zones
> 
>    There is a bootstrapping problem when a validator has an out-of-date
>    root trust anchor: it needs to find the name servers for the witness
>    zones in order to be able to get the WS records that vouch for the
>    new root trust anchor; however it is unable to validate the responses
>    it gets while resolving the name server addresses.  This section
>    describes how to minimize this bootstrapping problem.
> 
>    All root witness zones SHALL be delegated from a single parent zone,
>    called root-witnesses.arpa.  This zone is to be maintained by IANA.
>    A delegation in this zone indicates that there are out-of-band
>    arrangements between the root DNSSEC key managers (XXX do they have a
>    better name?) and the witness organization allowing the witness
>    organization to meaningfully vouch for changes to the root DNSSEC
>    trust anchor.

The Root Zone Manager role is currently carried out by Verisign. They manage the ZSKs, they edit the zone to apply authorised changes, they sign the root zone, and they distribute it to root servers.

The IANA Functions Operator role is currently carried out by ICANN. They manage the KSKs, publish trust anchors, and exercise the KSK during ceremonies to sign enough DNSKEY RRSets provided by the Root Zone Maintainer for the root zone to be re-signed and published for the next N months.

Changes to the root zone are authorised by US DoC NTIA.

Collectively, the Root Zone Manager, the IANA Functions Operator and NTIA are referred to as the Root Zone Partners.

(I think. I no longer have to worry about these details on a daily basis. The IANA Functions Contract and the Cooperative Agreement are public, I think; you'll probably find them on ntia.doc.gov if you're interested. All those phrases are surely in those documents, somewhere.)

>    The root-witnesses.arpa zone SHOULD NOT be signed.  Leaving the zone
>    unsigned prevents the risk that validators will use some higher-level
>    trust anchor to validate responses from a witness zone rather than
>    the witness trust anchor itself.  In particular we want to avoid a
>    compromised root key being used to vouch for itself.  The purpose of
>    the root-witnesses.arpa zone is to contain delegation NS RRs and glue
>    address records for the witness zones, and these records are never
>    signed.  The signed parts of delegations are the DS RRsets; omitting
>    those prevents unsafe witness validation, but also leaves almost
>    nothing in the root-witnesses.arpa zone to sign.
> 
>    Each witness zone's name servers have names inside that witness zone
>    so that they can be validated by the witness trust anchor without
>    depending on any other part of the DNS.

I'm not sure I understand the purpose of this. Are you suggesting that if I send a query to address X but find out in the response from the server at that address that I should really have been talking to address Y, I should apply the mind bleach and start again?

Surely it's the WS RRSets we want to validate. How we get them is largely irrelevant. They either validate against a locally-configured trust anchor or they don't. If someone pilosov/kapela's a route for a real witness server and sneakily sends me a perfectly good response which validates, then giddy up, I'm using it.

>    The root-witnesses.arpa zone SHALL be served by the root name
>    servers.  This is so that a bootstrapping validating resolver can
>    find its witnesses using just its root hints, and get a direct
>    referral to the right witness zone name servers, again without
>    depending on any other part of the DNS.

There's some enthusiasm for paring back the root servers even further than they are, and having them serve just the root and ROOT-SERVERS.NET zones (the latter so that priming responses can be fattened with lovely tasty glue).

To clarify your intent, are you suggesting the root servers should serve ROOT-WITNESSES.ARPA simply because there's a lot of them, i.e. the requirement is that those zones be served competently? If not, where does the requirement come from?

> Appendix A.  Questions
> 
>    Should this this scheme be extended to be more like DLV?

I think the answer to that is no, regardless of what scheme you're talking about. :-)

Joe

Attachment: signature.asc

Re: [DNSOP] draft-fanf-dnsop-trust-anchor-witness… Tony Finch
[DNSOP] draft-fanf-dnsop-trust-anchor-witnesses-0… Tony Finch
Re: [DNSOP] draft-fanf-dnsop-trust-anchor-witness… Matthäus Wander
Re: [DNSOP] draft-fanf-dnsop-trust-anchor-witness… Joe Abley
Re: [DNSOP] draft-fanf-dnsop-trust-anchor-witness… Joe Abley
Re: [DNSOP] draft-fanf-dnsop-trust-anchor-witness… Tony Finch
Re: [DNSOP] draft-fanf-dnsop-trust-anchor-witness… Tony Finch

Re: [DNSOP] draft-fanf-dnsop-trust-anchor-witnesses-00.txt

Attachment: signature.asc