Re: [DNSOP] On trust anchors, roots of trust, humans and indirection

Michael StJohns <msj@nthpermutation.com> Thu, 29 March 2018 23:55 UTC

To: Tony Finch <dot@dotat.at>
Cc: "dnsop@ietf.org" <dnsop@ietf.org>
References: <a9bd794f-41bc-9593-db0d-5424c84431a3@nthpermutation.com> <alpine.DEB.2.11.1803281105310.10477@grey.csi.cam.ac.uk>
From: Michael StJohns <msj@nthpermutation.com>
Message-ID: <cfc66d01-c8ce-b605-8074-8400b377f414@nthpermutation.com>
Date: Thu, 29 Mar 2018 19:55:41 -0400
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0
MIME-Version: 1.0
In-Reply-To: <alpine.DEB.2.11.1803281105310.10477@grey.csi.cam.ac.uk>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/id8d3fSqgrIXC4n5WX00QOaMX_A>
Subject: Re: [DNSOP] On trust anchors, roots of trust, humans and indirection
Precedence: list

Apologies for the top post.

Thanks for the commentary.  My guess is that we're starting from 
different assumptions.

There are three questions I have about your solution - 1) Do you expect 
it to be usable each time a device boots?  2) If (1), how long in years 
to you expect it to be usable?  3) if (1), do you expect that any query 
will change your default trust state (e.g. the selection of witnesses)?

Joe's problem AFAICT simply stated is that 5011 doesn't necessarily work 
for a device that's been on the shelves for 3-4 years.  My counter is 
that anyone who expects a device to come into service after four years 
on the shelf without requiring some intervention is somewhat optimistic.

If the answer to (1) is "No, this is only used for the first 
bootstrapping", then I would suggest one of two alternatives - one of 
which is similar to your proposal.  The first is that the device when 
first connected immediately downloads new firmware (which would include 
the most recent DNS trust anchor).  That becomes a don't care for DNSOP, 
but is really the correct way to deal with consumer devices which are 
the once mostly likely to be placed on the network without 
(knowledgeable) human intervention - at least for the maintenance 
lifetime.   The second is that, absent of knowledge of the DNS root of 
trust, a client queries ALL of the roots (A, B, C, etc) in the kickstart 
file and accepts a SEP DNSKEY  that appears in the root zone served by 
the majority of them.

If the answer to (1) is "Yes, this is used every time to figure out what 
the trust anchors are" I would suggest that you then have simply moved 
the TA management problem one level up and will need to maintain state 
for each of the witnesses for as long as the answer to (2).  (I.e., if 
you can't answer the question of "how does the system continue to 
operate if N of the witnesses have gone dark and/or been replaced by 
other witnesses?" then you don't have a viable system.

If the answer to (3) is yes, then this looks a lot like 5011.  If no, 
then you have to have a maintenance item for the set of witnesses if not 
a protocol.

The problem with the generic "splitting trust" model, is that you have 
an initial set of pseudo-trusted entities that you combine to gain final 
trust.   In the DNS root model, it is doubtful that you can get a 
sufficiently static set of entities for even a 5-8 year period of time 
except for the roots.  And even then, you can only assume that the 
address mappings of the roots are static, and not any key pairs that 
might identify the end points.

Trust models have bit rot.  It's just the nature of the beast.  In 
figuring out how to build a system that resists bit rot, you need to 
answer the three questions I asked, and then figure out the various 
probabilities of any given witness going dark (e.g. moving addresses, 
changing keys, shutting down, being comprimised) and figuring out the 
probability of a given client getting a "good" trust anchor at various 
points in its lifetime given changes in witnesses - e.g. a 100% 
probability at inception vs a 20% probability at year 12.  RFC5011 
considered the requirements as stated in RFC4986 and provided a system 
that was designed to be relatively bit-rot resistant (on a system basis) 
for a design life in excess of 40 years given reasonable attention to 
administration.  No system related to the DNS can be 100% bit-rot 
resistant for all clients given the one-way nature of the DNS data flows.

More in line.

On 3/28/2018 6:36 AM, Tony Finch wrote:
> Michael StJohns <msj@nthpermutation.com> wrote:
> Interesting thoughts, thanks. I have a slightly different starting point,
> which doesn't disagree with your argument, but leads to somewhat different
> consequences.
>
>> Proposition 1 (P1):  The initial selection of a root of trust (ROT) on behalf
>> of a validator ALWAYS involves a human in the loop.  It may not be obvious
>> which human(s), but it is always the case someone (not a computer) decided.
>> The selector may be the person configuring the validator or the set of people
>> who compile the code with the validator, or linux distribution manager, but
>> the initial selection always involves a judgement call of some sort by a
>> human.  In many cases, this is a judgement call is based on external
>> information (like widespread publication of the ROT information or multiple
>> third party endorsements (e.g. reputation evaluation)).
> I think it should be possible to automate this judgment call, given a
> suitable distributed publication/endorsement mechanism. This is the point
> of my trust anchor witnesses draft. The HITL doesn't select the trust
> anchor directly, but instead selects the witnesses.

I don't think this invalidates P1 - a human chooses.   The policy 
specified by the human is substituting for: "Take the one DS of the root 
in the file"; with "Go ask N entities, pick the answer that M give 
you".  In either case, the client executes the policy engine to get a 
result.

>
>> Proposition 4 (P4):  The compromise of a singleton ROT (or more generally of
>> all ROTs) leading to the "no trust" condition, requires repeating the "initial
>> root of trust selection process". From the point of view of the validator,
>> this is almost always a manual action either directly to the validator (manual
>> configuration update, manual firmware update), or indirectly through a
>> validators control point (e.g. pushed by a NOC).
> With multiple trust anchor witnesses, a validator can survive the
> compromise of a witness (or a witness ceasing operations, or multiple
> witness failures) if it requires a large enough quorum when setting up or
> recovering a trust anchor, and enough working witnesses remain. No need
> for a HITL in these cases.
>
> Loss of all witnesses should be extremely unlikely!

This is where you fall down.  This is motherhood rather than analysis.

The question needs to read:  "Given a set of N <describe the witnesses>, 
and a policy which requires M of them to be applicable, and given a 
probability of p that any given witness goes dark and stays dary in any 
given month after the start, what is the cumulative probability after X 
years, that M witnesses are available?  After Y years?  Is this 
probability > Pa - the acceptable probability (as agreed to over beer at 
IETF 102)?" Seriously, if you've got a system that has a 1% probability 
of failing after 10 years then that may not be acceptable - but I don't 
know what that probability is nor what people would find acceptable.

I'm happy to have this conversation - I think its a possible approach, 
but you need to get the assumptions correct, and then you need to 
compare the output against the requirements to see if this meets them.

>
>> Corollary 3 (C3): If P4, C1 and P1 are true, simply moving the ROT from the
>> DNS Root Trust Anchor set to one or more CA ROTs does not mitigate against ROT
>> compromise, it only moves the responsibility for mitigating the problem from
>> the DNSSEC system to the CA system.
> Right.
>
> My idea is different because witnesses are not individually trusted: only
> a quorum is enough to establish trust. A compromised witness is basically
> equivalent to an unavailable witness (unless the compromise is as big as
> the quorum!)
>
> The aim is to disperse trust, not to move it around.

I got it.  And as a point solution where you own both client and 
witnesses, its not a bad one.   But this is an infrastructure system 
that has to work under a lot of really critical assumptions.  5011 was 
designed to be no worse for a generally live client than DNS in general 
- it has no external dependencies (e.g. CA's) and can be used anywhere 
DNS is available. Firewalls are a nasty part of the problem and any 
solution that augments or replaces 5011 needs to work around most if not 
all firewall restrictions.

Again - thanks for the commentary - Mike

>
> Tony.

[DNSOP] On trust anchors, roots of trust, humans … Michael StJohns
Re: [DNSOP] On trust anchors, roots of trust, hum… Tony Finch
Re: [DNSOP] On trust anchors, roots of trust, hum… Michael StJohns
Re: [DNSOP] On trust anchors, roots of trust, hum… Tony Finch
Re: [DNSOP] On trust anchors, roots of trust, hum… Phillip Hallam-Baker
Re: [DNSOP] On trust anchors, roots of trust, hum… Tony Finch
Re: [DNSOP] On trust anchors, roots of trust, hum… Paul Vixie
Re: [DNSOP] On trust anchors, roots of trust, hum… Tony Finch
Re: [DNSOP] On trust anchors, roots of trust, hum… Paul Vixie
Re: [DNSOP] On trust anchors, roots of trust, hum… Tony Finch
Re: [DNSOP] On trust anchors, roots of trust, hum… Paul Vixie
Re: [DNSOP] On trust anchors, roots of trust, hum… Phillip Hallam-Baker
Re: [DNSOP] On trust anchors, roots of trust, hum… Michael StJohns