[VoT] Vectors of Trust: A Strawman

"Richer, Justin P." <jricher@mitre.org> Thu, 02 October 2014 21:10 UTC

From: "Richer, Justin P." <jricher@mitre.org>
To: "vot@ietf.org" <vot@ietf.org>
Thread-Topic: Vectors of Trust: A Strawman
Thread-Index: AQHP3oU+4p9DXTurHEG17GSReQzTfA==
Date: Thu, 02 Oct 2014 21:10:07 +0000
Message-ID: <42FAEF9E-37A4-4F3A-84C0-63C1FBC22EC6@mitre.org>
Accept-Language: en-US
Content-Language: en-US
Content-Type: multipart/alternative; boundary="_000_42FAEF9E37A44F3A84C063C1FBC22EC6mitreorg_"
MIME-Version: 1.0
Archived-At: http://mailarchive.ietf.org/arch/msg/vot/rsAmmJTRE9RaOV416_bOSXcmdXc
Subject: [VoT] Vectors of Trust: A Strawman
Precedence: list

Hi all,

In Utrecht, many of us had a conversation about "Level of Assurance", its limitations, and where we thought we could take it. There was a pretty consistent notion that LoA as currently defined provides a handy shortcut but is usually insufficient and more often than not is grossly misused.

With that in mind, I would like to kick off this mailing list with a strawman proposal for an alternative: Vectors of Trust. The core idea here is to factor out different aspects of what goes into the LoA calculation today and present them as orthogonal vectors. These vectors would need to be communicated in various protocols and environments. We would also need to have a way of describing (and perhaps cataloguing) certain sets of vector values into a single reference to be used in things like trust frameworks.

The first question is what vectors should there be? Embedded in this is how many there should be in the first place. Too few vectors and we end up with the overly-simplified and usually-misused problem that LoA has today. Too many vectors and we end up with an un-navigable set of values where everybody's niche concern is a special snowflake. We need to stay allergic to both cases and shoot for something "in the middle". In particular, my own personal gut instinct says we should shoot for 3 to 5 vectors. I'm going to propose 4 here, all of them linear scales. As a side note, it's a question whether all of these are even linear scales, but I think it's important that we define comparison metrics on each vector element as well as the overall values.

So what vectors should we have? The LoA definition in 800-63 gives us a few to work from as starting points, as well as some work that's been done in the UK. These were captured from the discussion in the Utrecht meeting:

Identity proofing:

This covers how sure the IdP is of various attributes about the user on the way in, and how much they're willing to vouch for those attributes.

0: self-asserted / anonymous / randomized / no idea
1: consistent over time / pseudonymous
2: proofed in person
3: contractually bound

Credential strength:

This covers how strongly bound the primary credential is to an individual presenter, and how easily spoofed or stolen it is.

0: no credential / public
1: shared secret / password
2: proof of key possession
3: multiple factors [and probably higher definitions here as well -- but we want to avoid all the little bespoke 2FA companies getting to define a special snowflake "credential strength" value or else this becomes useless]

Assertion presentation:

This covers how much the federation protocol leaks information over the network to various parties, and how much of that information could be tampered in transit without the RP noticing.

0: no protection / leaky
1: signed & verifiable through browser
2: signed & verifiable through back channel
3: target-encrypted to RP's own key

Operational management:

This covers a variety of information about the identity provider and its host organization. What's the OpSec policy of the IdP? Is there disaster recovery in place? How mature is the hosting organization? What kind of incident response can be expected? How strongly bound is a particular attribute to a particular credential [though maybe this fits under identity proofing]? You'll note that already this feels a lot less linear and a lot less defined than the other three.

Next we need a way to combine and communicate these vectors across the wire. I propose a simple method of assigning each category a marker letter and each value within the category a numeric value, and combining them with a separator, such as:

pseudonymous, multi-factor, strong assertion = P1:C3:A2

It's compact and unambiguous, though it's not really making use of any existing data structures languages like JSON. But I think there does need to be a simple and parseable means to convey this info. This way, clients that need information on a particular category can pull it out and ignore the others. We can present structures like this (especially if they serialize into strings like this example) as part of the assertions in a variety of protocols. Say you've got an OpenID Connect access token, you can add a "vot" member to the payload with "I1:C3:A2" as its value. Clients can parse or ignore this as they see fit.

Next, I think we need a good way to label and categorize these combined vectors, especially as exemplary use cases. A handful of examples from my own deployment experience:

P3:C2:A1 - OpenID 2.0 account provided to current employees and protected by domain password

P1:C3:A2 - OpenID Connect account from dynamically introduced IdP bound to a medical record in an out-of-band process

P3:C3:A3 - Cross-domain OpenID Connect pilot (with contracts connecting the two orgs to enforce the semantics)

It is probably a mistake to try and smush these back into a linear scale like LoA, since that's exactly what we're trying to get away from. However, it would probably be a good and enlightening exercise to map the existing LoA definitions into these vectors.

Finally, I think we need to think about and discuss the notion of the assessment and trustmarks that would be powering the trust in all of these values. After all, if they're just self-asserted by the IdP, that doesn't really help anyone. However, if we had a discovery mechanism whereby a trustmark provider would be able to host a machine-readable definition of what vectors a particular IdP has been proven to be able to claim for any transaction, I think we've got a good leg up on the problem. First off, it would need to be discoverable from the IdP, so I'm proposing we add a component to the discovery document. An OIDC example:

{
"iss": "https://idp.example.org/",
"trustmark": "https://trustmark-provider.org/csp-123412"
...
}

The URI given in this document would contain information about which vectors the IdP is allowed to claim, according to the trustmark provider:

{
"iss": "https://idp.example.org/",
"P": [3],
"C": [1, 2, 3],
"A": [2, 3],
"O": [1, 4]
}

Paranoid clients and clients dealing with many IdPs across an ecosystem would be able to grab the service discovery doc (step 1, and they probably need it anyway) and then checking with a trusted third party, the trustmark provider, if the IdP is on the up-and-up. If a client gets a claim for a vector value that's not attested to by a known/trusted trustmark provider, it can call shenanigans.

This is far from comprehensive, but I think this is a fairly decent strawman to start with. So let's burn this down and see what's left. Be sure to invite your friends to the bonfire!

-- Justin

[VoT] Vectors of Trust: A Strawman Richer, Justin P.
Re: [VoT] Vectors of Trust: A Strawman Jim Fenton
Re: [VoT] Vectors of Trust: A Strawman Nat Sakimura
Re: [VoT] Vectors of Trust: A Strawman Dave Crocker
Re: [VoT] Vectors of Trust: A Strawman Brian Arkills
Re: [VoT] Vectors of Trust: A Strawman Leif Johansson
Re: [VoT] Vectors of Trust: A Strawman Rainer Hoerbe
Re: [VoT] Vectors of Trust: A Strawman Richer, Justin P.
Re: [VoT] Vectors of Trust: A Strawman Mikael Linden
Re: [VoT] Vectors of Trust: A Strawman Joni Brennan
Re: [VoT] Vectors of Trust: A Strawman Nicole Harris
Re: [VoT] Vectors of Trust: A Strawman Cantor, Scott