[Iot-onboarding] Using Threshold techniques in onboarding.

Phillip Hallam-Baker <phill@hallambaker.com> Tue, 17 November 2020 00:44 UTC

MIME-Version: 1.0
From: Phillip Hallam-Baker <phill@hallambaker.com>
Date: Mon, 16 Nov 2020 19:44:20 -0500
Message-ID: <CAMm+Lwhc3-EkZiVCXwb3nt2QT0fEUha=eMN9EbH7rkJhJ9T1BQ@mail.gmail.com>
To: iot-onboarding@ietf.org
Content-Type: multipart/alternative; boundary="00000000000041830205b442cceb"
Archived-At: <https://mailarchive.ietf.org/arch/msg/iot-onboarding/KQRm2NJfszAVw41SwXycEAgyh4I>
Subject: [Iot-onboarding] Using Threshold techniques in onboarding.
Precedence: list

I have running code and a spec for one way to skin this particular cat. The
draft is not quite ready but its close:
https://tools.ietf.org/id/draft-hallambaker-mesh-architecture-15.html

Let's consider what we want to do when we onboard a device:

1) Establish initial communication with the device.

2) Configure the device to operate under control of the owner. Including
the ability to connect to the owner's network and authenticate commands
from the owner.

3) Establish a trust context that is verifiably independent of the
manufacturer and supply chain.

4) Reconfigure the device as necessary including the ability to switch wifi
networks, reset device config without needing to repeat onboarding, etc.
etc. without the need to climb a ladder at 3am in the morning because the
smoke alarm decided to throw away its configuration data on a whim.

Onboarding must become a one time operation. No excuses. Onboarding
configuration requires user effort which is precious. Configurations MUST
survive a hard reset of the device. The only time the configuration should
be forgotten is when the owner requires it.

TLS configuration was raised in SECDispatch. I believe DANE is the wrong
tool for the same reason CAs are the wrong tool: PKIX PKI is really not
designed for IoT. In particular, PKIX certificates are bound to DNS names
which requires that they expire within a short timeframe. Moving from the
WebPKI to DNSSEC doesn't help because DNS names are rented, not owned.

The expiry issue aside, enterprises and IETF participants have DNS domains
of their own. Consumers do not. And so binding to DNS names...


So one approach I have proposed in the past is to use Strong Internet
Names. These are simply a fingerprint of a public key that is mapped to a
reserved portion of DNS namespace.

So lets say that Alice's coffee-pot has a TLS key with the fingerprint
MB5S-R4AJ-3FBT-7NHO-T26Z-2E6Y-WFH4. We can use that to establish a domain
name that allows us to authenticate the TLS cert without reference to any
external trust provider:

coffee.mm--MB5S-R4AJ-3FBT-7NHO-T26Z-2E6Y-WFH4

That is not the sort of identifier I want to ever throw in a user's face.
But it is perfectly OK for an internal buried in the guts identifier. That
is what I started with four years ago... I have moved on a bit since.


In principle, we could just issue strong name certificates when a device is
manufactured. And that would be perfectly adequate for most user's actual
needs. Right up until some manufacturer is found to have kept track of all
the keys ever issued and has suffered a breach. This approach is not
acceptable for most enterprises and particularly not for HIPPA / SCI type
applications.

The solution is to take advantage of the fact that all Diffie Hellman Keys,
including Elliptic Curve keys support threshold key generation.

The device ships with an ECDH {public, private} keypair {x.P, x} When the
device is onboarded, the administrator issues a second set of keys {y.P, y}
and passes the private value y to the device by means of a secure channel
and a certificate for the key {z.P, z} where z=x+y.

Only the device has the means to calculate x+y. But the administrator can
calculate z.P because z.P = x.P + y.P.


Using this approach means that the user is completely insulated from any
malice or incompetence on the part of the manufacturer (or subsequent
supply chain compromise) provided only that the administration device
chooses a strong value of y. Contrawise, the user is protected if the
administrative device is faulty unless the device manufacturer defects.

Threshold is really, really powerful. You can have exactly as much fault
tolerance as you like, all you have to do is to introduce another party to
share the key with.


So what is the user experience like in practice? Well I have not yet fully
considered the enterprise case where we have to consider the distinction
between the owner and the user and consider insider threats. But I do have
three onboarding scenarios for the user case:

1) Witness value comparison (with optional PIN code)

If the device to be onboarded has a keyboard, Alice enters her Mesh
callsign (alice@example.com) on the device to be connected. The device
spits out a witness value which should appear on the administration device.
If the witness values match exactly, everything is correct and the
connection request can be accepted. Alternatively, an out of band PIN code
may be used as an authentication mechanism.

2) QR code presented on administration device.

If the device to be connected has either a camera or some means of
accepting a short range communication and a means of placing the device
into an onboarding mode, the Mesh callsign and PIC code can be presented in
this form allowing the device to connect without the need to enter data.

3) Static QR code printed on the connecting device.

The connection can also be made by means of a static QR code printed on the
device itself. This must provide some means of bootstrapping a local
communication channel between the devices to complete the connection
process and provision the initial network configuration.


One of the principles of the Mesh is autonomy. Users should be in control
of their digital environment. So what I am currently working on is a
mechanism to allow users to switch Mesh Service Providers without switching
costs.This will allow the mesh callsign to simply become @alice which will
be a lifelong name assigned to Alice that never needs to change unless she
wants to change it. She can switch her provider (e.g. moves from a Comcast
served area to Verizon) but she doesn't need to change her callsign. All
her existing devices will adjust automatically.

[Iot-onboarding] Using Threshold techniques in on… Phillip Hallam-Baker
Re: [Iot-onboarding] Using Threshold techniques i… Toerless Eckert
Re: [Iot-onboarding] Using Threshold techniques i… Phillip Hallam-Baker
[Iot-onboarding] Phil*: "self-enrolling voucher" … Toerless Eckert
Re: [Iot-onboarding] Phil*: "self-enrolling vouch… Panwei (William)