[Trans] WGLC comments on draft-ietf-trans-6962-bis-24

Richard Barnes <rlb@ipv.sx> Mon, 16 January 2017 22:23 UTC

MIME-Version: 1.0
From: Richard Barnes <rlb@ipv.sx>
Date: Mon, 16 Jan 2017 17:23:27 -0500
Message-ID: <CAL02cgQ-dYT3o7NWc8JZw7Knuv2ssRAvkhmbLiyMYT8dWX9MfA@mail.gmail.com>
To: trans@ietf.org
Content-Type: multipart/alternative; boundary="f403045e3510fee8c205463da1f4"
Archived-At: <https://mailarchive.ietf.org/arch/msg/trans/gO_DFW3v9FmBCOek_hifZ6KL368>
Cc: Eric Rescorla <ekr@rtfm.com>
Subject: [Trans] WGLC comments on draft-ietf-trans-6962-bis-24
Precedence: list

Hey all,

Sorry I’ve missed the WGLC deadline.  I wanted to summarize for this list
some discussions that have been going on within the Firefox team.
Basically, we’ve started to look at what it would take to deploy CT inside
of Firefox (including code, policy, etc), and come away with some concerns
that the system as currently specified can’t be deployed in a way that
actually provides the desired guarantees, in particular protection against
equivocation by logs.  For what it’s worth, these comments apply roughly
equally to RFC 6962 and 6962bis.

tl;dr:

- It is not practical to build a publicly verifiable log system that
incorporates SCTs
- The existing tools for public verifiability need to be made more
efficient in order to work on the scale envisioned

I recognize that it’s very late in the day to be raising these issues, but
we believe they go to the heart of the value proposition of CT and so it’s
important they be addressed before 6962bis goes to RFC. I speak for both
myself and my colleagues here at Mozilla in saying that we are more than
willing to put in the time to help get to consensus on these topics.

Details below.

Thanks,
--Richard


===

Fundamentally, CT is a public ledger system: every certificate is supposed
to be entered into a log and RPs only accept certificates which are logged.
In order for this to work properly, RPs need to be able to verify that
there is public consensus about the state of the log. Otherwise, logs can
“equivocate”: represent to the RP that a given certificate was published
when it in fact was not. Unfortunately, in practice RPs are not doing this
verification (for reasons discussed below), and so CT reduces to a
countersignature scheme in which the RP trusts the log not to equivocate.
There are two primary challenges here, as detailed below.


# SCTs, and thus immediate issuance, are incompatible with public
verifiability

It has been known for quite some time that the public verifiability piece
of CT introduces latency in certificate issuance. In order to allow for
immediate certificate issuance, logs instead issue SCTs, which are just
promises to incorporate the certificate into the log; effectively the SCT
is a countersignature on the certificate. However, if an RP accepts a
certificate + SCT, then it is vulnerable to collusion between a CA which
issues a bogus cert and a log which issues a bogus SCT but never
incorporates the cert into the log.

We are unaware of any way to efficiently address this issue without
introducing either privacy problems or latency. In order to validate
inclusion in the log, the RP needs to validate that other entities (e.g.,
the software manufacturer) have the same view of the log. Either the RP
downloads the whole log (which is inefficient), queries for the specific
certificate in question (which has privacy problems) or retrieves a
checkpoint which vouches for some batch of certificates (which introduces
batch latency).

There seem to be two major ways to address this issue:

1. Accept issuance latency: An RP will only accept a cert as valid when
accompanied by proof that it has been incorporated into the public record

2. Accept some window of vulnerability to equivocation during which SCTs
are accepted and then retrospectively checked. The RP would provisionally
accept a certificate that claimed to have been very recently issued and
then check for log presence a few minutes later (once an inclusion proof
should be available)

Unfortunately, there’s not really an effective way to accomplish the latter
with high reliability and without bad privacy problems.  Going back to the
server and asking for an inclusion proof is safe from a privacy
perspective, but there’s a significant risk of failure given how often
servers are multi-homed.  And asking anyone but the server leaks browsing
history.  It’s theoretically possible that some private information
retrieval scheme could save us, but that would be a big new chunk of work,
and unlikely to deploy in the near term.

Note that it is not possible to just not require public verifiability for
certificates which claim to be recently issued; because this attack depends
on the log and the CA colluding, they can just issue certificates with
recent timestamps.


# CT’s public verifiability mechanisms are too inefficient to be deployed
at scale

If this WG is going to meet its charter goals, CT needs to have a working
public verifiability system. What that means in practice is that it’s
efficient for the RP to acquire whatever information it needs to validate
that a certificate is in the public record. In the current system, this
basically means:

  - Acquire an inclusion proof [hopefully provided by the site the RP is
connecting to].
  - Acquire the STH that the inclusion proof chains back to and validate
that the STH was publicly logged.

Clearly in order to be efficient, multiple certs must chain back up to the
same STH; this is also a privacy requirement because otherwise retrieving a
given STH leaks which certificate you are verifying.  For similar privacy
reasons, clients need to proactively download and validate every STH they
might encounter, to avoid making queries for STHs (which leak browsing
history).  So, what this means is that the RP needs to periodically
retrieve:

  - All the STHs that any certificate might chain to
  - The consistency proofs between those STHs

The good news is that if the RP does this, then it will be in a position to
verify that any certificate with an inclusion proof has been publicly
logged; it will be protected from equivocation.  The bad news is that this
scheme generates so much data it cannot be deployed.

To get an idea of scale here, I looked all of the submissions to the Google
Pilot log over December 2016.  Let’s assume that the log creates a new STH
for every 2048 certificates it receives, in order to minimize issuance
latency; it takes around 8 minutes for Pilot to get 2048 certificates, on
average.  At this rate, Pilot produces around 6000 STHs per month.  The
good news is that at this rate, an RP can easily store all of the STHs it
needs, around ~192kB of hashes per month, ~2.3MB per year.

The bad news is that the RP has to download an inordinate amount of
information to verify these STHs. In addition to the STHs themselves, it
will need to download around 6000 consistency proofs over the course of a
month.  Each proof is around 20 hashes, so at the end of the day this is
~125k hashes (4MB of data) that an RP has to download every month, 48MB for
the year. That’s a pretty big chunk of data.

There are no doubt several plausible alternative data structures, but just
to give a sense of what’s possible, consider the following design:  Replace
the global Merkle tree with a series of time-windowed trees, one for each
batch.  Then glue these together with a conventional Haber-Stornetta hash
chain.  (See my cartoon at <https://ipv.sx/tmp/ct-hs.pdf>)  This has the
same number of STHs as the design above but because the consistency proofs
are trivial (you just validate that STH_n includes the hash of STH_{n-1}),
the total download size for the month is just ~6k hashes (192kB of data).
This scheme also saves you a few bytes on inclusion proofs, since you only
have to go to the batch level.

There are also intermediate designs that preserve the overall Merkle tree
structure at the cost of a bit more data, and probably a lot of other
designs we haven’t thought of (such as the “segmented” scheme that was in
the pre-I-D versions of CT).

In any case, if we claim that CT represent a system that is actually
publicly verifiable, then we need to get rid of SCTs and come up with a way
to push log state to RPs more efficiently.

[Trans] WGLC comments on draft-ietf-trans-6962-bi… Richard Barnes
Re: [Trans] WGLC comments on draft-ietf-trans-696… Paul Wouters
Re: [Trans] WGLC comments on draft-ietf-trans-696… Richard Barnes
[Trans] Minimum SCT age (was Re: WGLC comments on… Rob Stradling
Re: [Trans] Minimum SCT age (was Re: WGLC comment… Tom Ritter
Re: [Trans] Minimum SCT age (was Re: WGLC comment… Rob Stradling
Re: [Trans] Minimum SCT age (was Re: WGLC comment… Andrew Ayer
Re: [Trans] Minimum SCT age (was Re: WGLC comment… Bill Frantz
Re: [Trans] Minimum SCT age (was Re: WGLC comment… Eran Messeri
Re: [Trans] Minimum SCT age (was Re: WGLC comment… Rob Stradling
Re: [Trans] WGLC comments on draft-ietf-trans-696… Eran Messeri
Re: [Trans] WGLC comments on draft-ietf-trans-696… Richard Barnes
Re: [Trans] WGLC comments on draft-ietf-trans-696… Paul Wouters
Re: [Trans] WGLC comments on draft-ietf-trans-696… Salz, Rich
Re: [Trans] WGLC comments on draft-ietf-trans-696… Eric Rescorla
Re: [Trans] WGLC comments on draft-ietf-trans-696… Salz, Rich
Re: [Trans] WGLC comments on draft-ietf-trans-696… Eric Rescorla
Re: [Trans] Minimum SCT age (was Re: WGLC comment… Ben Laurie
Re: [Trans] Minimum SCT age (was Re: WGLC comment… Ben Laurie
Re: [Trans] WGLC comments on draft-ietf-trans-696… Ben Laurie
Re: [Trans] WGLC comments on draft-ietf-trans-696… Tom Ritter
Re: [Trans] WGLC comments on draft-ietf-trans-696… Andrew Ayer