[Cfrg] Side inputs to signature systems, take 2

"D. J. Bernstein" <djb@cr.yp.to> Sat, 23 April 2016 13:00 UTC

Return-Path: <djb-dsn2-1406711340.7506@cr.yp.to>
X-Original-To: cfrg@ietfa.amsl.com
Delivered-To: cfrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CE7CD12D5F0 for <cfrg@ietfa.amsl.com>; Sat, 23 Apr 2016 06:00:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.219
X-Spam-Level:
X-Spam-Status: No, score=-4.219 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GtN_uLcDOggP for <cfrg@ietfa.amsl.com>; Sat, 23 Apr 2016 06:00:06 -0700 (PDT)
Received: from calvin.win.tue.nl (calvin.win.tue.nl [131.155.70.11]) by ietfa.amsl.com (Postfix) with SMTP id CD0B412B045 for <cfrg@irtf.org>; Sat, 23 Apr 2016 06:00:05 -0700 (PDT)
Received: (qmail 21984 invoked by uid 1017); 23 Apr 2016 13:00:29 -0000
Received: from unknown (unknown) by unknown with QMTP; 23 Apr 2016 13:00:29 -0000
Received: (qmail 25867 invoked by uid 1000); 23 Apr 2016 12:59:55 -0000
Date: Sat, 23 Apr 2016 12:59:54 -0000
Message-ID: <20160423125954.25865.qmail@cr.yp.to>
From: "D. J. Bernstein" <djb@cr.yp.to>
To: cfrg@irtf.org
Mail-Followup-To: cfrg@irtf.org
In-Reply-To: <878u080w22.fsf@alice.fifthhorseman.net>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/cfrg/pv7X8hrBtMGfmlCMHsLhJDXPHso>
Subject: [Cfrg] Side inputs to signature systems, take 2
X-BeenThere: cfrg@irtf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Crypto Forum Research Group <cfrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/cfrg>, <mailto:cfrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cfrg/>
List-Post: <mailto:cfrg@irtf.org>
List-Help: <mailto:cfrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/cfrg>, <mailto:cfrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Sat, 23 Apr 2016 13:00:08 -0000

Here's the standard API for signature systems:

   * keygen inputs: nothing; outputs: public key, secret key;
   * sign inputs: message (string), secret key; outputs: signature;
   * verification inputs: message, signature, public key: outputs: yes/no.

This API is overwhelmingly popular in the cryptographic literature, the
security literature, and the software ecosystem. It is shared by the
vast majority of

   * specifications of signature systems,
   * papers analyzing the security of signature systems,
   * theoretical proofs regarding signature systems,
   * cryptographic libraries, and
   * applications of signatures at higher layers of software.

This sharing is tremendously helpful---compare it to all the horror
stories of mismatched APIs slowing everybody down, making the auditor's
job hellishly difficult, and producing real-world security problems. The
simple fact that the signature API is so popular means that any change
to the API comes with a heavy presumption of being a _very bad idea_.

I'm not saying that it's impossible to rebut this presumption. For
example, I recommend merging the message and signature into a signed
message---I'm acutely aware of how many system layers need to be adapted
to make this work, but I don't see any other plausible way to eliminate
the very common "use the message without noticing it's a forgery" error
pattern. Stateful signatures are a deeper example. In both of these
cases I see clear, convincing reasons for a new API, outweighing the
massive costs and dangers---but changing the API is obviously an uphill
battle against a default position of amply justified skepticism.

This brings me to the idea of modifying signature specifications to add
"contexts" (whatever exactly those are!) to signing and verification,
with the hope of stopping cross-protocol attacks. There's a severe lack
of justification for this idea:

   * The security analysis is almost entirely missing. It's easy to come
     up with examples where the idea seems to work, but slightly more
     thought shows many more examples where the idea fails, because it's
     not directly aimed at the core cross-protocol issue.

   * The syntax/layering/API analysis is entirely missing. In every case
     where the idea is effective, the same benefit is also achieved by a
     much more traditional fix that _doesn't_ change the signature API,
     and this traditional fix is much easier from a systems perspective.

As an example, let's imagine that the PGP protocol adopts some sort of
signature-plus-context system, and labels itself with context "PGP".
Sounds like a great way to separate PGP from other protocols, right?

Now let's consider two protocols built on top of PGP (copied verbatim
from my first "Side inputs to signature systems" message):

   * Protocol A: There are two messages, "Put the troops through a
     surprise drill" and "Don't put the troops through a surprise
     drill". For efficiency these messages are compressed to "1" and
     "0" respectively. For security, make sure to check that the
     compressed message is signed with the General's PGP key.

   * Protocol B: There are two messages, "Launch the missiles" and "Don't
     launch the missiles". For efficiency these messages are compressed
     to "1" and "0". For security, make sure to check that the
     compressed message is signed with the General's PGP key.

There's a gigantic cross-protocol vulnerability here, and the "PGP"
context string completely fails to stop this vulnerability.

If you think that the messages in protocol A have to be separated
somehow from the messages signed by the same key in protocol B, then of
course you're right---but how exactly is this supposed to map to the
vague concept of a "context"? Was it wrong for the PGP protocol to
declare "PGP" as the context? Is A supposed to assign a new "A" context,
ignoring the lower-level protocol that's actually doing the signatures?
How is the PGP software supposed to handle this?

What happens when the semantics of the A message _do_ take information
from the lower-level protocol? Should the higher-level protocol A change
the context to some tree structure such as "A within PGP"? Maybe nobody
is actually using the side data available from PGP, but there are many
other examples of semantics being built through chains of protocols.

Even worse, what if the people writing the A software and the B software
don't even realize the need to coordinate "context"? Assigning semantics
to strings is something that programmers are doing all the time---and,
whenever these strings are transmitted through an untrusted network, the
new semantics create a new opportunity for cross-protocol attacks. By
saying "Protocol A" and "Protocol B" I've been painting a picture of
programmers who _realize_ that they're building new protocols, but this
is not actually the normal situation; it is a rare exception.

At this point I expect "Aha, we know the solution!" comments from fans
of dozens of different formats for "self-describing strings." Random
examples that come to mind include the VAX/VMS file format, the original
MacOS file-type system, the Internet mail-message attachment format,
XML, etc., etc., etc. The dream of each of these formats is to provide a
grand unified description of semantics, resolving all possible
misinterpretations. This means that all we have to do is specify that
signed messages are encoded as XML, right?

Unfortunately---even if we ignore the question of how well defined and
understood these formats are---the reality is that programmers are
defining new data semantics so frequently that they usually don't even
define new types _inside their own programs_, never mind going through
the hassle of registering types in any of these grand unified systems.
The universe of semantics defined by software ends up vastly broader
than contemplated in any of these formats. Each format ends up resolving
misinterpretations only within an extremely limited semantic scope.

But wait, there's more! The notion of having a file describe its own
semantics might seem reasonable in contexts where each file is handled
independently---for example, viewed within a stateless file viewer. But
this notion disintegrates as soon as files begin to communicate through
a shared state. By changing this shared state, a signed file can change
how a subsequent signed file is interpreted---but then the signatures
need to include the state or some other strict-sequencing mechanism, to
prevent an attacker from changing the semantics of the second file. The
protocol _after_ modification of the shared state is not the same as the
protocol _before_ modification of the shared state.

It's easy to say that cross-protocol attacks are created by messages
being taken out of context. But what _is_ the context? This is the core
issue, and solving it is vastly more complex than simply sticking "PGP"
somewhere into a signature hash. As I wrote before (without elaborating
on what the word "protocol" means):

   The standard fix for this type of problem is to encode more
   information into the signed messages so that they can't be taken out
   of context. Of course, to figure out how much information is enough,
   you have to work at the protocol layer, understanding all the
   different contexts in all of the high-level protocols that might sign
   messages under the same key---and the protocol designers need to talk
   to each other! Any encoding, no matter how verbose, can be ruined by
   another protocol assigning new meanings to some of the encoded
   messages.

I think that's enough about semantics. Let me now turn to the questions
of syntax, layering, and API.

Show me any definition of semantics in a signature-plus-context system---

   * what the "context" says about the semantics of the message, and
   * how the "message" then encodes data

---and I'll show you an equivalent definition of semantics for the
_standard_ signature API, the simpler API that doesn't have "context":
simply encode the same "context" information and the data together as
the "message" to be signed.

This use of the _standard_ signature system provides all of the same
protection against cross-protocol attacks that can possibly be provided
by the signature-plus-context system. What creates the protection is
_not_ the change of signature concept to include a separate "context";
what creates the protection is defining the semantics of the string
being communicated.

The huge advantage of this syntax choice is that it's compatible with
the signature API already used in specifications and papers and proofs
and libraries and the rest of the system. It preserves the familiar
signature layer that we already have. It minimizes the number of inputs
for signing and the complexity of the task facing the signature-system
auditor.

I hope I've now made sufficiently clear that I'm objecting---and why I'm
objecting---to the use of "contexts" and other side inputs in CFRG's
specifications of signature systems.

Daniel Kahn Gillmor writes:
> The context label is explicitly in place in Ed448 in the current draft,
> but not in Ed25519.

My objection applies equally to all signature systems with side inputs.

> > * Are cryptographic libraries supposed to change their signature
> >   APIs to allow an extra side input?
> A library that implements a signature scheme that has a context string
> needs to provide an API for it, yes.

This is considerable deployment hassle---and it's only the beginning of
the massive software upgrade that you'd need to actually make "context"
work. Remember that, e.g., PGP would need to accept "context" arguments
from any application processing PGP-signed messages, that those
applications would need to similarly allow "context" from higher-level
callers, etc. And then what happens if the applications, for whatever
reasons, think that they need to use traditional signature systems? Is
the "context" thrown away, allowing cross-protocol attacks? Or is it
encoded into the message, which is exactly what I'm saying should always
be done, and which doesn't need or want "context" in the signature spec?

Encoding the same information into traditional messages does just as
well in stopping cross-protocol attacks as a "context" side input can
possibly do. The API divergence is unnecessary and highly undesirable.

> We're actually doing exactly this in TLS 1.3, but doing it in an
> ad-hoc way

Do you have justification for using the loaded word "ad-hoc" here? It
should be obvious that part of the job of TLS is to adequately define
the semantics of its own signatures. It's not at all surprising that
these semantics are specific to TLS.

> > * What happens when someone wants to use more of these inputs?
> I'm not sure what you mean here.  You mean more context labels?  the
> goal here is one distinct context label per domain.

What's the "domain" for a PGP-signed message, interpreted by protocol X,
further interpreted by protocol Y, as meaning "Immediately sell all of
the stocks that I listed in my previous message"? Surely one needs to
include not just static protocol information but also something dynamic
such as the hash of the stock list or of the previous message.

> > * More to the point, how is this supposed to be better than having
> >   the application sign a more informative message, using the
> >   traditional concept of a signature system?
> "more informative" assumes that you know exactly how any bytestring is
> going to be interpreted in every other context.

This is a tough question of semantics, as I've been trying to explain.
Complicating the syntax---splitting "message" into "message" plus a
separate "context"---is of no value in tackling this question of
semantics. Meanwhile the costs of the new syntax are very clear.

Benjamin Kaduk writes:
> I don't really care if the application protocol makes its message being
> signed more informative (i.e., specific to the particular protocol message
> at hand) by manually putting that data at the front of its signing input
> and ignoring the context input, or by using the context input.  I just
> want it to happen, since that improves the overall safety of the internet.

Everyone agrees on trying to eliminate all possible confusion in the
semantics of signed messages, but the right way to do this is to
directly tackle the question of semantics, not to frivolously screw
around with the standard syntax for a signature system. You're saying
that you don't care about the syntax/layering/API issues; I _do_ care
and am very much opposed to changing the API.

---Dan