[TLS] The case for a single stream of data

Colm MacCárthaigh <colm@allcosts.net> Fri, 05 May 2017 16:28 UTC

Return-Path: <colm@allcosts.net>
X-Original-To: tls@ietfa.amsl.com
Delivered-To: tls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A077D129AEB for <tls@ietfa.amsl.com>; Fri, 5 May 2017 09:28:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=allcosts-net.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id S8iGS3CiLmeB for <tls@ietfa.amsl.com>; Fri, 5 May 2017 09:28:09 -0700 (PDT)
Received: from mail-yw0-x229.google.com (mail-yw0-x229.google.com [IPv6:2607:f8b0:4002:c05::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 72E9D129AB3 for <tls@ietf.org>; Fri, 5 May 2017 09:28:09 -0700 (PDT)
Received: by mail-yw0-x229.google.com with SMTP id l135so5370456ywb.2 for <tls@ietf.org>; Fri, 05 May 2017 09:28:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=allcosts-net.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=sRhsKLf3nrmJKCroyaAqoggLJVO5ROckD1aK6bWjEZ4=; b=dbqXFvW2O+AKXLmAsncvvdm/h5HHY2+fGA5PApyXtzEVQeK6sXbXwBLJiShnNNlHOc ZtBbD3AdGTCPUbxfDMrIF0r7kD1hbKtlvsKsTPhzSERfhbmMaDrY7PBFsWomwLB3AEP1 5JKxaURPE+SBrEfD2Tx74soOLrrMQrMrA+sva15R6/yyChexQFxc1oQI5/FBuTOAj97+ NP+uPFSgHOK5lTO6odGWP+sDNYCbQ1vKpdoXnvir67vjvoq3fgrE1ITCLWRgRJZ7QfvZ Xn2GLQ1qqENOVcxMH1FwkSYo+BU5tZ5l++5iU3b3Jv0GmS8vUsnOZI4FgpzLhxtDsLBc MHeg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=sRhsKLf3nrmJKCroyaAqoggLJVO5ROckD1aK6bWjEZ4=; b=NcRZJlNkK1gP3UQfZmzQ37o+FDOe2cj6MZr3ooFkn0xhDo5EIaZ2sJ/fA/15eOXV98 hdCxMqNJ/ougDzXD7+Nu624SEiyqfEbD1DVXgJ7daiuyYDD5W8nWiNDT3/AFP0mBYVmr EN+VY4Hs10CbzjzhV+TTwYgae0FIg7FjkAJscxe0ic7ZKRFox4ChjPykyrSlpad5PYAk fIv6mHx/FHEPzdqLGnOTARo4kBRpfZEaTCmg+9WK4s0FqCoI4j/ELVEg2l95+2kV7o73 HohCTVERiLbg9LVEmBcdojDOmOGnF5/ax8vQZtL3jtHktikPnp2+IRjwF6pSTEtz75gO 2MYg==
X-Gm-Message-State: AN3rC/4xDesr++BjOvTgpYXizjROGR433z95etE+x4cQJAIIvhocdr3E RvP0qwxYCpIKkWHXHiyqECf5UrDPWA5/
X-Received: by 10.13.238.65 with SMTP id x62mr38734225ywe.122.1494001688361; Fri, 05 May 2017 09:28:08 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.129.57.67 with HTTP; Fri, 5 May 2017 09:28:07 -0700 (PDT)
From: Colm MacCárthaigh <colm@allcosts.net>
Date: Fri, 05 May 2017 09:28:07 -0700
Message-ID: <CAAF6GDfm=voTt_=JrdGtiaYby1JG8ySU2s6myjjpHKeGvi0bMg@mail.gmail.com>
To: "tls@ietf.org" <tls@ietf.org>
Content-Type: multipart/alternative; boundary="94eb2c034d8cedf7f3054ec95f05"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tls/MeyMl1-GuyRBFuWcPh5ul2fuwHk>
Subject: [TLS] The case for a single stream of data
X-BeenThere: tls@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <tls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tls>, <mailto:tls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tls/>
List-Post: <mailto:tls@ietf.org>
List-Help: <mailto:tls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tls>, <mailto:tls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 05 May 2017 16:28:13 -0000

I wanted to start a separate thread on this, just to make some small
aspects of replay mitigating clear, because I'd like to make a case for TLS
providing a single-stream, which is what people seem to be doing anyway.

Let's look at the DKG attack. There are two forms of the attack, one is as
follows:

"Client sends a request with a 0-RTT section. The attacker lets the server
receive it, but suppresses the server responses, so the client downgrades
and retries as a 1-RTT request over a new connection. Repeating the
request".

In this case, server-side signaling such as the X-header trick doesn't work
at all. But thankfully this attack is equivalent to an ordinary socket
interference attack. E.g. if an attacker today suppressed a server response
to a HTTP request, then the client will do its retry logic. It's the same,
and I think everything is compatible with today's behavior.

Next is the more interesting form of the attack:

"Client sends a request with a 0-RTT section. For some reason the server
can't reach the strike register or single use cache, and falls back to
1-RTT. Server accepts the request over 1-RTT.  Then a short time later, the
attacker replays the original 0-RTT section."

In this case, server side signaling to the application (such as the neat X-
header trick) also doesn't work, and is not backwards compatible or secure
by default. It doesn't work because the server application can't be made
idempotent from "outside" the application, so any signaling is
insufficient, and is equivalent to the Exactly-Once message delivery
problem in distributed systems. Since a request might be retried as in case
1, it needs an application-level idempotency key, or a delay-and-retry
strategy (but replay will break this). There's some detail on all this in
the review. End result is that a server-side application that was never
designed to reach duplicates may suddenly be getting exactly one duplicate
(that's all the attack allows, if servers reject duplicate 0-RTT).

What is actually needed here, I think, is client-side signaling. Careful
clients need to be made aware of the original 0-RTT failure.

So for example, an SDK that writes to an eventually consistent data store
may treat any 0-RTT failure as a hard failure, and *not* proceed to sending
the request over 1-RTT. Instead it might wait its retry period, do a poll,
and only then retry the request. If the TLS implementation signals the
original  0-RTT failure to the client, as if it were a connection error,
everything is backwards compatible again. Well mostly; to be properly
defensive, the client's retry time or polling interval needs to be greater
than the potential replay window, because only then can it reason about
whether the original request succeeded or not. If there is a strict maximum
replay window, then this behavior is enforceable in a TLS implementation:
by delaying the original failure notification to the client application by
that amount.

Of course browsers won't do this, and that's ok. Browsers have decided that
aggressive retries is best for their application space. But careful clients
/need/ this; and it's not just about backwards compatibility. It is a
fundamental first-principles requirement for something that uses an
eventually consistent data store. We can say don't use 0-RTT, but that's
not practical, for reasons also in the review.

So if we want to fully mitigate DKG attacks, I think it is useful to hard
cap the replay, say that it MUST be at most 10 seconds. And then worst
case, a client that needs to be careful can wait 10 seconds. Note that the
TLS implementation can do this on the client's behalf, by inserting a
delay. Of course that means that for these kinds of applications, this
means that 0-RTT delivers speed most of the time, but may occasionally slow
things down by 10 seconds. I think that's an ok trade-off to make for
backwards compatibility.

But it also has implications for middle-boxes: a TLS reverse proxy needs to
either not use 0-RTT on the "backend" side, or it needs to use it in very
careful way; accepting 0-RTT from the original client, only if the backend
also accepts a 0-RTT section from the proxy. This is to avoid the case
where the client can't reason about the potential for a replay between the
proxy and the backend. It's doable, but gnarly, and slows 0-RTT acceptance
down to the round trip between the client and the backend, via the proxy.

That's one reason why the review suggests something else too:  just lock
careful applications out, but in a mechanistic way rather than a "good
intentions" way, by having TLS implementations *intentionally* duplicate
0-RTT sections.

O.k. so all of that the above might be a bit hairy: but I want to take away
from it at this stage is that splitting the early_data and application_data
at application level isn't particularly helpful; the server-side can't
really use this information anyway, because of the Exactly-Once problem.
Client side signaling does help though, and a simple safe-by-default
mechanism there is to behave as if the connection has failed, but after
writing the first section of data. E.g. in s2n this would be ...

conn = s2n_connect(); // Client makes a connection
r = s2n_write(conn, "GET / HTTP/1.1 ... "); // Client writes some data, we
stuff it in the 0-RTT and send it. This write succeeds. From the client's
perspective, it may or may not have been received; that's normal.

/* At this point the 0-RTT data is rejected by the server, and so it might
be replayable  ... iff the server side strike-register or
   cache had a problem.

   A pedantically correct TLS library might then pause here for 10 seconds,
or if it's non-blocking, then set a timer so that nothing can happen on
conn for the next 10 seconds. Browsers could turn this behavior off, since
they retry aggressively anyway. But it's a secure default that is backwards
compatible.
*/

r = s2n_read()/s2n_write()/s2n_shutdown();  // At this point, s2n returns
failure. It's as if the connection failed. The client can implement its
retry strategy, if any, safely; the request won't be replayed at this
point.

r = s2n_connect(); // Client makes a new connection for a retry.


A slightly higher level API is probably more realistic, because there's a
potential to optimize for connection re-use. There's really no need to tear
down the whole connection and start-over. It's safe to proceed to 1-RTT if
the delay has expired. A higher level API would fix that, but this is just
the "safe by default" API I'm outlining.  Again, all I want to take away is
that all of this is doable safely with a single stream.


-- 
Colm