[TLS] A closer look at ROBOT, BB Attacks, timing attacks in general, and what we can do in TLS
Colm MacCárthaigh <colm@allcosts.net> Thu, 14 December 2017 22:20 UTC
Return-Path: <colm@allcosts.net>
X-Original-To: tls@ietfa.amsl.com
Delivered-To: tls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A1335124D68 for <tls@ietfa.amsl.com>; Thu, 14 Dec 2017 14:20:59 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=allcosts-net.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id oJFt74DoT9AR for <tls@ietfa.amsl.com>; Thu, 14 Dec 2017 14:20:56 -0800 (PST)
Received: from mail-yb0-x230.google.com (mail-yb0-x230.google.com [IPv6:2607:f8b0:4002:c09::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 06E45126D73 for <tls@ietf.org>; Thu, 14 Dec 2017 14:20:55 -0800 (PST)
Received: by mail-yb0-x230.google.com with SMTP id s1so4705775ybm.7 for <tls@ietf.org>; Thu, 14 Dec 2017 14:20:55 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=allcosts-net.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=JhLmB1lY07jiwhjIfcBCU7lUSwrbRdAhq1XOEq+LVLo=; b=PR/WQkqZsU7h/eB+xG/ivodxCAfezAznGKnwLJKB3PYi+qOqz/1IdQdf+E6JwdcN1R Mvz/1LYv+OQOH4lxyarxfOrQOi30D3RdbbISrECHRgy3DvGEMi+QHFKl/eP5H4Ay6454 dPNM27KGHU0N+28lym+z6Gkl+KTFKXqDgsvUA3XqcA6n9yXGfikBdbaYna50r9SFylfN k4FwvzOIO8x7uSZX+OY8PM2XvvX2qlsNae4XWK+YPi1+BIzC91CP8DrkWaWJJt9pMQce sJ9RWPltLP936p2Ri0Ni1TB3XLDkMAtz9p15SRxiNzuyOj9IquBWhOfpqPkupsoXFCmy vn/g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=JhLmB1lY07jiwhjIfcBCU7lUSwrbRdAhq1XOEq+LVLo=; b=UevFnnQohEx8rybnDIMWW8sXz4m3bbJICrw0Ug4tsHJSD8/7mYbTBLhrV4UE1hpC3Z DN+6A7H5VkhaRQKBgdSpb8/tOCMPyhjLmdMPrfctNYlmuFGjHoGj/Nzv00RVvSXOsd7h K5Ow4wPA+kB02G4cpSTUDvEle4PWE7DXMtLiJaE3NuGZDAMFqO8oTcmvedMKbOeAFucO vfshPFllyd33bCTl5TJlcCWn95eXK3PPI4Xp42yUMwJnmZNbDqNqxfVCUWayiv4M9ZNH 1ODs1qHDafEL5zel8OW09pTIbP0g4InT2eDVFFSI+dKKz+AgArZaqS/Sh8sMcJJLvQPa /hJQ==
X-Gm-Message-State: AKGB3mKHtPLXz2FYGI1OasMjBok6m0Oo27eCJj027fcxNVoLebn82i70 Y/sdM6OstahVEyQQPPjRDYvhTdgIo/RVFjhonVWf1cep
X-Google-Smtp-Source: ACJfBoupgBaXG09F0WVG9FAdpXuT1IJHFIHbve/z/bqmu/CMwLnT6I1r3wI7g7uFDOC8VNrVczbnu5J49SrgS350WbE=
X-Received: by 10.129.91.193 with SMTP id p184mr5342599ywb.272.1513290054647; Thu, 14 Dec 2017 14:20:54 -0800 (PST)
MIME-Version: 1.0
Received: by 10.129.50.70 with HTTP; Thu, 14 Dec 2017 14:20:54 -0800 (PST)
From: Colm MacCárthaigh <colm@allcosts.net>
Date: Thu, 14 Dec 2017 14:20:54 -0800
Message-ID: <CAAF6GDeeo2xjv1Xu7SFXVZ_zM=XUVJHT=eqH4_-G3+4UHsfvgg@mail.gmail.com>
To: "tls@ietf.org" <tls@ietf.org>
Content-Type: multipart/alternative; boundary="001a114c8ee2265dcc0560544c50"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tls/t6SKfh49fb4kRET2krZ6UoaEefs>
Subject: [TLS] A closer look at ROBOT, BB Attacks, timing attacks in general, and what we can do in TLS
X-BeenThere: tls@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <tls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tls>, <mailto:tls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tls/>
List-Post: <mailto:tls@ietf.org>
List-Help: <mailto:tls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tls>, <mailto:tls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 14 Dec 2017 22:21:00 -0000
TLS folks, A few weeks ago the s2n team got a mail from US CERT asking us to take a look for any Bleichenbacher attack issues and get to back to them. We didn't have any issues (thankfully!), but it was a good opportunity for us to review how we defend against BB and other related attacks. The notice was of course based on the excellent work of Hanno Böck, Juraj Soorovsky, and Craig Young. Based on the notice, we also went looking in some other code-bases and implementations we have access to, and did find issues, both classic BB98 style attacks, and small timing side channels. We've worked with vendors and code owners in each case and fixes are in place, new releases made, and so on. Based on our experiences with all of this over the last few weeks, I'd like to summarize and throw out a few suggestions for making TLS stacks more defensive and robust against problems of this class. One or two may be even worth considering as small additions to the forthcoming TLS RFC, and I'd love to get feedback on that. *First: the basic specific defense against BB98* In s2n we went with the boring basic per-the-TLS defense against BB98. We pre-generate a random PMS: https://github.com/awslabs/s2n/blob/master/tls/s2n_client_key_exchange.c#L64 then we over-write this PMS only if RSA succeeds: https://github.com/awslabs/s2n/blob/55a641dc29d17780620c16713854f1c9fd31f7ce/crypto/s2n_rsa.c#L165 since it's important not to leak timing, we use a special "constant_time_copy_or_dont" routine to do the over-write: https://github.com/awslabs/s2n/blob/master/utils/s2n_safety.c#L70 we then allow the handshake to proceed, and fail it later: https://github.com/awslabs/s2n/blob/088240d081953131deefb27a70fa6f728156f0cf/tls/s2n_client_finished.c#L35 So far everything is standard, though we did notice in reviewing some other implementations that things can be unnecessarily complicated around handling the first step. I'm including the code links here just as an example that literally following the RFC guidance is fairly doable, but it helps a lot to have a constant-time copy-or-not kind of routine to make it much easier. *Second: hide all alerts in suspicious error cases* Next, when the handshake does fail, we do two non-standard things. The first is that we don't return an alert message, we just close the connection. This is intentional, and even though it violates the written standards, we believe it's better to optimize for frustrating attackers and not leaking information Vs making occasional debugging by TLS experts easier. I'd be interested in the thoughts of others here. We did find other implementations that just close() a connection, and we've never noticed inter-operability problems. Should we tolerate this in the RFC? *Third: mask timing side-channels with a massive delay* The second non-standard thing we do is that in all error cases, s2n behaves as if something suspicious is going on and in case timing is involved, we add a random delay. It's well known that random delays are only partially effective against timing attacks, but we add a very very big one. We wait a random amount of time between a minimum of 10 seconds, and a maximum of 30 seconds. With the delay being granular to nanoseconds. This puts a delay of 20 seconds on average on every potential attacker measurement (though obviously they can parallelize) and adds something around 2^60 bits of entropy to a measurement. The effectiveness of this technique depends on the size and distribution of the timing channel being leaked, as well the distribution of measurement latency generally, but suppose the timing leak is 10 microseconds, then the delay likely increases attacker difficulty by a factor of probably hundreds of billions. I'd be interested in thoughts on this too, as a general recommendation as "something TLS stacks can/should do is that when there's an error, mask it with a big random delay". The main "downside" of this delay is that it can tie up resources - but we're not as concerned with that. Firstly, an attacker can always cause a connection to stall for 30 seconds, and even ordinary network conditions like packet loss can trigger that. That's why we chose 30 seconds as our upper bound. Second, connections are cheap, especially in asynchronous/epoll driven models, or any model that isn't one-connection per process. We prefer to have the defense in depth. It's worth noting that this protection also helps with other timing attacks, such as Lucky13 (read https://aws.amazon.com/blogs/security/s2n-and-lucky-13/ for my write up on how the earlier version of this, with microsecond granularity, wasn't enough to defeat Lucky13, but still made it millions of times harder for an attacker) or even AES timing issues. The remainder of this mail doesn't concern changes that could made in the TLS draft, but rather just other information that may be useful for implementors. Sorry if I'm abusing the purpose of the list a little, but hopefully the last item in particular will make it worth the read. *Fourth: regression tests* We noticed that many of implementations we encountered had no regression tests. In s2n we do have plenty of tests, but actually did not have a specific regression test against BB. We didn't want to add one until the Robot attack was public, but are now adding one. *Fifth: formal constant-time verification* Regression tests on their own do not defend against timing side channel regressions, and we noticed in the commit history of some of the code we reviewed that ones had snuck in over time. For example some non-s2n code we encountered added some debug logging in the case where the padding is mismatched, which created a very small timing difference with the cost it takes to print the logging message. Here I'd like to share two pieces of work we've been doing over the last year. The first is formal verification of our code's constant time properties with ctverif and SMACK. Our integration is at: https://github.com/awslabs/s2n/tree/master/tests/ctverif We've been able to instrument our most timing sensitive routines and verify that they really do use the same number of instructions, regardless of inputs. The annotations are very readable and easy to deal with and don't mess up the code base, in fact the code I linked to earlier: https://github.com/awslabs/s2n/blob/master/utils/s2n_safety.c#L70 is annotated, that's what the S2N_PUBLIC_INPUT macro is for. These constant-time tests run on every commit and build of s2n, and defend against regressions in the code and even compiler. For those of you who may be at RWC2018/HACS, I'd love to show you more if you're interested in integrating or doing anything similar! *Sixth: fuzzy constant-time verification for code-balancing* In building the s2n formal constant-time verification, we noticed that there are more complicated aspects of TLS processing; such as CBC verification and BB98 mitigation that need to be in constant-time, but where the code also spans many functions and even messages. The solutions here typically are not small routines that are rigorously constant time, but instead rely on a degree of code-balancing: taking code paths that are "similar enough" in timing so as not to be measurable. At first, the reaction against code-balancing can often be finger-wagging, but our our survey of TLS libraries is that most are using code-balancing in some capacity. In fact, we can take the OpenSSL LuckyMinus20 vulnerability and regression as evidence that trying to be rigorously constant time on large complicated algorithms that were not designed to be constant time can seriously backfire by introducing logic errors in the very hard-to-follow constant-time adaptations. Because code-balancing is useful in these cases, and certainly used in many real world implementations, we wanted to make its effect more measurable, and more than simply good intentions. Our automated reasoning group (and in particular Daniel Schwartz-Narbonne and Konstantinos Athanasiou) has built and integrated a new tool called sidewinder. The PR integrating it into s2n is here: https://github.com/awslabs/s2n/pull/664 The great thing about Sidewinder is that it can cope with an analysis over larger and more inter-related code units, and it can produce bounds for how much may the execution paths differ by, for both instructions and memory accesses.I believe it would have prevented some of the regressions we found in our BB attack focused review of third party code bases over the last few weeks. It's another tool I'd be excited and happy to share with folks at RWC/HACS. -- Colm
- [TLS] A closer look at ROBOT, BB Attacks, timing … Colm MacCárthaigh
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Watson Ladd
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Colm MacCárthaigh
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Hanno Böck
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Colm MacCárthaigh
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Yoav Nir
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Nikos Mavrogiannopoulos
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Ilari Liusvaara
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Ilari Liusvaara
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Andrei Popov
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Kathleen Moriarty
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Hanno Böck
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Ilari Liusvaara
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Ilari Liusvaara
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Andrei Popov
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Eric Rescorla
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Watson Ladd
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Ilari Liusvaara
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Eric Rescorla
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Eric Rescorla
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Ilari Liusvaara
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Andrei Popov
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Ilari Liusvaara
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Tim Hollebeek
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Andrei Popov
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Tim Hollebeek
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Ilari Liusvaara
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Ilari Liusvaara
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Ilari Liusvaara
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Martin Rex
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Martin Rex
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Ilari Liusvaara
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Peter Gutmann
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Ilari Liusvaara
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Hubert Kario
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Hubert Kario
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Colm MacCárthaigh
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Hubert Kario
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Colm MacCárthaigh
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Hubert Kario
- Re: [TLS] A closer look at ROBOT, BB Attacks, tim… Colm MacCárthaigh