[tcpinc] Cutting tcpcrypt latency and state complexity
Bob Briscoe <bob.briscoe@bt.com> Mon, 27 October 2014 23:43 UTC
Return-Path: <bob.briscoe@bt.com>
X-Original-To: tcpinc@ietfa.amsl.com
Delivered-To: tcpinc@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7ACA81A8786 for <tcpinc@ietfa.amsl.com>; Mon, 27 Oct 2014 16:43:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.61
X-Spam-Level:
X-Spam-Status: No, score=-2.61 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iGL2inMNeJ5s for <tcpinc@ietfa.amsl.com>; Mon, 27 Oct 2014 16:43:08 -0700 (PDT)
Received: from hubrelay-by-03.bt.com (hubrelay-by-03.bt.com [62.7.242.139]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C8D7C1A8722 for <tcpinc@ietf.org>; Mon, 27 Oct 2014 16:42:02 -0700 (PDT)
Received: from EVMHR02-UKBR.domain1.systemhost.net (193.113.108.41) by EVMHR03-UKBR.bt.com (10.216.161.35) with Microsoft SMTP Server (TLS) id 8.3.348.2; Mon, 27 Oct 2014 23:42:52 +0000
Received: from EPHR02-UKIP.domain1.systemhost.net (147.149.100.81) by EVMHR02-UKBR.domain1.systemhost.net (193.113.108.41) with Microsoft SMTP Server (TLS) id 8.3.348.2; Mon, 27 Oct 2014 23:41:59 +0000
Received: from bagheera.jungle.bt.co.uk (132.146.168.158) by EPHR02-UKIP.domain1.systemhost.net (147.149.100.81) with Microsoft SMTP Server id 14.3.181.6; Mon, 27 Oct 2014 23:41:59 +0000
Received: from BTP075694.jungle.bt.co.uk ([10.109.38.168]) by bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id s9RNfsD8017263; Mon, 27 Oct 2014 23:41:54 GMT
Message-ID: <201410272341.s9RNfsD8017263@bagheera.jungle.bt.co.uk>
X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9
Date: Mon, 27 Oct 2014 23:41:55 +0000
To: Andrea Bittau <bittau@cs.stanford.edu>, David Mazieres expires 2014-12-28 PST <mazieres-3uy2t3qwjnnucer4x2aruwdufi@temporary-address.scs.stanford.edu>, dm@uun.org, Mark Handley <m.handley@cs.ucl.ac.uk>, dabo@cs.stanford.edu, mike@shiftleft.org, sqs@cs.stanford.edu, draft-bittau-tcpinc@tools.ietf.org
From: Bob Briscoe <bob.briscoe@bt.com>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="=====================_273600153==.ALT"
X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
Archived-At: http://mailarchive.ietf.org/arch/msg/tcpinc/MPpUjJXzqI4Jvw6q3nh32B4TN4A
Cc: tcpinc@ietf.org
Subject: [tcpinc] Cutting tcpcrypt latency and state complexity
X-BeenThere: tcpinc@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for adding encryption to TCP." <tcpinc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpinc>, <mailto:tcpinc-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpinc/>
List-Post: <mailto:tcpinc@ietf.org>
List-Help: <mailto:tcpinc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpinc>, <mailto:tcpinc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 27 Oct 2014 23:43:21 -0000
tcpcrypt coauthors (Andrea, Dan, Mike, Mark, David, Quinn) Thank you very much for draft-bittau-tcpinc-tcpcrypt-00. Very clearly and comprehensively specified. Currently the opportunistic encryption defined in tcpinc-tcpcrypt-00 adds handshaking latency to each connection. This is likely to make opportunistic encryption unacceptable to everyone except the most paranoid individuals - who want privacy whatever the performance cost. The proposals below reduce the tcpcrypt latency (before it can send encrypted user-data) from two round trips to one for a new session and from one round to zero for a resumed session. They also remove the HELLO-SENT and S-MODE states and a number of state transitions, which I think considerably simplifies the otherwise complex state machine of tcpcrypt. Indeed, by separating out the grunt work of framing options within the payload, traversing middleboxes and reliable ordered option delivery, it is possible that tcpcrypt's dependency on the TCP state machine will be greatly reduced or possibly eliminated. However, this separation makes tcpcrypt depend on other work (draft-briscoe-tcpm-inner-space-01). I am trying to get it adopted onto the agenda of the IETF tcpm WG as quickly as possible. But this breaks one of my own rules, "Avoid one research item depending on another". For now, let's just break my rules and see where we get to... 1) Root Causes of Tcpcrypt's Latency Problems 1.1) Options in Payload The first underlying problem is that the keying material in the tcpcrypt INIT1 and INIT2 options makes them too big to fit in the TCP header. So during the tcpcrypt setup phase, tcpcrypt places these two options in the payload. But a client can't put control messages in the payload until it knows it's talking to a server that understands tcpcrypt. Otherwise, on completion of the 3WHS, a legacy server would send the tcpcrypt options to the app. This forces a tcpcrypt client to consume a round to check that the server supports tcpcrypt before it can use INIT1 & INIT2. This means tcpcrypt still needs another round after the 3WHS before it can start sending data - it can't finish the tcpcrypt handshake within the 3WHS. 1.2) No way to signal a boundary between set-up and user-data The second change that tcpcrypt needs is a boundary within the TCP payload so it can transition from tcpcrypt setup options to encrypted data within the same segment. Then we will not need the rule that says "Implementations MUST NOT include application data in TCP segments during setup", which adds an extra round. TLS has always been placed within the TCP payload, and it's always had its record structure and the 'Finished' message to signal this boundary - as the end of setup and start of ciphered user-data. So in theory TLS is already structured to make it easy to send the same messages but in fewer rounds. This is why /in theory/ it was possible to cut latency in the three low latency variants of TLS that I know of, False Start, Snap Start and MinimaLT. We've written a summary of each and drawn timing diagrams of TLS with and without False Start in Section III of this Survey paper: <http://riteproject.eu/?attachment_id=735> However, as we explain in the survey, the theory didn't work out in practice. For instance the different message timings of False Start interacted non-deterministically with a number of SSL termination boxes used on the edge of data centres. This prevented widespread deployment of False Start. But in the paper we explain an exception that can allow it to be deployed in certain deployments (e.g. for SPDY). Tcpcrypt is fortunate in not having to deal with such legacy middleboxes. So it's worth restructuring it now, while its design is still fluid. 1.3) Unnecessary round to resume a tcpcrypt session Tcpcrypt consumes a round trip to resume a session. I can't see any particular reason why it needs this round, other than complacency because the traditional TCP 3WHS consumes a round anyway, so why try to start any faster? I believe it's easy to cut this round of latency out. Then, now that TCP Fast Open is available, tcpcrypt could resume sending ciphered user-data in zero rounds. 2) Solution 2.1) How to stop a legacy TCP passing TCP Options within the Payload of a SYN to the app The approach in draft-briscoe-tcpm-inner-space (or draft-touch-tcpm-tcp-syn-ext-opt) * makes space for extra TCP options beyond the Data Offset of a SYN and sets a boundary between these extra options and any payload data. * ensures legacy TCP servers don't pass these extra TCP options to the app Similarly, the approach in draft-briscoe-tcpm-inner-space (or draft-ietf-tcpm-tcp-edo) makes space for extra TCP options in segments after the first SYN. In the rest of this email, I'll use only Inner Space, for all the following reasons: - to be concrete; - because it should traverse most middleboxes, including connection splitters, resegmentation, and option strippers; - because it gives you a nice reliable ordered property for TCP Options, which would otherwise require tcpcrypt to have to fix this problem; - because I've been using tcpcrypt as one of the main use-cases for Inner Space, altho it's also general enough for other TCP options. - because the alternative combination of tcp-syn-ext-opt and tcp-edo creates extra space on the SYN and on data segments, but neither address the SYN/ACK in a way that's robust to middleboxes. 2.2) How to Reduce Tcpcrypt Latency and Complexity. The design below assumes a tcpcrypt+ spec that either REQUIRES Inner Space or it uses a similar approach in a tcpcrypt-specific implementation. Then the SYN can include large options like INIT1; and encrypted user-data can be included in the same segment as options like INIT2, NEXTK1 or NEXTK2. The proposed changes are in two main parts: - New Session - Resumed Session 2.2.1 New session Recap from draft-bittau-tcpinc-tcpcrypt-00 C -> S: HELLO S -> C: PKCONF, pub-cipher-list C -> S: INIT1, sym-cipher-list, N_C, pub-cipher, PK_C S -> C: INIT2, sym-cipher, KX_S With Inner Space, there's no need for the HELLO or PKCONF - the whole round is redundant. HELLO can be implicit in INIT1. And INIT1 can be extended (I'll call it INIT1+) to optimistically offer a choice of pub-cipher-list as well as offering a public key for its preferred first pub-cipher in the list, optimistically hoping that the server supports it. Optimistic choice of cipher suite is not a new idea - for instance False Start uses the same approach. If the server accepts the client's choice of pub-cipher, it can complete the INIT2 within the first round. C -> S: INIT1+, pub-cipher-list, sym-cipher-list, N_C, PK_C S -> C: INIT2+, sym-cipher, KX_S; MAC<m>; data<...> In this case, tcpcrypt can be ready to encrypt data after one round or even after half a round. Obviously, the server won't usually have any data to send until the client has sent an encrypted request (after this first round). If the server does not accept the first pub-cipher in the client's pub-cipher-list, the server can take over the task of sending INIT1. Then the client can respond with INIT2, followed directly by encrypted data in the same packet. C -> S: INIT1+, pub-cipher-list, sym-cipher-list, N_C, PK_C S -> C: INIT1+, pub-cipher, sym-cipher, N2_C, PK2_C C -> S: INIT2+, KX_S; MAC<m>; data<...> INIT1+ will have to be qualified by any one of the following: - regular - app-support - app-mandatory because these modifiers could previously be applied to HELLO, which is now implicit in INIT1+ Similarly, because INIT2+ subsumes PKCONF, it will have to be qualified with either of: - regular - app-support 2.2.1.1 New DDoS Attack and a Defence An army of clients could flood S with INIT1+'s containing an unacceptable pub-cipher at the head of the list, trying to make S take the heavier role of verifier rather than encrypter for the asymmetric key setup crypto. If under stress, S can strictly prioritise those requests with an acceptable pub-cipher at the head of the list, and queue up requests with unacceptable pub-cipher-lists. Then, assuming a client will not want to risk being confused as attack traffic, it will be worth it picking a pub-cipher that the server will accept. One could reintroduce PKCONF to the protocol, to allow S to return a PKCONF. But why reintroduce complexity to the protocol, when the above approach is sufficient? 2.2.2 Session Resume Recap from draft-bittau-tcpinc-tcpcrypt-00 A -> B: NEXTK1, SID[i] B -> A: NEXTK2 With Inner Space, there's now no need to wait for one round before A or B can encrypt user-data. They can send encrypted data straight away: A -> B: NEXTK1, SID[i]; MAC<m>; data<...> B -> A: NEXTK2; MAC<m>; data<...> If B declines state re-use (equivalent to example 5 in draft-bittau-tcpinc-tcpcrypt-00), it can discard the encrypted data and return an INIT1+. Then A can re-transmit the data with the newly negotiated keys, as follows: A -> B: NEXTK1, SID[i]; MAC<m>; data<...> B -> A: INIT1+; pub-cipher, sym-cipher, N2_C, PK2_C A -> B: INIT2+, KX_S; MAC<m>; data<...> 3) Next Steps Either * tcpcrypt could mandate use of Inner Space as it stands * or tcpcrypt could repurpose Inner Space, but as a tcpcrypt-specific design. IMO, the first alternative makes sense while the second would be silly - requiring implementation of the same capability in two different ways in the same stack, with two lots of debugging to do, etc. However, I should add that the ideas behind Inner Space have only been around since the end of Jul'14 and tcp-edo is still young (born Apr'14) even tho it's the oldest. So I doubt there will be an implementation until the designs settle into some degree of stability. Focusing on Inner Space in particular, I'm trying to be ambitious. I believe it will be able to encrypt not just the payload of a SYN, but also the tcpcrypt options in the same payload that control the encryption of itself. That needs: a) zero latency key agreement (as above for a session resume), b) TCP Option processing rules to trigger a second pass to look for more TCP Options once the payload is decrypted. Even if I have to give up on a) for an initial connection, the inner-space draft already defines b) and it's also still usable when a connection resumes with TCP Fast Open. Cheers Bob ________________________________________________________________ Bob Briscoe, BT