Re: [TLS] ESNIKeys over complex

Ilari Liusvaara <> Sat, 08 December 2018 16:38 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 14C3C130E63 for <>; Sat, 8 Dec 2018 08:38:39 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.6
X-Spam-Status: No, score=-2.6 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id sZXWcOxOb_y0 for <>; Sat, 8 Dec 2018 08:38:36 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id F3873130E5F for <>; Sat, 8 Dec 2018 08:38:35 -0800 (PST)
Received: from localhost (localhost []) by (Postfix) with ESMTP id 08CA54EE8D for <>; Sat, 8 Dec 2018 18:38:33 +0200 (EET)
X-Virus-Scanned: Debian amavisd-new at
Received: from ([IPv6:::ffff:]) by localhost ( [::ffff:]) (amavisd-new, port 10024) with ESMTP id M6Dae6UL2RI0 for <>; Sat, 8 Dec 2018 18:38:32 +0200 (EET)
Received: from LK-Perkele-VII ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPSA id 6871D2A0 for <>; Sat, 8 Dec 2018 18:38:31 +0200 (EET)
Date: Sat, 8 Dec 2018 18:38:30 +0200
From: Ilari Liusvaara <>
To: "" <>
Message-ID: <20181208163830.GA7470@LK-Perkele-VII>
References: <>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <>
User-Agent: Mutt/1.10.1 (2018-07-13)
Archived-At: <>
Subject: Re: [TLS] ESNIKeys over complex
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 08 Dec 2018 16:38:39 -0000

On Tue, Nov 20, 2018 at 09:45:51PM +0000, Stephen Farrell wrote:
> I'm fine that such changes don't get done for a while (so
> I or my student get time to try make stuff work:-) and
> it might in any case take a while to figure out how to
> handle the multi-CDN use-case discussed in Bangkok which
> would I guess also affect this structure some, but I wanted
> to send this to the list while it's fresh for me.

For even nastier combination, combine zone apex (so no CNAMEs)
and multi-CDN. Due to atomicity constraints (the unit of atomicity is
the RRset), plain includes will not work, even if the site is only
using one CDN at a time.

A nasty hack would be to include valid prefixes (with usual more-
specific rules for matching) in include directives. The client could
then match the includes with addresses they belong to.

In the proposed include syntax, there is an issue that the base64
encoding prevents recursive server from returning the referenced
ESNI records in one query. The include would have to be specified as
DNS wire-format name field for DNS recursors to be able to perform such
performance optimization.

For CNAME multi-CDNs, it should be enough to ensure that the ESNI
and address records share owner name and class, as this should
suffice to ensure that address and ESNI always come from the same CDN.

The problems with using CDNs with zone apex are well-known, and there
are attempts at solving those. Any such solution if practical would
also solve the "address CDN" issue for ESNI.

The "recovery" issue seems to be Capital-H Hard. In TLS 1.3, a
really nasty hack would be to restart the handshake after server
finished without resetting encryption (since encryption is not reset,
the client_hello and server_hello would be encrypted, with keys
known by the client and fronting server). However, I do not think that
would work in DTLS 1.3 (as it assumes that keys can not change during
an epoch, and there is space for only one handshake epoch). And even
that hack is likely too much. And of course fallbacks would be horrible
idea here.

Then it seems to me that other ways to do recovery are even worse,
as they would dink even more with the internals of TLS 1.3, which is
not something to do lightly (the "extreme care" remark from RFC8446
definitely appiles here). This seems to apply even if somebody invents
a snappy shortcut using some more exotic cryptographic primitive.

In summary, I do not think "recovery" will work. But this is not the
first DNS record one can take down a site for a long time. But perhaps
the first where one can do that on mass scale...

While thinking about the previous, I ran into some issues with the
split mode. Firstly, if the fronting server does not encrypt the
client_hello when transmitting it to backend server, passive attack
can match incoming connections with backend servers. This reduces
anonymity set to a single backend server (a lot smaller set).

And secondly, even if server encrypts the client_hello, but does not
use a tunnel to backend, if server does not have client hello replay
filtering (and such filtering is hard on typical fronting servers),
replay attacks and some very simple traffic analysis can discover the
backend server (again reducing the anonymity set by a lot).

This means that the fronting server should have an encrypted tunnel
with the backend server (and there is likely double encryption).

Then there is a future compatiblity issue: If one has a PQ IND-CPA
KEM with sufficiently small size (over a dozen of those in NISTPQC),
one can extend base TLS 1.3 to post-quantum in straightforward manner
(client generates a public key, sticks it to key_share, the server
encapsulates a random session key to that public key, sticks the
ciphertext into its key_share and the client decrypts the session key
with the private key. PFS is possible if key generation is cheap enough
for client to do that per-connection). However, such extension is not
compatible with present ESNI design. 

The problem is that key_share in client_hello carries a public key, and
not a ciphertext. And besides, one can not encrypt messages with
IND-CPA KEM (one needs at least IND-CCA KEM). But even if the key
exchange algorithm was IND-CCA, it still would not help because of the
first problem. There are some ways to both perform key agreement and
encrypt using the same PQ key, but all of them are way too slow and
have way too little analysis.

Of course, there does not seem to be straightforware way to fix this:
Decoupling the keys would require some way to ensure one can not copy-
paste ESNI between client hellos (nasty hack that could work would be
to hash client random and client key_share and include that into key
derivation for the ESNI).

Except the above might run into trouble in PSK mode. And reading the
editor's draft: What prevents from using pure-PSK "resumption" with
ESNI (that would be pretty stupid)? As the binding between ESNI and
connection does not apply in straightforward way in that case.