Re: [TLS] Update on TLS 1.3 Middlebox Issues

Ilari Liusvaara <> Sat, 07 October 2017 09:57 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 41AF9134793 for <>; Sat, 7 Oct 2017 02:57:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -0.001
X-Spam-Status: No, score=-0.001 tagged_above=-999 required=5 tests=[BAYES_20=-0.001, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id TaA_n28VU2is for <>; Sat, 7 Oct 2017 02:57:30 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id A881013479C for <>; Sat, 7 Oct 2017 02:57:29 -0700 (PDT)
Received: from localhost (localhost []) by (Postfix) with ESMTP id E3DF9B5184; Sat, 7 Oct 2017 12:57:26 +0300 (EEST)
X-Virus-Scanned: Debian amavisd-new at
Received: from ([IPv6:::ffff:]) by localhost ( [::ffff:]) (amavisd-new, port 10024) with ESMTP id GT7lXalkkHWs; Sat, 7 Oct 2017 12:57:26 +0300 (EEST)
Received: from LK-Perkele-VII ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPSA id 33BAF2316; Sat, 7 Oct 2017 12:57:24 +0300 (EEST)
Date: Sat, 7 Oct 2017 12:57:23 +0300
From: Ilari Liusvaara <>
To: Eric Rescorla <>
Cc: "" <>
Message-ID: <20171007095723.qyuxo3sm6gmqaemn@LK-Perkele-VII>
References: <>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <>
User-Agent: NeoMutt/20170609 (1.8.3)
Archived-At: <>
Subject: Re: [TLS] Update on TLS 1.3 Middlebox Issues
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 07 Oct 2017 09:57:32 -0000

On Fri, Oct 06, 2017 at 01:16:37PM -0700, Eric Rescorla wrote:
> Hi folks,
> In Prague I mentioned that we were seeing evidence of increased
> failures with TLS 1.3 which we believed were due to middleboxes. In
> the meantime, several of us have done experiments on this, and I
> wanted to provide an update.
> The high-order bit is that *negotiating* TLS 1.3 seems to cause
> increased failures with a variety of middleboxes (it’s generally safe
> to offer TLS 1.3 to servers which don’t support it). The measured
> incremental error rates vary quite a bit, ranging from minimal
> (Facebook) to ~1.5% (Firefox) and ~3.4% (Chrome). Each of us is using
> a slightly different methodology (organic versus forced traffic) and
> different populations (mobile, desktop, enterprise, etc), but it does
> seem like there is a nontrivial failure rate. At this point, we have
> two options:
> - Fall back to TLS 1.2 (as we have unfortunately done for previous releases)
> - Try to make small adaptations to TLS 1.3 to make it work better with
> middleboxes.

What you think is acceptable failure rate? That is, if we can't get
the rate below that, don't bother with adaptation?
> The Chrome team has been working on angle #2 and has been having
> success with an approach of trying to make TLS 1.3 connections look
> more like TLS 1.2. Their current experiments get them down to about 1%
> incremental failures and they are currently measuring some changes
> they hope will shave that down more. These changes are a bit annoying
> but basically superficial; they do not affect the cryptography.
> Separately, Firefox and Facebook have been experimenting with the new
> content type described in PR#1051 (Google’s and Facebook’s results
> conflict, so this is a bit of a mystery). We hope to have results from
> both sets of experiments by end of October, at which point we should
> be able to discuss the best way forward as a group.

Has there been attempts at figuring out what exactly the middleboxes
are intolerant to?

Here are some candidates that come to mind:

1) Handshake seemingly cutting short (quite annoying to fix)
2) Server version seemingly ahead of client version (annoying to fix)
3) Second byte of ServerVersion is >3 (annoying to fix).
4) Unknown extension negotiated (very annoying to fix).
5) Two missing fields in TLS 1.3 ServerHello throwing off parser (not
   difficult to fix)
6) First byte of ServerVersion != 3 (no issue, non-representative
   test, as per -21, the final TLS 1.3 will have ServerVersion 3,4).

I guess the reason Google's and Facebook's results conflict is that
they tend to hit different kinds of middleboxes. I guess Google
hits more enterprise middleboxes (more strict) and Facebook hits more
carrier middleboxes (less strict). 

This could explain the conflict, because if carrier middlebox loses the
handshake without parse error (which is what the Facebook's hack does),
it probably will let things through. Whereas enterprise middleboxes
tend to more deeply inspect the handshake and probably error out if the
handshake isn't as expected. Fooling the latter might not be easy.

Then there is the question what on earth are those carrier middleboxes
trying to do? Why they seemingly try to parse ServerHello? Looking at
the extension list, the only even remotely sensible to me candidates
are session resumption, certificate types and ALPN (and the latter two
are much more dubious than the first).

For more strict Enterprise middleboxes, parsing the ServerHello makes
some sense, even up to the point of failing on unknown extensions.

If one wants to do variant tests, I would try to test:

Variant 0) Stock TLS 1.2 (for baseline)
Variant 1) TLS 1.2 with unknown extension (no effect) negotiated.
Variant 2) TLS 1.2 ServerHello, with KeyShare extension and TLS 1.3
           ciphersuite, with rest of handshake from TLS 1.3.
Variant 3) Variant 2, but with ServerVersion of 3,x (x>3).
Variant 4) Variant 3, with the two dummy fields removed and server
           version of 3.4 (essentially final TLS 1.3 as of current