Benjamin Kaduk's Discuss on draft-ietf-quic-tls-33: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <> Tue, 05 January 2021 04:53 UTC

Return-Path: <>
Received: from (localhost [IPv6:::1]) by (Postfix) with ESMTP id 27C433A0EEB; Mon, 4 Jan 2021 20:53:22 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <>
To: The IESG <>
Subject: Benjamin Kaduk's Discuss on draft-ietf-quic-tls-33: (with DISCUSS and COMMENT)
X-Test-IDTracker: no
X-IETF-IDTracker: 7.24.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <>
Message-ID: <>
Date: Mon, 04 Jan 2021 20:53:22 -0800
Archived-At: <>
X-Mailman-Version: 2.1.29
List-Id: Main mailing list of the IETF QUIC working group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 05 Jan 2021 04:53:22 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-quic-tls-33: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)

Please refer to
for more information about IESG DISCUSS and COMMENT positions.

The document, along with other ballot positions, can be found here:


(1) Rather a "discuss-discuss", but we seem to be requiring some changes
to TLS 1.3 that are arguably out of charter.  In particular, in Section
8.3 we see that clients are forbidden from sending EndOfEarlyData and it
(accordingly) does not appear in the handshake transcript.  The
reasoning for this is fairly sound; we explicitly index our application
data streams and any truncation will be filled in as a normal part of
the recovery process, so the attack that EndOfEarlyData exists to
prevent instrinsically cannot happen.  However, the only reason we'd be
required to send it in the first place is if the server sends the
"early_data" extension in EncryptedExtensions ... and we already have a
bit of unpleasantness relating to the "early_data" extension, in that we
have to use a sentinel value for max_early_data_size in NewSessionTicket
to indicate that the ticket is good for 0-RTT, with the actual maximum
amount of data allowed indicated elsewhere.  TLS extensions are cheap,
so a new "quic_early_data" flag extension valid in CH, EE, and NST would
keep us from conflating TLS and QUIC 0-RTT semantics, thus solving both
problems at the same time.  On the other hand, that would be requiring
implementations to churn just for process cleanliness, so we might also
consider other alternatives, such as finessing the language and/or
document metadata for how this specification uses TLS 1.3.
(There are a couple other places in the COMMENT where we might suffer
from scope creep regarding TLS behavior as well, but I did not mark them
as DISCUSS since they are not changing existing specified behavior.)

(2) Let's check whether the quic_transport_parameters TLS extension
should be marked as Recommended or not.  The document currently says
"Yes", and the live registry say 'N'.  That said, the earliest mention I
can see of using 'N' in the archives is in
which seems to just be stating what IANA did when they changed what
codepoint (since there were issues with the initially selected value
'46') and not a reasoned decision.

The perhaps haphazard nature of that change notwithstanding, in my
opinion the 'N' actually is correct, since the extension is not
appropriate for general use *of TLS* (indeed, we require that TLS
implementations that support this document abort the connection if it is
used for non-QUIC connections).


I've noted some potential editorial improvements in a local copy of the
markdown and have made a pull request with them, at

The length of the remaining comments notwithstanding, this document is
generally quite well done and a sizeable chunk of my comments are just
relating to subtleties of TLS and our interaction with it; I plan to
change my position to Yes once the Discuss points have been
appropriately discussed.  My apologies in advance to the chairs who will
have to open github issues for all
of these!

We may want to make a pass through the document to normalize the way in
which we discuss 0-RTT.  For example, Section 4.5 mentions that
resumption "can be used without also enabling 0-RTT" as if 0-RTT should
be off by default, but much of the previous discussion involving 0-RTT
treats it as an expected and normal part of the protocl.  (For what
little it's worth, my own personal preference leans towards how RFC 8446
treats 0-RTT data, as an optional and potentially dangerous thing that
should be off by default and only enabled after careful consideration,
though in some ways it actually seems significantly safer in QUIC than
TLS.  Regardless, I do not see any value in re-litigating the question
at this point; I'm just hoping that the document can be consistent about

Section 2.1

   The 0-RTT handshake is only possible if the client and server have
   previously communicated.  In the 1-RTT handshake, the client is

Pedantically, RFC 8446 does allow the exchange of 0-RTT data for
externally provisioned PSKs (non-resumption), provided that the
necessary parameters are provisioned along with the PSK.

Section 3

Figure 3 shows "TLS Alerts" as being carried over QUIC Transport, but
per §4.8 TLS alerts are translated into QUIC connection errors and are
not sent natively.

   *  The TLS component provides a series of updates to the QUIC
      component, including (a) new packet protection keys to install (b)
      state changes such as handshake completion, the server
      certificate, etc.

I think that if we're going to talk about passing the server certificate
between TLS and QUIC components, we should be very clear about where/how
certificate validation occurs.  For example, it would be pretty
disasterous if TLS passed the certificate to QUIC expecting that QUIC
would do any validation of the peer identity, but QUIC assumed that TLS
would only provide a validated certificate.  Perhaps in §4.1 when we
mention the potential for "additional functions [...] to configure TLS",
we might mention "including certificate validation", if appropriate?

Section 4

   QUIC carries TLS handshake data in CRYPTO frames, each of which
   consists of a contiguous block of handshake data identified by an
   offset and length.  Those frames are packaged into QUIC packets and
   encrypted under the current TLS encryption level.  [...]

I'm not sure I understand the benefit of specifically calling this a
"TLS encryption level".  While it's true that the keys are being
provided by TLS, the cryptographic mechanisms being used are applied by
QUIC.  Furthermore, there's a 1:1 correspondence between the QUIC and
TLS encryption levels, so perhaps it is acceptable to not have a
qualifier at all (for just "under the current encryption level").

   One important difference between TLS records (used with TCP) and QUIC
   CRYPTO frames is that in QUIC multiple frames may appear in the same
   QUIC packet as long as they are associated with the same packet
   number space.  [...]

I'm a bit confused as to what analogy is being made here.  It seems to
be equating TLS records and QUIC frames, but of course multiple DTLS
records can appear in a single UDP datagram, so the difference between
QUIC and DTLS does not seem quite so large as depicted here.  Of course,
if the analogy is supposed to be between TLS handshake messages and QUIC
frames, or TLS records and QUIC packets, then things would be different,
but I'm not sure what this is intended to say.

Section 4.1.3

   Once the handshake is complete, TLS becomes passive.  TLS can still
   receive data from its peer and respond in kind, but it will not need
   to send more data unless specifically requested - either by an
   application or QUIC.  [...]

(pedantic note) In some sense this is a forward-looking statement *about
TLS*, which is not something the QUIC WG is chartered to work on.  That
said, because of how TLS feature negotiation is done, any new kind of
data spontaneously emitted by TLS would need to be negotiated with an
extension in the handshake, but I don't think we are attempting to
specifically lock down the set of extensions in the TLS handshake for
QUIC connections, so in theory a change to the TLS implementation
defaults could result in QUIC connections that have TLS spontaneously
emit data.  On the gripping hand, I don't really see much need for a
text change here.

Section 4.1.5

Why does the server call Get Handshake twice between receiving the
client Initial and the client's second Hansdhake flight?  IIUC the
interface says that additional handshake data will only be supplied from
TLS in response to input handshake data (and thus, not in response to
TLS generating new keys, which is what the figure suggests to be a
proximal trigger for the second Get Handshake).  We should probably be
careful that the interfaces we document are capable of describing the
case where TLS provides both new keys and handshake bytes, and the
handshake bytes are split across encryption levels (as is the case for
the server's first flight), which may be the motivation for the two Get
Handshake calls depicted here.

Similarly, it's my understanding that the client should still call Get
Handshake after receiving the Initial (but would receive only keys and
not output handshake data at that point).  I'm not sure whether the
figure should indicate that (ineffectual) Get Handshake call.

Should the figure indicate the Handshake Confirmed operation in addition
to Handshake Complete?

Section 4.3

   The TLS implementation does not need to ensure that the ClientHello
   is sufficiently large.  QUIC PADDING frames are added to increase the
   size of the packet as necessary.

Should we reference Section 8.1 of [QUIC-TRANSPORT] for "sufficiently

Section 4.4

   A client MUST authenticate the identity of the server.  This
   typically involves verification that the identity of the server is
   included in a certificate and that the certificate is issued by a
   trusted entity (see for example [RFC2818]).

I assume that, per normal TLS semantics, this requirement can be met by
PSK authentication as well as certificate authentication.  PSK
authentication has often been an aspect of TLS that has not received
substantial attention, so we may want to preemtively pay some attention
to the PSK case.

   A server MUST NOT use post-handshake client authentication (as
   defined in Section 4.6.2 of [TLS13]), because the multiplexing

Do we want to say anything about not sending the "post_handshake_auth"
extension (defined in the numerologically similar section 4.2.6 of RFC
8446)?  Strictly speaking we don't need to say anything, but efficient
implementations would satisfy the existing QUIC requirement by not
sending the extension and relying on the TLS stack to reject
post-handshake authentication since the extension had not been offered.

Section 4.8

   The alert level of all TLS alerts is "fatal"; a TLS stack MUST NOT
   generate alerts at the "warning" level.

This seems to be making a normative restriction on the operation of TLS,
which is not in the QUIC WG charter.  Perhaps we should instead
constrain the QUIC implementation to not accept such alerts (and what to
do if they are received) and note that the only closure alerts in RFC
8446 are "close_notify" and "user_cancelled", which are replaced by
equivalent QUIC-level functionality.

Section 4.9

   An endpoint cannot discard keys for a given encryption level unless
   it has both received and acknowledged all CRYPTO frames for that
   encryption level and when all CRYPTO frames for that encryption level
   have been acknowledged by its peer.  However, this does not guarantee

(nit/editorial) I believe that the "both" is meant to apply to "received
and acknowledged" and "when all CRYTO frames [...] have been
acknowledged", but the current formulation does not have parallel
structure between those clauses, so it ends up reading more naturally as
saying that the "both" refers to "received" and "acknowledged".  I don't
have a simple suggestion for fixing it, though (it might require a
significant restructuring to fix), so I'm leaving it here in my comments
rather than incorporating a fix into my editorial PR.  (Any drastic
rewording might consider the rest of the paragraph as well, as it seems
to leave the reader without a clear sense for when it definitely is safe
to discard old keys.)

   Though an endpoint might retain older keys, new data MUST be sent at
   the highest currently-available encryption level.  Only ACK frames
   and retransmissions of data in CRYPTO frames are sent at a previous
   encryption level.  These packets MAY also include PADDING frames.

Is there anything useful to say about whether it is advisable to make
retransmission packets have the same size as the original transmission?
("No" is a fine answer.)

Section 4.9.3

   0-RTT keys to allow decrypting reordered packets without requiring
   their contents to be retransmitted with 1-RTT keys.  After receiving
   a 1-RTT packet, servers MUST discard 0-RTT keys within a short time;
   the RECOMMENDED time period is three times the Probe Timeout (PTO,
   see [QUIC-RECOVERY]).  A server MAY discard 0-RTT keys earlier if it

Just to check my understanding, the two endpoints will not necessarily
agree on the value of PTO at any given point in time, so this is a
heuristic and not some form of synchronized behavior where the server
replicates the client's loss-detection algorithm precisely?

Section 5

   *  Retry packets use AEAD_AES_128_GCM to provide protection against
      accidental modification or insertion by off-path adversaries; see
      Section 5.8.

[As noted by others, please check the terminology here against
[QUIC-TRANSPORT]; I think the latter uses "attacker" rather than
"adversary" and I haven't internalized the new on-path, limited on-path,
off-path terminology yet, either.]

   *  All other packets have strong cryptographic protections for
      confidentiality and integrity, using keys and algorithms
      negotiated by TLS.

(side note) the handshake keys are "strong" in a certain sense
(unpredictability) but weak in the sense that they are not authenticated
until the handshake has completed.  It may not be necessary to attempt
to express this subtlety in this point in the document, though.

Section 5.1

I think we should say something about the "Context" argument to
HKDF-Expand-Label here.  (I assume it is going to be the same
Transcript-Hash as used for TLS 1.3, but it is not currently specified,
as far as I can see.  I do note that we have some ongoing discussion in
the context of draft-ietf-emu-eap-tls13 that *exporters*, at least,
might better have been specified to use the full transcript including
client Finished, but I am not convinced that we should consider such
drastic changes for our usage, especially since it would neuter 0.5-RTT
unless we add a lot more complexity.)  We may also want to say that
the Length input is determined by the ciphersuite's requirements.
Alternately we could specify in terms of Derive-Secret(), but there's
not really a clear incentive to do so.

Section 5.2

   This secret is determined by using HKDF-Extract (see Section 2.2 of
   [HKDF]) with a salt of 0x38762cf7f55934b34d179ae6a4c80cadccbb7f0a and
   a IKM of the Destination Connection ID field.  This produces an
   intermediate pseudorandom key (PRK) that is used to derive two
   separate secrets for sending and receiving.

[I was going to say something about all-zeros providing the same level
of cryptographic protection, but then I saw the discussion at and the note further
down that new versions of QUIC should pick different salt values.]
I assume this holds for the final RFC version as compared to the I-Ds,
and suggests that a new
(random) salt should be chosen for the final RFC version, but I didn't
see an open issue in github for it, so I'll mention it here just in case
that helps.  (The retry key+nonce in §5.8 seem to be in a similar boat,
and the examples in the appendix would need to be adjusted as well, of

   initial_secret = HKDF-Extract(initial_salt,

   client_initial_secret = HKDF-Expand-Label(initial_secret,
                                             "client in", "",
   server_initial_secret = HKDF-Expand-Label(initial_secret,
                                             "server in", "",

(editorial) I wonder a bit if the rhetoric would be more clear if we did
not call the "initial_secret" a "secret", since it is just an
intermediate value used for constructing secrets and not used directly
to construct keys, itself.  (The idea would be to hew to the TLS 1.3
terminology where specific named Secrets are used to drive
Derive-Secret() calls but other cryptographic inputs are not named with
the word "secret".)

Section 5.3

   QUIC can use any of the cipher suites defined in [TLS13] with the
   exception of TLS_AES_128_CCM_8_SHA256.  [...]

It's a little interesting to use the RFC as the authority on
ciphersuites rather than the registry, but I guess the header protection
scheme requirement makes it more reasonable.  Also, for what it's worth,
there are MAC-only TLS 1.3 ciphersuites in the IANA registry.  I don't
know whether you want to say anything (to forbid?) such things from
being used as QUIC AEADs.

Section 5.4.x

Are we really sure that we want to use normative pseudocode?  I thought
(but cannot currently substantiate) that we typically preferred
normative prose and example pseudocode, since pseudocode somewhat
intrinsically has not-fully-specified semantics.  As some specific (but
not comprehensive!) examples, which bytes of the mask are used when the
packet number is encoded in fewer than 4 bytes, and the key/plaintext
inputs to AES-ECB seem to only be specified in the pseudocode.

Section 5.5

   Once an endpoint successfully receives a packet with a given packet
   number, it MUST discard all packets in the same packet number space
   with higher packet numbers if they cannot be successfully unprotected
   with either the same key, or - if there is a key update - the next
   packet protection key (see Section 6).  [...]

Pedantically, "the next packet protection key" seems to imply
specifically the one next key, not "any subsequent key".  A strict
reading of this text would thust require discarding all packets received
after a second key update, because this clause applies to the very first
packet successfully received, and the third key used is neither the
"same" or "next" key, from that reference point.  (I note that in §6.4
we merely talk about "either the same or newer packet protection keys".)

Section 5.6

   Of the frames defined in [QUIC-TRANSPORT], the STREAM, RESET_STREAM,
   and CONNECTION_CLOSE frames are potentially unsafe for use with 0-RTT
   as they carry application data.  [...]

I guess STOP_SENDING is not listed as unsafe because it would be an
error to send it in 0-RTT anyway?

Section 5.7

   The requirement for the server to wait for the client Finished
   message creates a dependency on that message being delivered.  A
   client can avoid the potential for head-of-line blocking that this
   implies by sending its 1-RTT packets coalesced with a Handshake
   packet containing a copy of the CRYPTO frame that carries the
   Finished message, until one of the Handshake packets is acknowledged.
   This enables immediate server processing for those packets.

This mostly only helps for unauthenticated clients, since for
authenticated clients the Finishsed isn't much good without the
Certificate+CertificateVerify, which are probably too big to include
with every packet.

Section 5.8

   The secret key and the nonce are values derived by calling HKDF-
   Expand-Label using
   0xd9c9943e6101fd200021506bcc02814c73030f25c79d71ce876eca876e6fca8e as
   the secret, with labels being "quic key" and "quic iv" (Section 5.1).

HKDF-Expand-Label also takes a "Context" argument; what value is used
for that?  (Presumably the ouput Length argument is set to the lengths
of the stated fields.)

   Retry Pseudo-Packet {
     ODCID Length (8),
     Original Destination Connection ID (0..160),
     Header Form (1) = 1,
     Fixed Bit (1) = 1,
     Long Packet Type (2) = 3,
     Type-Specific Bits (4),
     Version (32),
     DCID Len (8),
     Destination Connection ID (0..160),
     SCID Len (8),
     Source Connection ID (0..160),
     Retry Token (..),

Should we say that the four bits before Version are Unused (not
Type-Specific Bits), as they are in the Retry packet itself?  (I assume
that the the arbitrary value in those bits does need to be preserved,

Section 6

I suggest noting that, in contrast to TLS, key updates are synchronized
between traffic directions; read keys are updated at the same time that
write keys are (and vice versa).

Section 6.3

   The process of creating new packet protection keys for receiving
   packets could reveal that a key update has occurred.  An endpoint MAY
   perform this process as part of packet processing, but this creates a
   timing signal that can be used by an attacker to learn when key
   updates happen and thus the value of the Key Phase bit in certain
   packets.  Endpoints MAY instead defer the creation of the next set of
   receive packet protection keys until some time after a key update
   completes, up to three times the PTO; see Section 6.5.

(editorial) I think this paragraph (and following) might need a rewrite,
since the phrase "new keys" in the context of a key update could be
ambiguous.  In particular, if we are considering the process of a key
update from generation M to generation N, this text only seems to make
sense if the "new keys" whose generation is being discussed is
generation O (i.e., ones that are not being used to send traffic yet by
anybody), but it is easy to misread "new keys" as being "the ones
installed/activated for use as part of the key update process".  Perhaps
the guidance that endpoints are to maintain both current and next keys
in normal operation should be moved earlier, with the context that you
need to be prepared to handle any valid incoming packets (which includes
those using both current and next keys, since the peer can initiate a
key update) without a timing channel for when key generation is
occurring.  (There may be some similar text in §9.5 that would be
updated in accordance with any changes made here.)

   Once generated, the next set of packet protection keys SHOULD be
   retained, even if the packet that was received was subsequently
   discarded.  [...]

What is "the packet" that was received?

Section 6.5

   An endpoint MAY allow a period of approximately the Probe Timeout
   (PTO; see [QUIC-RECOVERY]) after receiving a packet that uses the new
   key generation before it creates the next set of packet protection
   keys.  [...]

(editorial) Similarly to the previous section, I think the action being
described here might be more properly described as "promoting the 'next'
keys to be the 'current' keys (and thus starting the precomputation for
the subsequent 'next' keys)".  Surly this is not advocating waiting a
PTO before generating the keys you need to process a packet you just
received ... right?

Section 6.6

   The usage limits defined in TLS 1.3 exist for protection against
   attacks on confidentiality and apply to successful applications of
   AEAD protection.  The integrity protections in authenticated
   encryption also depend on limiting the number of attempts to forge
   packets.  TLS achieves this by closing connections after any record
   fails an authentication check.  In comparison, QUIC ignores any
   packet that cannot be authenticated, allowing multiple forgery

QUIC seems very analogous to DTLS in this regard, and DTLS 1.3 also has
similar text about AEAD limits for integrity.  Perhaps a brief mention
that they are essentially the same is useful?

   For AEAD_AES_128_GCM and AEAD_AES_256_GCM, the confidentiality limit
   is 2^23 encrypted packets; see Appendix B.1.  For
   AEAD_CHACHA20_POLY1305, the confidentiality limit is greater than the
   number of possible packets (2^62) and so can be disregarded.  For
   AEAD_AES_128_CCM, the confidentiality limit is 2^21.5 encrypted
   packets; see Appendix B.2.  Applying a limit reduces the probability
   that an attacker can distinguish the AEAD in use from a random
   permutation; see [AEBounds], [ROBUST], and [GCM-MU].
   Note:  These limits were originally calculated using assumptions
      about the limits on TLS record size.  The maximum size of a TLS
      record is 2^14 bytes.  In comparison, QUIC packets can be up to
      2^16 bytes.  However, it is expected that QUIC packets will
      generally be smaller than TLS records.  Where packets might be
      larger than 2^14 bytes in length, smaller limits might be needed.

(This seems to say that we're just reusing the TLS numbers even though
in theory we do allow larger packets.  But if that's the case, why do
the actual numbers we present differ from the ones given for DTLS 1.3?
The actual text in Appendix B suggests that we are not actually reusing
the TLS numbers (since we use different estimates for l, and
furthermore, that [GCM-MU] allows larger limits than [AEBounds] as used
by TLS 1.3).)

   Any TLS cipher suite that is specified for use with QUIC MUST define
   limits on the use of the associated AEAD function that preserves
   margins for confidentiality and integrity.  That is, limits MUST be
   specified for the number of packets that can be authenticated and for
   the number of packets that can fail authentication.  Providing a
   reference to any analysis upon which values are based - and any
   assumptions used in that analysis - allows limits to be adapted to
   varying usage conditions.

DTLS 1.3 imposes essentially the same requirement.  IIUC it would be
acceptable to use the same limit for DTLS 1.3 and for QUIC, and we
should perhaps say that if a cipher is allowed for one it is allowed for
the other (with the same limit), though on second though, we really do
want the analysis to take into account the different assumptions about
the number of blocks in a packet..

Section 9

In various places (e.g., §4.1.4) we recommend to buffer (or "retain")
data that cannot be processed yet.  Perhaps it goes without saying, but
such buffering needs limits in place in order to avoid DoS.

It may be too banal to mention again here that the Initial secret/keys
are not particularly secure, but I'll mention it just in case we want

Section 9.6

   The initial secrets use a key that is specific to the negotiated QUIC
   version.  New QUIC versions SHOULD define a new salt value used in
   calculating initial secrets.

Also for the Retry Integrity Tag key/nonce secret?

Appendix A.1, A.5

I think that RFC 8446 has us write just "" instead of _ to indicate a
zero-length Context.  I don't see the usage of _ explained anywhere.

Appendix A.2

   The client sends an Initial packet.  The unprotected payload of this
   packet contains the following CRYPTO frame, plus enough PADDING
   frames to make a 1162 byte payload:
   The unprotected header includes the connection ID and a 4-byte packet
   number encoding for a packet number of 2:


If I'm reading this correctly, the variable-length integer encoding of
the packet Length is 0x449e which would indicate a 1182-byte payload
(including packet number), not 1162.

Appendix A.4

Should we mention specifically the Retry Token value?

Appendix B.1.1, B.1.2

I don't think that Theorem 4.3 of [GCM-MU] is the right reference for
both formulae.  (I actually had trouble matching up any of the formulae
in the paper with the ones here due to different variable names, and ran
out of time to dig further.)

Appendix B.2

   TLS [TLS13] and [AEBounds] do not specify limits on usage for
   AEAD_AES_128_CCM.  However, any AEAD that is used with QUIC requires

DTLS 1.3 does, though.

   This produces a relation that combines both encryption and decryption
   attempts with the same limit as that produced by the theorem for
   confidentiality alone.  For a target advantage of 2^-57, this results

   v + q <= 2^34.5 / l

   By setting "q = v", values for both confidentiality and integrity
   limits can be produced.  Endpoints that limit packets to 2^11 bytes
   therefore have both confidentiality and integrity limits of 2^26.5
   packets.  Endpoints that do not restrict packet size have a limit of

DTLS currently has text that does a dedicated computation for "q" and
then substitutes that established value into this last formula to
determine the limit for "v".  Should DTLS switch to using an analysis
more like the one presented here?  (Or vice versa, of course.)