[TLS] draft-ietf-tls-dtls13-42 responses to feedback

Eric Rescorla <ekr@rtfm.com> Thu, 22 April 2021 17:32 UTC

Return-Path: <ekr@rtfm.com>
X-Original-To: tls@ietfa.amsl.com
Delivered-To: tls@ietfa.amsl.com
Received: from localhost (localhost []) by ietfa.amsl.com (Postfix) with ESMTP id 42FFC3A0E43 for <tls@ietfa.amsl.com>; Thu, 22 Apr 2021 10:32:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.896
X-Spam-Status: No, score=-1.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=rtfm-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([]) by localhost (ietfa.amsl.com []) (amavisd-new, port 10024) with ESMTP id PPcAulPBncs7 for <tls@ietfa.amsl.com>; Thu, 22 Apr 2021 10:32:19 -0700 (PDT)
Received: from mail-il1-x133.google.com (mail-il1-x133.google.com [IPv6:2607:f8b0:4864:20::133]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B2F5A3A0E44 for <tls@ietf.org>; Thu, 22 Apr 2021 10:32:19 -0700 (PDT)
Received: by mail-il1-x133.google.com with SMTP id v13so2076149ilj.8 for <tls@ietf.org>; Thu, 22 Apr 2021 10:32:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rtfm-com.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=KOzggViOaWY2zjmDSadk5zu/rb1i7eP4vYhbsCemya4=; b=qo2PGhqpWwW4bDGD2RAVQtt62V7VeOv8FO4n+JW5sP0ras7m4Lc6fHx6TbBbD6m/pT YqNsSm67h7wUVGUn6x4OwC7MWjyi4/6gPJVY6pqw4FNaNPzu8xcAvDFOeE4rLwBaz84C G+YE5Xag3+xo7TcVhhiR3/j89wvqXzQVoZaGbXNj3zhdVpzR3q2zWpQx+TzbpesTYqen ZxWgzPpxTCNp4Py5Y9flKE/hny0G3rT8YIatBsI7TFncaaqjJ014enzWqKq1CWF+b0EB 4dOgNhCrTloN9I3M8TZSPxJohwu72XTLbZOYZVarb6UjaJAcNQOB9HfECw4Mj8O9rR2V +E4g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=KOzggViOaWY2zjmDSadk5zu/rb1i7eP4vYhbsCemya4=; b=OpV013EYXvJQSeeBwK8krHBFZT3o8358YEudoIuXj+oKslvxCaFu+fOxLyTP/f25z3 gvh/IsDFIXGwJ669Hs6wI3acHAKWSgZjbwL+9Qtka8EmbmaX9ehFwaySj3YxnFwc5gyI T1vYMdZZd/J8dS3P7xoc1AsewX2ESIRkNGQpX6pT5NbJop7GCjDIzmn0czqpAkm4xnCm eB8GZFqR4joY+KN+UcTMw4ut8GpWMAyEl3J71wj5U+glI56reZoW6yMMrTb/WUdeuaRx vCqkkLjmhc4IBRDWhqM+XJxJjqnmg6uPnGY38WSHw2HmkVjo/FdmL5PNzXLgiPxaSaAQ nntA==
X-Gm-Message-State: AOAM530R3+Rz2sZ2va5SpWe6S4sesniZzqL2XGZL/BsSi1jDH6MDrP3k pCob+E5cum7TKPrao7Urjp7AacPqJqY7QKvJdAn7+6m/hslLfRgf
X-Google-Smtp-Source: ABdhPJwX53f79/C/KRv7nTwsjFU3lqjquo1GjIkmNj2JfUiH5fqIbnWPc0ZHLRXnefXKrN+i6l1LPhs1hJUkOPhFkUg=
X-Received: by 2002:a05:6e02:1a81:: with SMTP id k1mr3721821ilv.18.1619112737799; Thu, 22 Apr 2021 10:32:17 -0700 (PDT)
MIME-Version: 1.0
From: Eric Rescorla <ekr@rtfm.com>
Date: Thu, 22 Apr 2021 10:31:42 -0700
Message-ID: <CABcZeBO2L1gp-hoh983THrXaUwtbOX_nyfhZVZNrHnXPSZWBmQ@mail.gmail.com>
To: "<tls@ietf.org>" <tls@ietf.org>, IESG <iesg@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000969a8605c0930f37"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tls/d1osAkPh_DmVLxAsYF99y5IT0nA>
Subject: [TLS] draft-ietf-tls-dtls13-42 responses to feedback
X-BeenThere: tls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <tls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tls>, <mailto:tls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tls/>
List-Post: <mailto:tls@ietf.org>
List-Help: <mailto:tls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tls>, <mailto:tls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Apr 2021 17:32:25 -0000

I have posted draft-ietf-tls-dtls13-42, addressing the IESG
Feedback. Thanks to everyone who provided reviews. Here is a
description how I handled comments. If there is somebody whose
feedback I missed please let me know.


**** Erik Kline

> [ section 4.4 ]
> * "respectively" -> "respectively."
> * Could a DTLS implementation packetize to a min-MTU for an IP version
>   and avoid all pMTU issues?  Such a strategy would probably be poor for
>   IPv4 but might be acceptable for IPv6 communications.

Maybe, but I think we probably don't need to say much.

> [ section 4.5.3 ]
> * "MUST NOT used" -> "MUST NOT be used"


> [ section 5.8.4 ]
> * "NOT have send" -> "NOT send", I think


> [ section 6 ]
> * "which are needed to encrypt to decrypt"?


**** Francesca Palombini

> section 2. Conventions and Terminology
> FP: Please spell out that network byte order (most significant byte
first) is used throughout the document.


>     Once the client has transmitted the ClientHello message, it expects
>     to see a HelloRetryRequest or a ServerHello from the server.
>     However, if the server's message is lost, the client knows that
>     either the ClientHello or the response from the server has been lost
>     and retransmits. When the server receives the retransmission, it
>     knows to retransmit.
> FP: It would be good to mention retransmission max times here.

DTLS actually doesn't have an overall timeout. This is left to
the discretion of the implementation. It does have a maxmimum
backoff, butbackoff isn't mentioned at all.

>              |                |   /+----------------+\
>              | 31 < OCT < 64 -+--> |DTLS Ciphertext |
>              |                |    |(header bits    |
>              |      else      |    | start with 001)|
>              |       |        |   /+-------+--------+\
>         The value for the "DTLS-OK" column is "Y". IANA is requested to
>         reserve the content type range 32-63 so that content types in this
>         range are not allocated.
> FP: IANA is asked to reserve 32-63, but I could not see any explanation
for that. I would like to see it justified in section 4.1 or in the
respective IANA section.


>     fragmentation, clients of the DTLS record layer SHOULD attempt to
>     size records so that they fit within any PMTU estimates obtained from
>     the record layer.
> FP: First time PMTU appears, please expand and add reference.


>     performing PMTU discovery, whether via [RFC1191] or [RFC4821]
>     mechanisms. In particular:
> FP: I think this is missing areference to RFC 8201 since IPv6 is
mentioned below.


>     Any TLS cipher suite that is specified for use with DTLS MUST define
>     limits on the use of the associated AEAD function that preserves
>     margins for both confidentiality and integrity. That is, limits MUST
>     be specified for the number of packets that can be authenticated and
>     for the number of packets that can fail authentication before a key
>     update is required. Providing a reference to any analysis upon which
>     values are based - and any assumptions used in that analysis - allows
>     limits to be adapted to varying usage conditions.
> FP: This seems important enough that it should be highlighted for the
experts reviewing the registration. I see that
has a number of notes, maybe that would be enough, or maybe add it (as an
update?) to RFC 8447?


> zero
> length vector (i.e., a single zero byte length field).
> FP: I suggest using TLS 1.3 terminology of "zero-length vector (i.e., a
zero-valued single byte length field)"


>     flow shown in Figure 11 if the client does not send the ACK message
> FP: s/11/12


***** Martin Duke

> Sec 2. It might be useful to introduce the term "epoch" in the glossary,
for those who read this front to back.


> Sec 4.2.3: "The encrypted sequence number is computed by XORing the
leading bytes of the Mask with the sequence number. Decryption is
accomplished by the same process."
> The text is unclear if the XOR is applied to the expanded sequence number
or merely the 1-2 octets on the wire. I presume it's the latter, but this
should be clarified.


> Sec 4.2.3: It's implied here that the sn_key rotates with the epoch. As
this is different from QUIC, it's probably worth spelling out.


> Sec 5.1 is a bit vague about the amplification limit; why not at least
RECOMMEND 3, as we've converged on this elsewhere?


> Sec 5.1. Reading between the lines, it's clear that the cookie can't be
used as address verification across connections in the way that a NEW_TOKEN
token is. It would be good to spell this out for clients -- use the
resumption token or whatever instead.

Added some text.

> Sec 7.2 "As noted above, the receipt of any record responding to a given
flight MUST be taken as an implicit acknowledgement for the entire flight."
I think this should be s/entire flight/entire previous flight?

Added some text.

> Sec 7.2 "Upon receipt of an ACK that leaves it with only some messages
from a flight having been acknowledged an implementation SHOULD retransmit
the unacknowledged messages or fragments."
> This language appears inconsistent with Figure 12, where Record 1 has not
been acknowledged but is also not retransmitted. It appears there is an
implied handling of empty ACKs that isn't written down anywhere in Sec 7.2

This is just a bug in the diagram. Good catch. Fixed.

> Sec 9. Should there be any flow control limits on new_connection_id? Or
should receivers be free to simply drop CIDs they can't handle? It might be
good to specify.

Added some text.

> Finally, a really weird one. Reading this document and references to
connection ID prompted to me to think how QUIC-LB could apply to DTLS. The
result is here: https://github.com/quicwg/load-balancers/pull/106/files.
Please note the rather unfortunate third-to-last paragraph. I'm happy to
take the answer that this use case doesn't matter, since I made it up
today. But if it does, it would be very helpful if (1) DTLS 1.3 clients
MUST include a connection_id extension in their ClientHello, even if zero
length, and/or (2) this draft updated 4.1.4 of 8446 to allow the server to
include connection_id in HelloRetryRequest even if the client didn't offer
it. Thoughts?


> 5.2 s/select(HandshakeType)/select(msg_type). Though with pseudocode your
mileage may vary as to what's clearer.


> 5.7 s/consitute/constitute


> Sec 5.7 In table 1, why include one ACK in the diagram but not the other?
It's clear from the note, but the figure is a weird omission.

I don't think I understand why we did this, so I just removed it.

**** Lars Eggert

Indicated minor changes made.

**** Zaheduzzaman Sarker

> This was very well written document. Thanks for this.
> Minor observations below-
> * Section 3.1 :
>   - Once the client has transmitted the ClientHello message, it expects
to see a HelloRetryRequest or a ServerHello from the server. However, if
the server's message is lost, the client knows that either the ClientHello
or the response from the server has been lost and retransmits.
> is this supposed to mean when the timer expires the client knows either
the ClientHello or the response from the server has been lost? the current
text does not imply that - the server's message is lost is an
interpretation of timer expired event.


>   -  The server also maintains a retransmission timer and retransmits
when that timer expires.
> The way it is written following the previous paragraph, almost made me
feel that the server is also maintaining a timer for the client hello. It
would be nicer if some text explains the usage of timers at the server to
break the continuous read from previous paragraph.


> * Section 3.3: I would add a reference to section 4.4.


> * Section 4.5.2: I assume the silent discard of invalid records will not
impact the timers, is that a valid assumption? if yes, then it would be
good if this is clarified in the text.

This is correct, but I don't quite see why one would think it does, as they
don't even get to the point where you it would impact the timer. Anyway,
added some text.

> * Section 5.8.1:
>     Because DTLS clients send the first message (ClientHello), they start
in the PREPARING state. DTLS servers start in the WAITING state, but with
empty buffers and no retransmit timer
> This is repeated twice in the section, is there any reason for that?


**** John Scudder


> Section 3.1:
> I found the explanatory text to be confusing. You start with a figure
illustrating a lost HelloRetryRequest. Then you tell me the server
maintains a rexmit timer:
>    The server also maintains a retransmission timer and retransmits when
>    that timer expires.
> But then you immediately tell me that it actually doesn’t:
>    Note that timeout and retransmission do not apply to the
>    HelloRetryRequest since this would require creating state on the
>    server.  The HelloRetryRequest is designed to be small enough that it
>    will not itself be fragmented, thus avoiding concerns about
>    interleaving multiple HelloRetryRequests.
> I presume that if I added some more words to this, your intent is that
the server maintains a retransmission timer *for messages other than
HelloRetryRequest*. As written, it gave me some whiplash.


> Section 4.2.1:
>    In general,
>    implementations SHOULD discard records from earlier epochs, but if
>    packet loss causes noticeable problems implementations MAY choose to
>    retain keying material from previous epochs for up to the default MSL
>    specified for TCP [RFC0793] to allow for packet reordering.
> It seems to me as though “if packet loss causes noticeable problems” is
saying either too much, or not enough. Not enough: problems for whom?
Noticeable by whom? How is this determined? Do you really mean I’m supposed
to work this out dynamically as the text sort-of implies? Too much: if
you’re not going to answer the foregoing, maybe don’t taunt me, and omit
the clause entirely? Or, possibly a less vague rewrite could be in the
nature of “if providing service to an application that is especially
sensitive to packet loss”.


> Section 2:
> “The reader is also as to be familiar” s/as/assumed/


> Section 11:
>    Although the cookie must allow the server to produce the right
>    handshake transcript, they
> “It” not “they” (agreement in number)


> and
>    DTLS with connection IDs allow for endpoint addresses to
>    change during the association;
> “allows” not “allow” (agreement in number)


**** Eric Vyncke
> -- Section 3 --
> s/TLS cannot be used directly in datagram environments/TLS cannot be used
directly over a datagram transport/ ?
> Bullet 2) s/to enable reassembly in the correct order/to enable
reordering/ ?


> -- Section 3.1 --
> Should there be a hint to a maximum retry count ?

I'm not sure what we would put here given the diversity of environments,
so we opted not to.

> -- Section 3.3 --
> I understand the motivation (and no need to reply), but, sigh...
implementing frag/reassembly above the transport layer...

Indeed. If it helps, think of DTLS as the transport layer :)

**** Robert Wilton
> 1) Although it is clear from the metadata, it might be helpful if the
> introduction also stated that it obsoletes DTLS 1.2.


> 2) This document is a set of deltas against TLS 1.3.  Given that it
> talks about the DTLS 1.1/1.2 documents being deltas in the
> introduction, I would have also included that information for this
> document in the introduction rather than in the Terminology and
> Considerations section.  Initially, having read the introduction I had
> assumed that it was not going to be deltas.


**** Bernard Aboba

> Summary: The timeout and retransmission scheme looks workable for common
cases, but could use some refinement to make it more robust.
> Technical Comments
> 4.5.2. Handling Invalid Records
> Unlike TLS, DTLS is resilient in the face of invalid records (e.g.,
> invalid formatting, length, MAC, etc.). In general, invalid records
> SHOULD be silently discarded, thus preserving the association;
> however, an error MAY be logged for diagnostic purposes.
> [BA] How does silent discard of invalid records interact with
retransmission timers?

It doesn't. How could it? But I added some text anyway.

> Implementations which choose to generate an alert instead, MUST
> generate error alerts to avoid attacks where the attacker repeatedly
> probes the implementation to see how it responds to various types of
> error. Note that if DTLS is run over UDP, then any implementation
> which does this will be extremely susceptible to denial-of-service
> (DoS) attacks because UDP forgery is so easy. Thus, this practice is
> NOT RECOMMENDED for such transports, both to increase the reliability
> of DTLS service and to avoid the risk of spoofing attacks sending
> traffic to unrelated third parties.
> [BA] "this practice" refers to "generate an alert instead", correct?

Yes. Addressed,

> 5.8.2. Timer Values
> Though timer values are the choice of the implementation, mishandling
> of the timer can lead to serious congestion problems, for example if
> [BA] Saying "timer values are the choice of the implementation" seems
> odd, because it is followed by normative language. I would delete this
> and start the sentence with "Mishandling...".

It has been deleted.

> many instances of a DTLS time out early and retransmit too quickly on
> a congested link. Implementations SHOULD use an initial timer value
> of 100 msec (the minimum defined in RFC 6298 [RFC6298]) and double
> the value at each retransmission, up to no less than 60 seconds (the
> RFC 6298 maximum). Application specific profiles, such as those used
> for the Internet of Things environment, may recommend longer timer
> values. Note that a 100 msec timer is recommended rather than the
> 3-second RFC 6298 default in order to improve latency for time-
> sensitive applications. Because DTLS only uses retransmission for
> handshake and not dataflow, the effect on congestion should be
> minimal.
> Implementations SHOULD retain the current timer value until a message
> is transmitted and acknowledged without having to be retransmitted,
> at which time the value may be reset to the initial value.
> [BA] Is it always possible to distinguish a retransmission from a late
> arrival of an original packet? This seems like it could result in
> wrongly resetting the timer in some situations.

The intent of this text is that you didn't retransmit at all.

> 5.8.3. Large Flight Sizes
> DTLS does not have any built-in congestion control or rate control;
> in general this is not an issue because messages tend to be small.
> However, in principle, some messages - especially Certificate - can
> be quite large. If all the messages in a large flight are sent at
> once, this can result in network congestion. A better strategy is to
> send out only part of the flight, sending more when messages are
> acknowledged. DTLS offers a number of mechanisms for minimizing the
> size of the certificate message, including the cached information
> extension [RFC7924] and certificate compression [RFC8879].
> [BA] How does the implementation know how much of the flight to send?
> Not sure how prevalent large certs are for DTLS (e.g. compared with the
self-signed certs of WebRTC),
> but in EAP-TLS deployments, large certs have caused problems.
> The EAP-TLS cert document draft-ietf-emu-eaptlscert cites some additional
> mechanisms for reducing certificate sizes, such as draft-ietf-tls-ctls
> and [RFC6066] which defines the "client_certificate_url"
> extension which allows TLS clients to send a sequence of Uniform
> Resource Locators (URLs) instead of the client certificate.

We added some text.

> 5.11. Alert Messages
> Note that Alert messages are not retransmitted at all, even when they
> occur in the context of a handshake. However, a DTLS implementation
> which would ordinarily issue an alert SHOULD generate a new alert
> message if the offending record is received again (e.g., as a
> retransmitted handshake message). Implementations SHOULD detect when
> a peer is persistently sending bad messages and terminate the local
> connection state after such misbehavior is detected. Note that
> alerts are not reliably transmitted; implementation SHOULD NOT depend
> on receiving alerts in order to signal errors or connection closure.
> [BA] For the fatal alert case, it does seem like retransmission would
> be a good idea; otherwise the peer can be left hanging.

This has been the practice since DTLS 1.0, and there's no way to
ack them, so I don't think we should change no.

> Section 7.1
> "Disruptions" such as reordering do not affect timers, correct?

No. The timers are only on the sender side, so they kind of

> ACKs SHOULD NOT be sent for these flights unless generating the
> responding flight takes significant time.
> What is "significant time"?


> Editorial Comments (NITs)
> Section 2
> The reader is also as to be familiar with
> [BA] "as" -> "assumed"


> Section 3
> The basic design philosophy of DTLS is to construct "TLS over
> datagram transport". Datagram transport does not require nor provide
> reliable or in-order delivery of data. The DTLS protocol preserves
> this property for application data. Applications such as media
> streaming, Internet telephony, and online gaming use datagram
> transport for communication due to the delay-sensitive nature of
> transported data. The behavior of such applications is unchanged
> when the DTLS protocol is used to secure communication, since the
> DTLS protocol does not compensate for lost or reordered data traffic.
> [BA] While low-latency streaming and gaming does use DTLS to protect data
(e.g. for
> protection of WebRTC data channel), telephony and RTC Audio/Video uses
> key derivation only, and SRTP for protection of data. So you might want
to make a
> distinction.


> Section 3.1
> Note that timeout and retransmission do not apply to the
> HelloRetryRequest since this would require creating state on the
> server. The HelloRetryRequest is designed to be small enough that it
> will not itself be fragmented, thus avoiding concerns about
> interleaving multiple HelloRetryRequests.
> [BA] I would add "For more detail on timeouts and retransmission,
> see Section 5.8."


> 4.3. Transport Layer Mapping
> DTLS messages MAY be fragmented into multiple DTLS records. Each
> DTLS record MUST fit within a single datagram. In order to avoid IP
> fragmentation, clients of the DTLS record layer SHOULD attempt to
> size records so that they fit within any PMTU estimates obtained from
> the record layer.
> [BA] You might reference PMTU considerations described in Section 4.4.


>     Post-handshake client authentication
> Messages of each category can be sent independently, and reliability
> is established via independent state machines each of which behaves
> as described in Section 5.8.1. For example, if a server sends a
> NewSessionTicket and a CertificateRequest message, two independent
> state machines will be created.
> As explained in the corresponding sections, sending multiple
> instances of messages of a given category without having completed
> earlier transmissions is allowed for some categories, but not for
> others. Specifically, a server MAY send multiple NewSessionTicket
> messages at once without awaiting ACKs for earlier NewSessionTicket
> first. Likewise, a server MAY send multiple CertificateRequest
> messages at once without having completed earlier client
> authentication requests before. In contrast, implementations MUST
> NOT have send KeyUpdate, NewConnectionId or RequestConnectionId
> [BA] "send" -> "sent"


>     Example of Handshake with Timeout and Retransmission
> The following is an example of a handshake with lost packets and
> retransmissions. Note that the client sends an empty ACK message
> because it can only acknowledge Record 1 sent by the server once it
> has processed messages in Record 0 needed to establish epoch 2 keys,
> which are needed to encrypt to decrypt messages found in Record 1.
> [BA] "encrypt to decrypt" -> "encrypt or decrypt"?


> Section 7.3
> In the first case the use of the ACK message is optional because the
> peer will retransmit in any case and therefore the ACK just allows
> for selective retransmission, as opposed to the whole flight
> retransmission in previous versions of DTLS. For instance in the
> flow shown in Figure 11 if the client does not send the ACK message
> [BA] Figure 11 is the DTLS State Machine. Are you referring to another


> The use of the ACK for the second case is mandatory for the proper
> functioning of the protocol. For instance, the ACK message sent by
> the client in Figure 13, acknowledges receipt and processing of
> record 4 (containing the NewSessionTicket message) and if it is not
> sent the server will continue retransmission of the NewSessionTicket
> indefinitely until its transmission cap is reached.
> [BA] Do you mean "maximum retransmission timemout value"?