[TLS] AD Evaluation of draft-ietf-tls-dtls13-39
Benjamin Kaduk <kaduk@mit.edu> Fri, 13 November 2020 23:51 UTC
Return-Path: <kaduk@mit.edu>
X-Original-To: tls@ietfa.amsl.com
Delivered-To: tls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D9D1A3A102F; Fri, 13 Nov 2020 15:51:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id V-z_eogOWnc7; Fri, 13 Nov 2020 15:51:46 -0800 (PST)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EC8893A1031; Fri, 13 Nov 2020 15:51:42 -0800 (PST)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 0ADNpYUC002241 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 13 Nov 2020 18:51:39 -0500
Date: Fri, 13 Nov 2020 15:51:34 -0800
From: Benjamin Kaduk <kaduk@mit.edu>
To: draft-ietf-tls-dtls13.all@ietf.org
Cc: tls@ietf.org
Message-ID: <20201113235134.GW39170@kduck.mit.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/tls/FJM6OHfvLJP_pF5uUcR86pzrdYo>
Subject: [TLS] AD Evaluation of draft-ietf-tls-dtls13-39
X-BeenThere: tls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <tls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tls>, <mailto:tls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tls/>
List-Post: <mailto:tls@ietf.org>
List-Help: <mailto:tls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tls>, <mailto:tls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 Nov 2020 23:51:56 -0000
Hi all, Sorry this took longer than planned to get to -- running my pubreq queue in order took longer than expected. I made a pull request with editorial/nit-level stuff at https://github.com/tlswg/dtls13-spec/pull/160 (though some editorial issues remain mentioned here where there is a lot of flexibility in how to resolve them). I think there are probably some DTLS-specific "implementation pitfalls" that might merit a section akin to RFC 8446's Appendix C.3. I also mention in the per-section comments a few places where we should say a bit more about how we diverge from RFC 8446, and a few places where being more explicit about separate read and write epochs would be helpful. Section 1 1.2 (see Appendix D of [TLS13] for details). While backwards compatibility with DTLS 1.0 is possible the use of DTLS 1.0 is not recommended as explained in Section 3.1.2 of RFC 7525 [RFC7525]. I guess we might want to reference draft-ietf-tls-oldversions-deprecate by the time we hit the RFC Editor's queue. Section 2 - connection: A transport-layer connection between two endpoints. Can there even be a datagram "connection"? Regardless, we should define "association" since we use that as well. - handshake: An initial negotiation between client and server that establishes the parameters of their transactions. This is the only place in the document where we use the word "transaction", which makes me suspect that it is not the best word choice. - session: An association between a client and a server resulting from a handshake. This seems technically true, but could be confusing if we want to have an analogy DTLS Session:TLS Session::DTLS Association:TLS Connection. (Per the previous comments, I am not 100% sure that we are trying to have that anaolgy, though.) Section 3.4 DTLS optionally supports record replay detection. The technique used is the same as in IPsec AH/ESP, by maintaining a bitmap window of Do we want a reference for the IPsec usage? (We do reference https://tools.ietf.org/html/rfc4303#section-3.4.3 from §4.5.1 when we talk about the mechanics of the replay window.) Applications may conceivably detect duplicate packets and accordingly modify their data transmission strategy. The text here doesn't give me a clear impression of whether the application is supposed to use the DTLS sequence numbers for this detection, or their own (application layer) information. (I didn't think that most DTLS implementations exposed an API to give the application the record sequence number.) Section 4 ProtocolVersion legacy_record_version; We should probably say what value(s) are allowed here, akin to the RFC 8446 "MUST be set to 0x0303 for all records [...] other than an initial ClientHello". Fixed Bits: The three high bits of the first byte of the DTLSCiphertext header are set to 001. Do we want to say something about why we have fixed bits and all and how the values were chosen (perhaps a reference to RFC 7983)? If a connection ID is negotiated, then it MUST be contained in all datagrams. Sending implementations MUST NOT mix records from multiple DTLS associations in the same datagram. If the second or later record has a connection ID which does not correspond to the same association used for previous records, the rest of the datagram MUST be discarded. I'm failing to come up with a reason why you would directly want to use CIDs (for the same association) for records in a single datagram; should we recommend using a single CID value per datagram explicitly? The entire header value shown in Figure 4 (but prior to record number encryption) is used as as the additional data value for the AEAD Maybe forward-reference §4.2.3 for the record-number encryption? The entire header value shown in Figure 4 (but prior to record number encryption) is used as as the additional data value for the AEAD function. For instance, if the minimal variant is used, the AAD is 2 octets long. Note that this design is different from the additional data calculation for DTLS 1.2 and for DTLS 1.2 with Connection ID. In light of the ongoing discussion for the DTLS 1.2 connection ID, I just want to walk through a few points here and confirm that we are in good shape. - (D)TLS 1.3 always requires AEAD ciphers - AEAD ciphers implicitly authenticate the AAD length - the connection ID is the only implicitly variable-length field in the record header/AAD (the size of the sequence number and length fields is indicated by bits in the first byte) - in light of the above three points, the length of the connection ID is also implicitly authenticated by the AEAD - the AAD explicitly flags whether or not a CID is present at all - the AEAD implicitly authenticates the length of the ciphertext, so it is okay to omit the length from the AAD when the L bit is 0. Section 4.1 - If the first byte is alert(21), handshake(22), or ack(proposed, 26), the record MUST be interpreted as a DTLSPlaintext record. - If the first byte is any other value, then receivers MUST check to see if the leading bits of the first byte are 001. If so, the implementation MUST process the record as DTLSCiphertext; the true content type will be inside the protected portion. - Otherwise, the record MUST be rejected as if it had failed deprotection, as described in Section 4.5.2. I can imagine a reading of this last point that says that something that implements DTLS 1.3 MUST treat a tls12_cid ContentType as an error ... even if that software also supports DTLS 1.2. I could imagine a phrasing like "not a valid DTLS 1.3 record" that would not have this property, but I am not sure whether that is the right approach. In particular, when CIDs are in use, we do not necessarily have any external context to indicate whether we should be expecting DTLS 1.3 or DTLS 1.2 (or either) on a given listening socket. (I guess a mixed 1.2+1.3 implementation would also get Application Data, but 5-tuple would in theory be a distinguisher for when to accept that.) Section 4.2.3 In DTLS 1.3, when records are encrypted, record sequence numbers are also encrypted. The basic pattern is that the underlying encryption algorithm used with the AEAD algorithm is used to generate a mask which is then XORed with the sequence number. I have mixed feelings about breaking through the AEAD abstraction to assume that there is an "underlying encryption algorithm" to use. Furthermore, we currently only cover AES and ChaCha20, though the ciphersuites registry lists TLS 1.3 ciphers with SM4 as block cipher, as well as MAC-only ciphersuites. (Presumably the lack of sequence number encryption is considered a feature for MAC-only ciphersuites.) We may even want to limit our specification of sequence number encryption to specific named ciphersuite codepoints, instead of having the vague attempt at applying to future AES-using or ChaCha20-using ciphers. It seems that we are in effect imposing a new requirement for a TLS 1.3 cipher to get listed with DTLS-OK=Y, namely, that it has a mechanism for generating a mask for encrypting sequence numbers. I think we should make this new requirement more explicit, e.g., with an update to the IANA registry. It would also be useful if we could give guidance to future ciphersuite authors for how one could do this in general, though that would probably not be normative guidance. When the AEAD is based on ChaCha20, then the mask is generated by treating the first 4 bytes of the ciphertext as the block counter and the next 12 bytes as the nonce, passing them to the ChaCha20 block function (Section 2.3 of [CHACHA]): I note that this is effectively random nonce selection since we treat the encryption function as a PRF. The sn_key is only regenerated when the traffic keys are refreshed, so we may have higher risk of nonce-reuse here than for the "real" packet encryption since the sequential nonce values used for packet encryption have a lower collision probability (though the value of what would be exposed on nonce reuse for sequence number encryption is lower). That said, we are only using the cipher portion and do not have an AEAD tag, to the really nasty consequences of nonce reuse are not in scope. Nonetheless, perhaps updating the guidance on how often to rekey is in order (or at least running the numbers). The encrypted sequence number is computed by XORing the leading bytes of the Mask with the sequence number. Decryption is accomplished by the same process. (This is fine. There is maybe some aesthetic complaint about "sometimes bit 0 of the Mask aplies to bit 32 of the sequence number and sometimes it applies to bit 40 of the sequence number" but I am pretty sure it doesn't matter.) Section 4.4 If PMTU estimates are available from the underlying transport protocol, they should be made available to upper layer protocols. In particular: I feel like we should be saying something about subtracting the length of the DTLSCiphertext header (form in use). Note that DTLS does not defend against spoofed ICMP messages; implementations SHOULD ignore any such messages that indicate PMTUs below the IPv4 and IPv6 minimums of 576 and 1280 bytes respectively I want to say there is a standard reference for this but am failing to come up with the RFC number right now. The DTLS record layer SHOULD allow the upper layer protocol to discover the amount of record expansion expected by the DTLS processing. This might be better closer to the "If PMTU estimates are available" paragraph. If there is a transport protocol indication (either via ICMP or via a refusal to send the datagram as in Section 14 of [RFC4340]), then the DTLS record layer MUST inform the upper layer protocol of the error. indication of what? - If the DTLS record layer informs the DTLS handshake layer that a message is too big, it SHOULD immediately attempt to fragment it, using any existing information about the PMTU. (editorial) too many "it"s here referring to different things. Section 4.5.1 We need to have an anti-replay window per epoch; the text here should mention that explicitly (we currently just have text talking about "record counter for a session MUST be initialized to zero when that session is established", but a session is at a broader scope than epoch). serve as a timing channel for the record number. Note that decompressing the records number is still a potential timing channel for the record number, though a less powerful one than whether it was deprotected. Just to confirm: we're using "decompress" here for the process of going from 8-or-16-bit sequence number to 48-bit sequence number? Section 4.5.3 As mentioned above, we might mention any reduced limits due to sequence-number protection (e.g., with ChaCha20) here, if they exist. For AEAD_AES_128_GCM, AEAD_AES_256_GCM, and AEAD_CHACHA20_POLY1305, the limit on the number of records that fail authentication is 2^36. Note that the analysis in [AEBounds] supports a higher limit for the AEAD_AES_128_GCM and AEAD_AES_256_GCM, but this specification recommends a lower limit. For AEAD_AES_128_CCM, the limit on the number of records that fail authentication is 2^23.5; see Appendix B. We might get asked to provide references for the AEADs (so, RFCs 5116 and 6655) here and in the following paragraph. Section 5.1 message, as well as the cookie extension, is defined in TLS 1.3. The HelloRetryRequest message contains a stateless cookie generated using the technique of [RFC2522]. The client MUST retransmit the I note that RFC 2522 recommends using a cryptographic hash such as MD5, which is probably not the exact advice we want to be giving. ClientHello with the cookie added as an extension. The server then Does "MUST retransmit" imply that the other HRR functionality (such as group negotiation) cannot be used? (I note that a bit further on we say "MUST create a new ClientHello ... [following] Section 4.1.2 of [TLS13]", which seems to be enough of a normative requirement that the "MUST" may not be needed here.) The cookie extension is defined in Section 4.2.2 of [TLS13]. When Do we want to add any discussion of what is stored in the cookie (other than the RFC 2522-like address+ports and the ClientHello1 hash that [TLS13] mentions), as mentioned in the thread at https://mailarchive.ietf.org/arch/msg/tls/QbteFvnk1H2K9OjfHGosuG9e9Rk/ ? I am somewhat amenable to the stance that it's more appropriately done in 8446bis. that the exchange is performed, however. In addition, the server MAY choose not to do a cookie exchange when a session is resumed or, more generically, when the DTLS handshake uses a PSK-based key exchange. We could potentially say something about associating the IP address information with the session/PSK (presumably alongside discussion of connection IDs and mobility). Section 5.2 Note: In DTLS 1.2 the message_seq was reset to zero in case of a rehandshake (i.e., renegotiation). On the surface, a rehandshake in DTLS 1.2 shares similarities with a post-handshake message exchange in DTLS 1.3. However, in DTLS 1.3 the message_seq is not reset to allow distinguishing a retransmission from a previously sent post- handshake message from a newly sent post-handshake message. Just to confirm: this means we are limited to 2**16 handshake messages per association (including NST and post-handshake auth)? Section 5.3 [Discussing ServerHello here for want of a better location.] We specify a ClientHello.legacy_version = {254,253}, but we seem to be inheriting the unmodified TLS 1.3 ServerHello, complete with ServerHello.legacy_version = 0x0303. That seems problematic, since the legacy DTLS 1.2 ServerHello would use the expected {254,253} like the ClientHello. Similarly, we should probably specify whether the sentinel downgrade-protection Random values are used as-is from TLS 1.3, or if we have new ones for DTLS. [end ServerHello topics] In DTLS 1.3, the client indicates its version preferences in the "supported_versions" extension (see Section 4.2.1 of [TLS13]) and the legacy_version field MUST be set to {254, 253}, which was the version number for DTLS 1.2. The version fields for DTLS 1.0 and DTLS 1.2 are 0xfeff and 0xfefd (to match the wire versions) but the version field for DTLS 1.3 is 0x0304. It seems like reusing 0x0304 will make implementations more complex for little gain -- it's common to want to, e.g., compare (D)TLS versions to see which are greater. OpenSSL does that with macros like: /* * DTLS version numbers are strange because they're inverted. Except for * DTLS1_BAD_VER, which should be considered "lower" than the rest. */ # define dtls_ver_ordinal(v1) (((v1) == DTLS1_BAD_VER) ? 0xff00 : (v1)) # define DTLS_VERSION_GT(v1, v2) (dtls_ver_ordinal(v1) < dtls_ver_ordinal(v2)) # define DTLS_VERSION_GE(v1, v2) (dtls_ver_ordinal(v1) <= dtls_ver_ordinal(v2)) # define DTLS_VERSION_LT(v1, v2) (dtls_ver_ordinal(v1) > dtls_ver_ordinal(v2)) # define DTLS_VERSION_LE(v1, v2) (dtls_ver_ordinal(v1) >= dtls_ver_ordinal(v2)) which would have to grow another case of indirection to handle this, whereas making TLS1_3_VERSION and DTLS1_3_VERSION have the same numerical value doesn't seem to help the code at all. cipher_suites: Same as for TLS 1.3. Is it too banal to say that only cipher suites with DTLS-OK=Y are permitted [to be used]? Section 5.4 When a DTLS implementation receives a handshake message fragment, it MUST buffer it until it has the entire handshake message. DTLS implementations MUST be able to handle overlapping fragment ranges. This allows senders to retransmit handshake messages with smaller fragment sizes if the PMTU estimate changes. Lots to say here: - "MUST buffer fragments" conflicts with an earlier "MAY discard if message_seq is greater than next_receive_seq". - "MUST buffer" has DoS considerations. - does "handle overlapping fragment ranges" include verifying that the overlapping content is identical? If not, do we say anything about which one to use? Section 5.5 the deletion attacks that EndOfEarlyData prevents in TLS. Servers SHOULD aggressively age out the epoch 1 keys upon receiving the first epoch 2 record and SHOULD NOT accept epoch 1 data after the first I'm not disagreeing with the sentiment, but it seems like following the "SHOULD age out" recommendation could lead to a stall of application data if the flight with client's Finished has to get retransmitted or throttled by congestion control. Section 5.6 Numbering the flights like this with absolute identifiers could be quite useful, but the current formulation leaves a bit to be desired, since we don't have much consistency in numbering across the various scenarios. If we are going to have to fall back to "client's second flight" to refer to the given scenario in question, then perhaps it is not worth giving different numbers to client vs server flight. Figure 6: Message flights for a full DTLS Handshake (with cookie exchange) I'd consider (but possibly not actually end up) noting that flights 2 and 3 are skipped when the cookie exchange is not needed. It's also a bit surprising to see pre_shared_key as an important/noteworthy extension in the sample full (i.e., non-resumption) handshake alongside key_share. Figure 8: Message flights for the Zero-RTT handshake Why do we include psk_key_exchange_modes for the zero-RTT example but not the other ones? I don't think it's particularly more notable for 0-RTT than other handshakes. Note: The application data sent by the client is not included in the timeout and retransmission calculation. This note also appears a little out of place here, since we don't really get into timeout and retransmission until the next section Section 5.7.1 The state machine says "receive record, send ACK"; does that hold for all records? (I guess maybe it does, for all records that do not complete a flight.) In the PREPARING state, the implementation does whatever computations are necessary to prepare the next flight of messages. It then buffers them up for transmission (emptying the buffer first) and enters the SENDING state. What is meant by "emptying the buffer first"? Surely I need to keep the messages I am sending buffered in case I have to retransmit them, and if I am in PREPARING I have to have fnished sending my previous flight (if any) first... There are four ways to exit the WAITING state: 1. The retransmit timer expires: the implementation transitions to the SENDING state, where it retransmits the flight, resets the retransmit timer, and returns to the WAITING state. Should there be a fifth way, involving failing the handshake by hitting a retransmission cap? 4. The implementation receives some or all next flight of messages: if this is the final flight of messages, the implementation transitions to FINISHED. If the implementation needs to send a new flight, it transitions to the PREPARING state. Partial reads (whether partial messages or only some of the messages in the flight) may also trigger the implementation to send an ACK, as described in Section 7.1. I don't understand the "some or" part; shouldn't the state machine need a complete next flight in order to transition states? In addition, for at least twice the default MSL defined for [RFC0793], when in the FINISHED state, the server MUST respond to retransmission of the client's second flight with a retransmit of its ACK. "second flight" does not seem to allow for all the various possibilities for handshake structure we have, with HRR, resumption, etc. the first side's Finished message. Implementations MUST either discard or buffer all application data records for the new epoch until they have received the Finished message for that epoch. On my first read I equated "that epoch" with "the new epoch", which doesn't make sense. Perhaps "received the Finished message for the current epoch" or even just "for epoch 3", since the post-handshake auth Finished messages are not limited to one per epoch? Section 5.7.2 many instances of a DTLS time out early and retransmit too quickly on a congested link. Implementations SHOULD use an initial timer value of 100 msec (the minimum defined in RFC 6298 [RFC6298]) and double the value at each retransmission, up to no less than the RFC 6298 maximum of 60 seconds. Application specific profiles, such as those The wording here is a bit amusing, as "up to no less than the ... maximum" is facially nonsensical, but the RFC 6298 setting is in fact the floor for the implementation-defined maximum. I don't have a clever wording suggestion, though. sensitive applications. Because DTLS only uses retransmission for handshake and not dataflow, the effect on congestion should be minimal. Perhaps a cautionary note about large certificate chains is in order, though? Implementations SHOULD retain the current timer value until a transmission without loss occurs, at which time the value may be Does "transmission without loss" mean a full flight? I'm not seeing a way to read this that lets us continue to transmit with a given value of the timer, as opposed to resetting it to the initial value or having to back off due to loss. reset to the initial value. After a long period of idleness, no less than 10 times the current timer value, implementations may reset the timer to the initial value. Should this be a 2119 MAY? Section 5.10 Do we want to say anything about the (non-)utility of "close_notify" given that we don't deal with in-order or reliable transport? Or that a DTLS implementation just drops packets instead of sending "bad_record_mac"? Section 5.11 Note: it is not always possible to distinguish which association a given record is from. For instance, if the client performs a handshake, abandons the connection, and then immediately starts a new handshake, it may not be possible to tell which connection a given protected record is for. In these cases, trial decryption MAY be necessary, though implementations could use CIDs. This doesn't seem like a normative MAY but rather a statement of fact. Section 6 Figure 11 seems to show that the initial ServerHello has message_seq=1, but §5.2 says that "[t]he first message each side transmits in each association always has message_seq = 0". Which one is it? (A change here would affect all the server's messages except the final ACK.) Also in Figure 11, the client has to send an empty ACK because Record 1 could only be ACK'd in epoch 2, but the client doesn't have the epoch 2 keys yet. We should at least forward-reference §7.1 and acknowledge (pun intended) that the empty ACK is correct in this case even if we don't go into the details of why it is correct yet. Section 6.1 Using these reserved epoch values a receiver knows what cipher state has been used to encrypt and integrity protect a message. Implementations that receive a payload with an epoch value for which no corresponding cipher state can be determined MUST generate a "unexpected_message" alert. For example, if a client incorrectly uses epoch value 5 when sending early application data in a 0-RTT exchange. A server will not be able to compute the appropriate keys and will therefore have to respond with an alert. Why would such erroneous epoch=5 usage fall into "unexpected_message" territory as opposed to the normal silent discard of records that fail deprotection (per §4.5.2)? Figure 12 has "[HelloRetryRequest]" in brackets, but HRR is not protected under the application traffic keys. IMO it would be appropriate to just list "HelloRetryRequest" (without a previous "ServerHello") since we have a disclaimer at the top of the document that we are pretending it is a separate message for purposes of documentation. I think it would also be useful to actually show the KeyUpdate message(s) in Figure 12, especially since we admit asymmetric KeyUpdate. Discussion of separately tracking send and receive epoch would also be appropriate... Section 7 During the handshake, ACKs only cover the current outstanding flight (this is possible because DTLS is generally a lockstep protocol). Thus, an ACK from the server would not cover both the ClientHello and the client's Certificate. Implementations can accomplish this by clearing their ACK list upon receiving the start of the next flight. I wonder if it is helpful to mention here that ACK is not needed if the endpoint can proceed to start sending the next flight (since sending messages in the next flight implicitly ACKs the entire previous flight). Implementations SHOULD simply use the highest current sending epoch, which will generally be the highest available. After the handshake, Will there generally be parity between send and receive epochs? I am not sure that such parity would be needed for largely asymmetric traffic flows. Section 7.1 1. Handshake flights other than the client's final flight >From context this means "initial handshake flights" but we don't really have a specific definition that "handshake flight" excludes post-handshake messages. Perhaps it's worth a few more words to clarify. through the responding flight. A notable example for this is the case of post-handshake client authentication in constrained environments, where generating the CertificateVerify message can take considerable time on the client. All other flights MUST be ACKed. (non-post-handshake client authentication would also take the same amount of time, right?) Section 7.3 retransmission in previous versions of DTLS. For instance in the flow shown in Figure 11 if the client does not send the ACK message when it received and processed record 1 indicating loss of record 0, the entire flight would be retransmitted. When DTLS 1.3 is used in I don't think that the client can "process" record 1 per se, since the contents are encrypted in the handshake keys and the client doesn't have the ServerHello yet. So shouldn't this just be "when it received record 1"? The use of the ACK for the second case is mandatory for the proper functioning of the protocol. For instance, the ACK message sent by the client in Figure 12, acknowledges receipt and processing of record 2 (containing the NewSessionTicket message) and if it is not The records in Figure 12 are not labeled (anymore?). sent the server will continue retransmission of the NewSessionTicket indefinitely. s/indefinitely/until its retransmission cap is reached/? Section 8 Discussion of asymmetric KeyUpdate and the need to separately track send and receive epoch would be welcome here. Section 9 Some discussion of what types of events might trigger a peer to start using "spare" CIDs (and, presumably, stop using the old ones) could be helpful. There's no reason we might want to put Extensions in either NewConnectionId or RequestConnectionId, right? opaque ConnectionId<0..2^8-1>; Are we keeping with draft-ietf-tls-dtls-connection-id and forbidding zero-length connection IDs? If so, that should be indicated in the presentation language. Endpoints SHOULD respond to RequestConnectionId by sending a NewConnectionId with usage "cid_spare" containing num_cid CIDs soon as possible. Endpoints MUST NOT send a RequestConnectionId message when an existing request is still unfulfilled; this implies that endpoints needs to request new CIDs well in advance. An endpoint MAY ignore requests, which it considers excessive (though they MUST be acknowledged as usual). This seems like it sets up a deadlock scenario. Consider peers A and B; A sends RequestConnectionId with num_cids=200 and B considers this request excessive, so ACKs it but sends no NewConnectionId in response. A is prohibited from sending another RequestConnection Id to indicate that it really needs new CIDs, since the requested 200 have not arrived yet; thus, B could end up not sending any NewConnectionIds even though A is essentially blocking on getting them. Unless "unfulfilled" is supposed to mean "not ACKed"? Section 11 It is probably worth giving a few reminders about cookie generation/requirements and cookie (non-)reuse. RFC 2522 has some useful considerations, but we don't currently incoroprate them by reference. Concrete recommendations for cookie key rotation intervals might even be in order. Section 5.7.3 has a discussion about the need for distinct state machines to be running concurrently, and interplay between the handshake layer and the record layer state machine that is needed in order to dispatch records across state machines. Getting this right seems security relevant, so a reminder in this section might be in order. TLS 1.3 places fairly stringent requirements on non-replayability of 0-RTT data; this would typically be done at the session ticket or ClientHello level. DTLS has optional per-record replay protection. As far as I can tell, the two are orthogonal; that is, there is no inherent need to use DTLS replay protection with 0-RTT data (but there is still need to prevent replay of ClientHellos, and thus the sets of data that follow them). It may be worth reiterating that the two mechanisms play different roles, and the one cannot replace the other. If there is a large amount of skew between the send and receive epochs, the implementation will have to decide whether to keep around application traffic secrets or generate the various traffic keys and discart the oldest traffic secrets. With the exception of order protection and non-replayability, the security guarantees for DTLS 1.3 are the same as TLS 1.3. While TLS always provides order protection and non-replayability, DTLS does not provide order protection and may not provide replay protection. We might also mention non-reliability of application data here (though that is to large extent a property of the underlying transport). If implementations process out-of-epoch records as recommended in Section 8, then this creates a denial of service risk since an adversary could inject records with fake epoch values, forcing the recipient to compute the next-generation application_traffic_secret using the HKDF-Expand-Label construct to only find out that the message was does not pass the AEAD cipher processing. [...] I think this text is stale, since we no longer do implicit key update on seeing a new epoch -- we require explicit KeyUpdate+ACK before the new keys are used. An implementation can still process records under an old epoch, of course, but those keys would generally already be around and not require additional computation to produce. The security and privacy properties of the CID for DTLS 1.3 builds on top of what is described in [I-D.ietf-tls-dtls-connection-id]. There are, however, several improvements: [...] - With multi-homing, an adversary is able to correlate the communication interaction over the two paths, which adds further privacy concerns. In order to prevent this, implementations (editorial) What is described here is not an "improvement"; this is a description of the issue that we provide an improvement in relation to. Reordering this bullet point to have the benefit/improvement be in the first sentence would improve parallelism of writing structure. Perhaps "the ability to use multiple CIDs allows for improved privacy properties in multi-homed scenarios. When only a single CID in use on multiple paths from such a host, an adversary is able to correlate [...]"? - Switching CID based on certain events, or even regularly, helps against tracking by on-path adversaries but the sequence numbers can still allow linkability. For this reason this specification defines an algorithm for encrypting sequence numbers, see (editorial) similarly, this could become "the mechanism for encrypting sequence numbers (Section 4.2.3) prevents trivial tracking by on-path adversaries that attempt to correlate the pattern of sequence numbers received on different paths; such tracking could occur even when different CIDs are used on each path, in the absence of sequence number encryption". encrypted. This may improve correlation of packets from a single connection across different network paths. I feel like the small width of the epoch field mitigates this somewhat (though not fully). Section 12. Changes to DTLS 1.2 This section is about changes *since* DTLS 1.2 (and I propose some wording tweaks in my editorial PR). But I think we should also consider whether we do need a section on "changes to DTLS 1.2", or rather "changes affecting DTLS 1.2 implementations, along the lines of https://tools.ietf.org/html/rfc8446#section-1.3 ("Updates Affecting TLS 1.2"). Section 13 [I made some comments earlier that are expected to lead to new text in the IANA Considerations.] IANA is requested to allocate a new value in the "TLS ContentType" registry for the ACK message, defined in Section 7, with content type 26. The value for the "DTLS-OK" column is "Y". IANA is requested to reserve the content type range 32-63 so that content types in this range are not allocated. With 0-25 already allocated or "requires coordination" and 64-255 at "requires coordination", this leaves us with like 6 more content types. We go through them slowly, so I'm not particularly concerned; just pointing it out. Section 14.2 We say "contains a stateless cookie generated using the technique of [RFC2522]" which seems to require usage/understanding of RFC 2522, thus promoting it to normative. (That seems silly, so I recommend rewording rather than recategorizing the reference.) Appendix A %%## Record Layer %%## Handshake Protocol %%## ACKs %%## Connection ID Management We seem to be missing the magic "%%%" markers in the markdown source that allow this automation to work. But we also seem to be wrapping at least some of the figures in "~~~" and I don't know how those interact, so I didn't try to fix this myself. Thanks, Ben
- [TLS] AD Evaluation of draft-ietf-tls-dtls13-39 Benjamin Kaduk
- Re: [TLS] AD Evaluation of draft-ietf-tls-dtls13-… Eric Rescorla
- Re: [TLS] AD Evaluation of draft-ietf-tls-dtls13-… Eric Rescorla
- Re: [TLS] AD Evaluation of draft-ietf-tls-dtls13-… Benjamin Kaduk
- Re: [TLS] AD Evaluation of draft-ietf-tls-dtls13-… Eric Rescorla
- Re: [TLS] AD Evaluation of draft-ietf-tls-dtls13-… Eric Rescorla
- Re: [TLS] AD Evaluation of draft-ietf-tls-dtls13-… Benjamin Kaduk