[TLS] Re: DTLS 1.3 ACK sorting, and when to clear the ACK buffer
Ilari Liusvaara <ilariliusvaara@welho.com> Mon, 04 November 2024 17:19 UTC
Return-Path: <ilariliusvaara@welho.com>
X-Original-To: tls@ietfa.amsl.com
Delivered-To: tls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 82643C1D4CD3 for <tls@ietfa.amsl.com>; Mon, 4 Nov 2024 09:19:03 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Level:
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id p7PIBKPyYrM0 for <tls@ietfa.amsl.com>; Mon, 4 Nov 2024 09:19:02 -0800 (PST)
Received: from welho-filter1.welho.com (welho-filter1b.welho.com [83.102.41.27]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BDDBAC1D61FB for <tls@ietf.org>; Mon, 4 Nov 2024 09:19:00 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by welho-filter1.welho.com (Postfix) with ESMTP id 1C48C1E255 for <tls@ietf.org>; Mon, 4 Nov 2024 19:18:58 +0200 (EET)
X-Virus-Scanned: Debian amavisd-new at pp.htv.fi
Received: from welho-smtp2.welho.com ([IPv6:::ffff:83.102.41.85]) by localhost (welho-filter1.welho.com [::ffff:83.102.41.23]) (amavisd-new, port 10024) with ESMTP id gKPjTkF9vzrw for <tls@ietf.org>; Mon, 4 Nov 2024 19:18:57 +0200 (EET)
Received: from LK-Perkele-VII2 (87-92-153-79.rev.dnainternet.fi [87.92.153.79]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by welho-smtp2.welho.com (Postfix) with ESMTPSA id B2148287 for <tls@ietf.org>; Mon, 4 Nov 2024 19:18:56 +0200 (EET)
Date: Mon, 04 Nov 2024 19:18:56 +0200
From: Ilari Liusvaara <ilariliusvaara@welho.com>
To: "<tls@ietf.org>" <tls@ietf.org>
Message-ID: <ZykCAFMIEHhZRNZg@LK-Perkele-VII2.locald>
References: <CAF8qwaCKtjis=h6npApRAtT=AbHtDHO333zAXKPMowH_V5K8NQ@mail.gmail.com> <ZyeaTZctp5mX-ck3@LK-Perkele-VII2.locald> <CAF8qwaDkEbwFHd7teVY4c-EZ7JO5eBigehUEF21ZA3VPRbwJNQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CAF8qwaDkEbwFHd7teVY4c-EZ7JO5eBigehUEF21ZA3VPRbwJNQ@mail.gmail.com>
Sender: ilariliusvaara@welho.com
Message-ID-Hash: 2Y4ZUN7E7RUYQVTPNUP3LY4Z4E23RJCD
X-Message-ID-Hash: 2Y4ZUN7E7RUYQVTPNUP3LY4Z4E23RJCD
X-MailFrom: ilariliusvaara@welho.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-tls.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [TLS] Re: DTLS 1.3 ACK sorting, and when to clear the ACK buffer
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <tls.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tls/q9gZ0G0Hxh8Fx4fi0EiMUNc8Ts0>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tls>
List-Help: <mailto:tls-request@ietf.org?subject=help>
List-Owner: <mailto:tls-owner@ietf.org>
List-Post: <mailto:tls@ietf.org>
List-Subscribe: <mailto:tls-join@ietf.org>
List-Unsubscribe: <mailto:tls-leave@ietf.org>
On Mon, Nov 04, 2024 at 01:52:06PM +0000, David Benjamin wrote: > On Sun, Nov 3, 2024 at 3:50 PM Ilari Liusvaara <ilariliusvaara@welho.com> > wrote: > > > On Sun, Nov 03, 2024 at 12:49:59PM +0000, David Benjamin wrote: > > > > The spec also recommends you keep your older epochs around for a spell in > case of packet reordering. That can also cause you to see the older epoch > even after the handshake has progressed past it. I'm not sure if there's > any benefit to doing this specifically during the handshake, although it > might let you see an older ACK. Seeing that older ACK may be unnecessary if > you do the epoch-aware implicit ACK you describe, but neither that nor > epoch management in general is described in the document. (I think the > very, very badly needed rfc9147bis should fix the latter at least. Adding > your extra implicit ACK case seems reasonable too.) Yes, it is reasonable to keep older epochs for a bit for possible reordered application data. But since this is merely SHOULD, things should still work even if all handshake records from older epoch are ignored. There might be an edge case where if client ignores all handshake records from older epoch, the server has to do epoch 0 implcit ACK in order to avoid a deadlock. > > There is subtle edge case where this can cause outright failures: > > > > - Sender that implements only linear tracking. > > - Very large flight that gets split into lots of records. > > - Some of those records get evicted from ACK buffer before being ACKed. > > - Flight has no response, or response is blocked. > > > > In this scenario, no data will get through, and the sender will just > > re-transmit the flight forever. > > > > To counter this, if ACK buffer fills with unacknowledged records, one > > should immediately send ACK. If the first record in transmission was > > received and that ACK makes it through, it will cause forward progress. > > > > I'm not sure that will actually prevent forward progress, though I may be > misunderstanding your example. In the worst case, you will manage to ACK, > say, the last 32K of that flight. The peer will then retransmit all but the > last 32K, you'll ACK the last 32K of that, and so on until the whole flight > gets ACKed. This is not amazing, but it's still forward progress. And given > each record number covers about an MTU's worth of handshake data, you don't > need much ACK buffer to avoid this or make its effects minimal. Though I > agree flushing the ACK buffer when full is a sensible implementation > strategy (though also not mentioned by the specification). Maybe such sender would be too simplistic, and all senders should implement full tracking. But with such simplistic sender, that case would not make forward progress and would livelock. However, when sending, one should be mindful of implementations with limited tracking (on receive side): - Do not have fragment sequence jump back. - Do not have multi-message records unless all but the last are guaranteed to be complete. What is the current maximum for number of messages in one flight? 6 (ServerHello, EncryptedExtensions, CertificateRequest, Certificate, CertificateVerify, Finished)? > > Above is one case where one wants last records sent (highest RSN), not > > last records received. > > > > I'm not sure I follow. In that example, there are more unACKed records than > fit in the buffer at all, so neither eviction algorithm will ACK > everything. I'm not seeing how prioritizing the highest RSN improves > things. More generally, the last records received are the one that you > haven't ACKed yet, so when there aren't eviction problems, those are the > ones to prioritize. Yeah, it is probably too marginal to be worthwhile. > > There is in fact another subtle edge case which requires ACKing stuff > > from previous flight: > > > > - Sender sends flight that has no response or response is blocked. > > - The complete flight comes through, but the last ACK is lost. > > - Sender re-transmits the (possibly partial) flight. > > > > The receiver considers flight complete, but sender does not. Getting > > things unstuck requires ACKing stuff from previous flight. > > > > However, this does not require keeping the records from previous flight > > in the list. > > > > Well, there's two ways to get it unstuck. You could either explicitly ACK > the previous flight, or just start sending your reply, which implicitly > ACKs that flight. If the blocking flight has response, then explicit ACK is required to avoid deadlock (to get out of state where both sides have flight in progress). But this can only happen post-handshake. > > When the old fragment is in the *current* flight, the > > peer may have lost an earlier ACK and not realize they can stop > > retransmitting. It's only old fragments in *previous* flights that are > > unnecessary to ACK, but the specification does not suggest to distinguish > > them. (Distinguishing them would require extra state in the record layer > to > > store a low watermark for the flight, and that seems a waste. There's no > > real harm in adding that record to the ACK buffer.) > > The specification *does* suggest distinguishing them, but in a roundabout > way. Section 7.1 says (emphasis mine): > > > When an implementation detects a disruption *in the receipt of the > current incoming flight*, it SHOULD generate an ACK that covers the > messages from *that flight* which it has received and processed so far. > > That text suggests that you should only be ACKing the current incoming > flight, which in turn suggests you shouldn't ACK flights from previous > incoming flights, which in turn means tracking the boundary of the current > and previous incoming flight. That has one attractive property. During the > handshake (more on post-handshake later), if you also fix the > buffer-clearing behavior, it means the ACK-send timer and retransmit timer > are never both active. If you've received all of flight N-1, sent N, but > received none of N+1, the ACK buffer has been cleared and there's no ACK > timer from N+1. If you've received part of N+1, that implicitly ACKs N so > you start the ACK timer and shut off the transmit timer. However, because peers can disagree on if flight is complete or not, one can not stop sending ACKs for current flight until the next flight starts. However, one may still flush the ACK buffer on complete flight. > But all this breaks down with post-handshake messages, so this is not > actually a useful invariant for your implementation. Moreover, how to > interpret this 7.1 text with post-handshake messages is a little > interesting. Section 5.8.4 just says you "duplicate" the state machine. But > if we have a bunch multi-flight post-handshake transactions (right now > post-handshake auth is the only one), we shouldn't have a design that asks > the implementation track, indefinitely, which sequence numbers were part of > a current vs past incoming flight, because that means memory usage grows > indefinitely with each post-handshake transaction. Yes, post-handshake is when things get interesting, as things might not be in lockstep anymore. With suitable extensions (like the proposed Extended Key Update), both sides can end up with flight in progress at the same time. With regards to state-keeping, i think that it would be enough to track the following: - The first message sequence in the current flight. - Current flight complete flag. - First message sequence in next flight (if current flight complete). ... Which fits in 5 bytes (+ padding). This lets one recognize if record belongs to past (ignore), current (process) or next (next becomes current, and process) flight. Using sequence number comparisons would even handle the case where message sequence wraps around (which I do not think is allowed). -Ilari
- [TLS] DTLS 1.3 ACK sorting, and when to clear the… David Benjamin
- [TLS] Re: DTLS 1.3 ACK sorting, and when to clear… Ilari Liusvaara
- [TLS] Re: DTLS 1.3 ACK sorting, and when to clear… David Benjamin
- [TLS] Re: DTLS 1.3 ACK sorting, and when to clear… Ilari Liusvaara
- [TLS] Re: DTLS 1.3 ACK sorting, and when to clear… David Benjamin
- [TLS] Re: DTLS 1.3 ACK sorting, and when to clear… Ilari Liusvaara
- [TLS] Re: DTLS 1.3 ACK sorting, and when to clear… David Benjamin
- [TLS] Re: DTLS 1.3 ACK sorting, and when to clear… Ilari Liusvaara