[TLS] Re: DTLS 1.3 ACK sorting, and when to clear the ACK buffer

David Benjamin <davidben@chromium.org> Tue, 05 November 2024 11:48 UTC

Return-Path: <davidben@google.com>
X-Original-To: tls@ietfa.amsl.com
Delivered-To: tls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5492AC14F6B4 for <tls@ietfa.amsl.com>; Tue, 5 Nov 2024 03:48:08 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.404
X-Spam-Level:
X-Spam-Status: No, score=-9.404 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.148, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=chromium.org
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id L_6bq9gcSo8t for <tls@ietfa.amsl.com>; Tue, 5 Nov 2024 03:48:04 -0800 (PST)
Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EA19FC16A126 for <tls@ietf.org>; Tue, 5 Nov 2024 03:48:02 -0800 (PST)
Received: by mail-ed1-x531.google.com with SMTP id 4fb4d7f45d1cf-5cb15b84544so6360383a12.2 for <tls@ietf.org>; Tue, 05 Nov 2024 03:48:02 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1730807281; x=1731412081; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=r418XwFncKPv79vr1fDh5lh9uCC1LfReQ4MeFG0fGHk=; b=EZBZU65hlLqzeLJm7ZQ6/075IgrIqbB6WMVClkIbIwiXAVqQA8S362q7thAT1wCqp0 6w2dx3ngnMS4zsKU98jdvvWgn4Tr6Q3Ay9wbnWV8D65NaAvsm65h6Rj8jbRta5bvL2kV 8Odlq1EDO6flLwy4s+dHAXrnFQcuLGBSs3oYw=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730807281; x=1731412081; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=r418XwFncKPv79vr1fDh5lh9uCC1LfReQ4MeFG0fGHk=; b=BAzI61ie7vAg5UaTdPBSkvIvLfbSy2t0LPY779nFwKs05mOcO+3SaT0Hk0IcUWUV4d a5U0uai+KCRhKz7UJ0xtl+E5MYgWoLxjvVXvufr57le0k19J6EHL6IGVgRIxDuZB2YTV qvzVoXVjKqc14KIrZZOAEL+X8nnLOZYHeeExDT34rkcqkE0kifQ5XnZd4bhIa8h9oppe maW+P3LZusdwIHf6lfu5Ts0+89lWdJC84bciPMQ8OobBTPRJfKdY92ico8RiTFbD3Jte Bv3y5mbEoBxWec1i8KGXHOnzCUOl7CltM8x8d0PKyVl8lgwMw8779XKY5hsbcG5ji7Bh +JeA==
X-Gm-Message-State: AOJu0Yw4D2o9BrhmhDcn9+7ZS5znFts7c773qntKl/sa50DfwkLjHUbo 9FOPsDA6TPdymlPzGn1F8tSbWp9EtXMlUnJqxyn1bg+s/rbR4tVHCbEFfAgr/l8iicIdf/gz/bm Xt9Rnyz3+sHx4/FfKdIOLLcNOksI50+z6y/E=
X-Google-Smtp-Source: AGHT+IHXzDuznXGz5KxxX3Ed9zCbJJcG4wnQfKWaZlIk+sufJP3hX0n1PwHcb+8wc/zscUEiK7+49M8/40wsvEBKs2M=
X-Received: by 2002:a17:907:97ce:b0:a99:8a5c:a357 with SMTP id a640c23a62f3a-a9e657fd8c3mr1525490266b.58.1730807281241; Tue, 05 Nov 2024 03:48:01 -0800 (PST)
MIME-Version: 1.0
References: <CAF8qwaCKtjis=h6npApRAtT=AbHtDHO333zAXKPMowH_V5K8NQ@mail.gmail.com> <ZyeaTZctp5mX-ck3@LK-Perkele-VII2.locald> <CAF8qwaDkEbwFHd7teVY4c-EZ7JO5eBigehUEF21ZA3VPRbwJNQ@mail.gmail.com> <ZykCAFMIEHhZRNZg@LK-Perkele-VII2.locald>
In-Reply-To: <ZykCAFMIEHhZRNZg@LK-Perkele-VII2.locald>
From: David Benjamin <davidben@chromium.org>
Date: Tue, 05 Nov 2024 11:47:44 +0000
Message-ID: <CAF8qwaB7pSUpNS5Y5U53Yx69_1V24rq-FJbXyV7yNYi-7UDTmg@mail.gmail.com>
To: Ilari Liusvaara <ilariliusvaara@welho.com>
Content-Type: multipart/alternative; boundary="0000000000002cc335062628f919"
Message-ID-Hash: JIYFTKQIYGADOWS5YOQ7W3KDXHUWGW6R
X-Message-ID-Hash: JIYFTKQIYGADOWS5YOQ7W3KDXHUWGW6R
X-MailFrom: davidben@google.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-tls.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: "<tls@ietf.org>" <tls@ietf.org>
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [TLS] Re: DTLS 1.3 ACK sorting, and when to clear the ACK buffer
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <tls.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tls/hujpj35d7HRzhP1iQi_rnvIPIY0>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tls>
List-Help: <mailto:tls-request@ietf.org?subject=help>
List-Owner: <mailto:tls-owner@ietf.org>
List-Post: <mailto:tls@ietf.org>
List-Subscribe: <mailto:tls-join@ietf.org>
List-Unsubscribe: <mailto:tls-leave@ietf.org>

On Mon, Nov 4, 2024 at 5:19 PM Ilari Liusvaara <ilariliusvaara@welho.com>
wrote:

> > > There is subtle edge case where this can cause outright failures:
> > >
> > > - Sender that implements only linear tracking.
> > > - Very large flight that gets split into lots of records.
> > > - Some of those records get evicted from ACK buffer before being ACKed.
> > > - Flight has no response, or response is blocked.
> > >
> > > In this scenario, no data will get through, and the sender will just
> > > re-transmit the flight forever.
> > >
> > > To counter this, if ACK buffer fills with unacknowledged records, one
> > > should immediately send ACK. If the first record in transmission was
> > > received and that ACK makes it through, it will cause forward progress.
> > >
> >
> > I'm not sure that will actually prevent forward progress, though I may be
> > misunderstanding your example. In the worst case, you will manage to ACK,
> > say, the last 32K of that flight. The peer will then retransmit all but
> the
> > last 32K, you'll ACK the last 32K of that, and so on until the whole
> flight
> > gets ACKed. This is not amazing, but it's still forward progress. And
> given
> > each record number covers about an MTU's worth of handshake data, you
> don't
> > need much ACK buffer to avoid this or make its effects minimal. Though I
> > agree flushing the ACK buffer when full is a sensible implementation
> > strategy (though also not mentioned by the specification).
>
> Maybe such sender would be too simplistic, and all senders should
> implement full tracking. But with such simplistic sender, that case
> would not make forward progress and would livelock.
>
> However, when sending, one should be mindful of implementations with
> limited tracking (on receive side):
>
> - Do not have fragment sequence jump back.
> - Do not have multi-message records unless all but the last are
>   guaranteed to be complete.
>
> What is the current maximum for number of messages in one flight?
> 6 (ServerHello, EncryptedExtensions, CertificateRequest, Certificate,
> CertificateVerify, Finished)?
>

Ah, when you say simplistic sender and linear tracking, do you mean a
sender that only tracks the prefix of bytes that have been sent? I.e. if it
sends records 1, 2, 3, 4 and I ACK record, it is unable to deal with
reordering and ignores it? Yeah, I think I agree that would indeed get
stuck in this case. Maybe not the best plan for a sender given that records
can get reordered, but ah well. For non-final flights, it's fine because
even if no ACKs get through, the peer will send the next flight which
implicitly ACKs everything. But for final flights, we do require that you
eventually manage to ACK the whole flight. There isn't a succinct way for
the peer to say "yeah, I got all that".

On the receive side, I think simplistic receivers are fine? If the
simplistic receiver drops some fragments because they're in a funny order,
they shouldn't ACK it, so the sender will know to include it in the
retransmit.

6 sounds right. We use 7, but that's because we still support NPN and
Channel ID in (D)TLS 1.2, so the maximum for us becomes 1.2's
ClientKeyExchange, Certificate, CertificateVerify, NextProto, ChannelID,
ChangeCipherSpec, Finished. (ChangeCipherSpec isn't a message, but it takes
up a slot in the outgoing queue in our implementation right now.)

> > There is in fact another subtle edge case which requires ACKing stuff
> > > from previous flight:
> > >
> > > - Sender sends flight that has no response or response is blocked.
> > > - The complete flight comes through, but the last ACK is lost.
> > > - Sender re-transmits the (possibly partial) flight.
> > >
> > > The receiver considers flight complete, but sender does not. Getting
> > > things unstuck requires ACKing stuff from previous flight.
> > >
> > > However, this does not require keeping the records from previous flight
> > > in the list.
> > >
> >
> > Well, there's two ways to get it unstuck. You could either explicitly ACK
> > the previous flight, or just start sending your reply, which implicitly
> > ACKs that flight.
>
> If the blocking flight has response, then explicit ACK is required to
> avoid deadlock (to get out of state where both sides have flight in
> progress). But this can only happen post-handshake.
>

Oh I see. The concern is a sender that is only willing to have one outgoing
flight at a time? So if both sides, say, send a post-handshake
CertificateRequest, neither will be willing to move on and send a
responding Certificate until then things have cleared. I agree that this
situation is indeed a problem.

I'm not sure what's the best way out of that mess. The spec says you
"duplicate" the state machine, whatever that means, but having only one
outgoing flight is indeed a very sensible restriction; it's one I've been
considering for our implementation. The specification is incredibly
ambiguous around how parallel post-handshake transactions work! I still
need to figure out how we want to handle post-handshake messages, but I'd
definitely been considering that restriction. But I also don't intend to
implement any multi-flight post-handshake transactions because, well, we
don't need them but more importantly I cannot make sense of how to make
them work well at all. (See all the threads I've started here.)

I'm not sure what's the best way out of that mess, but I hope we can spend
time with the badly-needed rfc9147bis to improve this.


> > > When the old fragment is in the *current* flight, the
> > > peer may have lost an earlier ACK and not realize they can stop
> > > retransmitting. It's only old fragments in *previous* flights that are
> > > unnecessary to ACK, but the specification does not suggest to
> distinguish
> > > them. (Distinguishing them would require extra state in the record
> layer
> > to
> > > store a low watermark for the flight, and that seems a waste. There's
> no
> > > real harm in adding that record to the ACK buffer.)
> >
> > The specification *does* suggest distinguishing them, but in a roundabout
> > way. Section 7.1 says (emphasis mine):
> >
> > > When an implementation detects a disruption *in the receipt of the
> > current incoming flight*, it SHOULD generate an ACK that covers the
> > messages from *that flight* which it has received and processed so far.
> >
> > That text suggests that you should only be ACKing the current incoming
> > flight, which in turn suggests you shouldn't ACK flights from previous
> > incoming flights, which in turn means tracking the boundary of the
> current
> > and previous incoming flight. That has one attractive property. During
> the
> > handshake (more on post-handshake later), if you also fix the
> > buffer-clearing behavior, it means the ACK-send timer and retransmit
> timer
> > are never both active. If you've received all of flight N-1, sent N, but
> > received none of N+1, the ACK buffer has been cleared and there's no ACK
> > timer from N+1. If you've received part of N+1, that implicitly ACKs N so
> > you start the ACK timer and shut off the transmit timer.
>
> However, because peers can disagree on if flight is complete or not,
> one can not stop sending ACKs for current flight until the next flight
> starts. However, one may still flush the ACK buffer on complete flight.
>
>
> > But all this breaks down with post-handshake messages, so this is not
> > actually a useful invariant for your implementation. Moreover, how to
> > interpret this 7.1 text with post-handshake messages is a little
> > interesting. Section 5.8.4 just says you "duplicate" the state machine.
> But
> > if we have a bunch multi-flight post-handshake transactions (right now
> > post-handshake auth is the only one), we shouldn't have a design that
> asks
> > the implementation track, indefinitely, which sequence numbers were part
> of
> > a current vs past incoming flight, because that means memory usage grows
> > indefinitely with each post-handshake transaction.
>
> Yes, post-handshake is when things get interesting, as things might not
> be in lockstep anymore. With suitable extensions (like the proposed
> Extended Key Update), both sides can end up with flight in progress at
> the same time.
>
> With regards to state-keeping, i think that it would be enough to
> track the following:
>
> - The first message sequence in the current flight.
> - Current flight complete flag.
> - First message sequence in next flight (if current flight complete).
>
> ... Which fits in 5 bytes (+ padding).
>
> This lets one recognize if record belongs to past (ignore), current
> (process) or next (next becomes current, and process) flight.
>

Hmm, I'm not sure if that's sufficient. While the sender might choose to
restrict themselves to only one outgoing flight at a time for simplicity, a
receiver can't assume that the sender has done so. The sender might have
decided to send 10 NewSessionTickets in parallel. On the receiver side, you
might receive them, maybe even in order, and then ACK them in sequence. But
perhaps the very first ACK got lost, so the sender retransmits the first
NewSessionTicket. It is expecting you to ACK it again, or it will keep
retransmitting.

That means that, post-handshake, the receiver needs to ACK messages
basically arbitrarily back. That is fine because you still only need to
track the current sequence number and ACK any past ones without processing
anything. But it means the notion of "current flight" is everlasting. Every
post-handshake transaction leaves its final flight around as the "current
flight".

But now suppose you want to distinguish "current flight" from "past flight"
and only ACK current flights. *Now* you need per-transaction state to know
which old sequence numbers are from current vs past flights, in every
post-handshake transaction. This seems problematic, so I think we should
make sure the protocol works if you ACK all past sequence numbers
indiscriminately. But if that's simpler, maybe we should just say that's
the ACK policy and stop being fussy about only ACKing one flight at a time.


> Using sequence number comparisons would even handle the case where
> message sequence wraps around (which I do not think is allowed).
>

I do not think that is allowed. If it is, we should fix it so it is not. :-)

David