[Masque] QUIC proxying and stateless reset

Martin Thomson <mt@lowentropy.net> Thu, 17 November 2022 00:39 UTC

Feedback-ID: ic129442d:Fastmail
User-Agent: Cyrus-JMAP/3.7.0-alpha0-1115-g8b801eadce-fm-20221102.001-g8b801ead
Mime-Version: 1.0
Message-Id: <9b01d7e4-3ad4-4baf-9e94-6f80c9f33451@betaapp.fastmail.com>
Date: Thu, 17 Nov 2022 11:39:14 +1100
From: Martin Thomson <mt@lowentropy.net>
To: masque@ietf.org
Content-Type: text/plain
Archived-At: <https://mailarchive.ietf.org/arch/msg/masque/HbkmutqVdRYcFofFz0e43ZSF2No>
Subject: [Masque] QUIC proxying and stateless reset
Precedence: list

I was going to post about this in the context of draft-pauly-masque-quic-proxy, but then I realized that maybe this is a bigger topic that relates to all use of QUIC via a proxy.

# Intro

Stateless reset exists to handle the case where one party to a connection dies (or routing no longer works, or...). The failed endpoint needs a way of telling their peer to stop sending them stuff. It's called stateless because they might not retain per-connection state. Of course, it's a bit of a misnomer as they need to retain enough state to generate the reset, so they aren't completely stateless.

In a three-party system, there are three entities that are in this position. This creates a new design problem. Though not particularly challenging, it is worth exploring a little.

# Requirements

We want the client, proxy, and server each to be able to send a message that will cause the other actors in the system to stop.

* The client needs to tell the server to stop.
* Similarly, the server needs to tell the client to stop.
* Finally, the proxy needs to tell both client AND server to stop.

When endpoints tell each other to stop, the proxy doesn't really need to be involved. It doesn't originate packets, but this last requirement affects all of them.

Obviously, clients and servers could keep the proxy out of the loop, operating end-to-end. That works, but is not resilient to loss of state at the proxy.

# Clients

Clients that die are probably the least interesting.

A dead client will be evident to the proxy by virtue of having lost its connection to the proxy (i.e., it's control channel). This means that a client doesn't strictly need to tell the proxy about the stateless reset token (SRT) that it uses. However, a dead control channel might take a long time to become evident if it isn't used that much.

A client does need to tell both the proxy and the server that they aren't coming back. For this, the proxy might benefit from knowing what SRT the client might generate. When the client registers a CID, it can tell the proxy the corresponding stateless reset token. The proxy can remember this.

The proxy needs to be able to inform the server about a dead client either way. The proxy might just forward the SRT from the client, but we'll see later that this isn't practical.

# Servers

These are not the same as clients, except that it mostly is. The server stateless reset won't be routable unless the proxy knows about it. So the client needs to tell the proxy about the SRT that corresponds to the server CID. The proxy could, again, just forward any SRT it receives to the client, using its state.

# Proxies

Now that the proxy knows the SRT from both endpoints, it can kill the connection in both directions. Mission accomplished, right? No. The proxy can lose state too.

We could solve this by adding an extra SRT to every CID, exclusively for signaling that the proxy is dead, but that means changing QUIC. This is best solved with another mapping.

The proxy needs to take packets it receives, using CIDs that it chose, and produce a SRT from those. This means that instead of forwarding the client SRT, the proxy should generate a SRT alongside any CID that it tells the client to pass to the server (an SRT that the server might use). Similarly, instead of forwarding the server SRT, the proxy should generate a SRT alongside the CID that it tells the client to use.

Now we have this (arrows indicate flow of packets):

Client [CID, SRT]_c <----- Proxy [CID, SRT]_pc <----- Server

The client creates []_c for receiving packets, sends those to the proxy, receives []_pc, then sends the server []_pc in a NEW_CONNECTION_ID frame.

To the earlier point, the client could omit the SRT from []_c and rely on the death of the control channel to inform the proxy. For the purposes of symmetric operation and timely detection of failures, I would prefer to keep that signal.

Client -----> [CID, SRT]_ps Proxy -----> [CID, SRT]_s Server

The client receives []_s from the server in NEW_CONNECTION_ID, sends that to the proxy and receives []_ps for use in sending packets.

# Multiple Paths

Obviously, if the path via the proxy is just one path between client and server, this gives the proxy the ability to kill the entire connection across all paths. That's not ideal, so endpoints might choose to regard a reset on one path as not sufficient cause to terminate the entire connection.

# With Tunnels

If the proxy dies, a tunnel will die, meaning that the client is informed. The server is not. This means that it might still be beneficial to have the client advertise CIDs - or at least SRTS - chosen by the proxy. That would allow a crashed proxy to tell servers to stop.

A tunnel also means that the client->proxy signal really isn't needed as the outer connection will have its own SRT. However, the proxy might benefit from a way to indicate to the server that a client is dead. That also suggests that there is some value in having clients use a CID chosen by - or at least known to - the proxy.

The server->proxy signal isn't really needed here. The death of a server can be indicated through the tunnel. The proxy might benefit from knowledge of the server's SRT, but I don't see a driving need. This means that when tunnelling, clients can safely drop the flow where the proxy learns the server CID/SRT. Obviously, there is no need for the proxy to generate a new CID/SRT for the client to use in that direction.

[Masque] QUIC proxying and stateless reset Martin Thomson
Re: [Masque] QUIC proxying and stateless reset David Schinazi
Re: [Masque] QUIC proxying and stateless reset Martin Thomson