[Masque] QUIC proxying and stateless reset

Martin Thomson <mt@lowentropy.net> Thu, 17 November 2022 00:39 UTC

Return-Path: <mt@lowentropy.net>
X-Original-To: masque@ietfa.amsl.com
Delivered-To: masque@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 659F5C14CF15 for <masque@ietfa.amsl.com>; Wed, 16 Nov 2022 16:39:42 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.795
X-Spam-Level:
X-Spam-Status: No, score=-2.795 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=lowentropy.net header.b=tsxjgcGB; dkim=pass (2048-bit key) header.d=messagingengine.com header.b=RZh6UDme
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9RaotuxsNgMI for <masque@ietfa.amsl.com>; Wed, 16 Nov 2022 16:39:37 -0800 (PST)
Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 14051C14F725 for <masque@ietf.org>; Wed, 16 Nov 2022 16:39:36 -0800 (PST)
Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id 6D46C5C06D1 for <masque@ietf.org>; Wed, 16 Nov 2022 19:39:34 -0500 (EST)
Received: from imap41 ([10.202.2.91]) by compute3.internal (MEProxy); Wed, 16 Nov 2022 19:39:34 -0500
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lowentropy.net; h=cc:content-type:date:date:from:from:in-reply-to:message-id :mime-version:reply-to:sender:subject:subject:to:to; s=fm2; t= 1668645574; x=1668731974; bh=POS6vFS04wBQyl85TxXWHkU09p3HCvXAWGN BJJ53R2Q=; b=tsxjgcGBh5tSXDjRjgpdq/rXcW5WHWNj5xVPkgkrDhc25ROIdj/ 77ivPS4V92aBYPiF83hYgNh4xtdPbeudniupxLkjOXskX+2f24F2VKxRP7SL3n3Y bRZn6jVUcELNftJtDT9qf2SiorRzpipVqRKFgya0cD1m+ojuNnWr6qAyblUokweZ V5YHxEefVPXlLK8FASaeX/Fa0k3mb5Zft2fYus4PWlHAaGTTGZA4FtD52Gx3MB5W X3SeqEO/y/8d5ZuqHROy0FNJwZWrQo6oIwJ2KbSYLzTFp3L1V5+qhS2Y3OqZXjNB 8Xif03As0F6b+rP8wZUfye4lNk3BgY1rBbg==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:date:feedback-id :feedback-id:from:from:in-reply-to:message-id:mime-version :reply-to:sender:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1668645574; x= 1668731974; bh=POS6vFS04wBQyl85TxXWHkU09p3HCvXAWGNBJJ53R2Q=; b=R Zh6UDmeQVdWcZtlWjDbAVju5SkMDdy1le9ZczAcGOBAms0F0S7unLIPWBVb4rEuC 7Mjq5RorHFLKZNT1VECgf/NflknwOmkZdaCSNweplEW5cyu2gIgHZTpPY8u41g2j H9+NaosUXP3A9bLxazciyHobztrSJfoYhIHnohKhWyyz5Cu0it4YFwFSo/i95B64 yxqjjrSLzUInFTozBj4S+YoONiKBvffmy/Rs5gyY4nIViGwUSTcf9BqpXQvA9p7G sG0Y1YEjX0G6ZN0G7oO8zw76E1qGYG48rAalbWPwlXedTDiZKl0Jr/TTtZyGopDl YmTGTSSpBM+K7Qt7Ek7Jg==
X-ME-Sender: <xms:xoJ1YzajtdV5dZ3-_ns_Cbbfd45AV01L2R2tEQ77U_GcqWYPxMi0-A> <xme:xoJ1YyYet8CJyWejqttmgk62wTsilRvgLi1w4Tt0X1VYjmhc7tVNMEsfU73wXEAjG SarZrs6LMwppAcQeNQ>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvgedrgeejgddvhecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhepofgfggfkfffhvffutgesthdtredtre ertdenucfhrhhomhepfdforghrthhinhcuvfhhohhmshhonhdfuceomhhtsehlohifvghn thhrohhphidrnhgvtheqnecuggftrfgrthhtvghrnhepvefguddtgfejfeduueehkeehke duueegheefgeevkeekgeelveevffeuudffheehnecuvehluhhsthgvrhfuihiivgeptden ucfrrghrrghmpehmrghilhhfrhhomhepmhhtsehlohifvghnthhrohhphidrnhgvth
X-ME-Proxy: <xmx:xoJ1Y19SdkmLKxxTemV7xPxv3bGhqAPzsfoLikdv98o5qg2FIQlypg> <xmx:xoJ1Y5rY2SuyN8rVU_00h4BcDfY_hbJKpsK-OJfU1EhmAH8EDThO7g> <xmx:xoJ1Y-pZl9qYTWu5qqHJhrDa66RV4fHnnKq7xDZpgVT6zyxxYkz_lw> <xmx:xoJ1Y73-5gkSWE0LcR7DCv4lV8GYo7TTsokS3qtiGKpWKYH-A0MGzQ>
Feedback-ID: ic129442d:Fastmail
Received: by mailuser.nyi.internal (Postfix, from userid 501) id 3787F234007B; Wed, 16 Nov 2022 19:39:34 -0500 (EST)
X-Mailer: MessagingEngine.com Webmail Interface
User-Agent: Cyrus-JMAP/3.7.0-alpha0-1115-g8b801eadce-fm-20221102.001-g8b801ead
Mime-Version: 1.0
Message-Id: <9b01d7e4-3ad4-4baf-9e94-6f80c9f33451@betaapp.fastmail.com>
Date: Thu, 17 Nov 2022 11:39:14 +1100
From: Martin Thomson <mt@lowentropy.net>
To: masque@ietf.org
Content-Type: text/plain
Archived-At: <https://mailarchive.ietf.org/arch/msg/masque/HbkmutqVdRYcFofFz0e43ZSF2No>
Subject: [Masque] QUIC proxying and stateless reset
X-BeenThere: masque@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Multiplexed Application Substrate over QUIC Encryption <masque.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/masque>, <mailto:masque-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/masque/>
List-Post: <mailto:masque@ietf.org>
List-Help: <mailto:masque-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/masque>, <mailto:masque-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Nov 2022 00:39:42 -0000

I was going to post about this in the context of draft-pauly-masque-quic-proxy, but then I realized that maybe this is a bigger topic that relates to all use of QUIC via a proxy.

# Intro

Stateless reset exists to handle the case where one party to a connection dies (or routing no longer works, or...).  The failed endpoint needs a way of telling their peer to stop sending them stuff.  It's called stateless because they might not retain per-connection state.  Of course, it's a bit of a misnomer as they need to retain enough state to generate the reset, so they aren't completely stateless.

In a three-party system, there are three entities that are in this position.  This creates a new design problem.  Though not particularly challenging, it is worth exploring a little.

# Requirements

We want the client, proxy, and server each to be able to send a message that will cause the other actors in the system to stop.

* The client needs to tell the server to stop.
* Similarly, the server needs to tell the client to stop.
* Finally, the proxy needs to tell both client AND server to stop.

When endpoints tell each other to stop, the proxy doesn't really need to be involved.  It doesn't originate packets, but this last requirement affects all of them.

Obviously, clients and servers could keep the proxy out of the loop, operating end-to-end.  That works, but is not resilient to loss of state at the proxy.

# Clients

Clients that die are probably the least interesting.  

A dead client will be evident to the proxy by virtue of having lost its connection to the proxy (i.e., it's control channel).  This means that a client doesn't strictly need to tell the proxy about the stateless reset token (SRT) that it uses.  However, a dead control channel might take a long time to become evident if it isn't used that much.

A client does need to tell both the proxy and the server that they aren't coming back.  For this, the proxy might benefit from knowing what SRT the client might generate.  When the client registers a CID, it can tell the proxy the corresponding stateless reset token.  The proxy can remember this.

The proxy needs to be able to inform the server about a dead client either way.  The proxy might just forward the SRT from the client, but we'll see later that this isn't practical.

# Servers

These are not the same as clients, except that it mostly is. The server stateless reset won't be routable unless the proxy knows about it.  So the client needs to tell the proxy about the SRT that corresponds to the server CID.  The proxy could, again, just forward any SRT it receives to the client, using its state.

# Proxies

Now that the proxy knows the SRT from both endpoints, it can kill the connection in both directions.  Mission accomplished, right?  No.  The proxy can lose state too.

We could solve this by adding an extra SRT to every CID, exclusively for signaling that the proxy is dead, but that means changing QUIC.  This is best solved with another mapping.

The proxy needs to take packets it receives, using CIDs that it chose, and produce a SRT from those.  This means that instead of forwarding the client SRT, the proxy should generate a SRT alongside any CID that it tells the client to pass to the server (an SRT that the server might use).  Similarly, instead of forwarding the server SRT, the proxy should generate a SRT alongside the CID that it tells the client to use.

Now we have this (arrows indicate flow of packets):

Client [CID, SRT]_c <----- Proxy [CID, SRT]_pc <----- Server

The client creates []_c for receiving packets, sends those to the proxy, receives []_pc, then sends the server []_pc in a NEW_CONNECTION_ID frame.

To the earlier point, the client could omit the SRT from []_c and rely on the death of the control channel to inform the proxy.  For the purposes of symmetric operation and timely detection of failures, I would prefer to keep that signal.

Client -----> [CID, SRT]_ps Proxy -----> [CID, SRT]_s Server

The client receives []_s from the server in NEW_CONNECTION_ID, sends that to the proxy and receives []_ps for use in sending packets.

# Multiple Paths

Obviously, if the path via the proxy is just one path between client and server, this gives the proxy the ability to kill the entire connection across all paths.  That's not ideal, so endpoints might choose to regard a reset on one path as not sufficient cause to terminate the entire connection.

# With Tunnels

If the proxy dies, a tunnel will die, meaning that the client is informed.  The server is not.  This means that it might still be beneficial to have the client advertise CIDs - or at least SRTS - chosen by the proxy.  That would allow a crashed proxy to tell servers to stop.

A tunnel also means that the client->proxy signal really isn't needed as the outer connection will have its own SRT.  However, the proxy might benefit from a way to indicate to the server that a client is dead.  That also suggests that there is some value in having clients use a CID chosen by - or at least known to - the proxy.

The server->proxy signal isn't really needed here.  The death of a server can be indicated through the tunnel.  The proxy might benefit from knowledge of the server's SRT, but I don't see a driving need.  This means that when tunnelling, clients can safely drop the flow where the proxy learns the server CID/SRT.  Obviously, there is no need for the proxy to generate a new CID/SRT for the client to use in that direction.