Re: QUIC idle timeouts and path idle timeouts
Christian Huitema <huitema@huitema.net> Mon, 02 September 2024 01:48 UTC
Return-Path: <huitema@huitema.net>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 32FF0C151983; Sun, 1 Sep 2024 18:48:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.907
X-Spam-Level:
X-Spam-Status: No, score=-1.907 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EAcyzFNb5FFF; Sun, 1 Sep 2024 18:48:36 -0700 (PDT)
Received: from semfq02.mfg.siteprotect.com (semfq02.mfg.siteprotect.com [64.26.60.183]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0C335C14F713; Sun, 1 Sep 2024 18:48:35 -0700 (PDT)
Received: from smtpauth01.mfg.siteprotect.com ([64.26.60.150]) by se02.mfg.siteprotect.com with esmtp (Exim 4.92) (envelope-from <huitema@huitema.net>) id 1skwBD-002ED4-E2; Sun, 01 Sep 2024 21:48:34 -0400
Received: from [192.168.79.93] (unknown [50.239.128.225]) (Authenticated sender: huitema@huitema.net) by smtpauth01.mfg.siteprotect.com (Postfix) with ESMTPSA id 4Wxs983JNbz1627G7; Sun, 1 Sep 2024 21:48:28 -0400 (EDT)
Message-ID: <1f710ee3-c865-4d1d-896a-eb6bb31c5a07@huitema.net>
Date: Sun, 01 Sep 2024 18:48:26 -0700
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: QUIC idle timeouts and path idle timeouts
To: Ian Swett <ianswett=40google.com@dmarc.ietf.org>, Martin Thomson <mt@lowentropy.net>
References: <c85efbc5-fac3-4ddf-9cb0-733ef5f855fd@app.fastmail.com> <8c562e59-2f70-4a85-8ce9-6015d2e257af@app.fastmail.com> <CAKcm_gPQH7vquxOUg05jePb-d70gJ7WRdpRePcL2QbZLSdu45w@mail.gmail.com>
Content-Language: en-US
From: Christian Huitema <huitema@huitema.net>
In-Reply-To: <CAKcm_gPQH7vquxOUg05jePb-d70gJ7WRdpRePcL2QbZLSdu45w@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Authentication-Results: mfg.siteprotect.com; auth=pass smtp.auth=huitema@huitema.net
X-Originating-IP: 64.26.60.150
X-SpamExperts-Domain: mfg.outbound
X-SpamExperts-Username: 64.26.60.150/31
Authentication-Results: mfg.siteprotect.com; auth=pass smtp.auth=64.26.60.150/31@mfg.outbound
X-SpamExperts-Outgoing-Class: ham
X-SpamExperts-Outgoing-Evidence: Combined (0.30)
X-Recommended-Action: accept
X-Filter-ID: Pt3MvcO5N4iKaDQ5O6lkdGlMVN6RH8bjRMzItlySaT9yUXObLbeBHwbIisj9pvSJPUtbdvnXkggZ 3YnVId/Y5jcf0yeVQAvfjHznO7+bT5w9jMbK+ika2XTAEMqfbvJb1j3Oz76lYIYzr2ZntpoACEfI ZDwcRFQmn5vfQeItkBZnJToZNAWIs+gNyE7d1xUJYaQ/XADZ6XjbQg0MpP2zFW1AjUz/fyjbdrpu q77jhLHiQXU/slwZ+/xbvpupN7A5tzXu78EScRTPwUcwy1yhrMx+H5uI/HVNJdBQXHiykCRyqQQ7 e9b9hJxyHGKJfV56nPV2zSIy6Ejm2DacOtJLUw6I3ioreo6fYbCHylZylcBhOh91IP8M8WJHBRj/ BsTLQ89lk4MxJktYbDYSyV1P2qORUAGIEACy6IVPvqenBqnt3JsC5e/DlYVV9jqBtc/t9a3fvBg8 I7q5hFKojyxxcVkDWpg3cUqnTXK7+jR2jt1xuwt6BW/LqWzUw+fkjzpuRAwX31WVY5lWjWxuGSRu xf7EBdb8Y6YkOTerIW+BmaMYK3Oz8o4PzeoA1cFIH1r0rGZvhQfMnJLxWYh1sZ0BDJLpzg2LW7z+ 9rwJjso6c/XDfkPmHuUfLX66hSFIy+9bGBZ9WqzVCVc3XCLdApIlvk4HZe+Npbm5PA9zxaDCuu51 kFykMAhZ0IANLAQ62eZlagx1TJdQ4CQADTf+QDbCdcgdiL+OCEgmlWS8hlDN7QEqhnE2WtQxEuLr lRUnZFyadeOcKGMCCXlFcRncxPtMbwgm6p3yYHbD9jIA+SFheC9EugPerxU6oWhxSu4uB7swHjQj swwPhOoJtZ+nS+g5DMP9XcCUuw5alUaEkpB7KFNNQRRRSuUToPr/jKgokW1gVQpIAM0mFDNyj5Df jikLBE5/kz2sTLFrv++dNgj8KuPMzh0Dbc8Fi7WOLJILxLMZBuiRmgwsmikCx1u85QubmjwJWw42 swm4bO6gacpMpzLZhrhQGuuYMFPuB0PYz8DKLLn5+xr5voMsLQz5L5XPXvMyf1SucBU9QUSbRip7 pVKiNJstiP6vPi78MbmFBe1Skn/c5SCVFmpuYv6Cos8vXJg3ldOofWtLtLWBADzKut0zGiC+PEh3 cAbq+wvRa0gFLYLK+d5fxxOU+xLDz7RiQ2TedXANftVxwYfpOLLd4G1dnMIbze7oQNsE96+6LJbK
X-Report-Abuse-To: spam@se02.mfg.siteprotect.com
Message-ID-Hash: ZPZHGMK4FHECPBHBJ2SWJCTLBPAD3RMR
X-Message-ID-Hash: ZPZHGMK4FHECPBHBJ2SWJCTLBPAD3RMR
X-MailFrom: huitema@huitema.net
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-quic.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: quic@ietf.org
X-Mailman-Version: 3.3.9rc4
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/DYEhsiiiU4woxQlYQpQP0wVjyN0>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Owner: <mailto:quic-owner@ietf.org>
List-Post: <mailto:quic@ietf.org>
List-Subscribe: <mailto:quic-join@ietf.org>
List-Unsubscribe: <mailto:quic-leave@ietf.org>
As Ian writes, the problem is on the server, not at the client. If the client wakes up with something to send after a long silence, it can decide to just resume the connection. But the server can't. If the client connection is dropped, the server is stuck. The current solution is to use keep-alives. But this is painful for both clients and servers. For clients, that means waking up each time any of the 17 messenger applications on the phone sends a keep alive, wakes up the radio, and drains the battery. For servers, that means getting messages from every client every 15 seconds, even if they only have messages to receive every 15 minutes, which increases CPU load and power consumption. Not great. There are a few alternatives. The client could use protocols like PCP or UPNP-IGD to open a port in the local NAT. That's fine if the local router supports it. It can work very well if the network supports IPv6 and the client just needs to set a pinhole in the local firewall. But it will not work if the local ISP is using some combination of IPv4 and Carrier Grade NAT. Unless the CGNAT supports PCP and the client has a plausible way to discover the address of the CGNAT. Maybe the IETF could work on that, but I am not holding my breadth. Another way to reduce the impact on the client is to make sure that all applications doing keep-alive do it exactly at the same time. If they do, then the radio wakes up only once, sends a train of messages, maybe waits for the ACKs. Not perfect, but at least if preserves the battery a bit. Of course, that solution does not help the server at all. Yet another solution that had been tried is to have a system level process do the keep alive on behalf of all applications in the box. I won't go in the details, but we could maybe do a variant of that with Masque. Have the client use Masque for all outgoing connections, connecting to a Masque server outside the CGNAT. Then the client only needs the Masque session alive -- 1 keep alive instead of N. The end-to-end QUIC session could use IPV6, and long idle timers. Maybe something we could actually ship! -- Christian Huitema On 9/1/2024 1:00 PM, Ian Swett wrote: > This is a real problem, but I'm unsure what the best way to approach it is. > > I think you're suggesting that a large server operator could try to infer > NAT timeouts for clients of different IP prefixes and communicate that to > the client as a suggested keepalive/ping timeout? I'm curious about how to > infer NAT timeouts? Our servers detect a dead connection, but I'm not sure > how to tell what the reason is and more specifically the NAT timeout? > Sometimes devices just drop off the network. > > As you may know, Chrome will send a PING as a keepalive after 15 seconds of > idle only if there are outstanding requests (ie: hanging GETs). The number > was chosen somewhat arbitrarily and is certainly not optimal, but it did > fix some use cases where hanging GETs were otherwise failing. > > Thanks, Ian > > On Wed, Jul 24, 2024 at 5:55 PM Martin Thomson <mt@lowentropy.net> wrote: > >> The intent of the idle timeout was to have that reflect *endpoint* >> policy. That is, it is independent of path. >> >> It's certainly very interesting to consider what you might do about paths >> and keep-alives (or not). But that's a separable problem. Having a way >> for endpoints to share their information about timeouts might work, but I >> worry that that will lead to wasteful keepalive traffic. How would we >> ensure that keepalives are not wasteful? >> >> Is there a better way, such as a quick connection continuation? >> >> On Wed, Jul 24, 2024, at 11:24, Lucas Pardue wrote: >>> Hi folks, >>> >>> Wearing no hats. >>> >>> There's been some chatter this week during IETF about selecting QUIC >>> idle timeouts in the face of Internet paths that might have shorter >>> timeouts, such as NAT. >>> >>> This isn't necessarily a new topic, there's past work that's been done >>> on measurements and attempts to capture that as in IETF documents. For >>> example, Lars highlighted a study of home gateway characteristics from >>> 2010 [1]. Then there's RFC 4787 [2], and our very own RFC 9308 [3] >>> >>> There's likely other work that's happened in the meantime that has >>> provided further insights. >>> >>> All the discussion got me wondering whether there might be room for a >>> QUIC extension that could hint at the path timeout to the peer. For >>> instance, as a server operator, I might have a wide view of network >>> characteristics that a client doesn't. Sending keepalive pings from the >>> server is possible but it might not be in the client's interest to >>> force it to ACK them, especially if there are power saving >>> considerations that would be hard for the server to know. Instead, a >>> hint to the peer would allow it to decide what to do. That could allow >>> us to maintain a large QUIC idle timeouts as befitting of the >>> application use case, but adapt to the needs of the path for improved >>> connection reliability. >>> >>> Such an extension could hint for each and every path, and therefore a >>> benefit to multipath, which has some addition path idle timeout >>> considerations [4]. >>> >>> Thoughts? >>> >>> [1] - https://dl.acm.org/doi/10.1145/1879141.1879174 >>> [2] - https://www.rfc-editor.org/rfc/rfc4787.html >>> [3] - https://www.rfc-editor.org/rfc/rfc9308.html#section-3.2 >>> [4] - >>> >> https://www.ietf.org/archive/id/draft-ietf-quic-multipath-10.html#name-idle-timeout >> >>
- QUIC idle timeouts and path idle timeouts Lucas Pardue
- Re: QUIC idle timeouts and path idle timeouts Lucas Pardue
- Re: QUIC idle timeouts and path idle timeouts Dan Wing
- Re: QUIC idle timeouts and path idle timeouts Martin Thomson
- Re: QUIC idle timeouts and path idle timeouts Ian Swett
- Re: QUIC idle timeouts and path idle timeouts Martin Thomson
- Re: QUIC idle timeouts and path idle timeouts Christian Huitema