Re: statement regarding keepalives

Joe Touch <> Thu, 16 August 2018 14:09 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id D08D3130DE8; Thu, 16 Aug 2018 07:09:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.99
X-Spam-Status: No, score=-1.99 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, T_SPF_PERMERROR=0.01] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id BVfkpb315oS5; Thu, 16 Aug 2018 07:09:38 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 52C99126CB6; Thu, 16 Aug 2018 07:09:38 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;; s=default; h=To:References:Message-Id: Content-Transfer-Encoding:Cc:Date:In-Reply-To:From:Subject:Mime-Version: Content-Type:Sender:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=1o4wBYUk1JWQgtsvU1S4ZfhUvyiFnQiXp+c8nf4J+9A=; b=mlSP4YqXq4aAqJhk6fZrJZXP/ 5R2GPfEpymFAfz1YZSp0hnOSPPeWX3JQdfLktHsyXR03mLO66kRK/uIixQllu1PT0WMqS1JlO5en1 REaM+WaVi4NNHelXxHjT8iQyvxYHlXrDihvV0wG4EmK3E5I/RgTNIVIzifLvnKjsHFt1zdnemLHTO Pn+YwEOJKZ8SBLFIiMLbo54P58Wc1LJNy4OwdFoxQaIk3zZ+V77Xm6LGmqy9k3SWuxrpENx2hg34Y x5tjoBXLeMkwwMd1MZeGs2BQ3DySZeYqrNft/iwD/WzimtVjBEsNtwh0yY8GB3FsI7L6u+tthWo4E wXh4cwLew==;
Received: from ([]:50155 helo=[]) by with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.91) (envelope-from <>) id 1fqIy0-000GZP-HB; Thu, 16 Aug 2018 10:09:37 -0400
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
Subject: Re: statement regarding keepalives
From: Joe Touch <>
In-Reply-To: <>
Date: Thu, 16 Aug 2018 07:09:35 -0700
Cc: HMikael Abrahamsson <>, "" <>, "" <>, "" <>, "" <>
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <> <> <> <> <>
To: Kent Watsen <>
X-Mailer: Apple Mail (2.3445.9.1)
X-OutGoing-Spam-Status: No, score=-1.0
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname -
X-AntiAbuse: Original Domain -
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain -
X-Get-Message-Sender-Via: authenticated_id:
X-From-Rewrite: unmodified, already matched
Archived-At: <>
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: IETF Transport and Services Area Mailing List <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 16 Aug 2018 14:09:40 -0000

Hi, Kent,

I think the recommendations miss a few aspects of my suggestions:

- there is NEVER a good reason to assume that keepalives should happen at the “highest level” of anything;
keepalives are needed *at EVERY level* where endpoint state needs to be actively (rather than passively) maintained

- I agree it’s not helpful to assume that layers can coordinate on keepalives, but they don’t need to; keepalives at lower levels simply wouldn’t be triggered if there is sufficient traffic at those layers driven by upper layer keepalives. in specific, this means that there is NEVER a good reason to avoid implementing keepalives at a layer where they are needed, i.e., because of potential interaction with higher level keepalives. Such interaction is resolved automatically.

So the point, IMO, is that:
	- EACH layer that needs keepaliives MUST implement it for themselves
	- there is NEVER a reason to disable or suppress keepalives at any layer to “reduce traffic” due to keepalives at higher layers
	- although keepalives can be useful for state that decays when that state matters, keep in mind that not all state decays and not all such state matters
		it’s often still a surprise to many that TCP connections aren’t “cleaned up” when not in use; they’re cleaned up ONLY when old state is in the way of new state
		That’s a feature, not a bug.

As others have pointed out, there’s also no reason to jump to the conclusion that short, restarted connections are better - or worse - than keepalives. The difference depends on the amount of effort required to maintain state vs re-establishing it (including the need to recycle connection identifiers).


> On Aug 15, 2018, at 10:35 AM, Kent Watsen <> wrote:
> Below is an updated version of some text that we might roll into
> a statement or an I-D of some sort.  Kindly review and provide 
> suggestions for improvement, or support for the text as is, if
> that is the case.  ;)
> This update accommodates comments from:
>  - Wesley Eddy & David Black
>     - removed "layers of functionality" verbiage
>     - moved footnote into the body of the document (this had
>       a cascading effect, and why it looks so different now)
>  - Joe Touch
>     - keepalives should occur at *all* layers that benefit
>     - keepalives at a layer should be suppressed in the 
>       presence of sufficient traffic from higher layers
>     - keepalives at a layer should not be interpreted as
>       implying state at any other layer
> This update does not accommodate comments from:
>  - Michael Abrahamsson & Tom Herbert
>     - no statement added to promote TCP keepalives
>        * note: I believe this to be unnecessary because 
>          the current text doesn't ever say to not use TCP.
>     - no statement added for tuning params (e.g., timeouts).
>        * note: we could add this, but it will increase the
>          scope of the document - do we want to do this?
> Cheers!
> Kent
> ===== START =====
> # Connection Strategies for Long-lived Connections
> A networked device may have an ongoing need to interact with a remote
> device. Sometimes the need arises from wanting to push data to the
> remote device, and sometimes the need arises from wanting to check if
> there is any data the remote device may have pending to deliver to
> it.
> There are two fundamental network connection strategies that can be
> used to accomplish this goal: 1) a single long-lived connection and
> 2) a sequence of short-lived connections.
> A single long-lived connection is most common, as it is
> straightforward to implement and directly answers the question of 
> if the "connection" is established. However, long-lived connections
> require more system resources, which may affect scalability, and
> require the initiator of the connection to periodically test the
> aliveness of the remote device, discussed further in the next 
> section.
> A sequence of short-lived connections is less common, as there is an
> additional implementation effort, as well as concerns such as: 1) the
> delay of the remote device needing to wait until the connection is
> reestablished in order to deliver pending data, and 2) the additional
> latency incurred from starting new connections, especially when
> cryptology is involved. However, short-lived connections do not
> require keepalives and are arguably more secure, as each device is
> forced to re-authenticate the other and reload all related
> access-control policies on each connection.
> For networking sessions that are primarily quiet, and the use case
> can cope with the additional latency of waiting for and starting new
> connections, it is RECOMMENDED to use a sequence of short-lived
> connections, instead of maintaining a single long-lived connection
> using aliveness checks.
> # Keepalives for Persistent Connections
> When the initiator of a networking session needs to maintain a
> long-lived connection, it is necessary for it to periodically test
> the aliveness of the remote device. In such cases, it is RECOMMENDED
> that the aliveness check happens at the highest protocol layer
> possible that is meaningful to the application, in order to maximize
> the depth of the aliveness check.
> For example, for an HTTPS connection to a simple webserver,
> HTTP-level keepalives would test more layers of functionality than
> TLS-level keepalives. However, for a webserver that is accessed via a
> load-balancer that terminates TLS connections, TLS-level aliveness
> checks may be the most meaningful check that can be performed.
> More generally, it is RECOMMENDED that applications be able to
> perform the aliveness checks at all protocol levels that benefit, but
> suppress the aliveness checks at lower protocol layers from occurring
> when there is sufficient activity at higher protocol layers.
> Keepalives at a layer SHOULD NOT be interpreted as implying state at
> any other layer.
> In order to ensure aliveness checks can occur at any given protocol
> layer, it is RECOMMENDED that protocol designers always include an
> aliveness check mechanism in the protocol and, for client/server
> protocols, that the aliveness check can be initiated from either
> device, as sometimes the "server" is the initiator of the underlying
> networking connection (e.g., RFC 8071).
> Some protocol stacks have a secure transport protocol layer (e.g.,
> TLS, SSH, DTLS) that sits on top of a cleartext protocol layer (e.g.,
> TCP, UDP). In such cases, it is RECOMMENDED that the aliveness check
> occurs within protection envelope afforded by the secure transport
> protocol layer; the aliveness checks SHOULD NOT occur via the
> underlying cleartext protocol layer, as an adversary can block
> aliveness check messages in either direction and send fake aliveness
> check messages in either direction.