Re: statement regarding keepalives

Mikael Abrahamsson <> Fri, 20 July 2018 11:47 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id B5B9C130EB1; Fri, 20 Jul 2018 04:47:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -4.3
X-Spam-Status: No, score=-4.3 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id MWcID3WL8Obh; Fri, 20 Jul 2018 04:47:47 -0700 (PDT)
Received: from ( [IPv6:2a00:801::f]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 8CE45130E5E; Fri, 20 Jul 2018 04:47:47 -0700 (PDT)
Received: by (Postfix, from userid 501) id D17FBB1; Fri, 20 Jul 2018 13:47:45 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;; s=mail; t=1532087265; bh=pZrUvuhw4IsIXTnsaH/j9mlKaubbWGMuskQUHZG3ZJA=; h=Date:From:To:cc:Subject:In-Reply-To:References:From; b=3TP3YteEtQKQD7fIth5Jy4Cj37ivqMnWxxfm/qkWcacHYunL6ivqQ8Z0bC37zufPZ jbV7LGMe5020KZKOwD2+7brkiIAWNTslrk76zmuPb5XxtkSYv8WABpKZWmfh4Ljtp0 mgrXtOL9SFN1RdWPYLNxpCjWQIxbds3/JXPLHx24=
Received: from localhost (localhost []) by (Postfix) with ESMTP id C97F9AF; Fri, 20 Jul 2018 13:47:45 +0200 (CEST)
Date: Fri, 20 Jul 2018 13:47:45 +0200 (CEST)
From: Mikael Abrahamsson <>
To: Kent Watsen <>
cc: "" <>, "" <>, "" <>, "" <>
Subject: Re: statement regarding keepalives
In-Reply-To: <>
Message-ID: <>
References: <>
User-Agent: Alpine 2.20 (DEB 67 2015-01-07)
Organization: People's Front Against WWW
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII; format=flowed
Archived-At: <>
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: IETF Transport and Services Area Mailing List <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 20 Jul 2018 11:47:50 -0000


While I agree with the sentiment here, I have personally been in positions 
where application programmers were unable to (in a timely manner) modify 
whatever was running, to implement a keepalive protocol. In that case, 
turning on TCP keepalives was a very easy thing to do that immediately 
would yield operational benefits.

So I'd like to see in the text that we recommend to do it as "high up" in 
the stack as possible, but still don't put off people turning on TCP 
keepalives "because the IETF doesn't recommend that", and thus they do 
nothing at all and the problem just persists.

Also, should we talk about recommendations for what these timers should 
be? In my experience, it's typically in tens of seconds up to 5-10 minutes 
that makes sense for Internet use. Shorter than that might interrupt the 
connection prematurely, longer than that causes things to take too long to 
detect a problem. Of course it's up to the application/environment to 
choose the best value for each use-case, but some text on this might be 
worthwhile to have there?

On Fri, 13 Jul 2018, Kent Watsen wrote:

> The folks working with the BBF asked the NETMOD WG to consider modifying draft-ietf-netconf-netconf-client-server to support TCP keepalives [1].  However, it is unclear what IETF's position is on the use of keepalives, especially with regards to keepalives provided in protocol stacks (e.g., <some-app> over HTTP over TLS over TCP).
> After some discussion with Transport ADs (Spencer and Mijra) and the TLS ADs (Eric and Ben), the following draft statement has been crafted.  Spencer and Mijra have requested TSVAREA critique it before, perhaps, developing a consensus document around it in TSVWG.
> It would be greatly appreciated if folks here could review and provide comments on the draft statement below.  The scope of the statement can be increased or reduced as deemed appropriate.
> [1]
> Thanks,
> Kent (and Mahesh) // NETCONF chairs
> ===== STATEMENT =====
> When the initiator of a networking session needs to maintain a persistent connection [1], it is necessary for it to periodically test the aliveness of the remote peer.  In such cases, it is RECOMMENDED that the aliveness check happens at the highest protocol layer possible that is most meaningful to the application, to maximize the depth of the aliveness check.
> E.g., for an HTTPS connection to a simple webserver, HTTP-level keepalives would test more aliveness than TLS-level keepalives.  However, for a webserver that is accessed via a load-balancer that terminates TLS connections, TLS-level aliveness checks may be the most meaningful check that could be performed.
> In order to ensure aliveness checks can always occur at the highest protocol layer, it is RECOMMENDED that protocol designers always include an aliveness check mechanism in the protocol and, for client/server protocols, that the aliveness check can be initiated from either peer, as sometimes the "server" is the initiator of the underlying networking connection (e.g., RFC 8071).
> Some protocol stacks have a secure transport protocol layer (e.g., TLS, SSH, DTLS) that sits on top of a cleartext protocol layer (e.g., TCP, UDP).  In such cases, it is RECOMMENDED that the aliveness check occurs within protection envelope afforded by the secure transport protocol layer.  In such cases, the aliveness checks SHOULD NOT occur via the cleartext protocol layer, as an adversary can block aliveness check messages in either direction and send fake aliveness check messages in either direction.
> [1] While reasons may vary for why the initiator of a networking session feels compelled to maintain a persistent connection.  If the session is primarily quiet, and the use case can cope with the additional latency of starting a new connection, it is RECOMMENDED to use short-lived connections, instead of maintaining a long-lived persistent connection using aliveness checks.

Mikael Abrahamsson    email: