Re: statement regarding keepalives

Tom Herbert <> Fri, 17 August 2018 21:13 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 832B3130E1F for <>; Fri, 17 Aug 2018 14:13:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, T_DKIMWL_WL_MED=-0.01, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id WGuEDIXkDsx9 for <>; Fri, 17 Aug 2018 14:13:15 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:400d:c0d::236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 054A5130DE3 for <>; Fri, 17 Aug 2018 14:13:13 -0700 (PDT)
Received: by with SMTP id e19-v6so10268213qtp.8 for <>; Fri, 17 Aug 2018 14:13:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=X9DIRab9UbecU6STg0xAiOQLOoDdjUTHlpoPbqA/Vyc=; b=n8RdcVywuvaB+YpKsI2wydZTFmB2bfuqHU/Y22DRQsoYFo+ZVyTSl/fN8cRpkXVIqh pcACQWRtR0FK5HT/OdIRum/GF5W8oHaHwscFoeUoxWoapYX+8tQbEO2pKtgyn2fFaUS0 BjLSjeFP+Lg8c56uge5QqTy8FoIJYmMhZZv81GnUp13Y9ZqsmjTCWSSMzZRftFWbTp/J JTmaFJgMcZ0YSa6nylcnjWZmsuZBJPPeEJ3TnZTU6CeGIykn9VKFkGCgoqQ6/mcKcP52 yCtKO0Y6nrVVFfOzm0SWULx+v6iSQYrSmEY0dpCvB3/qPa3hHurW6OIZRwRUMEJwlRL8 9avw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=X9DIRab9UbecU6STg0xAiOQLOoDdjUTHlpoPbqA/Vyc=; b=NXDvklbw8mqE0B+a2On2Fs/1uViW18133RTyZH/FCX5qzCoz9kxqCh+R0KRTsc9Jtw FgguynI0Zj+2X9e1KolCiatyOIWGjWNWx2LOAQkzwfIrur3tqM/mBruQoHR/knCCypwT dG6tDXRVOM5ehGkGwJrmDZ1XLhMG0APYl6RUwl6gZRq+8PzbejZfzLtpfQ00f0A4GI+z yQ8nCs7XLow0AwTR3hxeOHQQwWViM7SbTj9K7S1IzEOuythe+R7wNtpgJKAtGs2r1kfN 9pad6X4OYJejn7oBoSWCgq0ZGrL+o7zVwZ68FvlrTmIPqXrp0KTdvTZ+0nqL4WY5nboI PV+g==
X-Gm-Message-State: AOUpUlFClfw1ddLGNDST10nBid2Xmfe9YkD34RiCnvbQZXMU15MdjTBD DIPbKX82wJAvzEQGYSzUVSMbNBWU2w1+4emkYiUB/v8JPYI=
X-Google-Smtp-Source: AA+uWPx8ClCjBzf9UgS/9qNq0TRBZDnLndtMFYSf5pJyhiaz/D7DNjzpGfiNlWHxIQDMJeRqpusVmAGbamWr8eN3tKk=
X-Received: by 2002:ac8:530d:: with SMTP id t13-v6mr34246409qtn.396.1534540392008; Fri, 17 Aug 2018 14:13:12 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:ac8:3304:0:0:0:0:0 with HTTP; Fri, 17 Aug 2018 14:13:11 -0700 (PDT)
In-Reply-To: <>
References: <> <> <> <> <> <> <> <> <> <> <> <> <>
From: Tom Herbert <>
Date: Fri, 17 Aug 2018 14:13:11 -0700
Message-ID: <>
Subject: Re: statement regarding keepalives
To: Joe Touch <>
Cc: Benjamin Kaduk <>,,, " >>" <>,
Content-Type: text/plain; charset="UTF-8"
Archived-At: <>
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: IETF Transport and Services Area Mailing List <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 17 Aug 2018 21:13:16 -0000

On Fri, Aug 17, 2018 at 1:31 PM, Joe Touch <> wrote:
> On 2018-08-17 11:43, Tom Herbert wrote:The purpose of an application keep
> alive is not to do favors for TCP,
> it's to verify the end to end liveness between application end points.
> This is at a much higher layer, verifying liveness of the TCP
> connection is a side effect.
> Sure - that's fine and not what I'm concerned about.
> I don't want the text to say that higher level protocols or apps should try
> to do favors to keepalive lower level protocols - because it doesn't
> necessarily work.
> However, if that 1GB goes out in 10 seconds, then TCP would have sent its
> own keepalives just fine. It didn't need the app's help.
> So the app didn't help at all; at best, it does nothing and at worst it
> hurts.
> Consider that someone sets an application keepalive to 35 second
> interval and the TCP keepalive timer is 30 seconds. When the
> connection goes idle TCP keepalive will fire at thirty seconds, and
> five seconds later the application keepalive fires. So every
> thirty-five seconds two keepalives are done at two layers. This is not
> good as it wastes network resources and power.
> Agreed.
> In this case, the
> application keepalive is sufficient
> In this *implementation* it *might* be sufficient, in others, it might not.
> There's simply no way for the layers to know.
> and the TCP keepalive shouldn't be
> used.
> If you KNOW that the app keepalive will cause the TCP transmission, sure -
> but how do you KNOW that? You don't and can't. Even if you write to the TCP
> socket, all you know when the socket returns is that the data was copied to
> the kernel. You don't know for sure that you've triggered a TCP packet.
Actually, you do know that information. Application keepalives are
request/response messages sent in TCP data. When a response is
received to keepalive request over the TCP connection that is proof
that the keepalive was sent. If the application keepalive was sent on
the socket, and no response is received before the application timer
expires, then the application declares the the connection dead and
most likely will just close the socket and try to reconnect. The fact
that an application keepalive request, or its response, might be stuck
in a TCP send buffer (e.g. peer rcv window is zero) versus the peer
host completely disappeared is irrelevant. To the application it's all
the same, a connection to a peer application has failed and action
needs to be taken.


> Besides, your "keepalives" might end up causing TCP to send packets it never
> needed to send in the first place - even IF you think you're doing it a
> favor.
> This is an example of the problems in running two control loops
> at different layers with overlapping functionality,
> The problem is trying to infer overlap in functionality. If you realize that
> these are independent control loops *and leave them alone* you're fine.
Independence of control loops does not mean they can't conflict.
Mulitple layers performing keepalives is just one example, and
probably one with lesser insidious behavoir. Look at for good example
how link layer retransmissions can conflict with TCP algorithms to
produce really bad results.


> It's only in trying to optimize them as overlapping that a problem is
> created.
> if the
> ramifications of doing aren't understood it can lead to undesirable
> interactions and behavior.
> Agreed - so don't. Admit that there are inefficiencies *regardless of how
> hard you try to do otherwise* and leave them alone, IMO.
> If the app needs an app-level keepalive, do it.
> If the app wants TCP to be kept alive, let IT do it and leave it alone.
> Don't try to couple the two because you can't, and whatever you think you
> might gain you could easily lose. Leaving the two alone and separate is
> sufficient and robust.
> Joe