Re: statement regarding keepalives

Tom Herbert <tom@herbertland.com> Thu, 16 August 2018 14:38 UTC

Return-Path: <tom@herbertland.com>
X-Original-To: tsv-area@ietfa.amsl.com
Delivered-To: tsv-area@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D7172130E78 for <tsv-area@ietfa.amsl.com>; Thu, 16 Aug 2018 07:38:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.91
X-Spam-Level:
X-Spam-Status: No, score=-1.91 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, T_DKIMWL_WL_MED=-0.01] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=herbertland-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qNFZ_71NIz_t for <tsv-area@ietfa.amsl.com>; Thu, 16 Aug 2018 07:38:10 -0700 (PDT)
Received: from mail-qt0-x230.google.com (mail-qt0-x230.google.com [IPv6:2607:f8b0:400d:c0d::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 478C4130E0D for <tsv-area@ietf.org>; Thu, 16 Aug 2018 07:38:10 -0700 (PDT)
Received: by mail-qt0-x230.google.com with SMTP id w26-v6so4916623qto.5 for <tsv-area@ietf.org>; Thu, 16 Aug 2018 07:38:10 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=herbertland-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=lYLJE6U3x4Tw2CXfbieQRg6Zo7IIxD3K76pGfSpJzFI=; b=I/9WTpBNr0Jruqk8MSWqeiy+Wa35X/9utFCu0Hy+71cBejI3ODxDcMTch/SJAJE0oF P22YcRL+sSa7y4z00cVT/lar5govnhxOJAByrd9yY5HB0Mb3OEp/ikfAzOPWNvhEa9s9 jobXz40VTPokA0buVLuK8oGgIUuOso2BZvMYBV9y+zip3MiMpY0hV9YitvdwTMTsZGVz VdTCP90PYfc3UWGUddsRIElY1JCnGAY+djznK+cC/ncrr2TdguT0AmdJIm8apCMFs8mF F/PlRHbCyEyA4wIZvTd93XpUv/mrZwu+m50BbtL5tMBdntX+w9BOKiXstrb/dCregu/n bKWA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=lYLJE6U3x4Tw2CXfbieQRg6Zo7IIxD3K76pGfSpJzFI=; b=r5zQ/FTHcA5W/r+LhQ7OqzEGPmceQC7aAEwDgOQye+N3CEsT477OKpLMUYBYHf67uP xDcpsX4UmBbkZDG67AvChFsZYcuetgWfDpdVvxQrpcdg1e8mWEes//Z1hFBgtyQS9oWy PEONwpLxL938agNyfC8Ljf5tuECFUzv71xaj9iP9dzj1oT9B30g+kIYAy/5OhVktJ1t1 8UUjT2uSxYKu/p30QuCwrKbJnePxmnSYeTsmNtYEjCt+aV+xeuriF7ce2/SA7PtF252y 5Zgd3gppndUqiu0DoiujujVz7XKp+qPlYJV/U0AuXQ6Kb7/OQlfr3rLJBZOcH6YLOXJE OnJQ==
X-Gm-Message-State: AOUpUlG9pC+a7HE/7Q4E7YnpXklZHobGhiJFc23pEj7DawO0aek8UyYj Bhrycv62grcgaV8Phfm7rys10iwduIdcDMvsnA2YRg==
X-Google-Smtp-Source: AA+uWPzT+czC2Y99549RCKVrTCSiYuoJYrkTDF4lzAOAHqgSh3Jhn02LL97Hk1WGV/1nMb9wQTYfzXWnq32y1NK/8UY=
X-Received: by 2002:a0c:ace2:: with SMTP id n31-v6mr26499911qvc.72.1534430289162; Thu, 16 Aug 2018 07:38:09 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:ac8:3304:0:0:0:0:0 with HTTP; Thu, 16 Aug 2018 07:38:08 -0700 (PDT)
In-Reply-To: <CF1F342A-4947-43A7-B84D-9B9DA1A7F1EF@edvina.net>
References: <D3326DE0-3F31-4045-B945-82B3F417BE4B@juniper.net> <alpine.DEB.2.20.1807201340240.14354@uplift.swm.pp.se> <B50DC954-CBB6-41C5-BE3A-F1DECD6046A5@juniper.net> <717202c9c6c6b3d083bfa4c8a9925e45@strayalpha.com> <6377766E-9A03-41BA-A4D4-8796F46278BD@juniper.net> <CALx6S34+rG_rx+79=iaeu5YT4pYUWRqAym6S_CNzJq9-a40Yvw@mail.gmail.com> <513E9F0D-CFAD-4009-8F86-289D9DC55A79@juniper.net> <alpine.DEB.2.20.1808160919260.19688@uplift.swm.pp.se> <CF1F342A-4947-43A7-B84D-9B9DA1A7F1EF@edvina.net>
From: Tom Herbert <tom@herbertland.com>
Date: Thu, 16 Aug 2018 07:38:08 -0700
Message-ID: <CALx6S35AsYByzY=_vBvcyc4=QWyvLbeNUgOVSU64A760Qk078w@mail.gmail.com>
Subject: Re: statement regarding keepalives
To: "Olle E. Johansson" <oej@edvina.net>
Cc: Mikael Abrahamsson <swmike@swm.pp.se>, "netconf-chairs@ietf.org" <netconf-chairs@ietf.org>, "tls-ads@ietf.org" <tls-ads@ietf.org>, "tsv-area@ietf.org" <tsv-area@ietf.org>, "tsvwg-ads@tools.ietf.org" <tsvwg-ads@tools.ietf.org>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsv-area/M5u2L8Idgq7XBSjek35NRQI_i3U>
X-BeenThere: tsv-area@ietf.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: IETF Transport and Services Area Mailing List <tsv-area.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsv-area>, <mailto:tsv-area-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsv-area/>
List-Post: <mailto:tsv-area@ietf.org>
List-Help: <mailto:tsv-area-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsv-area>, <mailto:tsv-area-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Aug 2018 14:38:14 -0000

On Thu, Aug 16, 2018 at 12:44 AM, Olle E. Johansson <oej@edvina.net> wrote:
>
>
> On 16 Aug 2018, at 09:28, Mikael Abrahamsson <swmike@swm.pp.se> wrote:
>
> On Wed, 15 Aug 2018, Kent Watsen wrote:
>
> You bring up an interesting point, it goes to the motivation for wanting to
> do keepalives in the first place.  The text doesn't yet mention maintain
> flow state as a motivation.
>
>
> It's not only to maintain flow state, it's also to close the connection when
> the network goes down and doesn't work anymore, and "give up" on connections
> that doesn't work anymore (for some definition of "anymore").
>
> I have operationally been in the situation where a server/client application
> was implemented so that the server could only handle 256 connections (some
> filedescriptor limit). Every time the firewall was rebooted, lost state, the
> connection hung around forever. So the server administrators had to go in
> and restart the process to clear these connections, otherwise there were 256
> hung connections and no new connections could be established.
>
> Sometimes the other endpoint goes down, and doesn't come back. We will for
> instance deploy home gateways probably keeping netconf-call-home sessions to
> an NMS, and we want them to be around forever, as long as they work. TCP
> level keepalives would solve this, as if the customer just powers off the
> device, after a while the session will be cleared. Using TCP keepalives here
> means you get this kind of behaviour even if the upper-layer application
> doesn't support it (netconf might have been a bad example here). It's a
> single socket option to set, so it's very easy to do.
>
> From knowing approximately what settings people have in their NAT44 and
>
> firewalls etc, I'd say the recommendation should be that keepalives are set
> to around 60-300 second interval, and then kill the connection if no traffic
> has passed in 3-5 of these intervals, kill the connection. Otherwise TCP
> will have backed off so far anyway, that it's probably faster to just re-try
> the connection instead of waiting for TCP to re-send the packet.
>
> I have seen so many times in my 20 years working in networking where lack of
> keepalives have caused all kinds of problems. I wish everybody would turn it
> on and keep it on.
>
Olle,

They are already on, TCP has a default keepalive for 2 hrs. The issue
that is inevitably raised is that 2 hrs. is much too long a period for
maintaining NAT state (NAT timeouts are usuallu far less time). But,
as I pointed out already, sending keepalives at a higher frequency is
not devoid of cost nor problems.

Tom

>
> As more and more connections flow over mobile networks, it seems more and
> more important, even for flows you did not expect. I have to send keepalives
> over IPv6 connections - not for NAT as on IPv4. but for middlebox devices
> that has an interesting approach and attitude towards connection management.
> ;-)
>
> The SIP Outbound RFC has a lot of reasoning behind keep-alives for
> connection failover and may be good input here.
>
> https://tools.ietf.org/html/rfc5626
>
> /O