Re: [tcpm] 793bis: question on TCP RTO timer

Michael Tuexen <michael.tuexen@lurchi.franken.de> Fri, 27 August 2021 12:22 UTC

Return-Path: <michael.tuexen@lurchi.franken.de>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 13A143A10A0 for <tcpm@ietfa.amsl.com>; Fri, 27 Aug 2021 05:22:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.499
X-Spam-Level:
X-Spam-Status: No, score=-1.499 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, KHOP_HELO_FCRDNS=0.399, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3b1Dtl_JzrKy for <tcpm@ietfa.amsl.com>; Fri, 27 Aug 2021 05:22:24 -0700 (PDT)
Received: from drew.franken.de (mail-n.franken.de [193.175.24.27]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9A15D3A109E for <tcpm@ietf.org>; Fri, 27 Aug 2021 05:22:23 -0700 (PDT)
Received: from smtpclient.apple (ip1f100e9c.dynamic.kabel-deutschland.de [31.16.14.156]) (Authenticated sender: lurchi) by mail-n.franken.de (Postfix) with ESMTPSA id A16DC7220BFA1; Fri, 27 Aug 2021 14:22:15 +0200 (CEST)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\))
From: Michael Tuexen <michael.tuexen@lurchi.franken.de>
In-Reply-To: <9caba130-cebc-3235-8eac-7ff08fa3809c@ymbk.nl>
Date: Fri, 27 Aug 2021 14:22:15 +0200
Cc: Extensions <tcpm@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <32F34C85-A125-40AC-993C-BC5A86204A93@lurchi.franken.de>
References: <9caba130-cebc-3235-8eac-7ff08fa3809c@ymbk.nl>
To: Geert Jan de Groot <GeertJan.deGroot@ymbk.nl>
X-Mailer: Apple Mail (2.3654.120.0.1.13)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/7g_ZhQXpYItGNWf-qxrny45R_WI>
Subject: Re: [tcpm] 793bis: question on TCP RTO timer
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Aug 2021 12:22:30 -0000

> On 27. Aug 2021, at 12:52, Geert Jan de Groot <GeertJan.deGroot@ymbk.nl> wrote:
> 
> Hi folks,
> 
> I do apologise for bringing up this question this late in the game, but lab tests are not pretty..
> 
> The issue is on 793bis section 3.8.1: the RTO timer. The 793bis document correctly points to RFC1122, RFC2988, RFC6298.
> 
> RFC6298 (and 2988) describe:
>   (2.4) Whenever RTO is computed, if it is less than 1 second, then the
>         RTO SHOULD be rounded up to 1 second.
> 
> I now have discussions with an embedded OS manufacturer who clamps the RTO timer at 1 second, even if the network delay and SRTT is much smaller (say, two hosts connecting over a typical 1000BASE-T network). That means that if a single packet is lost, the TCP connection freezes for one second because it needs the RTO-timer to trigger which is one second, minimal, per requirement RFC6298/2.4 above.
Can't you use fast retransmission to recover from the loss? What about using RACK?

Best regards
Michael
> 
> Put like this, a connection pushing 1gbit/sec traffic freezes for one second if a single packet is dropped and the OS manufacturer claims this is correct. My feeling says it isn't - we measure SRTT for a reason, a packet dropped is simply a signal to lower bandwith (congestion) but not to freeze traffic for a second waiting for RTO to trigger. However, reading the draft and 6298/2988 I can't definitively fault the OS for their one second freeze.
> 
> Before sending this to the list, for fear of kicking up dust, I asked Wesley and he referred to RFC8961, which suggests to use the RTT, with exponential backoff, to estimate the RTO timer value without the one second minimal value of RFC6298.
> 
> I wonder what the list thinks. When the one-second rule was first mentioned, networks were significantly slower than they are today, and it only seems logical that timers based on traffic (with appropiate backoff measures) make more sense than a one second timer.
> 
> Clue appreciated,
> 
> Geert Jan
> 
> _______________________________________________
> tcpm mailing list
> tcpm@ietf.org
> https://www.ietf.org/mailman/listinfo/tcpm