Re: [tcpm] Comments on draft-ietf-tcpm-rtorestart-01

"Scheffenegger, Richard" <rs@netapp.com> Thu, 19 December 2013 09:24 UTC

Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 623A01AE0F3 for <tcpm@ietfa.amsl.com>; Thu, 19 Dec 2013 01:24:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.44
X-Spam-Level:
X-Spam-Status: No, score=-7.44 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-0.538, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8LrizlgwfZi3 for <tcpm@ietfa.amsl.com>; Thu, 19 Dec 2013 01:24:19 -0800 (PST)
Received: from mx12.netapp.com (mx12.netapp.com [216.240.18.77]) by ietfa.amsl.com (Postfix) with ESMTP id 63CBA1AE129 for <tcpm@ietf.org>; Thu, 19 Dec 2013 01:24:19 -0800 (PST)
X-IronPort-AV: E=Sophos;i="4.95,512,1384329600"; d="scan'208";a="130648078"
Received: from vmwexceht05-prd.hq.netapp.com ([10.106.77.35]) by mx12-out.netapp.com with ESMTP; 19 Dec 2013 01:24:17 -0800
Received: from SACEXCMBX02-PRD.hq.netapp.com ([169.254.1.147]) by vmwexceht05-prd.hq.netapp.com ([10.106.77.35]) with mapi id 14.03.0123.003; Thu, 19 Dec 2013 01:24:17 -0800
From: "Scheffenegger, Richard" <rs@netapp.com>
To: Yuchung Cheng <ycheng@google.com>, Per Hurtig <per.hurtig@kau.se>
Thread-Topic: [tcpm] Comments on draft-ietf-tcpm-rtorestart-01
Thread-Index: AQHO+9xAhlrV8y0Ac0WO9u6v6uElH5pa1KQAgABp/LA=
Date: Thu, 19 Dec 2013 09:24:16 +0000
Message-ID: <012C3117EDDB3C4781FD802A8C27DD4F25F46778@SACEXCMBX02-PRD.hq.netapp.com>
References: <E4F761E0-A42E-4B0F-A243-3B3A5D2834DA@kau.se> <CAK6E8=fRnZ3Z8322Vzg6rGMxx7KngfktcCHMLJPJUdGCfZuFpA@mail.gmail.com>
In-Reply-To: <CAK6E8=fRnZ3Z8322Vzg6rGMxx7KngfktcCHMLJPJUdGCfZuFpA@mail.gmail.com>
Accept-Language: de-AT, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.104.60.116]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "tcpm@ietf.org Extensions" <tcpm@ietf.org>, Michael Welzl <michawe@ifi.uio.no>
Subject: Re: [tcpm] Comments on draft-ietf-tcpm-rtorestart-01
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Dec 2013 09:24:22 -0000

Hi Per,

>> (i) Should we have a section in the draft describing a socket option?
>
> What're the useful cases in your mind?

I don't think that a granularity on a per-session level would really do any good; having a global sysctl (or similar) to influence the overall behavior of the stack is fine. But from a troubleshooting point of view, I doubt that such a fine level of control will do any good.

>> (ii) Should we apply RTO restart for all segments or only when the
>> amount of outstanding data is too small to support fast retransmit?
>
> Part of these compilation comes from resetting cwnd on timeout. It's like
> a time-ticking bomb for TCP performance. Consider the case where the
> entire cwnd is delivered except the last one. There is no point to re-
> slow-start to restore the ack clock, because you already have
> cwnd-1 acks last round.
> 
> I plan to implement rto-restart idea in Linux as base of a new loss
> recovery, but probably not the cwnd checking part. When an RTO fires it's
> time to retransmit. The need to check cwnd implies my timer is
> problematic.

Provided that you have other means (ie TS) to detect if the ACK was due to an spurious retransmit (RTO) or genuinely needed, I'm not sure if that complexity of checking the currently outstanding segments really is needed.

As Yuchung points out, removing this would reduce the complexity of the algorithm. If spurious retransmits are a problem, and they often are, the stack requires some other way of dealing with those anyway, because "fixing" only rto-restart won't address all the other (currently happening) instances of spurious rtos...



Seasons greetings,


Richard Scheffenegger