Re: [tcpm] Comments on draft-ietf-tcpm-rtorestart-01

"Scheffenegger, Richard" <rs@netapp.com> Thu, 19 December 2013 12:00 UTC

Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D1CD91AE176 for <tcpm@ietfa.amsl.com>; Thu, 19 Dec 2013 04:00:30 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.14
X-Spam-Level:
X-Spam-Status: No, score=-7.14 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-0.538, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PUGIcu1NBZqQ for <tcpm@ietfa.amsl.com>; Thu, 19 Dec 2013 04:00:29 -0800 (PST)
Received: from mx12.netapp.com (mx12.netapp.com [216.240.18.77]) by ietfa.amsl.com (Postfix) with ESMTP id A90071AE15C for <tcpm@ietf.org>; Thu, 19 Dec 2013 04:00:29 -0800 (PST)
X-IronPort-AV: E=Sophos;i="4.95,512,1384329600"; d="scan'208";a="130686925"
Received: from vmwexceht04-prd.hq.netapp.com ([10.106.77.34]) by mx12-out.netapp.com with ESMTP; 19 Dec 2013 04:00:28 -0800
Received: from SACEXCMBX02-PRD.hq.netapp.com ([169.254.1.147]) by vmwexceht04-prd.hq.netapp.com ([10.106.77.34]) with mapi id 14.03.0123.003; Thu, 19 Dec 2013 04:00:27 -0800
From: "Scheffenegger, Richard" <rs@netapp.com>
To: Anna Brunström <anna.brunstrom@kau.se>
Thread-Topic: [tcpm] Comments on draft-ietf-tcpm-rtorestart-01
Thread-Index: AQHO+9xAhlrV8y0Ac0WO9u6v6uElH5pa1KQAgABp/LCAAJ2jAP//jIyw
Date: Thu, 19 Dec 2013 12:00:26 +0000
Message-ID: <012C3117EDDB3C4781FD802A8C27DD4F25F471AE@SACEXCMBX02-PRD.hq.netapp.com>
References: <E4F761E0-A42E-4B0F-A243-3B3A5D2834DA@kau.se> <CAK6E8=fRnZ3Z8322Vzg6rGMxx7KngfktcCHMLJPJUdGCfZuFpA@mail.gmail.com> <012C3117EDDB3C4781FD802A8C27DD4F25F46778@SACEXCMBX02-PRD.hq.netapp.com> <52B2CD74.6030408@kau.se>
In-Reply-To: <52B2CD74.6030408@kau.se>
Accept-Language: de-AT, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.106.53.53]
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "tcpm@ietf.org" <tcpm@ietf.org>
Subject: Re: [tcpm] Comments on draft-ietf-tcpm-rtorestart-01
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Dec 2013 12:00:31 -0000

Hi Anna,

>> As Yuchung points out, removing this would reduce the complexity of the
>> algorithm. If spurious retransmits are a problem, and they often are, the
>> stack requires some other way of dealing with those anyway, because
>> "fixing" only rto-restart won't address all the other (currently
>> happening) instances of spurious rtos...
>>
> 
> This is not only about spurious RTOs, it is also about how you recover
> lost packets. You don't want an RTO for that if a fast retransmit can be
> done.

But how do you positively know in advance, if a fast retransmit can be done? Looking at cwnd alone is not sufficient, as you also need a model of the receiver (delayed ACK - phase and frequency) and return path (ACK loss, delay)... Just because there may exist a chance to do fast retransmits doesn't imply that fast retransmits will be the most timely loss recovery...
 
> Also for spurious RTOs, TS isn't always used, and anyway all mechanisms
> for spurious timeout detection detect and repair the problem *after* it
> happened ... this is better than nothing, but it's even better to keep the
> spurious timeout from happening in the first place. It is a simple check,
> so I think that one line of code is worthwhile.


Again, without precognition on the sender side, you'll always need some reactive mechanisms; trying to be proactive alone will not always help...

Overall, I think making rto-restart dependent on cwnd should be optional, not necessarily normative. Implementers who have repair mechanisms after spurious RTOs happened may choose to implement a more simple rto restart; or they may choose to make a better model to trigger RTO (ie. improving the timeout calculation utilizing additional information ) 

Richard Scheffenegger