Re: [tcpm] WGLC for draft-ietf-tcpm-1323bis

"Scheffenegger, Richard" <rs@netapp.com> Tue, 14 May 2013 13:59 UTC

Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B8AD821F90EA for <tcpm@ietfa.amsl.com>; Tue, 14 May 2013 06:59:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.949
X-Spam-Level:
X-Spam-Status: No, score=-9.949 tagged_above=-999 required=5 tests=[AWL=0.650, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iGwcqY2hsS-Y for <tcpm@ietfa.amsl.com>; Tue, 14 May 2013 06:58:55 -0700 (PDT)
Received: from mx2.netapp.com (mx2.netapp.com [216.240.18.37]) by ietfa.amsl.com (Postfix) with ESMTP id B51EC21F9051 for <tcpm@ietf.org>; Tue, 14 May 2013 06:58:55 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.87,670,1363158000"; d="scan'208";a="24902386"
Received: from smtp2.corp.netapp.com ([10.57.159.114]) by mx2-out.netapp.com with ESMTP; 14 May 2013 06:58:55 -0700
Received: from vmwexceht02-prd.hq.netapp.com (vmwexceht02-prd.hq.netapp.com [10.106.76.240]) by smtp2.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id r4EDwssB027872; Tue, 14 May 2013 06:58:54 -0700 (PDT)
Received: from SACEXCMBX02-PRD.hq.netapp.com ([169.254.1.61]) by vmwexceht02-prd.hq.netapp.com ([10.106.76.240]) with mapi id 14.03.0123.003; Tue, 14 May 2013 06:58:54 -0700
From: "Scheffenegger, Richard" <rs@netapp.com>
To: "mallman@icir.org" <mallman@icir.org>, Pasi Sarolahti <pasi.sarolahti@iki.fi>
Thread-Topic: [tcpm] WGLC for draft-ietf-tcpm-1323bis
Thread-Index: AQHOTZQ2FK4ieyqHREWohZpucKxSDZkEkb8Q
Date: Tue, 14 May 2013 13:58:53 +0000
Message-ID: <012C3117EDDB3C4781FD802A8C27DD4F24B8B63F@SACEXCMBX02-PRD.hq.netapp.com>
References: <FCF05C2E-7414-4F1E-B63C-EFC5C94812E4@iki.fi> <20130510153700.ED45811C3C9C@lawyers.icir.org>
In-Reply-To: <20130510153700.ED45811C3C9C@lawyers.icir.org>
Accept-Language: de-AT, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.106.53.53]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "tcpm (tcpm@ietf.org)" <tcpm@ietf.org>
Subject: Re: [tcpm] WGLC for draft-ietf-tcpm-1323bis
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 14 May 2013 13:59:02 -0000

Hi Mark,

>   - 3.1: "Many TCP implementations base their RTT measurements upon
>     a sample of one segment per window or less.  While this yields
>     an adequate approximation to the RTT for small windows, it
>     results in an unacceptably poor RTT estimate for a LFN."
> 
>     Do you have evidence of this?  We have evidence you're wrong:
> 
>       Mark Allman, Vern Paxson.  On Estimating End-to-End Network
>       Path Properties.  Proceedings of the ACM SIGCOMM Technical
>       Symposium, Cambridge, MA, September 1999.
> 
>     That shows that the number of samples per RTT is pretty
>     immaterial to the effectiveness of the RTO.

I'd like to point out, that the whole section 3 does not make a clear distinction between *timely measuring* RTT, and using that measured RTT for further purposes, such as arriving at a useful value for RTO.

We can also argue about the specific wording in the 2nd sentence; But it appears to me, that you read "poor RTT estimate == poor RTO estimate"?

>   - 3.1: "RTT estimator".  Note, TCP does not have an "RTT
>     estimator".  Scrub this from the document.  We have an "RTO
>     estimator".  These are different things.  Confusing them is a
>     fundamental mistake.

Correct. And IMHO 1323 is looking at RTT, but, agreed, only with the intended purpose of refining RTO (where these improvements haven't been as beneficial as originally envisioned).

I rewrote the section 3.1 entirely:

3.  TCP Timestamp Option

3.1.  Introduction

   TCP measures the round trip time (RTT), primarily for the purpose of
   arriving at a decent value for the Retransmission Timeout (RTO) timer
   interval.  Accurate and current RTT estimates are necessary to adapt
   to changing traffic conditions, while a conservative estimate of the
   RTO inveral is necessary to minimize spurious RTOs.

   When [RFC1323] was originally written, it was perceived that more
   timely and accurate RTT measurements would contribute to reducing
   spurious RTOs, while maintaining their timeliness.  At the time, RTO
   was also the only mechanism to make use of the measured RTT.  It has
   been shown, that taking more RTT samples has only a very limited
   effect to optimize RTOs [Allman99].

   This document makes a clear distinction between the round trip time
   measurement (RTTM) mechanism, and subsequent mechanisms using the RTT
   signal as input, such as RTO (see Section 3.4).

   It is important to use the timestamp option with big windows, to
   allow the use of the PAWS mechanism (see Section 4).  Furthermore,
   the option is useful for all TCP's, since it simplifies the sender
   and allows the use of additional optimizations such as Eifel
   ([RFC3522], [RFC4015]) and others.

3.2.  Timestamp Option

And made the last paragraph of 3.3 a dedicated section, for the separation between RTTM and RTO:

3.4.  Updating the RTO value

   [KL04] has highlighted the problem that an unmodified RTO
   calculation, which is updated with per-packet RTT samples, will
   truncate the path history too soon.  This can lead to an increase in
   spurious retransmissions, when the path properties vary in the order
   of a few RTTs, but a high number of RTT samples are taken on a much
   shorter timescale.

   Implementers should note that with timestamps multiple RTTMs can be
   taken per RTT.  The [RFC6298] RTO estimator has weighting factors,
   alpha and beta, based on an implicit assumption that at most one RTTM
   will be sampled per RTT.  When using multiple RTTMs per RTT to update
   the RTO estimator, the weighting factor SHOULD be decreased to take
   into account the more frequent RTTMs.

   For example, an implementation could choose to

   o  just use one sample per RTT to update the RTO estimator, or

   o  vary the gain based on the congestion window, or

   o  take an average of all the RTT measurements (and the maximum of
      the variance) received over one RTT,

   and then use that value to update the RTO estimator.  This document
   does not prescribe any particular method for modifying the RTO
   estimator.







>   - End of 3.3 on RTT sample weighting factors.
> 
>     (1) The problem with the history being truncated when using RTTM
>         was independently highlighted by Ludwig and Floyd.  We
>         should at least have the common courtesy to cite Sally's
>         note to e2e and Reiner's paper.

I tried to find Sally's comment (also trying to find the reference given in your "Using Spurious Retransmissios to Adapt the Retransmission Timeout" paper), but was unsuccessful... Do you have a link?



>   - 3.4, (A): Why are we discussing this in terms of the "Kth"
>     segment?  Delayed ACKs per the standard is "2nd".  Why do we
>     have to make the discussion in terms of some theory rather than
>     in terms of what we have specified?

fixed

>   - 3.2 insinuates that you should not include a timestamp on an
>     RST: "TSopt MUST be sent in every non-<RST> segment", implying
>     it should not be sent on an <RST> (or you'd have just said
>     "every segment").  But, then 4.2 goes on to (rightly IMO)
>     develop why we should include it on <RST> segments.  This
>     inconsistency needs fixed.


fixed

>     RTTM should not be deprecated.  It should be a MAY.
> 
>     RTTM should not be discussed with breathless bullshit about hand
>     wavy math and un-demonstrated stability issues and whatnot.
> 
>     We should say that RTTM is absolutely within compliance of the
>     spec and that it will not hurt your RTO.
> 
>     We should also say that RTTM is unlikely to help your RTO.
> 
>     We should leave it to implementers to decide if RTTM is useful
>     for their purposes.
> 
>     We should specify a way to vary the gains in the standard RTO
>     algorithm based on the current cwnd.
> 
>     And, we should absolutely state that there are other uses for
>     the timestamp option (like Eifel, like PAWS) and there is
>     nothing wrong with the *option* for that purpose.

Best regards,
   Richard