Re: [tcpm] Updated Section 3 of draft-ietf-tcpm-1323bis
"Scheffenegger, Richard" <rs@netapp.com> Wed, 15 May 2013 19:07 UTC
Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AA0EC21F90FC for <tcpm@ietfa.amsl.com>; Wed, 15 May 2013 12:07:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.082
X-Spam-Level:
X-Spam-Status: No, score=-10.082 tagged_above=-999 required=5 tests=[AWL=-0.083, BAYES_00=-2.599, J_CHICKENPOX_33=0.6, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XasgCG4GnX-y for <tcpm@ietfa.amsl.com>; Wed, 15 May 2013 12:07:01 -0700 (PDT)
Received: from mx12.netapp.com (mx12.netapp.com [216.240.18.77]) by ietfa.amsl.com (Postfix) with ESMTP id 5D2DF21F901D for <tcpm@ietf.org>; Wed, 15 May 2013 12:06:57 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.87,678,1363158000"; d="scan'208";a="53869266"
Received: from smtp2.corp.netapp.com ([10.57.159.114]) by mx12-out.netapp.com with ESMTP; 15 May 2013 12:06:57 -0700
Received: from vmwexceht01-prd.hq.netapp.com (vmwexceht01-prd.hq.netapp.com [10.106.76.239]) by smtp2.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id r4FJ6vIi004552; Wed, 15 May 2013 12:06:57 -0700 (PDT)
Received: from SACEXCMBX02-PRD.hq.netapp.com ([169.254.1.61]) by vmwexceht01-prd.hq.netapp.com ([10.106.76.239]) with mapi id 14.03.0123.003; Wed, 15 May 2013 12:06:57 -0700
From: "Scheffenegger, Richard" <rs@netapp.com>
To: "mallman@icir.org" <mallman@icir.org>, "tcpm (tcpm@ietf.org)" <tcpm@ietf.org>
Thread-Topic: [tcpm] Updated Section 3 of draft-ietf-tcpm-1323bis
Thread-Index: Ac5Rnme2H3qd1lv1Tgmf6iL4BL4X3g==
Date: Wed, 15 May 2013 19:06:56 +0000
Message-ID: <012C3117EDDB3C4781FD802A8C27DD4F24B91C7C@SACEXCMBX02-PRD.hq.netapp.com>
Accept-Language: de-AT, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.104.60.114]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "David Borman (David.Borman@quantum.com)" <David.Borman@quantum.com>
Subject: Re: [tcpm] Updated Section 3 of draft-ietf-tcpm-1323bis
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 May 2013 19:07:06 -0000
Hi, Before I again post an update with disputed sections of text, here is my current version of section 3. Note that the title was also changed to put the emphasis away from the RTTM/RTO update part. I've tried to keep all the comments reflected in this updated text, but this might not yet reflect the concensus! Best regards, Richard 3. TCP Timestamp Option 3.1. Introduction TCP measures the round trip time (RTT), primarily for the purpose of arriving at a reasonable value for the Retransmission Timeout (RTO) timer interval. Accurate and current RTT estimates are necessary to adapt to changing traffic conditions, while a conservative estimate of the RTO inveral is necessary to minimize spurious RTOs. When [RFC1323] was originally written, it was perceived that taking RTT measurements for each segment, and also during retransmissions, would contribute to reduce spurious RTOs, while maintaining the timeliness of necessary RTOs. At the time, RTO was also the only mechanism to make use of the measured RTT. It has been shown, that taking more RTT samples has only a very limited effect to optimize RTOs [Allman99]. This document makes a clear distinction between the round trip time measurement (RTTM) mechanism, and subsequent mechanisms using the RTT signal as input, such as RTO (see Section 3.4). The timestamp option is important when large receive windows are used, to allow the use of the PAWS mechanism (see Section 4). Furthermore, the option is useful for all TCP's, since it simplifies the sender and allows the use of additional optimizations such as Eifel ([RFC3522], [RFC4015]) and others. 3.2. Timestamp Option TCP is a symmetric protocol, allowing data to be sent at any time in either direction, and therefore timestamp echoing may occur in either direction. For simplicity and symmetry, we specify that timestamps always be sent and echoed in both directions. For efficiency, we combine the timestamp and timestamp reply fields into a single TCP Timestamp Option. TCP Timestamp Option (TSopt): Kind: 8 Length: 10 bytes +-------+-------+---------------------+---------------------+ |Kind=8 | 10 | TS Value (TSval) |TS Echo Reply (TSecr)| +-------+-------+---------------------+---------------------+ 1 1 4 4 The Timestamp Option carries two four-byte timestamp fields. The Timestamp Value field (TSval) contains the current value of the timestamp clock of the TCP sending the option. The Timestamp Echo Reply field (TSecr) is valid if the ACK bit is set in the TCP header; if it is valid, it echoes a timestamp value that was sent by the remote TCP in the TSval field of a Timestamp option. When TSecr is not valid, its value MUST be zero. However, a value of zero does not imply TSecr being invalid. The TSecr value will generally be from the most recent Timestamp Option that was received; however, there are exceptions that are explained below. A TCP MAY send the Timestamp option (TSopt) in an initial <SYN> segment (i.e., segment containing a SYN bit and no ACK bit), and MAY send a TSopt in other segments only if it received a TSopt in the initial <SYN> or <SYN,ACK> segment for the connection. Once TSopt has been successfully negotiated (sent and received) during the <SYN>, <SYN,ACK> exchange, TSopt MUST be sent in every non-<RST> segment for the duration of the connection, and SHOULD be sent in a <RST> segment (see Section 4.2 for details). If a non- <RST> segment is received without a TSopt, a TCP MAY drop the segment and send an <ACK> for the last in-sequence segment. A TCP MUST NOT abort a TCP connection if a non-<RST> segment is received without a TSopt. If a TSopt is received on a connection where TSopt was not negotiated in the initial three-way handshake, the TSopt MUST be ignored and the packet processed normally. In the case of crossing <SYN> segments where one <SYN> contains a TSopt and the other doesn't, both sides MAY send a TSopt in the <SYN,ACK> segment. TSopt is required for the two mechanisms described in sections 3.3 and 4.2. There are also other mechanisms that rely on the presence of the TSopt, e.g. [RFC3522]. If a TCP stopped sending TSopt at any time during an established session, it interferes with these mechanisms. This update to [RFC1323] describes explicitly the previous assumption (see Section 4.2), that each TCP segment must have TSopt, once negotiated. 3.3. The RTTM Mechanism RTTM places a Timestamp Option in every segment, with a TSval that is obtained from a (virtual) "timestamp clock". Values of this clock MUST be at least approximately proportional to real time, in order to measure actual RTT. These TSval values are echoed in TSecr values in the reverse direction. The difference between a received TSecr value and the current timestamp clock value provides a RTT measurement. When timestamps are used, every segment that is received will contain a TSecr value. However, these values cannot all be used to update the measured RTT. The following example illustrates why. It shows a one-way data flow with segments arriving in sequence without loss. Here A, B, C... represent data blocks occupying successive blocks of sequence numbers, and ACK(A),... represent the corresponding cumulative acknowledgments. The two timestamp fields of the Timestamp Option are shown symbolically as <TSval=x,TSecr=y>. Each TSecr field contains the value most recently received in a TSval field. TCP A TCP B <A,TSval=1,TSecr=120> -----> <---- <ACK(A),TSval=127,TSecr=1> <B,TSval=5,TSecr=127> -----> <---- <ACK(B),TSval=131,TSecr=5> . . . . . . . . . . . . . . . . . . . . . . <C,TSval=65,TSecr=131> ----> <---- <ACK(C),TSval=191,TSecr=65> (etc.) The dotted line marks a pause (60 time units long) in which A had nothing to send. Note that this pause inflates the RTT which B could infer from receiving TSecr=131 in data segment C. Thus, in one-way data flows, RTTM in the reverse direction measures a value that is inflated by gaps in sending data. However, the following rule prevents a resulting inflation of the measured RTT: RTTM Rule: A TSecr value received in a segment MAY be used to update the averaged RTT measurement only if the segment advances the left edge of the send window, i.e. SND.UNA is increased. Since TCP B is not sending data, the data segment C does not acknowledge any new data when it arrives at B. Thus, the inflated RTTM measurement is not used to update B's RTTM measurement. 3.4. Updating the RTO value [Ludwig00] and [Floyd05] have highlighted the problem that an unmodified RTO calculation, which is updated with per-packet RTT samples, will truncate the path history too soon. This can lead to an increase in spurious retransmissions, when the path properties vary in the order of a few RTTs, but a high number of RTT samples are taken on a much shorter timescale. Implementers should note that with timestamps multiple RTTMs can be taken per RTT. The [RFC6298] RTO estimator has weighting factors, alpha and beta, based on an implicit assumption that at most one RTTM will be sampled per RTT. When multiple RTTMs per RTT are available to update the RTO estimator, this implicit assumption must be considered. An implementation suggestion is detailed in Appendix G. { 3.5. - former 3.4. - not changed } Appendix G. RTO calculation modification This document RECOMMENDS that the standard RTO calculation ([RFC6298]) is modified in the following way. We roughly know how many samples a congestion window worth of data will yield, not accounting for ACK compression, and ACK losses. Such events will result in more history of the path being reflected in the final value for RTO, and are uncritical. This modification will approximate the RTO estimator described in [RFC6298], regardless how many samples are taken per window: ExpectedSamples = ceiling(FlightSize / (SMSS * 2)) alpha' = alpha / ExpectedSamples beta' = beta / ExpectedSamples Note that the factor 2 in ExpectedSamples is due to "Delayed ACKs". Instead of using alpha and beta in the algorithm of [RFC6298], use alpha' and beta' instead: RTTVAR <- (1 - beta') * RTTVAR + beta' * |SRTT - R'| SRTT <- (1 - alpha') * SRTT + alpha' * R' (for each sample R')
- Re: [tcpm] Updated Section 3 of draft-ietf-tcpm-1… Scheffenegger, Richard
- Re: [tcpm] Updated Section 3 of draft-ietf-tcpm-1… Pasi Sarolahti
- Re: [tcpm] Updated Section 3 of draft-ietf-tcpm-1… Scheffenegger, Richard
- Re: [tcpm] Updated Section 3 of draft-ietf-tcpm-1… Michael Welzl
- Re: [tcpm] Updated Section 3 of draft-ietf-tcpm-1… Scheffenegger, Richard