[tcpm] Review of draft-fairhurst-tcpm-newcwv-03

Yuchung Cheng <ycheng@google.com> Mon, 16 July 2012 22:55 UTC

Return-Path: <ycheng@google.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 595C511E80EB for <tcpm@ietfa.amsl.com>; Mon, 16 Jul 2012 15:55:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.877
X-Spam-Level:
X-Spam-Status: No, score=-102.877 tagged_above=-999 required=5 tests=[AWL=0.100, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rhVfH5j287Yn for <tcpm@ietfa.amsl.com>; Mon, 16 Jul 2012 15:55:40 -0700 (PDT)
Received: from mail-ob0-f172.google.com (mail-ob0-f172.google.com [209.85.214.172]) by ietfa.amsl.com (Postfix) with ESMTP id B49CD21F87A2 for <tcpm@ietf.org>; Mon, 16 Jul 2012 15:55:39 -0700 (PDT)
Received: by obbwc20 with SMTP id wc20so11361737obb.31 for <tcpm@ietf.org>; Mon, 16 Jul 2012 15:56:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:from:date:message-id:subject:to:cc:content-type :x-system-of-record; bh=nVpCTsc1cHouA8oMB8mowLXhs8xhZf2pYEUQijIXGwg=; b=hs2fEvy/espmLYKb9kVZznzW2opaDAAaH319rdDt0P5VGElpUYX9f2XvOZMzWkI/jU e7YjkXrwaFefqsPsa+L4zzZKjnw5lEulF85CvOa7WkAh4jjBgro4Z69iKrszyhNOeok1 0g7VZIv/ku+ULGNrsyNXcnUOUogkp/amMsUh0udLX0OXSSpvNwbDHHN5KypFdr8OLADg I+EcI8X2XLSSGtQSm17UDNXzD+2/5J6SHb86FKZkYpzMzzxlb7TmLyhEqkTxoWwP7POG G7/3YMFJC6IuxLPfEVEt+/BQB4K0B8gSN0Um8hvXNEDnElEVEx83HobHMT78EWKO9w+e FqVQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:from:date:message-id:subject:to:cc:content-type :x-system-of-record:x-gm-message-state; bh=nVpCTsc1cHouA8oMB8mowLXhs8xhZf2pYEUQijIXGwg=; b=RRK/yE6HNTPgKOBTPvX1ou/7yDKOCgOXoIWwiDbJP22ST1aTxHHgLkh05ekR0cVVG/ 4iXCHc2SpSqc5g6Bfc+45uCdvGc4kI/GEx3dJ2uf+8rE+/iBY5p9AY1yYlwkITscOp+A fDN6PWj3Aav/YoPA/58eqFOtJQrT1QcJ4kttpifKiAUY6QeDTkDxNvnS61R5Qb2J67Jz QmAZT3gmZ7YIPToA75r5DvZcnT4Z1/L9uZeTZcq4vM5ot7XoB6kmyunnHueXbyy2LNP8 m9OPODEUxGl7rfNf9X3GFuyPebnyN94Dl3HrffgpO07e18ssV8o00hGdlwV4jawqtzJZ JBsw==
Received: by 10.182.231.6 with SMTP id tc6mr145696obc.63.1342479385375; Mon, 16 Jul 2012 15:56:25 -0700 (PDT)
Received: by 10.182.231.6 with SMTP id tc6mr145671obc.63.1342479385096; Mon, 16 Jul 2012 15:56:25 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.60.14.168 with HTTP; Mon, 16 Jul 2012 15:56:04 -0700 (PDT)
From: Yuchung Cheng <ycheng@google.com>
Date: Mon, 16 Jul 2012 15:56:04 -0700
Message-ID: <CAK6E8=d_NrKkFhRUjSJ1MEMZ_CnEEEadzRAySD7SGGoGcjCdDw@mail.gmail.com>
To: "tcpm@ietf.org Extensions" <tcpm@ietf.org>
Content-Type: text/plain; charset="ISO-8859-1"
X-System-Of-Record: true
X-Gm-Message-State: ALoCoQnGrzciyYpHQhUIRwh7D3owZOQCmJFYK0XEQouHYq7cwKVVP+MV+g7mJJAhBstnJpURy7exGADzjSXrT3LNSD+M7ewyEVCb5QJZbLwV8VHEmmZmzLY8tnUXLpP1PIyGrZFvj+N7pjvIhR/1M6MvBEMUjI8clG9p+K9L+IQ302YOlImP9k0=
Cc: "Arjuna Sathiaseelan (work)" <arjuna@erg.abdn.ac.uk>, "iccrg@cs.ucl.ac.uk" <iccrg@cs.ucl.ac.uk>
Subject: [tcpm] Review of draft-fairhurst-tcpm-newcwv-03
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 16 Jul 2012 22:55:41 -0000

Summary:

First of all the problem this draft is trying to solve is important:
AFAIK servers and data-centers disable slow-start after idle because
it simply hurts latency too badly. Modern TCP-smart applications all
use persistent connections to amortize handshake and slow-start
overhead and the RFC2861 standard nullifies that to certain extent.
While the applications don't want to create big burst to cause
congestion, they have no better choices (low latency vs occasional
burst losses: choose one). This draft proposes a middle road. Thanks
for starting this!

Details:
Section 1
I found the term "variable-rate" application little awkward: for
example some constant-rate (CBR) app may still go through slow-start,
overshoots, and experiences non-validated phases. But it's nit.

Section 4.1
Since the draft already requires SACK, I recommend to use RFC3517 pipe
to better measure the amount of outstanding data in the network.

Section 4.2

a) Justification of 2/3 and 1/3 is much appreciated.
b) During the validated phase when FS >= 2/3cwnd, are ACKs used to
compute new cwnd. E.g., in slow-start with cwnd=30 and FS=20, will the
next 10 ACKs increase the cwnd to 40? also is this phase check applied
to both transmitting new data and receiving acks?


Section 4.3.1
"  An application that remains in the non-validated phase for a period
   greater than five minutes is required to adjust its congestion
   control state.  At the end of the non-validated phase, the sender
   MUST update cwnd:

           cwnd = max(FlightSize*2, IW)."
Some justification to set cwnd to double the FlightSize? it seems with
2/3 rule in the non-validated phase, cwnd can grow to as big as 4/3 of
original value at the end of the non-validated phase.

btw, it flows better if 4.3.1 and 4.3.2 are swapped.

Section 4.3.2
"A sender that detects a packet-drop or receives an ECN marked packet
 MUST calculate a safe cwnd, based on the volume of acknowledged data:

        cwnd = FlightSize - R.

 Where, R is the volume of data that was reported as unacknowledged by
 the SACK information.  This follows the method proposed for Jump
 Start [[Liu07]."
so is R 0 or 1 of of an ECN marked packet?

Some justification to your approach compared to RFC5861?
i.e. why (FlightSize - R)/2 vs FlightSize/2?


After thougths:
Since this RFC attempts to replace RFC2861, maybe it can also address
this performance handicap in the introduction of RFC2861?

"We propose that the TCP sender should not increase the congestion window
   when the TCP sender has been application-limited (and therefore has
   not fully used the current congestion window).  We have explored
   these algorithms both with simulations and with experiments from an
   implementation in FreeBSD."

This happens frequently today because many application pauses are
within RTOs. For example, the acks of smaller HTTP responses after
some big one will now not used to compute cwnd. They are used for RTO
estimation only.

I think we need a general solution to cover when cwnd > pipe. So far
we have laminar and your draft. But a unified solution would be great
but we are making progress :)

Thanks.