Re: [tmrg] TCP Evaluation Suite

Michael Welzl <michawe@ifi.uio.no> Tue, 28 June 2011 07:16 UTC

Return-Path: <michawe@ifi.uio.no>
X-Original-To: tmrg@ietfa.amsl.com
Delivered-To: tmrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B69D111E807F for <tmrg@ietfa.amsl.com>; Tue, 28 Jun 2011 00:16:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -99.54
X-Spam-Level:
X-Spam-Status: No, score=-99.54 tagged_above=-999 required=5 tests=[BAYES_20=-0.74, J_CHICKENPOX_17=0.6, J_CHICKENPOX_47=0.6, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LJiFtT1m3rSU for <tmrg@ietfa.amsl.com>; Tue, 28 Jun 2011 00:16:53 -0700 (PDT)
Received: from mail-out2.uio.no (mail-out2.uio.no [IPv6:2001:700:100:10::58]) by ietfa.amsl.com (Postfix) with ESMTP id E3B9511E807D for <tmrg@irtf.org>; Tue, 28 Jun 2011 00:16:42 -0700 (PDT)
Received: from mail-mx2.uio.no ([129.240.10.30]) by mail-out2.uio.no with esmtp (Exim 4.75) (envelope-from <michawe@ifi.uio.no>) id 1QbSXH-0004Eq-Im for tmrg@irtf.org; Tue, 28 Jun 2011 09:16:39 +0200
Received: from boomerang.ifi.uio.no ([129.240.68.135]) by mail-mx2.uio.no with esmtp (Exim 4.76) (envelope-from <michawe@ifi.uio.no>) id 1QbSXG-0005QJ-TI for tmrg@irtf.org; Tue, 28 Jun 2011 09:16:39 +0200
Message-ID: <4E098049.5000106@ifi.uio.no>
Date: Tue, 28 Jun 2011 09:18:33 +0200
From: Michael Welzl <michawe@ifi.uio.no>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.18) Gecko/20110616 Thunderbird/3.1.11
MIME-Version: 1.0
To: tmrg@irtf.org
References: <20110628070543.GA19045@bilby.lan>
In-Reply-To: <20110628070543.GA19045@bilby.lan>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-UiO-Ratelimit-Test: rcpts/h 1 msgs/h 1 sum rcpts/h 2 sum msgs/h 1 total rcpts 10312 max rcpts/h 36 ratelimit 0
X-UiO-Spam-info: not spam, SpamAssassin (score=-5.0, required=5.0, autolearn=disabled, T_RP_MATCHES_RCVD=-0.01, UIO_MAIL_IS_INTERNAL=-5, uiobl=NO, uiouri=NO)
X-UiO-Scanned: 2003AE76570C2F04937011CDDC21416191DBDF77
X-UiO-SPAM-Test: remote_host: 129.240.68.135 spam_score: -49 maxlevel 80 minaction 2 bait 0 mail/h: 1 total 5169 max/h 19 blacklist 0 greylist 0 ratelimit 0
Subject: Re: [tmrg] TCP Evaluation Suite
X-BeenThere: tmrg@irtf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: IRTF's transport modeling research group <tmrg@irtf.org>
List-Id: IRTF's transport modeling research group <tmrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/tmrg>, <mailto:tmrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/tmrg>
List-Post: <mailto:tmrg@irtf.org>
List-Help: <mailto:tmrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/tmrg>, <mailto:tmrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Jun 2011 07:16:54 -0000

Hi!

I think it's great to see that this is being taken ahead!

Stefan Hirschmann has done a master thesis with me, in which he 
investigated a different approach to Slow Start. As a starting point, I 
told him to look at this test suite for evaluations - but, 
unsurprisingly, we found it inadequate for the types of analyses needed 
to compare variations of Slow Start. He therefore extended it to cover 
some necessary scenarios - all of this can be found in his thesis:
http://heim.ifi.uio.no/~michawe/teaching/dipls/hirschmann.pdf

This may or may not have some relevance for the test suite... I'm not 
suggesting to change or extend the scope because this work has been 
going on for too long already and it would be just great if someone 
could quickly finish it now. Still, there could be some useful 
information for you in this thesis, so I felt like sharing it.

Cheers,
Michael

PS: Since Stefan actively used the test suite, he may be able to give 
some direct answers to the questions below. I think he's on this list...


On 6/28/11 9:05 AM, David wrote:
> Hi Everyone,
>
> I have been working on a revised version of the TCP test suite (see
> http://tools.ietf.org/html/draft-irtf-tmrg-tests-02) as well as its ns-2
> implementation. In the process we have identified a number of design decisions
> to be made.  Below, we propose some possibilities, but would like to get
> research group consensus on them before releasing the next draft.
>
> Some of these possibilities are based on practical issues to do with the
> implementation of the draft in ns-2. This email is meant to bring everyone up to
> date with the progress, explain some of the practical background, and open the
> floor for comments, criticisms, suggestions, etc. I especially especially
> solicit responses to the questions between "***".
>
> So far, most of the work has been with the basic dumbbell scenarios.
>
>
> Note there are some ascii diagrams that require a fixed with font to be viewed
> correctly.
>
>
> 1. TMIX TRAFFIC
>
> 1.1 Background on the traffic traces:
>
> The test suite uses Tmix to replay traces of TCP connection arrivals and their
> resulting origin-destination interatactions, including the amount of data for
> each interaction. The Tmix traffic used in the scenarios is not symmetric and
> not stationary.
>
> Some analysis of the traffic would be good to have as part of the suite, and is
> being investigated.
>
>
> 1.2 Background concerning scaling the traffic.
>
> To adjust the traffic load for the given scenarios, the connection arrival times
> are scaled.
> Connections are started at:
> 	          time = scale * connection_start_time
>
> The smaller the scale the higher (in general) the traffic.
>
> Note that changing the connection start times also changes the way the traffic
> interacts, potentially changing the "clumping" of traffic bursts.
>
>
> 1.3 Simulation start up
>
> Background:
>
> To accelerate the system start up, the system is "prefilled" to approximate a
> "steady state" of congestion. All connections that would have started in time=0s
> to time=prefill are started more closely together spaced over a MaxRTT period
> before time=prefill.  The idea behind this is to lessen the time taken before
> measurements can be taken. Measurements are begin after time=2*prefill
>
>
> Fixed with font required.
> eg.
>                       <---->
> 			MaxRTT
> |--------------------|----|-------------------------|
> t=0                       t=prefill                 t=2*prefill
>                       ^
>                       | start connections here.
>
>
>
> *** Does any one foresee problems with attempting to reduce the necessary
>    simulation warmup time with this method? ***
>
>
> 1.4. Packet sizes
>
> The draft calls for 10% 536B and 90% 1500B TCP packets. To achieve this, we have
> added a new MSS record has been added to the Tmix traffic vector files, and Tmix
> enhanced to use it. The original traces have been processed so that the TCP MSS
> for each tmix connection is selected randomly (i.i.d.) with 10% 496B and 90%
> 1460B.
>
> Rationale: The maximum segment size is generally constant for any particular
> 	      connection.
>
> *** Is it enough to randomly select the MSS for each connection in the trace
>      file without being concerned how much traffic each connection generates?
>      ***
>
> 2. SCALE SELECTION and LOSS TARGETS
>
> The connection arrival times are scaled for each scenario so as to achieve a
> certain average loss rate at the most congested queue on the central link. The
> draft in general omitted the target loss rates, but noted that moderate
> congestion had about a 1-2% loss rate for the access link.
>
> In these scenarios packet loss is not primarily caused by long lived TCP
> sessions in congestion avoidance mode cyclically filling the buffer. It is often
> caused by the "collision" of multiple TCP sessions starting together.
>
> It was difficult to achieve any stable results with these targets for the
> dial-up and geostationary satellite cases. Both of these scenarios experience
> very bursty loss, with the dial-up scenario by being by far the worst. For these
> scenarios we propose:
>
> Geostationary satellite: Mild congestion 2%, Moderate congestion 6% Dial-up link
> (64kbps): Mild congestion 5%, Moderate congestion 15%
>
> Wireless link: to be studied.
>
> The rule of thumb was moderate congestion is three times the loss of mild
> congestion. For the non-congested link traffic we propose a connection arrival
> rate of half that of the mild-congestion scenario.
>
>
> *** Do these targets seem reasonably practical? Comments? Suggestions? ***
>
> 3. SIMULATION TIMES
>
> The draft recommended at least 100s. We have found that this is not sufficient
> in any but the data center and transoceanic scenarios.
>
> The simulation times listed below are rough minima which provide enough
> averaging for a reliable determination of the connection arrival time scaling
> for the target loss rates. Final values require further study.
>
> Approximate simulation times:
>   - data center: ~85s (including 35s warmup)
>   - transoceanic: ~100s (40s warmup)
>   - access link: ~360s (60s warmup)
>   - geostationary: ~740s (40s warmup)
>   - dial-up: ~5100s (100s warmup)
>
> NOTES
> 	1. The traffic is not stationary.
> 	2. data center and transoceanic links have thousands of concurrent
> 	TCP sessions and take a significant amount of real time.
>
>
> *** Is everyone happy with simulation times varying according to the scenario?
>      Comments? Suggestions? ***
>
>
> 3. BASIC DUMBBELL SCENARIOS:
>
> 3.1 Topology:
>
> Topologies mainly are as described in draft with changes highlighted below.
>
> 3.1.1 Data Center and Transoceanic
>
> Currently a central link of 1Gbps is the fastest speed that can be
> practically simulated with ns-2. We propose that these experiments use a
> 1Gbps central link.
>
>
> *** Is this reasonable? Comments? Suggestions? ***
>
> 3.1.2 Geostationary satellite
>
>
> The draft proposed a topology with a 40Mbps bidirectional central link. The
> access links were asymmetric 40/4 Mbps on one side of the central link, and
> 4/40Mbps on the other side of the central link.
>
> We propose that this scenario be altered to model a network connected to a
> hub, connected via satellite to the backbone Internet. The central link
> models the asymmetric satellite connection. See below:
>
> Fixed with font required.
>
>             Node_1                                      Node_4
>                   \                                    /
>                    \          satellite link          /
>            Node_2 --- Router_1 -------------- Router_2 --- Node_5
>                    /                                  \
>                   /          4Mbps--->                  \
>             Node_3<---40Mbps              Node_6
>
>
> We propose that the delay parameters are as in the draft, and that the
> access links are set at 100Mbps.
>
>
> Rationale:
> 	   1. Our (inexhaustive) scan of current commercial satellite offerings
> 	   didn't show common use of the draft scenario.
>
> 	   2. The proposed scenario seemed more common in practice (though
> 	   sometimes the link is more asymmetric (ie 1Mbps up and 40Mbps down).
>
> 	   2. It is consistent with the other wired dumbbell scenarios, with
> 	   congestion on the central link.
>
> 	   4. Having congestion on some of the access links, each of which carry
> 	   different traffic makes it difficult to determine the appropriate
> 	   "scale" factor for connection arrivals to achieve the target level of
> 	   congestion.
>
>
> *** Do you agree with this proposed change in topology? Can you see problems
>      with it? Comments? Changes? Enhancements? Caveats? ***
>
>
> 3.2 Buffer sizes
>
> The draft suggests a buffer size of 100ms. This is roughly the median base
> RTT of possible paths for the access link scenario (actually 102ms).
>
> This size is too big for the data center scenario.  It is very impractical for
> the dial-up link (less than 1 packet!).
>
> To deal with these issues, while attempting to provide a standard type of
> test, We propose the following buffer sizes:
>
>        1. Access-link, data-center, trans-oceanic, and geostationary scenario
>        central link buffers are set to the median base RTT. This results in
>        buffer sizes of 102ms, 22ms, 232ms, and 702ms respectively (these being
>        converted to the integer number of 1500B packets that achieve this,
>        rounded down).
>
>        2. The dial-up link have its central link buffers set at 6 packets. This
>        allows for 3 concurrent 2 packet bursts.
>
>        3. The wireless scenario is still being investigated.
>
>
>
> *** Do these proposed buffer sizes seem reasonable? Comments? Suggestions?
>      Other proposals? ***
>
>
>