[Tmrg] TCP evaluation suite

lachlan.andrew at gmail.com (Lachlan Andrew) Sat, 28 June 2008 00:05 UTC

From: "lachlan.andrew at gmail.com"
Date: Fri, 27 Jun 2008 17:05:25 -0700
Subject: [Tmrg] TCP evaluation suite
Message-ID: <aa7d2c6d0806271705g4f0363e7wd20787231f19ddac@mail.gmail.com>

Greetings all,

After a long delay, here is a draft Internet Draft based on the
PFLDnet TCP evaluation suite paper.

An initial WAN-in-Lab implementation should be ready in a couple of
months, and I believe Wang Gang is still working part time on the NS
implementation.  Once that is ready, we can start setting actual
parameters by comparing the results against measurement studies.

There are still lots of "TBD" parameters, and comments (in bold in the .html).

- Currently, statistics are listed as being measured over the last
*half* of the experiment.  On a 100s experiment, that gives 50s to
avoid the effect of simultaneous flows slow-starting.  However (a) on
long experiments, it is overkill for avoiding the initial slow-start,
but (b) it may be too short for the number of flows to reach
"equilibrium".  I vote that we recommend a particular warm-up time
(say 50s) independent of the length of the experiment, and start the
system "near equilibrium" (not from zero flows).

- When comparing with "standard TCP", we specify which recent
proposals are included and not.  Which proposals should we list?  I
vote for:
 = SACK (included)
 = ECN  (not included)
 = Window scaling (included, even though many Windows machines don't use it)
 = Forward RTO (included)
 = Appropriate Byte Counting (It is on in Windows, and was briefly on
in Linux.  If we don't include it, should we account for Linux's
suppression of delayed ACKs during initial slow start when comparing
against measurements?)

- Currently, some of it is written in the style of a paper ("We use
two flows...").  I think we should make it prescriptive ("Do this")
instead of descriptive.  Should we also use SHOULD, MAY etc to make
clear what is part of the "core" tests?

- Once there is an NS version of the test, it would be good to check
that the RTTs actually give a good approximation to measured RTT
distributions

- All traffic loads have yet to be determined.  Sally suggested
setting these by matching the loss rate observed in the Internet with
the loss rate arising from newReno.  Since many current web servers
use Linux/CUBIC, should we instead match the measurements to
simulations of an appropriate mixture of newReno and CUBIC?

- How polished does something need to be to be registered as an
"Internet Draft"?  Can we submit this as a -00 draft, or should we get
more consensus first?



Anyone who wants to contribute to this draft is welcome to, whether or
not you were involved with the PFLDnet paper.  The author list is
starting from scratch.

Cheers,
Lachlan

-- 
Lachlan Andrew Dept of Computer Science, Caltech
1200 E California Blvd, Mail Code 256-80, Pasadena CA 91125, USA
Ph: +1 (626) 395-8820 Fax: +1 (626) 568-3603
http://netlab.caltech.edu/lachlan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: draft-irtf-tmrg-tests-00.xml
Type: text/xml
Size: 58887 bytes
Desc: not available
Url : http://mailman.ICSI.Berkeley.EDU/pipermail/tmrg-interest/attachments/20080627/18148a6d/attachment-0001.xml 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: draft-irtf-tmrg-tests-00.txt
Url: http://mailman.ICSI.Berkeley.EDU/pipermail/tmrg-interest/attachments/20080627/18148a6d/attachment-0001.txt 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/tmrg-interest/attachments/20080627/18148a6d/attachment-0001.html