[Tmrg] tmix-linux: burst of FIN packets causes packet loss at end of experiment
ritesh at cs.unc.edu (Ritesh Kumar) Fri, 17 October 2008 00:08 UTC
From: "ritesh at cs.unc.edu"
Date: Thu, 16 Oct 2008 20:08:05 -0400
Subject: [Tmrg] tmix-linux: burst of FIN packets causes packet loss at end of experiment
In-Reply-To: <48F6CEB8.2010604@room52.net>
References: <48F5291C.8020507@swin.edu.au> <aa7d2c6d0810141854v50c9d379wcbb54435e71e890f@mail.gmail.com> <48F6CEB8.2010604@room52.net>
Message-ID: <f47983b00810161708k7d93b412n4b787f2dd931fc12@mail.gmail.com>
On Thu, Oct 16, 2008 at 1:18 AM, Lawrence Stewart <lstewart at room52.net>wrote: > Hi Lachlan and all, > > Lachlan Andrew wrote: > > [snip] > > > > > There is a fundamental statistical need to ignore the first little part, > > so that the number of connections can reach "steady state", but I don't > > know any fundamental reason to ignore the last 1/3 of the > > experiment, provided that traffic generator ends flows cleanly. Since > > our suite is very time-constrained, we want to cut out any unnecessary > > waiting. > > > I think its a good idea to cut down as much wait time as possible. The reason why we choose to ignore the last 1/3rd of the experiment is because Tmix logs full results (response times etc) for only the connections which get completed. So near the end, we have a lot of connections which don't finish (and we subsequently don't log results for them) though the traffic dynamics definitely get impacted by them. Hence, if you look at the connection arrival/departure process from the tmix results (which you can create using connection start times and the durations) then you will notice a ramp up _and_ a ramp down of connections. I understand that in many scenarios one may not need to follow this recommendation. However, I would instead recommend not stopping tmix at a given time but stopping it only when all connections are done. I think not giving the -d <time> switch does that automatically. > > > I'm Cc'ing this to TMRG in case someone on the list knows of a strong > > reason to ignore the end of an experiment (or knows a way around the > > SNMP problem). > > I certainly couldn't say that ignoring the last 1/3rd of an experiment > is necessary in my experience. However, I have observed some behaviour > with the FreeBSD TCP implementation which also might be relevant to > other TCPs as well and might be pertinent to this discussion. > > Increasing the tx socket buffer size at the sender can lead to a > situation where at the end of the connection, the userland process > closes the socket, but the kernel finds itself with a large buffer of > data still needing to be sent. Going on memory here, I recall observing > that sometimes (haven't taken the time to narrow down when/why etc) some > of the TCP variables can get apparently messed up e.g. cwnd can take on > some unexpected values while the buffer is being flushed. I can't be > more specific than that right now, but I hope to sit down and nut it out > at some point. > > Stepping back further, one might reasonably ask why you'd need to > increase the tx socket buffer to a size where this problem is > possible... I noticed by trial and error that when trying to use a > non-real-time OS to do traffic generation, sometimes vagaries in kernel > scheduling meant that you could end up with an empty tx socket buffer > for periods of time during transmission if you didn't have the buffer > sized a substantial amount larger than the BDP of the path. > > I've worked around the issue by chopping the end off files after the > time at which the traffic generation process (iperf in my case) closes > the socket (normally a few seconds at most depending on the test > parameters). Definitely not ideal, I know, but it works around the issue > and I thought it was a story from the coal face worth sharing. > Wow... thats a really interesting scenario. However, I always thought that a close() on a connection would block till all the data for the connection is sent. May be I am wrong... these overloaded syscalls are generally weird in more than one way. My experience with Linux's TCP implementation is the following: 1) I never tried pushing more than 1Gbps using a single connection on my network. However, I didn't need to change tx socket buffer parameters. Linux TCP has socket buffer autotuning which automatically allocates more to TCP buffers. When I tried to play with these parameters, it shut off autotuning giving me errors when I tried pushing a really large number of connections through my testbed setup (~4000 simultaneous connections). 2) To make sure that we don't go into scheduling vagaries, we take a look at the CPU usage from tmix logs. We make sure that we don't have high CPU usage for a relatively large amount time. While running experiments we make sure we run experiments with less traffic load than the tcvecs we ran the calibration with. It is not uncommon to see different CPU usage statistics with different origin traces at the same offered load. We hence run calibration experiments with all our origin traces. 3) Linux TCP has a sysctl variable net.ipv4.tcp_low_latency. This is by default kept disabled which enables "tcp prequeing" which enqueues incoming data from the network interface in a "pre-queue" and processes in the receiving process' context. I haven't tried/measured it but may be some scheduling vagaries can be got rid off by enabling low latency? I, however, still like over provisioning our testbed with more CPUs than figuring out the detailed inner workings of complicated kernels like Linux :) Warm regards, Ritesh -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.ICSI.Berkeley.EDU/pipermail/tmrg-interest/attachments/20081016/b76ca541/attachment.html
- [Tmrg] tmix-linux: burst of FIN packets causes pa… Lachlan Andrew
- [Tmrg] tmix-linux: burst of FIN packets causes pa… Lawrence Stewart
- [Tmrg] tmix-linux: burst of FIN packets causes pa… Ritesh Kumar