[iccrg] Review of draft-ietf-iccrg-tcpval-01/ Issues on traffic traces
Mirja Kühlewind <mirja.kuehlewind@tik.ee.ethz.ch> Fri, 12 December 2014 16:45 UTC
Return-Path: <mirja.kuehlewind@tik.ee.ethz.ch>
X-Original-To: iccrg@ietfa.amsl.com
Delivered-To: iccrg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6DB301ACEDB for <iccrg@ietfa.amsl.com>; Fri, 12 Dec 2014 08:45:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.21
X-Spam-Level:
X-Spam-Status: No, score=-1.21 tagged_above=-999 required=5 tests=[BAYES_50=0.8, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_MED=-2.3, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id waJioRYr6JjF for <iccrg@ietfa.amsl.com>; Fri, 12 Dec 2014 08:45:08 -0800 (PST)
Received: from smtp.ee.ethz.ch (smtp.ee.ethz.ch [129.132.2.219]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E820B1ACED4 for <iccrg@irtf.org>; Fri, 12 Dec 2014 08:45:07 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by smtp.ee.ethz.ch (Postfix) with ESMTP id 18911D9303; Fri, 12 Dec 2014 17:45:05 +0100 (MET)
X-Virus-Scanned: by amavisd-new on smtp.ee.ethz.ch
Received: from smtp.ee.ethz.ch ([127.0.0.1]) by localhost (.ee.ethz.ch [127.0.0.1]) (amavisd-new, port 10024) with LMTP id Zb9OlXbYP3Ee; Fri, 12 Dec 2014 17:45:04 +0100 (MET)
Received: from [82.130.103.143] (nb-10510.ethz.ch [82.130.103.143]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: mirjak) by smtp.ee.ethz.ch (Postfix) with ESMTPSA id A2BADD9302; Fri, 12 Dec 2014 17:45:04 +0100 (MET)
Message-ID: <548B1B90.5050004@tik.ee.ethz.ch>
Date: Fri, 12 Dec 2014 17:45:04 +0100
From: Mirja Kühlewind <mirja.kuehlewind@tik.ee.ethz.ch>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
MIME-Version: 1.0
To: iccrg@irtf.org, David hayes <davihay@ifi.uio.no>, David Ros <dros@simula.no>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: http://mailarchive.ietf.org/arch/msg/iccrg/DdjUeOOqN83RDz5S1Qp8WdKTM3Y
Subject: [iccrg] Review of draft-ietf-iccrg-tcpval-01/ Issues on traffic traces
X-BeenThere: iccrg@irtf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussions of Internet Congestion Control Research Group \(ICCRG\)" <iccrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/iccrg>, <mailto:iccrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/iccrg/>
List-Post: <mailto:iccrg@irtf.org>
List-Help: <mailto:iccrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/iccrg>, <mailto:iccrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Dec 2014 16:45:11 -0000
Hi, I reviewed draft-ietf-iccrg-tcpval-01. I understand that the goal of this document is to finish work that was started more than 5 years ago (and actually should have been finished at that time as well). In general I think it would be very useful to have a document that describes initial test cases, however I'm not sure how useful this document is to provide "quick and easy" initial results (for comparison). Some background information: I recently performed a larger evaluation study to evaluate my own congestion control proposal for my thesis. I know the scenarios described are only meant to provide some initial evaluation and are surely not sufficient to provide an exhausting evaluation as needed for a thesis, but I actually fail/give up to use any of the described scenarios at all. I have to say that I'm not using ns-2 for simulation but an own simulation library that is able to fully include VMs into the simulation without any influence of host system (therefore provide reproducible results). However, not matter why I'm using a different tool, I believe the idea of the draft is to described scenarios for exactly this use case. If the draft only ends up documenting what's implemented in ns-2, this should not be an RFC but should be part of the ns-2 documentation instead. Therefore my first review comment is that I would rather move all ns-2 (and Tmix) specific parts in the annex (or even try to remove it at all). However, the reason why I ended up not using these scenarios are: 1) It was extremely hard (not quick and easy) to get any simulation running with the given traffic traces (and more than a little load) for more than a few seconds mainly due to implementation limitation such as the max. number of active flows. For some reason the real kernel in the VM did not release the socket right after the end of the transmission, so the number of active flow the kernel was counting was higher than the number of active flows in the simulation. There are several solutions to this problem, e.g. digging into the kernel code or just use multiple kernels instead. However, I gave up as it was simply not quick and easy and because of reason 2) 2) The proposed scenario based on the given traffic traces did not allow me any useful comparison (at least for the limited cases that I tried). Basically what I'm saying is that there is no significant different for any TCP extension in test because I was not changing Slow Start while most of the short flows in the traffic traces would never leave Slow Start. Further with these rather complicated traffic pattern, it is basically impossible to say if the different results you see actually depend on the used TCP extension or if these are just e.g. minor timing differences that might have a bigger effect. Again I know the goal is to finish this old work, but my main concern regarding the usefulness of this document is due to the traffic generation. And believe this is also the point where they got stocked then, because that's not easy. Unfortunately I have to say that everything you added (with respect to draft-irtf-tmrg-tests-02) does make the situation better and mostly reads like 'try-and-error' while being very specific to this one (out-dated) data set. I'll give more detailed comments further below. But I think all this text should at least be moved to the appendix. Than, as just indicated, the data set is from 2006. If the main reason to use these traces it to achieve more realistic traffic pattern, the goal is failed as traffic pattern have strongly changed since then (e.g. a huge proportion of the traffic is now video). Further, I don't even agree to the point that "traffic must be reasonably realistic". If the goal is "to compare and contrast proposal against standard TCP", it is more useful to use very simple traffic models such that effects that occurred can actually be related to certain behaviors of an algorithm. If the goals to check if the extension is safe for experimentation in the wider Internet (which seems to be a not spelled-out goal), than it would be most important to investigate corner/extreme cases which do not occur very often in the Internet but are so different that things could break. Of course, when you design a new extension, you also have to show that it works well in a "reasonably realistic" environment but that can not be the goal for this document. Further the best way to do that is to run it on the Internet. In this case you can't analyze any specific effects as you don't know the network condition, but you can basically say: it works or it doesn't work... Here are some more detailed comments on the traffic generation part; I will provide further comment on the rest on the document as soon as we have resolved the issues raised above... - 1. paragraph of section 2.1: This section argues that start time and flow size (distributions) are not enough, instead traffic should be model by start time, request and response size, and think time (distributions). First of all the paragraph is slightly hard to read and thus it would be great if it could be spelled out more clearly what the used model is (at the begining and then give the reasoning). However, I disagree with this model. I don't think that it is needed to model a dependency between a request and a response for TCP evaluations. Of course this might be more realistic but as long as we don't model application behavior or even user interactions that doesn't give us much and makes thing more complicated. Instead we should investigate scenarios where the load level on the forward and backward channel can be set independent of each other. Further modeling think times in contrast to inter-arrival times (IATs) only makes a difference for TCP evaluations where the 'waiting' is very small and therefore TCP will not fall back to Slow Start in the first case. What I do to cover both cases is, that I always just model flow sizes and IATs but have an option to say if the data generated by one generator should be send on the same TCP connection or if a new connection should be open for each data flow. This can lead to the case when the IAT is small and the available capacity is small as well, that all flows will be send back to back appearing as one long flow. However, i think that's in responsibility of the researcher doing the experiment to verify if this case has happened. - section 2.2. As I said I don't think there should be Tmix specific text in here. - section 2.3.2 Non-stationarity is a big problem which makes all experimentation with this data complicated (and not quick and easy). It is not clear at all how much your applied hacks actually changes the traffic characteristics of the trace. Therefore it also not clear how realistic the traffic still remains at the end (Therefore I would rather just like to see the traffic characteristic, such as flow size and IAT distribution, of this traced written down as an input for an artificial traffic generator; I don't this is less realistic.) But I also believe some of the problems with non-stationarity are specific to this trace. The trace seems to take only new started flows into account for the measurement which does not reflect the actually traffic load of the measured system at the beginning of the trace. This might be different with a different trace that has been measured differently. And I only can say this again, because I think that is really important, the trace is out-dated and we should not completely rely on this one trace in this document. - section 2.2.3.1 The number of 500e6 just seems random and is probably very specific to this one trace. The forward reference to section 4.4.2. is not understandable. - section 2.4. This again sound like guess work; I would rather like to define values and actually apply them to an artifical traffic model. - section 3 should really go in the appendix - section 3.1. says "it is important to test congestion control in overload". That is not wrong but, in fact, if there is sufficient data to send TCP will always try to fill the link; I would not call this overload. If there is not enough data to send congestion control does not even become active (except Slow Start). Therefore if you re-design the congestion control behavior and would like to evaluate this against 'standard TCP', the only cases that are interesting is when TCP is able to fill the link. However, even if the total load is e.g. only 85%, there will be phases during the simulation run where the link is full. Therefore there should never be a case where A>C (or offered load > 100%). This only leads to non-steady behavior, as you say correctly. In a real system this will lead to congestion collapse where even TCP cannot help anymore. But in a real system there usually is a user behind a computer who will just give up. - section 3.3 Without having tried to apply this, this part doesn't seem to be super useful. Again values seems to be arbitrary and at end, if I get you right, you save less than one RTT of simulation time. That wouldn't help any problems for my. And again, as I said above, this is a problem of this trace and could have been avoided when the trace would have been differently collected. - I'll comment on the rest of the document later. To conclude, I would like to propose to remove the traffic traces from the document (or move to the appendix but I believe this text should go into some ns-2 documentation instead). I know the intention is to finish the original work, but especially as the traces are out-dated for me the document was not useful as it is. Maybe this should be discussed with the original authors. Instead of the traffic traces I would like to use simple scenarios with a certain (small) number of greedy flows and/or short flow cross traffic (with a certain IAT and flow size distribution). I can provide numbers of what I've used or there are also scenarios described in draft-sarker-rmcat-eval-test-00 (which often in addition uses video traffic which makes things even more complicated than we would need it here for this initial evaluation). Maybe if possible it would anyway be useful to try and align the structure and/or terminology of these two draft. Sorry for the long email. I hope that is still helpful. Mirja
- [iccrg] Review of draft-ietf-iccrg-tcpval-01/ Iss… Mirja Kühlewind
- Re: [iccrg] Review of draft-ietf-iccrg-tcpval-01/… Lachlan Andrew
- Re: [iccrg] Review of draft-ietf-iccrg-tcpval-01/… Mirja Kühlewind