Re: [bmwg] Mean vs Median

Paul Emmerich <emmericp@net.in.tum.de> Wed, 11 November 2015 09:29 UTC

Return-Path: <emmericp@net.in.tum.de>
X-Original-To: bmwg@ietfa.amsl.com
Delivered-To: bmwg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8F73D1B48C1 for <bmwg@ietfa.amsl.com>; Wed, 11 Nov 2015 01:29:29 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.15
X-Spam-Level:
X-Spam-Status: No, score=-1.15 tagged_above=-999 required=5 tests=[BAYES_50=0.8, HELO_EQ_DE=0.35, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nMh9lIubKYqE for <bmwg@ietfa.amsl.com>; Wed, 11 Nov 2015 01:29:27 -0800 (PST)
Received: from mail-out1.informatik.tu-muenchen.de (mail-out1.informatik.tu-muenchen.de [131.159.0.8]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 278FB1B48B7 for <bmwg@ietf.org>; Wed, 11 Nov 2015 01:29:27 -0800 (PST)
Received: from dyn94st.net.in.tum.de (dyn94st.net.in.tum.de [131.159.14.94]) by mail.net.in.tum.de (Postfix) with ESMTPSA id 6E8661883002 for <bmwg@ietf.org>; Wed, 11 Nov 2015 10:29:24 +0100 (CET)
To: bmwg@ietf.org
References: <6b20c5aba195.56384250@naist.jp> <6a60f4388bab.563847bc@naist.jp> <6bd0f10697e2.563847f8@naist.jp> <6a409179ad4a.56384835@naist.jp> <6a80cfd8c72d.56384871@naist.jp> <6c30b15ad280.563848ae@naist.jp> <6c30f0e98215.563848ea@naist.jp> <6c10c39aeff9.56384926@naist.jp> <6ab08659b996.56384963@naist.jp> <6ab0ea4dfdd6.563849a0@naist.jp> <6ab0be62e098.563849dc@naist.jp> <6aa0abb5b14b.56384a19@naist.jp> <6aa0e679a9c8.56384a55@naist.jp> <6b60e1babb96.56384a93@naist.jp> <6b60fdd88897.56384acf@naist.jp> <6a509431f711.56384c39@naist.jp> <6a50aab7bf13.5638cb72@naist.jp> <CAPrseCo-E82O+tSvRC=4x-yXYTMEHUW6UjeQK6HBRZwXey=sKg@mail.gmail.com> <5640DA91.30502@net.in.tum .de> <9C1BEDBD-2338-4E1B-8C98-E9479FE01423@is.naist.jp> <4AF73AA205019A4C8A1DDD32C034631D0BB6ADB7AF@NJFPSRVEXG0.research.att.com> <244E19BF-D6DF-4976-BB01-0A149CEB83D5@is.naist.jp> <4AF73AA205019A4C8A1DDD32C034631D0BB6ADB7B6@NJFPSRVEXG0.research.att.com> <8837B2D6-A6C2-46E7-AD83-EA9FD5D1B784@is.naist.jp>
From: Paul Emmerich <emmericp@net.in.tum.de>
Message-ID: <56430A7B.20106@net.in.tum.de>
Date: Wed, 11 Nov 2015 10:29:31 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:38.0) Gecko/20100101 Thunderbird/38.3.0
MIME-Version: 1.0
In-Reply-To: <8837B2D6-A6C2-46E7-AD83-EA9FD5D1B784@is.naist.jp>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/bmwg/RE4bg3Q1j7alW00ZzKl8d4D9NJo>
Subject: Re: [bmwg] Mean vs Median
X-BeenThere: bmwg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Benchmarking Methodology Working Group <bmwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bmwg>, <mailto:bmwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bmwg/>
List-Post: <mailto:bmwg@ietf.org>
List-Help: <mailto:bmwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bmwg>, <mailto:bmwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 11 Nov 2015 09:29:29 -0000

Hi,

On 10.11.15 08:15, Marius Georgescu wrote:
> For physical testing, the distribution tends to be normal (at least for my testing experiences). In this context the difference between mean and median would be negligible.
> Since in the virtual world (the future) bimodal is the new normal (if i may say so :), median seems like a better choice.
> I fully agree we should have an additional statistic, accounting for the variance somehow. I proposed the Margin of error. It can be the standard deviation, the standard error …
normally distributed latencies are actually pretty rare when testing 
devices that do any part of the processing in software. It doesn't even 
have to be virtualized. The main reason for this is batch processing of 
packets which drastically improves the performance.

Simple batching leads to a uniform distribution. However, there are 
different algorithms trying to optimize for different things in a 
typical software router which leads to really weird distributions.
For example, Linux uses polling in the NAPI to reduce interrupts and 
increase the batch size, but many drivers like Intel's ixgbe also 
feature a second interrupt moderation algorithm that further reduces 
interrupts. Both systems interact and you get weird distributions.
You can read more about interrupt handling for network processing in 
Linux in my paper about it at [1] if you are interested. (However, the 
paper does not yet look at distributions, it only states a few percentiles.)

I've uploaded an excerpt from one of my papers here [2] which shows the 
latency of a Linux system running Open vSwitch without any 
virtualization. An excerpt from the same paper with virtualization is 
available at [3], this looks more like a normal distribution with a long 
tail. (The paper is currently under peer review for a journal and will 
hopefully be published soon, I can share a pre-print privately with 
anyone who is interested in the full paper.)


Paul

[1] 
http://www.net.in.tum.de/fileadmin/bibtex/publications/papers/SPECTS15NAPIoptimization.pdf
[2] 
https://www.dropbox.com/s/dl9tmslubyd2j7m/Screenshot%202015-11-11%2010.17.05.png?dl=1
[3] 
https://www.dropbox.com/s/tia2lwhlt08hl8l/Screenshot%202015-11-11%2010.24.57.png?dl=1