Re: [bmwg] Mean vs Median

Paul Emmerich <emmericp@net.in.tum.de> Thu, 12 November 2015 13:04 UTC

Return-Path: <emmericp@net.in.tum.de>
X-Original-To: bmwg@ietfa.amsl.com
Delivered-To: bmwg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D22C31A88BA for <bmwg@ietfa.amsl.com>; Thu, 12 Nov 2015 05:04:16 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.85
X-Spam-Level:
X-Spam-Status: No, score=-3.85 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_EQ_DE=0.35, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pZ2tOPiDLPq5 for <bmwg@ietfa.amsl.com>; Thu, 12 Nov 2015 05:04:15 -0800 (PST)
Received: from mail-out1.informatik.tu-muenchen.de (mail-out1.informatik.tu-muenchen.de [131.159.0.8]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6946C1A88BC for <bmwg@ietf.org>; Thu, 12 Nov 2015 05:04:15 -0800 (PST)
Received: from dyn94st.net.in.tum.de (dyn94st.net.in.tum.de [131.159.14.94]) by mail.net.in.tum.de (Postfix) with ESMTPSA id DAEA5187FA09; Thu, 12 Nov 2015 14:04:12 +0100 (CET)
To: Stenio Fernandes <sflf@cin.ufpe.br>
References: <6b20c5aba195.56384250@naist.jp> <6a80d3baddd6.5638459e@naist.jp> <6aa08a52c1ca.563845da@naist.jp> <6aa09799f4a7.563846ca@naist.jp> <6b60a07c9bbf.56384707@naist.jp> <6c109c80bfc2.56384743@naist.jp> <6a60e1ff9170.56384780@naist.jp> <6a60f4388bab.563847bc@naist.jp> <6bd0f10697e2.563847f8@naist.jp> <6a409179ad4a.56384835@naist.jp> <6a80cfd8c72d.56384871@naist.jp> <6c30b15ad280.563848ae@naist.jp> <6c30f0e98215.563848ea@naist.jp> <6c10c39aeff9.56384926@naist.jp> <6ab08659b996.56384963@naist.jp> <6ab0ea4dfdd6.563849a0@naist.jp> <6ab0be62e098.563849dc@naist.jp> <6aa0abb5b14b.56384a19@naist.jp> <6aa0e679a9c8.56384a55@naist.jp> <6b60e1babb96.56384a93@naist.jp> <6b60fdd88897.56384acf@naist.jp> <6a509431f711.56384c39@naist.jp> <6a50aab7bf13.5638cb72@naist.jp> <CAPrseCo-E82O+tSvRC=4x-yXYTMEHUW6UjeQK6HBRZwXey=sKg@mail.gmail.com> <9C1BEDBD-2338-4E1B-8C98-E9479FE01423@is.naist.jp> <56434C78.6090502@net.in.tum.de> <CAPrseCqY1FFQv8yuASVC5xMYQ7w4+KQCnMhE1cfV7Bjtowovqg@mail.gmail.com>
From: Paul Emmerich <emmericp@net.in.tum.de>
Message-ID: <56448E4C.9060307@net.in.tum.de>
Date: Thu, 12 Nov 2015 14:04:12 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:38.0) Gecko/20100101 Thunderbird/38.3.0
MIME-Version: 1.0
In-Reply-To: <CAPrseCqY1FFQv8yuASVC5xMYQ7w4+KQCnMhE1cfV7Bjtowovqg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/bmwg/Pc5u9ARZsAFKJA6O-j4pvgS9Fjc>
Cc: bmwg@ietf.org
Subject: Re: [bmwg] Mean vs Median
X-BeenThere: bmwg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Benchmarking Methodology Working Group <bmwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bmwg>, <mailto:bmwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bmwg/>
List-Post: <mailto:bmwg@ietf.org>
List-Help: <mailto:bmwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bmwg>, <mailto:bmwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 12 Nov 2015 13:04:17 -0000

Hi,

On 11.11.15 19:46, Stenio Fernandes wrote:
> The discussions so far have led me to conclude that in the context here,
> there is no need to make any assumptions on the sample set. Any specific
> measure of centrality or dispersion might not be precise enough in some
> cases. As stated by others, mean/median would work for well-behaved
> (e.g., normally distributed) data, but would not work for multi-modal or
> heavy-tailed ones. Recall that heavy-tailed distributions are usually
> characterized by the shape and location parameters instead of mean and
> variance. Regarding the number of samples, it is really tough to
> characterize heavy-tailed or multi-modal distributions with a few
> samples, even using advanced algorithms for maximum-likelihood estimation.
my suggestion would be characterizing the latency by providing several 
percentiles. I'd suggest using 10, 20, ... 90, 99, (and maybe 99.9 if we 
define the required number of samples as large enough to capture long 
tails). This is still a small number of data points that can easily be 
included in a test report. But it is effectively a quantized CDF and 
therefore captures the whole distribution including its shape reasonably 
well.


Paul