Re: [bmwg] Mean vs Median

Stenio Fernandes <sflf@cin.ufpe.br> Thu, 05 November 2015 15:24 UTC

Return-Path: <steniofernandes@gmail.com>
X-Original-To: bmwg@ietfa.amsl.com
Delivered-To: bmwg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 06FD01B3023 for <bmwg@ietfa.amsl.com>; Thu, 5 Nov 2015 07:24:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.277
X-Spam-Level:
X-Spam-Status: No, score=-1.277 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FM_FORGED_GMAIL=0.622, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OZSsDvy7sBHQ for <bmwg@ietfa.amsl.com>; Thu, 5 Nov 2015 07:24:47 -0800 (PST)
Received: from mail-yk0-x22c.google.com (mail-yk0-x22c.google.com [IPv6:2607:f8b0:4002:c07::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 81D131B2F8A for <bmwg@ietf.org>; Thu, 5 Nov 2015 07:21:54 -0800 (PST)
Received: by ykdr3 with SMTP id r3so135995352ykd.1 for <bmwg@ietf.org>; Thu, 05 Nov 2015 07:21:53 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=tvZypJUttYeBjPVEwO6x0Bd/L2nZbhaT+aXqHMYlJFM=; b=wO3C7Kd/e+UrqwD25ppT3HceKWvGnj7VMLF2bUtk2CY8h8Y0YNR+0Wijdy+SsN77aL 3Y1OXSUpfkAZSfeczo63KdeHl+EJbLObEuzNPdTLh9k82nFBQlE4ssgpD/1JE67r/JnL Hn8/KP/xSwmlwPQf0nZppp2rg+p03+Gap5D/klug8qOleZg1TF+IGa0j4lk6wmzbr9Jw PRz10pd+wjQv/w6c9K2XHC6aOgqMwGKLb6VXWIXFQ28nq++BqwVqscoaJGOGCZHyixs0 iBxuW1u1pHSthsNadl8bnEEqVB/9tAOIx3ddOwE29rW+LuGiVK+C3tRVSodmpxRXueNS zKFw==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cin_ufpe_br.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=tvZypJUttYeBjPVEwO6x0Bd/L2nZbhaT+aXqHMYlJFM=; b=nqm1WxGYIrY77LHG6xITtDKxjWu2jc4QEwPjuRNjw2Uwf7tXhhOs7SCat4rPfONdk7 0MkpOvbzIx53q4CNQpRvf7P2T+DHHh/0MQM/Po53j+8smRAot5R+rZaSUfENRXAtvrvU LbLyQ6MGAW0D0DYiPHgCGY1B6sjhmOSNl5AGk0y3lxZZEljaQsYVSbF3MG4hCua9Kf1z Skkx8m55TPOP6oo3QivIhmpiwK4l1bLjp5KhyfL/boBBa4t4nqEqyDUmVeoikgEoZgFJ AMRkypa+7cePQi/JaiU0UzrrdCVg+F4H+ds73DkjqF+V1O5+GTh5kgYsVVokQ9x2X9Ct WtNQ==
X-Received: by 10.31.52.214 with SMTP id b205mr7796065vka.122.1446736913048; Thu, 05 Nov 2015 07:21:53 -0800 (PST)
MIME-Version: 1.0
Sender: steniofernandes@gmail.com
Received: by 10.31.95.3 with HTTP; Thu, 5 Nov 2015 07:21:13 -0800 (PST)
In-Reply-To: <001001d11655$df8f3270$9ead9750$@gmail.com>
References: <6b20c5aba195.56384250@naist.jp> <6aa0d4b4811d.5638428d@naist.jp> <6c3092a4e4de.563842ca@naist.jp> <6c30e9bcce6f.56384306@naist.jp> <6c30b769f897.56384342@naist.jp> <6bd0eb5cc61c.5638437f@naist.jp> <6a80acabaf05.563843bb@naist.jp> <6a40d704f84b.563843f7@naist.jp> <6aa08acd9d6a.56384434@naist.jp> <6c10886bda9e.56384470@naist.jp> <6c1081bddbe0.563844ac@naist.jp> <6c1084a7be89.563844e9@naist.jp> <6a608b65b1c2.56384525@naist.jp> <6a60d6ebaa6a.56384561@naist.jp> <6a80d3baddd6.5638459e@naist.jp> <6aa08a52c1ca.563845da@naist.jp> <6aa09799f4a7.563846ca@naist.jp> <6b60a07c9bbf.56384707@naist.jp> <6c109c80bfc2.56384743@naist.jp> <6a60e1ff9170.56384780@naist.jp> <6a60f4388bab.563847bc@naist.jp> <6bd0f10697e2.563847f8@naist.jp> <6a409179ad4a.56384835@naist.jp> <6a80cfd8c72d.56384871@naist.jp> <6c30b15ad280.563848ae@naist.jp> <6c30f0e98215.563848ea@naist.jp> <6c10c39aeff9.56384926@naist.jp> <6ab08659b996.56384963@naist.jp> <6ab0ea4dfdd6.563849a0@naist.jp> <6ab0be62e098.563849dc@naist.jp> <6aa0e679a9c8.56384a55@naist.jp> <6b60e1babb96.56384a93@naist.jp> <6b60fdd88897.56384acf@naist.jp> <6a509431f711.56384c39@naist.jp> <6a50aab7bf13.5638cb72@naist.jp> <CAPrseCo-E82O+tSvRC=4x-yXYTMEHUW6UjeQK6HBRZwXey=sKg@mail.gmail.com> <001001d11655$df8f3270$9ead9750$@gmail.com>
From: Stenio Fernandes <sflf@cin.ufpe.br>
Date: Thu, 05 Nov 2015 13:21:13 -0200
X-Google-Sender-Auth: iMhV1gLT0K1GwjLSyKDHv1YRWa8
Message-ID: <CAPrseCoj_+pRbQtntsKtuRfQiepV-xaC0XvTVxNK+qQA3zWx-w@mail.gmail.com>
To: Marius Georgescu <marius.georgescul@gmail.com>
Content-Type: multipart/alternative; boundary="001a1143f932c929c70523ccaf61"
Archived-At: <http://mailarchive.ietf.org/arch/msg/bmwg/myADJHjkvfG2fDWBxmCTmwQvlWw>
Cc: k.pentikousis@eict.de, bmwg@ietf.org
Subject: Re: [bmwg] Mean vs Median
X-BeenThere: bmwg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Benchmarking Methodology Working Group <bmwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bmwg>, <mailto:bmwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bmwg/>
List-Post: <mailto:bmwg@ietf.org>
List-Help: <mailto:bmwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bmwg>, <mailto:bmwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Nov 2015 15:24:50 -0000

it seems we're converging here.

my understanding now is that if the recommendation is too broad as i
suggested, it might not be very useful, since people might not know
what/how to do exactly. but if it is too specific, lots of assumptions
about the data have to be made without really knowing the actual data.

a compromise may be in encouraging getting enough data samples (n>30,
preferably many more), evaluating measures of centrality, dispersion, and
error, so that the recommendation is clear enough, but neither too generic
nor too specific.

jain's book is a classic one, but a bit old on the examples. le boudec's
book is kind of new and specific to computer networks, although he has a
heavy hand on math.

cheers,

stenio


On Tue, Nov 3, 2015 at 2:36 PM, Marius Georgescu <
marius.georgescul@gmail.com> wrote:

> Hello Stenio,
>
>
>
> Thanks for your comments. Please see my comments inline.
>
>
>
> *From:* steniofernandes@gmail.com [mailto:steniofernandes@gmail.com] *On
> Behalf Of *Stenio Fernandes
> *Sent:* Tuesday, November 3, 2015 5:45 PM
> *To:* GEORGESCU LIVIU MARIUS <liviumarius-g@is.naist.jp>
> *Cc:* bmwg@ietf.org; k.pentikousis@eict.de
> *Subject:* Re: [bmwg] Mean vs Median
>
>
>
> my two cents on this... see inline comments
>
>
>
> this is my first interaction with the wg... so, bear with me if i'm a bit
> wordy :-)
>
>
>
> stenio
>
>
>
>
>
> Thanks for joining the discussion.
>
>
>
> On Tu
>
> e, Nov 3, 2015 at 3:57 AM, GEORGESCU LIVIU MARIUS <
> liviumarius-g@is.naist.jp
> <https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=liviumarius-g@is.naist.jp>>
> wrote:
>
> Hello BMWG,
>
>
>
> Following some of the discussion we had in IETF93 about using either mean
> or median as a summarizing function for the results of multiple test
> iterations, I added the following section in
> http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00
>
> .
> 10
> <http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00#section-10>.
> Summarizing function and repeatability
>
>
>
>
>
>    To ensure the stability of the benchmarking scores obtained using
>
>    the tests presented in Sections 6 <http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00#section-6>-9 <http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00#section-9>, multiple test iterations are
>
>    recommended. Following the recommendations of RFC2544 <http://tools.ietf.org/html/rfc2544>, the average
>
>    was chosen to be the summarizing function for the reported values.
>
>    While median can be an alternative summarizing function, a rationale
>
>    for using one or the other is needed.
>
>
>
>
>
> average is a colloquial term. although, in this context, there might be
> nothing wrong with that, imho precise terms are preferred. measures of
> central tendency could be used as the general term, where mean and median
> fit in.
>
>
>
> Average seems to be a term accepted and used by industry, and not just a
> “colloquial” term. We could  quibble about definitions and what we need to
> follow just as well. I prefer to go with RFC2544.
>
>
>
>    The median can be useful for summarizing especially when outliers
>
>    are not a desired quantity. However, in the overall performance of a
>
>    network device the outliers can represent a malfunction or
>
>    misconfiguration in the DUT, which should be taken into account.
>
>    The average is a more inclusive summarizing function. Moreover, as
>
>    underlined in [DeNijs <http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00#ref-DeNijs>], the average is less exposed to statistical
>
>    uncertainty. These reasons make it the RECOMMENDED summarizing
>
>    function for the results of different test iterations, unless stated
>
>    otherwise.
>
>
>
> i'm having a hard time to understand this paragraph... i) "inclusive" is
> very vague; ii) "less exposed to uncertainty" is confusing. the mean is
> just a measure of centrality, whereas measures of dispersions (e.g., sd,
> variance) can be used to assess the degree of uncertainty around that
> measure (think of confidence intervals).
>
> i know this is not the objective of the document, but the recommendation
> could be very simple. for example, stating that *one should assess
> statistical significance of results *would be enough, and then pointing
> out to a strong reference. le boudec's book seems appropriate here (cf.
> chap 2).
>
>
>
> Just assessing statistical significance would be great. I would recommend
> the study of this book: “Jain, Raj. *The art of computer systems
> performance analysis*. John Wiley & Sons, 2008.”
>
> I wonder if text like this in an RFC would see the print.
>
> I am sure this would be perfect for certain types of academic papers. I
> would rather have a clearer recommendation.
>
>
>
>    To express the repeatability of the benchmarking tests through a
>
>    number, the Margin of error (MoE) can be used. Of course, other
>
>    functions, such as standard error could be employed as well. The
>
>    advantage the MoE has is expressing an associated confidence
>
>    interval by using the alpha parameter.
>
>
>
>    The recommended formula for calculating the MoE is presented in
>
> Section 6.3.1
> <http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00#section-6.3.1>
> .
>
>
>
> if the document will not give detailed approaches for summarizing
> performance data (and i think it shouldn't), it should provied the simplest
> recommendation as possible. otherwise, in order to be scientifically
> correct, lots of assumptions must be made and provided, like iid, which
> might not hold true for all cases.
>
>
>
> In order  to be scientifically correct, the most important part would be
> to assess the probability distribution of the data. I am trying to find a
> solution where that wouldn’t be necessary.  Of course it would not hold for
> all cases, but the goal is to find a pareto-optimal solution where the
> summarized result would be representative enough for the test sample and
> simple enough to obtain. Or at least that’s how I see things.
>
>
>
> Marius
>



-- 
Prof. Stenio Fernandes
CIn/UFPE
http://www.steniofernandes.com