Re: [bmwg] Mean vs Median

Stenio Fernandes <sflf@cin.ufpe.br> Thu, 05 November 2015 15:24 UTC

MIME-Version: 1.0
Sender: steniofernandes@gmail.com
In-Reply-To: <001001d11655$df8f3270$9ead9750$@gmail.com>
References: <6b20c5aba195.56384250@naist.jp> <6aa0d4b4811d.5638428d@naist.jp> <6c3092a4e4de.563842ca@naist.jp> <6c30e9bcce6f.56384306@naist.jp> <6c30b769f897.56384342@naist.jp> <6bd0eb5cc61c.5638437f@naist.jp> <6a80acabaf05.563843bb@naist.jp> <6a40d704f84b.563843f7@naist.jp> <6aa08acd9d6a.56384434@naist.jp> <6c10886bda9e.56384470@naist.jp> <6c1081bddbe0.563844ac@naist.jp> <6c1084a7be89.563844e9@naist.jp> <6a608b65b1c2.56384525@naist.jp> <6a60d6ebaa6a.56384561@naist.jp> <6a80d3baddd6.5638459e@naist.jp> <6aa08a52c1ca.563845da@naist.jp> <6aa09799f4a7.563846ca@naist.jp> <6b60a07c9bbf.56384707@naist.jp> <6c109c80bfc2.56384743@naist.jp> <6a60e1ff9170.56384780@naist.jp> <6a60f4388bab.563847bc@naist.jp> <6bd0f10697e2.563847f8@naist.jp> <6a409179ad4a.56384835@naist.jp> <6a80cfd8c72d.56384871@naist.jp> <6c30b15ad280.563848ae@naist.jp> <6c30f0e98215.563848ea@naist.jp> <6c10c39aeff9.56384926@naist.jp> <6ab08659b996.56384963@naist.jp> <6ab0ea4dfdd6.563849a0@naist.jp> <6ab0be62e098.563849dc@naist.jp> <6aa0e679a9c8.56384a55@naist.jp> <6b60e1babb96.56384a93@naist.jp> <6b60fdd88897.56384acf@naist.jp> <6a509431f711.56384c39@naist.jp> <6a50aab7bf13.5638cb72@naist.jp> <CAPrseCo-E82O+tSvRC=4x-yXYTMEHUW6UjeQK6HBRZwXey=sKg@mail.gmail.com> <001001d11655$df8f3270$9ead9750$@gmail.com>
From: Stenio Fernandes <sflf@cin.ufpe.br>
Date: Thu, 05 Nov 2015 13:21:13 -0200
Message-ID: <CAPrseCoj_+pRbQtntsKtuRfQiepV-xaC0XvTVxNK+qQA3zWx-w@mail.gmail.com>
To: Marius Georgescu <marius.georgescul@gmail.com>
Content-Type: multipart/alternative; boundary="001a1143f932c929c70523ccaf61"
Archived-At: <http://mailarchive.ietf.org/arch/msg/bmwg/myADJHjkvfG2fDWBxmCTmwQvlWw>
Cc: k.pentikousis@eict.de, bmwg@ietf.org
Subject: Re: [bmwg] Mean vs Median
Precedence: list

it seems we're converging here.

my understanding now is that if the recommendation is too broad as i
suggested, it might not be very useful, since people might not know
what/how to do exactly. but if it is too specific, lots of assumptions
about the data have to be made without really knowing the actual data.

a compromise may be in encouraging getting enough data samples (n>30,
preferably many more), evaluating measures of centrality, dispersion, and
error, so that the recommendation is clear enough, but neither too generic
nor too specific.

jain's book is a classic one, but a bit old on the examples. le boudec's
book is kind of new and specific to computer networks, although he has a
heavy hand on math.

cheers,

stenio


On Tue, Nov 3, 2015 at 2:36 PM, Marius Georgescu <
marius.georgescul@gmail.com> wrote:

> Hello Stenio,
>
>
>
> Thanks for your comments. Please see my comments inline.
>
>
>
> *From:* steniofernandes@gmail.com [mailto:steniofernandes@gmail.com] *On
> Behalf Of *Stenio Fernandes
> *Sent:* Tuesday, November 3, 2015 5:45 PM
> *To:* GEORGESCU LIVIU MARIUS <liviumarius-g@is.naist.jp>
> *Cc:* bmwg@ietf.org; k.pentikousis@eict.de
> *Subject:* Re: [bmwg] Mean vs Median
>
>
>
> my two cents on this... see inline comments
>
>
>
> this is my first interaction with the wg... so, bear with me if i'm a bit
> wordy :-)
>
>
>
> stenio
>
>
>
>
>
> Thanks for joining the discussion.
>
>
>
> On Tu
>
> e, Nov 3, 2015 at 3:57 AM, GEORGESCU LIVIU MARIUS <
> liviumarius-g@is.naist.jp
> <https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=liviumarius-g@is.naist.jp>>
> wrote:
>
> Hello BMWG,
>
>
>
> Following some of the discussion we had in IETF93 about using either mean
> or median as a summarizing function for the results of multiple test
> iterations, I added the following section in
> http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00
>
> .
> 10
> <http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00#section-10>.
> Summarizing function and repeatability
>
>
>
>
>
>    To ensure the stability of the benchmarking scores obtained using
>
>    the tests presented in Sections 6 <http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00#section-6>-9 <http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00#section-9>, multiple test iterations are
>
>    recommended. Following the recommendations of RFC2544 <http://tools.ietf.org/html/rfc2544>, the average
>
>    was chosen to be the summarizing function for the reported values.
>
>    While median can be an alternative summarizing function, a rationale
>
>    for using one or the other is needed.
>
>
>
>
>
> average is a colloquial term. although, in this context, there might be
> nothing wrong with that, imho precise terms are preferred. measures of
> central tendency could be used as the general term, where mean and median
> fit in.
>
>
>
> Average seems to be a term accepted and used by industry, and not just a
> “colloquial” term. We could  quibble about definitions and what we need to
> follow just as well. I prefer to go with RFC2544.
>
>
>
>    The median can be useful for summarizing especially when outliers
>
>    are not a desired quantity. However, in the overall performance of a
>
>    network device the outliers can represent a malfunction or
>
>    misconfiguration in the DUT, which should be taken into account.
>
>    The average is a more inclusive summarizing function. Moreover, as
>
>    underlined in [DeNijs <http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00#ref-DeNijs>], the average is less exposed to statistical
>
>    uncertainty. These reasons make it the RECOMMENDED summarizing
>
>    function for the results of different test iterations, unless stated
>
>    otherwise.
>
>
>
> i'm having a hard time to understand this paragraph... i) "inclusive" is
> very vague; ii) "less exposed to uncertainty" is confusing. the mean is
> just a measure of centrality, whereas measures of dispersions (e.g., sd,
> variance) can be used to assess the degree of uncertainty around that
> measure (think of confidence intervals).
>
> i know this is not the objective of the document, but the recommendation
> could be very simple. for example, stating that *one should assess
> statistical significance of results *would be enough, and then pointing
> out to a strong reference. le boudec's book seems appropriate here (cf.
> chap 2).
>
>
>
> Just assessing statistical significance would be great. I would recommend
> the study of this book: “Jain, Raj. *The art of computer systems
> performance analysis*. John Wiley & Sons, 2008.”
>
> I wonder if text like this in an RFC would see the print.
>
> I am sure this would be perfect for certain types of academic papers. I
> would rather have a clearer recommendation.
>
>
>
>    To express the repeatability of the benchmarking tests through a
>
>    number, the Margin of error (MoE) can be used. Of course, other
>
>    functions, such as standard error could be employed as well. The
>
>    advantage the MoE has is expressing an associated confidence
>
>    interval by using the alpha parameter.
>
>
>
>    The recommended formula for calculating the MoE is presented in
>
> Section 6.3.1
> <http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00#section-6.3.1>
> .
>
>
>
> if the document will not give detailed approaches for summarizing
> performance data (and i think it shouldn't), it should provied the simplest
> recommendation as possible. otherwise, in order to be scientifically
> correct, lots of assumptions must be made and provided, like iid, which
> might not hold true for all cases.
>
>
>
> In order  to be scientifically correct, the most important part would be
> to assess the probability distribution of the data. I am trying to find a
> solution where that wouldn’t be necessary.  Of course it would not hold for
> all cases, but the goal is to find a pareto-optimal solution where the
> summarized result would be representative enough for the test sample and
> simple enough to obtain. Or at least that’s how I see things.
>
>
>
> Marius
>



-- 
Prof. Stenio Fernandes
CIn/UFPE
http://www.steniofernandes.com

[bmwg] Mean vs Median GEORGESCU LIVIU MARIUS
Re: [bmwg] Mean vs Median Stenio Fernandes
Re: [bmwg] Mean vs Median GEORGESCU LIVIU MARIUS
Re: [bmwg] Mean vs Median Stenio Fernandes
Re: [bmwg] Mean vs Median Paul Emmerich
Re: [bmwg] Mean vs Median Marius Georgescu
Re: [bmwg] Mean vs Median MORTON, ALFRED C (AL)
Re: [bmwg] Mean vs Median Marius Georgescu
Re: [bmwg] Mean vs Median MORTON, ALFRED C (AL)
Re: [bmwg] Mean vs Median Marius Georgescu
Re: [bmwg] Mean vs Median Paul Emmerich
Re: [bmwg] Mean vs Median Paul Emmerich
Re: [bmwg] Mean vs Median Stenio Fernandes
Re: [bmwg] Mean vs Median Marius Georgescu
Re: [bmwg] Mean vs Median Marius Georgescu
Re: [bmwg] Mean vs Median Paul Emmerich
Re: [bmwg] Mean vs Median Paul Emmerich
Re: [bmwg] Mean vs Median Marius Georgescu
Re: [bmwg] Mean vs Median Paul Emmerich
Re: [bmwg] Mean vs Median MORTON, ALFRED C (AL)
Re: [bmwg] Mean vs Median Marius Georgescu
Re: [bmwg] Mean vs Median Marius Georgescu
Re: [bmwg] Mean vs Median Stenio Fernandes