Re: [bmwg] Mean vs Median
"GEORGESCU LIVIU MARIUS" <liviumarius-g@is.naist.jp> Tue, 03 November 2015 16:41 UTC
Return-Path: <liviumarius-g@is.naist.jp>
X-Original-To: bmwg@ietfa.amsl.com
Delivered-To: bmwg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EEE261A886E for <bmwg@ietfa.amsl.com>; Tue, 3 Nov 2015 08:41:56 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.101
X-Spam-Level:
X-Spam-Status: No, score=-0.101 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_EQ_JP=1.244, HOST_EQ_JP=1.265, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Y5Ego3a6LjfJ for <bmwg@ietfa.amsl.com>; Tue, 3 Nov 2015 08:41:53 -0800 (PST)
Received: from mailrelay21.naist.jp (mailrelay21.naist.jp [163.221.80.71]) by ietfa.amsl.com (Postfix) with ESMTP id 8ED011A87E2 for <bmwg@ietf.org>; Tue, 3 Nov 2015 08:41:53 -0800 (PST)
Received: from mailpost21.naist.jp (mailscan21.naist.jp [163.221.80.58]) by mailrelay21.naist.jp (Postfix) with ESMTP id 95287247; Wed, 4 Nov 2015 01:41:52 +0900 (JST)
Received: from naist.jp (webmail21-a.naist.jp [163.221.80.53]) by mailpost21.naist.jp (Postfix) with ESMTP id 7E55B246; Wed, 4 Nov 2015 01:41:52 +0900 (JST)
Received: from [127.0.0.1] (Forwarded-For: ::ffff:182.171.247.250) by webmail21-a.naist.jp (mshttpd); Wed, 04 Nov 2015 01:41:52 +0900
From: GEORGESCU LIVIU MARIUS <liviumarius-g@is.naist.jp>
To: Stenio Fernandes <sflf@cin.ufpe.br>
Message-ID: <6a40a06c97e4.56396260@naist.jp>
Date: Wed, 04 Nov 2015 01:41:52 +0900
X-Mailer: Oracle Communications Messenger Express 7.0.5.35.0 64bit (built Mar 31 2015)
MIME-Version: 1.0
Content-Language: en
X-Accept-Language: en
Priority: normal
In-Reply-To: <6a409189b1be.5638e3c4@naist.jp>
References: <6b20c5aba195.56384250@naist.jp> <6aa0d4b4811d.5638428d@naist.jp> <6c3092a4e4de.563842ca@naist.jp> <6c30e9bcce6f.56384306@naist.jp> <6c30b769f897.56384342@naist.jp> <6bd0eb5cc61c.5638437f@naist.jp> <6a80acabaf05.563843bb@naist.jp> <6a40d704f84b.563843f7@naist.jp> <6aa08acd9d6a.56384434@naist.jp> <6c10886bda9e.56384470@naist.jp> <6c1081bddbe0.563844ac@naist.jp> <6c1084a7be89.563844e9@naist.jp> <6a608b65b1c2.56384525@naist.jp> <6a60d6ebaa6a.56384561@naist.jp> <6a80d3baddd6.5638459e@naist.jp> <6aa08a52c1ca.563845da@naist.jp> <6aa09799f4a7.563846ca@naist.jp> <6b60a07c9bbf.56384707@naist.jp> <6c109c80bfc2.56384743@naist.jp> <6a60e1ff9170.56384780@naist.jp> <6a60f4388bab.563847bc@naist.jp> <6bd0f10697e2.563847f8@naist.jp> <6a409179ad4a.56384835@naist.jp> <6a80cfd8c72d.56384871@naist.jp> <6c30b15ad280.563848ae@naist.jp> <6c30f0e98215.563848ea@naist.jp> <6c10c39aeff9.56384926@naist.jp> <6ab08659b996.56384963@naist.jp> <6ab0ea4dfdd6.563849a0@naist.jp> <6ab0be62e098.563849dc@naist.jp> <6aa0abb5b14b.56384a19@naist.jp> <6aa0e679a9c8.56384a55@naist.jp> <6b60e1babb96.56384a93@naist.jp> <6b60fdd88897.56384acf@naist.jp> <6a509431f711.56384c39@naist.jp> <6a50aab7bf13.5638cb72@naist.jp> <CAPrseCo-E82O+tSvRC=4x-yXYTMEHUW6UjeQK6HBRZwXey=sKg@mail.gmail.com> <6a409189b1be.5638e3c4@naist.jp>
Content-Type: multipart/alternative; boundary="--993b210d2b9c300d54ce"
X-TM-AS-MML: No
X-TM-AS-Product-Ver: IMSS-7.1.0.1392-8.0.0.1202-21920.000
X-TM-AS-Result: No--25.882-5.0-31-10
X-imss-scan-details: No--25.882-5.0-31-10
X-TMASE-MatchedRID: OoEa6u7Uk5/XTgyOeCkiImF/OVjTCoG0+X5uLEidSv+kHBdQz6cLCzgc oj/JqpGb5gCHftmwEMIuLZ3AqIxH3JEfv6UzUvbdzFqXKi+YeObutiMqNIaz7d1T+nRD/jT6PXd Zx1sZHpCdtRmRhPNchh9fNWA7SFWqh2iFXWy3oLdjLORo1y6rxhzMSx/XQOvMlFkA20bvgUUKdo kVXhehdlVN8laWo90MGY9Y+ATae1xaMgYuNs2nhpTAtbL7Lmy0NfqeZZPssim+3xA5udJ2N99J2 zmJrDxSzFOoeBdH/n2Bs03RHrzjM7oKipQLKIni/qsg+OKw7BUc7GmtYIN/Asn+WbzE3UhvfTYI ha6O7PyQ+gWwzffozvn94Go09nBsId0CVpkYkCJTCzmYI+kXZxBxqcioKvQn9Op6/Qg6ZAIwfXl 56Qt5SMWUKBjERoYTuP9+fQUL2ahqcCi3MPMBNpBb6mwLnSLRSKGrLERHMp14ez6VMlvnaPtzqs Xu3a66zmwpmAFt7kW0NJ9wxH7tkw7PhHZvnJBt+XY56IPq/P6WlWAZ6Nc21KZPQKkIZi5yJ+//t Wvrw2OR1ykkpfknCsnlJe2gk8vIFdX7UGTqqi9Pwh+Xh4Gh7cuzJIrNacGOSLUnJKbY1z+LYaqu +AuRalgowyUWHgGdYY0tNGdvli1Y/2pi7PK0EsZShONTmTT1IzkElH+xuv0lZfxAhmTMNNHbNyn 4IxDULi5PDX0qWHo76ne+DS+gbb5XN91D4/25W4UO24PfvJUaOJ6ZH2E5SSdSi76/qrXQBawxeD gsyEl1e7Xbb6Im2pv8tNLRPUrWNAw3+pAf1IijxYyRBa/qJeBPbNdhr3+dhXCdeY81j3dGONWF/ 6P/CsAn1apN4PgHrV0OiiNL0xhEcGY6dokS7yPzRlrdFGDw+Z1ukCzySmeX5k1SOULTwHaHZuZ0 VqBIr+FMTdSs/g73KhyswiSFLw==
Archived-At: <http://mailarchive.ietf.org/arch/msg/bmwg/yTBC5fccXJp54HFuewRxS91LouI>
Cc: k.pentikousis@eict.de, bmwg@ietf.org
Subject: Re: [bmwg] Mean vs Median
X-BeenThere: bmwg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Benchmarking Methodology Working Group <bmwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bmwg>, <mailto:bmwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bmwg/>
List-Post: <mailto:bmwg@ietf.org>
List-Help: <mailto:bmwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bmwg>, <mailto:bmwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Nov 2015 16:41:57 -0000
> > > > > Hello Stenio, > > > > Thanks for your comments. Please see my comments inline. > > > > From: steniofernandes@gmail.com > [mailto:steniofernandes@gmail.com <steniofernandes@gmail.com>] > On Behalf Of Stenio Fernandes > > Sent: Tuesday, November 3, 2015 5:45 PM > > To: GEORGESCU LIVIU MARIUS <liviumarius-g@is.naist.jp> > > Cc: bmwg@ietf.org; k.pentikousis@eict.de > > Subject: Re: [bmwg] Mean vs Median > > > > my two cents on this... see inline comments > > > > this is my first interaction with the wg... so, bear > with me if i'm a bit wordy :-) > > > > stenio > > > > > > Thanks for joining the discussion. > > > > On Tu > > e, Nov 3, 2015 at 3:57 AM, GEORGESCU LIVIU MARIUS <liviumarius-g@is.naist.jp(https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=liviumarius-g@is.naist.jp)> wrote: > > Hello BMWG, > > > > Following some of the discussion we had in IETF93 about > using either mean or median as a summarizing function for the results of > multiple test iterations, I added the following section in http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00 > > . > 10. Summarizing function > and repeatability > > > > > To ensure the stability of the benchmarking scores obtained using > > the tests presented in Sections 6(http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00#section-6)-9(http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00#section-9), multiple test iterations are > > recommended. Following the recommendations of RFC2544(http://tools.ietf.org/html/rfc2544), the average > > was chosen to be the summarizing function for the reported values. > > While median can be an alternative summarizing function, a rationale > > for using one or the other is needed. > > > > > > average is a colloquial term. although, in this context, > there might be nothing wrong with that, imho precise terms are preferred. > measures of central tendency could be used as the general term, where mean and > median fit in. > > > > Average seems to be a term accepted and used by industry, and > not just a “colloquial” term. We could quibble about definitions and what > we need to follow just as well. I prefer to go with RFC2544. > > > > The median can be useful for summarizing especially when outliers > > are not a desired quantity. However, in the overall performance of a > > network device the outliers can represent a malfunction or > > misconfiguration in the DUT, which should be taken into account. > > The average is a more inclusive summarizing function. Moreover, as > > underlined in [DeNijs(http://tools.ietf.org/html/draft-ietf-bmwg-ipv6-tran-tech-benchmarking-00#ref-DeNijs)], the average is less exposed to statistical > > uncertainty. These reasons make it the RECOMMENDED summarizing > > function for the results of different test iterations, unless stated > > otherwise. > > > > i'm having a hard time to understand this paragraph... i) > "inclusive" is very vague; ii) "less exposed to uncertainty" > is confusing. the mean is just a measure of centrality, whereas measures of > dispersions (e.g., sd, variance) can be used to assess the degree of > uncertainty around that measure (think of confidence intervals). > > i know this is not the objective of the document, but the > recommendation could be very simple. for example, stating that one should > assess statistical significance of results would be enough, and then > pointing out to a strong reference. le boudec's book seems appropriate > here (cf. chap 2). > > > > Just assessing statistical significance would be great. I would > recommend the study of this book: “Jain, Raj. The art of computer systems > performance analysis. John Wiley & Sons, 2008.” > > I wonder if text like this in an RFC would see the print. > > I am sure this would be perfect for certain types of academic > papers. I would rather have a clearer recommendation. > > > > To express the repeatability of the benchmarking tests through a > > number, the Margin of error (MoE) can be used. Of course, other > > functions, such as standard error could be employed as well. The > > advantage the MoE has is expressing an associated confidence > > interval by using the alpha parameter. > > > > The recommended formula for calculating the MoE is presented in > > Section 6.3.1. > > > > if the document will not give detailed approaches for > summarizing performance data (and i think it shouldn't), it should provied the > simplest recommendation as possible. otherwise, in order to be scientifically > correct, lots of assumptions must be made and provided, like iid, which might > not hold true for all cases. > > > > In order to be scientifically correct, the most important > part would be to assess the probability distribution of the data. I am trying > to find a solution where that wouldn’t be necessary. Of course it would > not hold for all cases, but the goal is to find a pareto-optimal solution where > the summarized result would be representative enough for the test sample and > simple enough to obtain. Or at least that’s how I see things. > > > > Marius > > >
- [bmwg] Mean vs Median GEORGESCU LIVIU MARIUS
- Re: [bmwg] Mean vs Median Stenio Fernandes
- Re: [bmwg] Mean vs Median GEORGESCU LIVIU MARIUS
- Re: [bmwg] Mean vs Median Stenio Fernandes
- Re: [bmwg] Mean vs Median Paul Emmerich
- Re: [bmwg] Mean vs Median Marius Georgescu
- Re: [bmwg] Mean vs Median MORTON, ALFRED C (AL)
- Re: [bmwg] Mean vs Median Marius Georgescu
- Re: [bmwg] Mean vs Median MORTON, ALFRED C (AL)
- Re: [bmwg] Mean vs Median Marius Georgescu
- Re: [bmwg] Mean vs Median Paul Emmerich
- Re: [bmwg] Mean vs Median Paul Emmerich
- Re: [bmwg] Mean vs Median Stenio Fernandes
- Re: [bmwg] Mean vs Median Marius Georgescu
- Re: [bmwg] Mean vs Median Marius Georgescu
- Re: [bmwg] Mean vs Median Paul Emmerich
- Re: [bmwg] Mean vs Median Paul Emmerich
- Re: [bmwg] Mean vs Median Marius Georgescu
- Re: [bmwg] Mean vs Median Paul Emmerich
- Re: [bmwg] Mean vs Median MORTON, ALFRED C (AL)
- Re: [bmwg] Mean vs Median Marius Georgescu
- Re: [bmwg] Mean vs Median Marius Georgescu
- Re: [bmwg] Mean vs Median Stenio Fernandes