Re: [codec] draft test and processing plan for the IETF Codec

Anisse Taleb <anisse.taleb@huawei.com> Tue, 19 April 2011 03:38 UTC

Return-Path: <anisse.taleb@huawei.com>
X-Original-To: codec@ietfc.amsl.com
Delivered-To: codec@ietfc.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfc.amsl.com (Postfix) with ESMTP id 9D3F9E0611 for <codec@ietfc.amsl.com>; Mon, 18 Apr 2011 20:38:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.281
X-Spam-Level:
X-Spam-Status: No, score=-6.281 tagged_above=-999 required=5 tests=[AWL=0.002, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4, SARE_MILLIONSOF=0.315]
Received: from mail.ietf.org ([208.66.40.236]) by localhost (ietfc.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ey0FRkBmCKOr for <codec@ietfc.amsl.com>; Mon, 18 Apr 2011 20:38:50 -0700 (PDT)
Received: from lhrga02-in.huawei.com (lhrga02-in.huawei.com [195.33.106.143]) by ietfc.amsl.com (Postfix) with ESMTP id 4B486E0696 for <codec@ietf.org>; Mon, 18 Apr 2011 20:38:50 -0700 (PDT)
Received: from huawei.com (lhrga02-in [172.18.7.45]) by lhrga02-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LJV003NFRGOQA@lhrga02-in.huawei.com> for codec@ietf.org; Tue, 19 Apr 2011 04:38:48 +0100 (BST)
Received: from LHREML202-EDG.china.huawei.com ([172.18.7.118]) by lhrga02-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTPS id <0LJV00KHERGNS7@lhrga02-in.huawei.com> for codec@ietf.org; Tue, 19 Apr 2011 04:38:47 +0100 (BST)
Received: from LHREML401-HUB.china.huawei.com (10.201.5.30) by LHREML202-EDG.china.huawei.com (172.18.7.189) with Microsoft SMTP Server (TLS) id 14.1.270.1; Tue, 19 Apr 2011 04:38:43 +0100
Received: from LHREML503-MBX.china.huawei.com ([fe80::f93f:958b:5b06:4f36]) by LHREML401-HUB.china.huawei.com ([::1]) with mapi id 14.01.0270.001; Tue, 19 Apr 2011 04:38:46 +0100
Date: Tue, 19 Apr 2011 03:38:46 +0000
From: Anisse Taleb <anisse.taleb@huawei.com>
In-reply-to: <COL103-W39BD44E16F636F4DB06934D0900@phx.gbl>
X-Originating-IP: [10.200.217.213]
To: Paul Coverdale <coverdale@sympatico.ca>, "codec@ietf.org" <codec@ietf.org>
Message-id: <F5AD4C2E5FBF304ABAE7394E9979AF7C26BC8E7C@LHREML503-MBX.china.huawei.com>
MIME-version: 1.0
Content-type: multipart/alternative; boundary="Boundary_(ID_zMGShmfy9hVpHKKqSKPG8g)"
Content-language: en-US
Accept-Language: en-GB, en-US
Thread-topic: [codec] draft test and processing plan for the IETF Codec
Thread-index: AQHL/jhbJQgDHmbkU0q3uka00W6W65Rkevpw
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
References: <COL103-W39BD44E16F636F4DB06934D0900@phx.gbl>
Subject: Re: [codec] draft test and processing plan for the IETF Codec
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 19 Apr 2011 03:38:53 -0000

Paul,

Strictly speaking, the probability of failing at least one requirement increases (or stays constant in dependent cases) with increasing the number of requirements. Of course as you mention, the responses of the listeners are not really random and a heads-tail modeling of the pass-fail is way too simplistic. I didn't look at the analysis in depth, neither did I verify the numbers.
I think Greg is mostly concerned with the size of the test and how the analysis of the requirements would be used to derive a conclusion about the codec.



[Off topic]
Statistics are quite fun to play with, a while ago a Conspiracy theorist tried to convince me that man never landed on the moon and that it was impossible that Apollo 11 made it. The millions of components, given the technology of that time, had  a significant probability of failure that in total it was beyond doubt that such a rocket would have gone off course.

Kind regards,
/Anisse


From: codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] On Behalf Of Paul Coverdale
Sent: Tuesday, April 19, 2011 4:21 AM
To: codec@ietf.org
Subject: Re: [codec] draft test and processing plan for the IETF Codec

As I mentioned earlier, the situation is not as bad as it may seem, certainly not a "1 in nonillion" chance of passing all requirements. Greg's analysis applies to flipping a coin that has a probability of .90 for "heads" and .10 for "tails." However, listener responses in a MOS test are not random - if they are, we throw that listener out of the results. The "randomness" that forms the basis of a statistical test derives from the distribution of responses ACROSS listeners rather than WITHIN listeners.

Regards,

...Paul

>-----Original Message-----
>From: Anisse Taleb [mailto:anisse.taleb@huawei.com]<mailto:[mailto:anisse.taleb@huawei.com]>
>Sent: Monday, April 18, 2011 6:37 PM
>To: Jean-Marc Valin; Paul Coverdale
>Cc: codec@ietf.org<mailto:codec@ietf.org>
>Subject: RE: [codec] draft test and processing plan for the IETF Codec
>
>JM, Greg, Paul,
>[taking emails in chronological order was ill advised :-)]
>
>I do not disagree with the statistical pitfalls you mention. As Paul
>stated and also what I wrote in a direct reply to this, there is no
>single uber-requirement to be passed by the codec, rather a vector of
>requirements that summarize the performance of the codec compared to
>other codecs. These have to be analyzed and discussed one by one.
>
>Kind regards,
>/Anisse
>
>> -----Original Message-----
>> From: codec-bounces@ietf.org<mailto:codec-bounces@ietf.org> [mailto:codec-bounces@ietf.org]<mailto:[mailto:codec-bounces@ietf.org]> On Behalf
>Of
>> Jean-Marc Valin
>> Sent: Thursday, April 14, 2011 3:07 PM
>> To: Paul Coverdale
>> Cc: codec@ietf.org<mailto:codec@ietf.org>
>> Subject: Re: [codec] draft test and processing plan for the IETF Codec
>>
>> > I don't think the situation is as dire as you make out. Your
>analysis
>> > assumes that all requirements are completely independent. This is
>not the
>> > case, in many cases if you meet one requirement you are likely to
>meet
>> > others of the same kind (eg performance as a function of bit rate).
>> >
>> > But in any case, the statistical analysis procedure outlined in the
>test
>> > plan doesn't assume that every requirement must be met with absolute
>> > certainty, it allows for a confidence interval.
>>
>> This is exactly what Greg is considering in his analysis. He's
>starting
>> from the assumption that the codec really meets *all* 162
>requirements.
>> Consider just the NWT requirements: if we were truly no worse than the
>> reference codec, then with 87 tests against a 95% confidence interval,
>we
>> would be expected to fail about 4 tests just by random chance.
>Considering
>> both NWT and BT requirements, the odds of passing Anisse's proposed
>test
>> plan given the assumptions above are 4.1483e-33. See
>http://xkcd.com/882/
>> for a more rigorous analysis.
>>
>> Cheers,
>>
>>   Jean-Marc