Re: [codec] A concrete proposal for requirements and testing

Ron <ron@debian.org> Fri, 08 April 2011 09:34 UTC

Return-Path: <ron@debian.org>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 5FAA228C0DE for <codec@core3.amsl.com>; Fri, 8 Apr 2011 02:34:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jisEtjgahRSz for <codec@core3.amsl.com>; Fri, 8 Apr 2011 02:34:30 -0700 (PDT)
Received: from ipmail06.adl6.internode.on.net (ipmail06.adl6.internode.on.net [150.101.137.145]) by core3.amsl.com (Postfix) with ESMTP id 7BEE43A69B8 for <codec@ietf.org>; Fri, 8 Apr 2011 02:34:30 -0700 (PDT)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AvsEALTVnk120goe/2dsb2JhbACmEHjAb4VtBIVTh3s
Received: from ppp118-210-10-30.lns20.adl2.internode.on.net (HELO audi.shelbyville.oz) ([118.210.10.30]) by ipmail06.adl6.internode.on.net with ESMTP; 08 Apr 2011 19:06:15 +0930
Received: from localhost (localhost [127.0.0.1]) by audi.shelbyville.oz (Postfix) with ESMTP id 3CFFA4F8F3 for <codec@ietf.org>; Fri, 8 Apr 2011 19:06:13 +0930 (CST)
X-Virus-Scanned: Debian amavisd-new at audi.shelbyville.oz
Received: from audi.shelbyville.oz ([127.0.0.1]) by localhost (audi.shelbyville.oz [127.0.0.1]) (amavisd-new, port 10024) with LMTP id cTQcAcIMgr+x for <codec@ietf.org>; Fri, 8 Apr 2011 19:06:12 +0930 (CST)
Received: by audi.shelbyville.oz (Postfix, from userid 1000) id 1E6604F8FE; Fri, 8 Apr 2011 19:06:12 +0930 (CST)
Date: Fri, 08 Apr 2011 19:06:12 +0930
From: Ron <ron@debian.org>
To: codec@ietf.org
Message-ID: <20110408093612.GD30415@audi.shelbyville.oz>
References: <64212FE1AE068044AD567CCB214073F123A10234@MAIL2.octasic.com> <F5AD4C2E5FBF304ABAE7394E9979AF7C26BC47FA@LHREML503-MBX.china.huawei.com> <027A93CE4A670242BD91A44E37105AEF17ACA33C36@ESESSCMS0351.eemea.ericsson.se> <20110407125345.GA30415@audi.shelbyville.oz> <BANLkTimeDEPY8va6_MQVztn3YGyTZ2LmVw@mail.gmail.com> <20110407164817.GB30415@audi.shelbyville.oz> <BLU0-SMTP522E3F60CF41CCB8108C96D0A40@phx.gbl> <4D9E0443.6040703@stpeter.im>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <4D9E0443.6040703@stpeter.im>
User-Agent: Mutt/1.5.20 (2009-06-14)
Subject: Re: [codec] A concrete proposal for requirements and testing
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Apr 2011 09:34:33 -0000

On Thu, Apr 07, 2011 at 12:36:51PM -0600, Peter Saint-Andre wrote:
> On 4/7/11 12:13 PM, Paul Coverdale wrote:
> 
> > It doesn't matter what any single individual on this mailing list thinks
> > about Opus quality. What matters is the collective opinion of the population
> > of naïve users who may listen to speech/music over Opus in the future, and
> > how they rate it compared to listening to speech/music via other codecs (not
> > that they will know/care what a codec is...). A well-designed test plan and
> > statistical analysis can give a good estimate of this opinion.
> 
> I completely agree. That's why it is so important for us to publish
> draft-ietf-codec-opus as a Proposed Standard.
> 
> The IETF tradition is one of "rough consensus and running code". What
> this means is that we work hard on a new technology and publish it as an
> RFC ("request for comments"). This is the rough consensus part.
> 
> Then we implement it and deploy it. Testing is something that happens
> naturally when we run a technology on networks. This is the "running
> code" part.
> 
> Implementation and deployment experience can lead to revisions and
> refinements and better specification of the technology. Of course we'll
> also seek rough consensus on any modifications, but it's not necessary,
> or even desirable, to get that implementation and deployment experience
> before we publish a specification as a Proposed Standard RFC.
> 
> Indeed, much discussion has occurred of late within the IETF about
> simplifying the Internet Standards Process by making it easier, not
> harder, to advance Internet-Drafts to Proposed Standard:
> 
> https://datatracker.ietf.org/doc/draft-housley-two-maturity-levels/
> 
> Let's get to rough consensus and then start running this code on the
> network. That will be the true test.

+1

Peter speaks well exactly what went through my mind about the best way
to actually get a meaningful statistical analysis of what we have today,
from a real world mix of naïve, golden-eared, and technically savvy users,
over a huge variety of use cases.

Of course I wasn't suggesting that we should delegate Stephen to go have
a listen, then hinge the remainder of this process on his impressions.
I was suggesting that *everyone* proposing tests should go have a listen,
because if at G.711 (mono) data rates, you can't ABX (stereo) Opus from
the reference (as many good listeners apparently can't on many samples),
then clearly many of the tests that people have been proposing aren't
actually very meaningful at all.  It might be nice to have a number that
shows how much we are better than them, to brag about over dinner with
p < 0.05 confidence, but it's already self-evident that we've by far
exceeded the quality goals that we initially merely aspired to aim for.
The sky is blue.  Opus is almost transparent.  Most people who aren't
blind or deaf seem to have rough consensus on these things already.

Audio quality isn't the leading problem we have to solve at this stage.
The biggest problem now is the problems we still don't know about yet.
And the only way to find them is to give it to real users, and see what
they can break.  The developers can't break it anymore, and nobody here
has shown that they can either.  That says it's time to expand our circle
of benevolent wreckers to people with more imaginative (ab)uses than we
are able to contrive here in the lab from molecules of past experience.

Please don't stop testing.  But please don't stop everyone else while
you do.  There are many more people waiting to test this who are outside
this group than we currently have within it.  It's time to put them to
work for us :)  Implement your real use case and ship it.  Nothing else
will show us what bugs really do remain as efficiently as that will.

Cheers,
Ron