Re: [codec] comparitive quality testing

Roman Shpount <> Thu, 14 April 2011 18:21 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 6FD23E084B for <>; Thu, 14 Apr 2011 11:21:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.976
X-Spam-Status: No, score=-2.976 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 9F0GM60ydsxm for <>; Thu, 14 Apr 2011 11:21:21 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id CA99DE0794 for <>; Thu, 14 Apr 2011 11:21:20 -0700 (PDT)
Received: by ewy19 with SMTP id 19so687550ewy.31 for <>; Thu, 14 Apr 2011 11:21:20 -0700 (PDT)
Received: by with SMTP id k4mr770227ebn.17.1302805279931; Thu, 14 Apr 2011 11:21:19 -0700 (PDT)
Received: from ( []) by with ESMTPS id r48sm1339744eei.16.2011. (version=TLSv1/SSLv3 cipher=OTHER); Thu, 14 Apr 2011 11:21:18 -0700 (PDT)
Received: by eye13 with SMTP id 13so690700eye.31 for <>; Thu, 14 Apr 2011 11:21:18 -0700 (PDT)
MIME-Version: 1.0
Received: by with SMTP id c19mr2619758ebq.71.1302805278133; Thu, 14 Apr 2011 11:21:18 -0700 (PDT)
Received: by with HTTP; Thu, 14 Apr 2011 11:21:18 -0700 (PDT)
In-Reply-To: <>
References: <>
Date: Thu, 14 Apr 2011 14:21:18 -0400
Message-ID: <>
From: Roman Shpount <>
To: Gregory Maxwell <>
Content-Type: multipart/alternative; boundary="0015174bdebcd08e0304a0e4fd37"
Cc: "" <>
Subject: Re: [codec] comparitive quality testing
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Codec WG <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 14 Apr 2011 18:21:22 -0000

I think part of the confusion comes from the fact that there are two
purposes for the comparative testing. One is to validate that the codec
meets the WG requirements. Another is to show how new codec compares to the
industry dominant codecs. For me, the second goal is more important then the
first one. I think if we care about the adoption of Opus, we should consider
making the comparative test results a deliverable for the working group. It
is very hard for a real company in the open market to justify doing
something, like adapting a new codec without a compelling reason. Knowing
how this codec compares to other existing codecs is a big part of providing
such a reason. If we look at the tests from this point of view, we need to
see how Opus compares to G.729 and AMR in narrow band, and AMR-WB and G.722
in wideband, Since there are no existing deployments of a meaningful size
(apart from a closed proprietary systems, like Skype) for UWB and FB, we can
compare Opus with industry leaders, such as G.719.

One can argue that we should also compare Opus with patent free codecs,
which adds iLBC and Speex to the list, but I personally see this as less of
a requirement. iLBC never managed to get market traction outside of the open
source world, and even in the open source world nobody bothered to write
even moderately optimized version of it. Speex is known for audio quality
problems so that it would be an easy target to beat. On the other hand this
would probably not be much of a milestone and will not tell anybody a lot
about the Opus quality.

There were several tests that compared Opus with non-interactive codecs, but
once again this is not something which would affect choosing Opus vs other
codecs, since other codecs are clearly inappropriate for Opus purposes.

We can argue about adding more codecs to the list, but I am not sure this
will make the difference. We only need to compare with a very few to give
everybody a clear idea about the Opus quality. As far as defining the
criteria for the codec being acceptable for standardization, all we really
need is a comparable quality (not worse then the other codecs by some
defined margin). This is not a competition where Opus needs to win every
race to be successful.

The whole reason why I am interested in formal comparative testing of Opus
is because I am impressed by its quality. I think having well documented
test results which were cross-checked by multiple people might make a
critical difference in Opus adoption, and as a result the success of this
working group.

No hats, just my two cents...
Roman Shpount

On Thu, Apr 14, 2011 at 10:25 AM, Gregory Maxwell <>wrote:

> Roni Even [] wrote:
> > I do not mind if the WG will decide to remove the quality claim and
> continue
> > with developing  a royalty free codec with "good enough" quality not
> saying
> > it is better than other codecs.
> > I just think that it should be clear from the charter and requirements
> what
> > is the purpose of the work.
> It's funny how we can argue and argue, only to later realize that it comes
> down to a simple mutual misunderstanding.
> I thought everyone was already on the same page with respect to the
> goals: it's good to be as good as possible, but the chartered purpose
> of the WG was only to do a "good quality" codec that was suited
> to the listed applications and deployments.
> As a developer I know that quality testing is important, and of course
> we've done a lot of it of various types.  I strongly believe in scientific
> testing, so of course my first instinct would have been to do it here,
> but perhaps the reality of the consensus process makes that less
> reasonable—as others have pointed out, most other WGs don't really
> do anything comparable to quality testing.
> Likewise, making sure the outcome is as legally unencumbered as I can
> is also very important to me,  but because of the vulgarities of the
> process and the law, this isn't something that the working group itself
> makes promises about.
> So, perhaps it makes sense for the working group to not make any quality
> promises in the same way it makes no promises about patents.
> It seems clear enough to me now that we can much more easily come to
> consensus about achieving good-enough status than about formal testing
> gates and requirements.
> We should accept your suggestion—drop all the comparative quality
> requirements from the requirements draft, and stop discussing comparative
> quality here—and then make some progress on technology, rather than
> continue bickering about details where we are not going to come to
> consensus.
> The market can figure out the comparative quality question on its own.
> _______________________________________________
> codec mailing list