Re: [video-codec] Performances, measurement of same (Re: The Can Has Landed)

"Timothy B. Terriberry" <tterribe@xiph.org> Mon, 23 March 2015 14:13 UTC

Return-Path: <tterribe@xiph.org>
X-Original-To: video-codec@ietfa.amsl.com
Delivered-To: video-codec@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 316411A8A8F for <video-codec@ietfa.amsl.com>; Mon, 23 Mar 2015 07:13:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.378
X-Spam-Level:
X-Spam-Status: No, score=-1.378 tagged_above=-999 required=5 tests=[BAYES_20=-0.001, HELO_MISMATCH_ORG=0.611, HOST_MISMATCH_COM=0.311, RCVD_IN_DNSWL_MED=-2.3, SPF_FAIL=0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id H6Jmr02I9op3 for <video-codec@ietfa.amsl.com>; Mon, 23 Mar 2015 07:13:14 -0700 (PDT)
Received: from smtp.mozilla.org (mx2.corp.phx1.mozilla.com [63.245.216.70]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F11731A8ABE for <video-codec@ietf.org>; Mon, 23 Mar 2015 07:13:03 -0700 (PDT)
Received: from localhost (localhost6.localdomain [127.0.0.1]) by mx2.mail.corp.phx1.mozilla.com (Postfix) with ESMTP id 268D4F225E for <video-codec@ietf.org>; Mon, 23 Mar 2015 07:13:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at mozilla.org
Received: from smtp.mozilla.org ([127.0.0.1]) by localhost (mx2.mail.corp.phx1.mozilla.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Evt-L1LuEtfX for <video-codec@ietf.org>; Mon, 23 Mar 2015 07:13:02 -0700 (PDT)
Received: from [31.133.165.93] (dhcp-a55d.meeting.ietf.org [31.133.165.93]) (Authenticated sender: tterriberry@mozilla.com) by mx2.mail.corp.phx1.mozilla.com (Postfix) with ESMTPSA id AED6AF2122 for <video-codec@ietf.org>; Mon, 23 Mar 2015 07:13:02 -0700 (PDT)
Message-ID: <55101F6D.8020905@xiph.org>
Date: Mon, 23 Mar 2015 07:13:01 -0700
From: "Timothy B. Terriberry" <tterribe@xiph.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:29.0) Gecko/20100101 SeaMonkey/2.26
MIME-Version: 1.0
To: video-codec@ietf.org
References: <CACrD=+_D+psUeWevMuwp0bnxqdcJpo3Zo3Og4E6kkGH1uuzxdA@mail.gmail.com> <CAMzhQmNymkEMgbw-gUGhKEgCYh1yXo8MRkP-8FNfQm8tVNhzbA@mail.gmail.com> <550B6CCC.70706@mozilla.com> <55100BD0.8030301@alvestrand.no>
In-Reply-To: <55100BD0.8030301@alvestrand.no>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/video-codec/avVeR7cOeC55XkAZdNqIB9rYWEQ>
Subject: Re: [video-codec] Performances, measurement of same (Re: The Can Has Landed)
X-BeenThere: video-codec@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Video codec BoF discussion list <video-codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/video-codec>, <mailto:video-codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/video-codec/>
List-Post: <mailto:video-codec@ietf.org>
List-Help: <mailto:video-codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/video-codec>, <mailto:video-codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Mar 2015 14:13:17 -0000

Harald Alvestrand wrote:
> Just as a matter of curiosity, which state of the art performance is it
> that you think VP9 hasn't achieved yet?

I think if we were trying to develop a competitor to HEVC, we could say 
"Ship VP9" and be done. In fact, that is exactly what Firefox did. But I 
am (personally) worried about what comes after HEVC, and I think we have 
an opportunity to get ahead of the curve here like we did with Opus.

> How should we treat that aspect in evaluation?

I have been thinking about this a lot since Mo asked the question a few 
weeks ago, and I am having a hard time coming up with an answer that is 
not "use the normal IETF consensus process".

We've certainly tried to focus on practical algorithms in Daala. 
However, sometimes it is nice to know what the theoretical best we could 
do actually is, so we know how well some of our cheap hacks are performing.

To use a recent example, until the last few weeks, the way we had been 
implementing our lapped transforms, one needed to know the block sizes 
of your neighbors to know what lapping size to apply. That meant we 
couldn't do an exhaustive search of block sizes without making some kind 
of gross simplifications. We restructured things to remove this 
dependency, even though it meant the filtering was theoretically worse. 
That allowed us to write an exhaustive search, which turned out to be 
between 3% and 11% better than the heuristics we had been using to pick 
block sizes. Until we did this, we didn't have a good idea what kind of 
gains were possible, nor how to change our heuristics to produce better 
results. Now we have something we can look at to tell us both.

So having both the heuristic and the exhaustive search is the ideal 
situation, and sometimes asking for it is reasonable. Sometimes it may 
be more work than people are willing to do up-front to make a decision 
about what is a good idea.

But I will also give a second example, that is even more recent. One 
issue Thomas Davies from Cisco pointed out to us was some kind of noise 
buildup along block boundaries in one of their test clips at low rates, 
something we had not seen in our own testing. We were able to track down 
the problem to certain rounding operations in the pre/post filters and 
in converting from the 12-bit data we use in our transforms down to 
8-bit data we store in our reference frames. Storing 12-bit reference 
frames makes the problem go away, but increases the memory bandwidth 
required by the codec by 50%. But we've observed gains between 0.2 dB 
and (on specially constructed examples) 2 dB by doing so. Is that a cost 
people are willing to pay? I don't know the answer to that question, but 
I do know that the easiest way to measure that, x86 cycle counts, is 
also probably the least informative. It also matters what other 
solutions there are (we have at least one more, with a different set of 
unappealing costs), or how hard we want to look for them.

I have a real hard time seeing how to design an up-front process that 
answers questions like this. Maybe others here have better ideas than I do.