Re: [rtcweb] Comments on H.264 and VP8 performance comparisons

Bo Burman <bo.burman@ericsson.com> Tue, 22 October 2013 16:26 UTC

Return-Path: <bo.burman@ericsson.com>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5E7BD11E83F3 for <rtcweb@ietfa.amsl.com>; Tue, 22 Oct 2013 09:26:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.593
X-Spam-Level:
X-Spam-Status: No, score=-3.593 tagged_above=-999 required=5 tests=[AWL=-1.594, BAYES_00=-2.599, J_CHICKENPOX_31=0.6]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cAJ1pJa3wRN4 for <rtcweb@ietfa.amsl.com>; Tue, 22 Oct 2013 09:26:40 -0700 (PDT)
Received: from sesbmg20.ericsson.net (sesbmg20.ericsson.net [193.180.251.56]) by ietfa.amsl.com (Postfix) with ESMTP id 3C56611E81D7 for <rtcweb@ietf.org>; Tue, 22 Oct 2013 09:26:39 -0700 (PDT)
X-AuditID: c1b4fb38-b7fcf8e0000062b8-38-5266a73ebb3b
Received: from ESESSHC022.ericsson.se (Unknown_Domain [153.88.253.124]) by sesbmg20.ericsson.net (Symantec Mail Security) with SMTP id 63.A8.25272.E37A6625; Tue, 22 Oct 2013 18:26:38 +0200 (CEST)
Received: from ESESSMB105.ericsson.se ([169.254.5.4]) by ESESSHC022.ericsson.se ([153.88.183.84]) with mapi id 14.02.0328.009; Tue, 22 Oct 2013 18:26:37 +0200
From: Bo Burman <bo.burman@ericsson.com>
To: Harald Alvestrand <harald@alvestrand.no>, "rtcweb@ietf.org" <rtcweb@ietf.org>
Thread-Topic: [rtcweb] Comments on H.264 and VP8 performance comparisons
Thread-Index: Ac7JIe76OhlqjGwCQFaenvACTkQ0VwF2Vc2AAA12J5D//+fngP//xXVg
Date: Tue, 22 Oct 2013 16:26:37 +0000
Message-ID: <BBE9739C2C302046BD34B42713A1E2A22DFC951C@ESESSMB105.ericsson.se>
References: <BBE9739C2C302046BD34B42713A1E2A22DF9F8D2@ESESSMB105.ericsson.se> <52664A39.5040105@alvestrand.no> <BBE9739C2C302046BD34B42713A1E2A22DFC7383@ESESSMB105.ericsson.se> <52669059.4080700@alvestrand.no>
In-Reply-To: <52669059.4080700@alvestrand.no>
Accept-Language: sv-SE, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [153.88.183.16]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrALMWRmVeSWpSXmKPExsUyM+Jvja7d8rQgg7MTpC2O9XWxWaz9187u wORxZcIVVo8lS34yBTBFcdmkpOZklqUW6dslcGW8+97MVjDLo+LA0U1sDYy3LLoYOTkkBEwk Nm/fzQ5hi0lcuLeeDcQWEjjKKLHvsVIXIxeQvYhRYu7R6WAJNgENifk77jKC2CICwRK9z9+D 2cIC7hK9tyYC1XAAxT0k2pYrQJhuEvMbIkEqWARUJSZ+PM0KYvMK+EpM2/qdDWL8dUaJxSsn g43nFNCVWPLkKdhIRgFZifvf77GA2MwC4hK3nsxngrhTQGLJnvPMELaoxMvH/1ghbEWJnWfb mSHqdSQW7P7EBmFrSyxb+JoZYrGgxMmZT1gmMIrOQjJ2FpKWWUhaZiFpWcDIsoqRozi1OCk3 3chgEyMwFg5u+W2xg/HyX5tDjNIcLErivB/fOgcJCaQnlqRmp6YWpBbFF5XmpBYfYmTi4JRq YHwSEjWr4mF0U3hLDMf/4IvTLF+e4i6udgr1jptabHnie0TzWxO/V3wTZlc9fHuxTkIh6Jz4 lQth51/k/7mlLxoVYqF79OycqOfKFz4Zv42snjTpCyNrbPO96TnBurc6Jup36p67u0lbkruA yf6Ic/SDGLOVx24tTYs+nzDLo0OyasH+P9Fvu5RYijMSDbWYi4oTAbSUvdFTAgAA
Subject: Re: [rtcweb] Comments on H.264 and VP8 performance comparisons
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Oct 2013 16:26:45 -0000

Hi all, 

On 10/22/2013 04:28 PM, Harald Alvestrand wrote:

> Having a 2.9% difference by swapping the order of arguments seems wrong; I'll pass this on to the guy who wrote that particular script.

Actually, this is not so weird at all. With the way of measuring that is done in your script, it makes a huge difference which sequence is the "anchor" (the first argument) and which is the "test subject" (the second argument). The reason is that a 10% bit rate increase is not equivalent to a 10% decrease when you turn the arguments around. An illustrative example:

Assume there are only two sequences, one that VP8 is really good at, so H264 need twice the bit rate to reach the same quality, and one that H.264 is really good at, so that it only need half the bits to reach the same quality. Then, using the VP8 codec as the anchor you get:

Sequence 1: H.264 needs twice the number of bits (+100%) 
Sequence 2: H.264 needs half the number of bits (-50%) 
You average these two and you get (100%-50%)/2 = 25% more bits on average for H.264

If you, on the other hand, use H.264 as the anchor (first argument), you get

Sequence 1: VP8 needs half the number of bits (-50%) 
Sequence 2: VP8 needs twice the number of bits (+100%) 
You average these two and you get (-50% + 100%)/2 = 25% more bits on average for VP8

Hence the script is biased towards the first argument in the script.

Best Regards
Bo Burman

> On 10/22/2013 04:20 PM, Bo Burman wrote:
> > Hi all,
> >
> > We understand that there are many people that are interested in a VP8 vs x264 comparison that includes rate controls.
> >
> > We therefore performed a quick study of the VP8 and x264 rate controls to see whether there were any major
> differences between them in the way they select quantizers. We found that VP8 uses what can be described as QP-
> toggling: By lowering the QP of, say, every 6th frame, you increase the quality of these markedly. The following frames
> also benefit from the increased quality (even though they have to make do with less bits than before) since they can
> predict from the high-quality frame.
> >
> > However, we found that the rate-control of x264 does not QP-toggle in this fine granular way, and x264 may therefore
> be at a disadvantage when comparing against VP8. So we ran a test using a modified version of x264 to allow QP-toggling
> during rate control. The modification is very simple and a patch for the x264 code is found in the bottom of this email.
> >
> > With this change, H.264 baseline (implemented using the modified x264 with rate control turned on) performs equally
> to VP8 with rate-control on; the difference is 1.3% in favor of VP8, but if you swap the order of the two codecs in
> draw_graphs.sh, H.264 baseline wins with 1.6%. This result is thus a tie between VP8 and H.264 baseline, and this is
> consistent with our previous fixed-QP test, and far from the (at least) 13% advantage for VP8 stated by Google.
> >
> > The modification is simple and it may well be possible to improve x264 by more sophisticated modifications. But
> pursuing this track is what we oppose, we do not want comparisons where the outcome depends on how well you
> modify a particular rate control mechanism - the rate control is  codec-agnostic after all. To avoid this we still maintain our
> view that codec comparisons should be done using fixed QP settings, which is also the practice in MPEG and VCEG.
> >
> > We used the test parameters from Google's test repository
> https://git.chromium.org/git/webm/vpx_codec_comparison.git, the version
> ada7d7937a54e47fc18e2e0a1287aea29dc5d1c5.
> 
> Thanks!
> 
> Can you verify that you did see the same 12.7% difference when you ran the unmodified ada7d checkout? I want to be
> sure that I haven't missed any external influences on the results.
> 
> Having a 2.9% difference by swapping the order of arguments seems wrong; I'll pass this on to the guy who wrote that
> particular script.
> 
> >
> > Here are the x264 and comparison script patches:
> > --
> >
> > diff --git a/encoder/ratecontrol.c b/encoder/ratecontrol.c index
> > 1a39a49..2cf256f 100644
> > --- a/encoder/ratecontrol.c
> > +++ b/encoder/ratecontrol.c
> > @@ -1707,7 +1707,18 @@ int x264_ratecontrol_mb_qp( x264_t *h ) {
> >       x264_emms();
> >       float qp = h->rc->qpm;
> > -    if( h->param.rc.i_aq_mode )
> > +
> > +       if (h->sh.i_type != SLICE_TYPE_I){
> > +          if ((h->fenc->i_poc % 12 )== 0)
> > +              qp -= 3.0;
> > +          else
> > +              qp += 0.0;
> > +
> > +       } else {
> > +          qp -= 5.0;
> > +       }
> > +
> > +    if( h->param.rc.i_aq_mode)
> >       {
> >            /* MB-tree currently doesn't adjust quantizers in unreferenced frames. */
> >           float qp_offset = h->fdec->b_kept_as_ref ?
> > h->fenc->f_qp_offset[h->mb.i_mb_xy] :
> > h->fenc->f_qp_offset_aq[h->mb.i_mb_xy];
> >
> >
> > draw_graphs.sh:
> >
> > change
> > ./visual_metrics.py metrics_template.html "*0.txt" stats/h264
> > stats/vp8  > vp8_vs_h264_quality.html to ./visual_metrics.py
> > metrics_template.html "*0.txt" stats/vp8 stats/h264  >
> > vp8_vs_h264_quality2.html
> >
> > Best Regards,
> > Bo Burman
> >
> >> -----Original Message-----
> >> From: rtcweb-bounces@ietf.org [mailto:rtcweb-bounces@ietf.org] On
> >> Behalf Of Harald Alvestrand
> >> Sent: den 22 oktober 2013 11:50
> >> To: rtcweb@ietf.org
> >> Subject: Re: [rtcweb] Comments on H.264 and VP8 performance
> >> comparisons
> >>
> >> On 10/14/2013 11:12 PM, Bo Burman wrote:
> >>> Hi all,
> >>>
> >>> We would like to counter Google's suggestion that our test has only
> >>> "demonstrated that it is possible to reduce VP8's
> >> performance" (updated draft on VP8 http://datatracker.ietf.org/doc/draft-alvestrand-rtcweb-vp8/).
> >>> In fact, what we did in our test was mostly undoing some very
> >>> peculiar
> >>> x264 settings made by Google in their test from April 3. By instead
> >>> using the x264 settings Google themselves proposed in their earlier
> >>> test (from March 12), and removing threading, the difference went
> >>> down from 41% to 16%. (This is without touching the VP8 parameters.)
> >>>
> >>> The last change we made was to remove the rate control from the
> >>> comparison, something that is standard practice in
> >> the world of video standardization. This involved changing both the
> >> x264 and VP8 parameters. After that, the difference went down to -1%.
> >>> In summary, the following steps were taken in our comparison:
> >>>
> >>> 1) Downloading the latest software: 41% became 36%
> >>> 2) Removing threading: 26%
> >>> 3) Removing bit padding: 18%
> >>> 4) Removing other differences between Google's March 12 and April
> >>> 3rd
> >>> tests: 16%
> >>> 5) Removing rate controller: -1%
> >>>
> >>>
> >> Just a quick update on this - I did not manage to get a new draft
> >> ready before the deadline, so I'll have to resort to sending email:
> >>
> >> I applied steps 1), 2) and 3) to the repository mentioned in the draft.
> >> The numbers I got were different, but significant. Below are the
> >> VP8-to-x264 differences I encountered each step of the way.
> >>
> >> - Master branch before October 15: Difference 71.52%
> >> - Updating x264 from 198a7ea (aug 16 2012) to c832fe (March 1 2013):
> >> Difference 71.44%
> >>     (I could not go beyond this because yasm 1.2 was required for newer versions).
> >> - Adding the --thread 1 parameter: Difference 26.11%
> >> - Removing the --nai-hrd=cbr parameter that was suggested by x264
> >> people: Difference 13.87%
> >> - Removing vbv-maxrate and vbv-init 0.8: Difference 12.74%
> >>
> >> I did not shorten the clips to 10 seconds, nor did I try to control
> >> rate by constant cq instead of a rate controller; if Bo's numbers are right, this shouldn't matter.
> >>
> >> I did try removing "vbv-bufsize ${rate}", since this was the
> >> remaining difference I could find for Bo's point 4 "other differences", but that was actually harmful (difference
> increased to 19.05%), so I put it back.
> >>
> >> I checked this in to the repository - the commit is here:
> >>
> >> http://git.chromium.org/gitweb/?p=webm/vpx_codec_comparison.git;a=com
> >> mitdiff;h=ada7d7937a54e47fc18e2e0a1287
> >> aea29dc5d1c5
> >>
> >>
> >> This number does not fit the impression the video codec team has from
> >> other tests - they think
> >> VP8 can do a lot more than 13% better than baseline - but at the
> >> draft deadline, we had not found the set of VP8 parameters that showed this for this particular clip set.
> >>
> >> Commentary: I'm surprised at x264's choice of default value for the
> >> --thread parameter. Accepting a 26% bitrate hit seems like a large price to pay for going faster by default. Is there a
> bug here?
> >>
> >> _______________________________________________
> >> rtcweb mailing list
> >> rtcweb@ietf.org
> >> https://www.ietf.org/mailman/listinfo/rtcweb