Re: [rtcweb] Comments on H.264 and VP8 performance comparisons

On 10/22/2013 06:26 PM, Bo Burman wrote:
> Hi all,
>
> On 10/22/2013 04:28 PM, Harald Alvestrand wrote:
>
>> Having a 2.9% difference by swapping the order of arguments seems wrong; I'll pass this on to the guy who wrote that particular script.
> Actually, this is not so weird at all. With the way of measuring that is done in your script, it makes a huge difference which sequence is the "anchor" (the first argument) and which is the "test subject" (the second argument). The reason is that a 10% bit rate increase is not equivalent to a 10% decrease when you turn the arguments around.

Thanks for that explanation - makes sense!
Also makes sense that it wouldn't matter for small differences, or ones 
where the differences were roughly the same for all the clips - the case 
we're in, where clips vary widely in relative performance, is a bit special.

BTW, I talked to one of the guys who did the June tweaks to x264 
parameters yesterday - he explained that he had done a (relatively) 
constant bitrate encoding for VP8 (mode=cbr), and was trying to achieve 
roughly the same result for x264; the advice he was given was to use the 
parameters you saw, with vbv-maxrate ${rate} and vbv-init 0.8 . Both VP8 
and x264 seem to gain about a percentage point if they are allowed to do 
a more variable bitrate, so this is not a huge differentiator.

>   An illustrative example:
>
> Assume there are only two sequences, one that VP8 is really good at, so H264 need twice the bit rate to reach the same quality, and one that H.264 is really good at, so that it only need half the bits to reach the same quality. Then, using the VP8 codec as the anchor you get:
>
> Sequence 1: H.264 needs twice the number of bits (+100%)
> Sequence 2: H.264 needs half the number of bits (-50%)
> You average these two and you get (100%-50%)/2 = 25% more bits on average for H.264
>
> If you, on the other hand, use H.264 as the anchor (first argument), you get
>
> Sequence 1: VP8 needs half the number of bits (-50%)
> Sequence 2: VP8 needs twice the number of bits (+100%)
> You average these two and you get (-50% + 100%)/2 = 25% more bits on average for VP8
>
> Hence the script is biased towards the first argument in the script.
>
> Best Regards
> Bo Burman
>
>> On 10/22/2013 04:20 PM, Bo Burman wrote:
>>> Hi all,
>>>
>>> We understand that there are many people that are interested in a VP8 vs x264 comparison that includes rate controls.
>>>
>>> We therefore performed a quick study of the VP8 and x264 rate controls to see whether there were any major
>> differences between them in the way they select quantizers. We found that VP8 uses what can be described as QP-
>> toggling: By lowering the QP of, say, every 6th frame, you increase the quality of these markedly. The following frames
>> also benefit from the increased quality (even though they have to make do with less bits than before) since they can
>> predict from the high-quality frame.
>>> However, we found that the rate-control of x264 does not QP-toggle in this fine granular way, and x264 may therefore
>> be at a disadvantage when comparing against VP8. So we ran a test using a modified version of x264 to allow QP-toggling
>> during rate control. The modification is very simple and a patch for the x264 code is found in the bottom of this email.
>>> With this change, H.264 baseline (implemented using the modified x264 with rate control turned on) performs equally
>> to VP8 with rate-control on; the difference is 1.3% in favor of VP8, but if you swap the order of the two codecs in
>> draw_graphs.sh, H.264 baseline wins with 1.6%. This result is thus a tie between VP8 and H.264 baseline, and this is
>> consistent with our previous fixed-QP test, and far from the (at least) 13% advantage for VP8 stated by Google.
>>> The modification is simple and it may well be possible to improve x264 by more sophisticated modifications. But
>> pursuing this track is what we oppose, we do not want comparisons where the outcome depends on how well you
>> modify a particular rate control mechanism - the rate control is  codec-agnostic after all. To avoid this we still maintain our
>> view that codec comparisons should be done using fixed QP settings, which is also the practice in MPEG and VCEG.
>>> We used the test parameters from Google's test repository
>> https://git.chromium.org/git/webm/vpx_codec_comparison.git, the version
>> ada7d7937a54e47fc18e2e0a1287aea29dc5d1c5.
>>
>> Thanks!
>>
>> Can you verify that you did see the same 12.7% difference when you ran the unmodified ada7d checkout? I want to be
>> sure that I haven't missed any external influences on the results.
>>
>> Having a 2.9% difference by swapping the order of arguments seems wrong; I'll pass this on to the guy who wrote that
>> particular script.
>>
>>> Here are the x264 and comparison script patches:
>>> --
>>>
>>> diff --git a/encoder/ratecontrol.c b/encoder/ratecontrol.c index
>>> 1a39a49..2cf256f 100644
>>> --- a/encoder/ratecontrol.c
>>> +++ b/encoder/ratecontrol.c
>>> @@ -1707,7 +1707,18 @@ int x264_ratecontrol_mb_qp( x264_t *h ) {
>>>        x264_emms();
>>>        float qp = h->rc->qpm;
>>> -    if( h->param.rc.i_aq_mode )
>>> +
>>> +       if (h->sh.i_type != SLICE_TYPE_I){
>>> +          if ((h->fenc->i_poc % 12 )== 0)
>>> +              qp -= 3.0;
>>> +          else
>>> +              qp += 0.0;
>>> +
>>> +       } else {
>>> +          qp -= 5.0;
>>> +       }
>>> +
>>> +    if( h->param.rc.i_aq_mode)
>>>        {
>>>             /* MB-tree currently doesn't adjust quantizers in unreferenced frames. */
>>>            float qp_offset = h->fdec->b_kept_as_ref ?
>>> h->fenc->f_qp_offset[h->mb.i_mb_xy] :
>>> h->fenc->f_qp_offset_aq[h->mb.i_mb_xy];
>>>
>>>
>>> draw_graphs.sh:
>>>
>>> change
>>> ./visual_metrics.py metrics_template.html "*0.txt" stats/h264
>>> stats/vp8  > vp8_vs_h264_quality.html to ./visual_metrics.py
>>> metrics_template.html "*0.txt" stats/vp8 stats/h264  >
>>> vp8_vs_h264_quality2.html
>>>
>>> Best Regards,
>>> Bo Burman
>>>
>>>> -----Original Message-----
>>>> From: rtcweb-bounces@ietf.org [mailto:rtcweb-bounces@ietf.org] On
>>>> Behalf Of Harald Alvestrand
>>>> Sent: den 22 oktober 2013 11:50
>>>> To: rtcweb@ietf.org
>>>> Subject: Re: [rtcweb] Comments on H.264 and VP8 performance
>>>> comparisons
>>>>
>>>> On 10/14/2013 11:12 PM, Bo Burman wrote:
>>>>> Hi all,
>>>>>
>>>>> We would like to counter Google's suggestion that our test has only
>>>>> "demonstrated that it is possible to reduce VP8's
>>>> performance" (updated draft on VP8 http://datatracker.ietf.org/doc/draft-alvestrand-rtcweb-vp8/).
>>>>> In fact, what we did in our test was mostly undoing some very
>>>>> peculiar
>>>>> x264 settings made by Google in their test from April 3. By instead
>>>>> using the x264 settings Google themselves proposed in their earlier
>>>>> test (from March 12), and removing threading, the difference went
>>>>> down from 41% to 16%. (This is without touching the VP8 parameters.)
>>>>>
>>>>> The last change we made was to remove the rate control from the
>>>>> comparison, something that is standard practice in
>>>> the world of video standardization. This involved changing both the
>>>> x264 and VP8 parameters. After that, the difference went down to -1%.
>>>>> In summary, the following steps were taken in our comparison:
>>>>>
>>>>> 1) Downloading the latest software: 41% became 36%
>>>>> 2) Removing threading: 26%
>>>>> 3) Removing bit padding: 18%
>>>>> 4) Removing other differences between Google's March 12 and April
>>>>> 3rd
>>>>> tests: 16%
>>>>> 5) Removing rate controller: -1%
>>>>>
>>>>>
>>>> Just a quick update on this - I did not manage to get a new draft
>>>> ready before the deadline, so I'll have to resort to sending email:
>>>>
>>>> I applied steps 1), 2) and 3) to the repository mentioned in the draft.
>>>> The numbers I got were different, but significant. Below are the
>>>> VP8-to-x264 differences I encountered each step of the way.
>>>>
>>>> - Master branch before October 15: Difference 71.52%
>>>> - Updating x264 from 198a7ea (aug 16 2012) to c832fe (March 1 2013):
>>>> Difference 71.44%
>>>>      (I could not go beyond this because yasm 1.2 was required for newer versions).
>>>> - Adding the --thread 1 parameter: Difference 26.11%
>>>> - Removing the --nai-hrd=cbr parameter that was suggested by x264
>>>> people: Difference 13.87%
>>>> - Removing vbv-maxrate and vbv-init 0.8: Difference 12.74%
>>>>
>>>> I did not shorten the clips to 10 seconds, nor did I try to control
>>>> rate by constant cq instead of a rate controller; if Bo's numbers are right, this shouldn't matter.
>>>>
>>>> I did try removing "vbv-bufsize ${rate}", since this was the
>>>> remaining difference I could find for Bo's point 4 "other differences", but that was actually harmful (difference
>> increased to 19.05%), so I put it back.
>>>> I checked this in to the repository - the commit is here:
>>>>
>>>> http://git.chromium.org/gitweb/?p=webm/vpx_codec_comparison.git;a=com
>>>> mitdiff;h=ada7d7937a54e47fc18e2e0a1287
>>>> aea29dc5d1c5
>>>>
>>>>
>>>> This number does not fit the impression the video codec team has from
>>>> other tests - they think
>>>> VP8 can do a lot more than 13% better than baseline - but at the
>>>> draft deadline, we had not found the set of VP8 parameters that showed this for this particular clip set.
>>>>
>>>> Commentary: I'm surprised at x264's choice of default value for the
>>>> --thread parameter. Accepting a 26% bitrate hit seems like a large price to pay for going faster by default. Is there a
>> bug here?
>>>> _______________________________________________
>>>> rtcweb mailing list
>>>> rtcweb@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/rtcweb