Re: [rtcweb] More H.264 vs VP8 tests

On 2013-06-22 18:54, Leon Geyser wrote:
> Hi Bo,
>
> Thanks for the additional testing done.
>
> I am not 100% why we have to test with fixed qp-levels when WebRTC is
> going to be used over the internet. I would assume that WebRTC would be
> used over the internet...
> My understanding of rate-control might be completely wrong. Why test
> modes that might not even be used in real-world scenarios?

Hi Leon (responding for Bo who is enjoying his vacation),

you are absolutely right, these codecs are going to be used with 
rate-controllers when deployed in the real world. However, we still do 
not think it is a good idea having rate control turned on when comparing 
them. The reason is that the rate controller influences the quality so 
much that it is very hard to get any comparable signal if you include 
them. Let me give you one example; let's say that you use the same 
codec, but two different rate controllers. Rate control A is very 
liberal with its bits in the beginning of the sequence, whereas rate 
control B is very frugal. On a video conferencing type sequence, where 
the camera is static and not much is happening, rate control A will win 
by a huge margin over rate control B. The reason is that the first frame 
will be encoded very well and this data can be used during the remainder 
of the sequence. Rate control B will not be able to stand a chance, even 
though it has more bits for the remaining images.  And this is with the 
same video codec!

We could of course try to make sure that the two rate controllers have 
similar settings; however, since they are not identical they cannot be 
made to work exactly the same. Even small changes in rate control 
settings could have a huge effect on the end result.

Rather than trying to make two pieces of software work the same, it is 
much easier to just shut off the rate controllers. This gives a direct 
measurement of the relative strengths of the codecs. This is also the 
reason why organizations such as JC-TVC have used this method to compare 
their codecs.

Best Regards,
Stefan

>
> On 22 June 2013 15:41, Bo Burman <bo.burman@ericsson.com
> <mailto:bo.burman@ericsson.com>> wrote:
>
>     Hi all,
>
>     We have had a look at Google's comparison between VP8 and H.264
>     constrained baseline that was posted on April 3rd
>     (http://www.ietf.org/mail-archive/web/rtcweb/current/msg07028.html).
>     This post contains, as the one mentioned above (and if the
>     attachments make it to the list), information on the exact tools and
>     options used for encoding and should thus be repeatable by anyone
>     interested.
>
>     As was already stated by others on this list, one major problem is
>     that Google's test involves the rate control mechanism. Typically
>     codecs are measured with rate control turned off, since it acts as a
>     huge noise on the measurement. Instead we propose to compare the
>     codecs using fixed qp-levels. The qp-level is the quantization
>     parameter that affects the rate/distortion tradeoff. Comparing using
>     fixed qp-levels is what has been used when benchmarking HEVC against
>     H.264 in the JCT-VC standardization, for instance. We are going to
>     select a codec (essentially bit stream format), not a rate control
>     mechanism: Once the codec is selected you can choose whatever rate
>     control mechanism you wish.
>
>     We used Google's excellent framework as the baseline and changed the
>     parameter settings in order to make it possible to measure using
>     fixed qp. We used the same sequences, but limited them to the first
>     10 seconds since they varied from 10 seconds to minutes; this also
>     eased computation time.
>
>     We used two H.264 encoder implementations: X264, which is an
>     open-source codec that can operate in everything from real-time to
>     slow, and JM which is the reference implementation that was used to
>     develop H.264. JM is very slow but attempts to be very efficient in
>     terms of bits per quality. The results were as follows:
>
>     X264 baseline vs VP8: H.264 wins with 1%
>     JM baseline vs VP8: H.264 wins with 4%
>
>     Running times:
>     X264: 1 hour 3 minutes
>     VP8: 2 hours 0 minutes
>     JM: order of magnitude slower
>
>     It is interesting to note that the measurements are more stable in
>     the new test; the variance of the percentages for the sequences is
>     now around 70, down from around 700 in Google's test of April 3rd.
>       We believe this is due to the removal of the rate controller,
>     which acts like noise on the measurements.
>
>     We also tried setting H.264 to constrained high (no interlace and no
>     B-pictures, compared to high). The results were then:
>
>     X264 constrained high vs VP8: H.264 wins with 25%
>     JM constrained high vs VP8: H.264 wins with 24%
>
>     We also note that the script that Google provided to calculate the
>     rate differences ("BD-rate") does not give exactly the same numbers
>     as the JCT-VC-way of calculating BD-rate. The main difference is
>     that the JM score for constrained high is better (around 29%) if the
>     JCT-VC way of calculating BD-rate is used.
>
>     In summary we think that proper testing can conclude that there is
>     no clear performance advantage to any codec between VP8 and H.264
>     baseline. When comparing VP8 against H.264 constrained high on the
>     other hand, it seems like there is an advantage for H.264
>     constrained high.
>
>     The attached file includes the files necessary to reproduce the test.
>
>     Best Regards,
>
>     Bo Burman
>
>
>     _______________________________________________
>     rtcweb mailing list
>     rtcweb@ietf.org <mailto:rtcweb@ietf.org>
>     https://www.ietf.org/mailman/listinfo/rtcweb
>
>