Re: [rtcweb] More H.264 vs VP8 tests

Stefan Håkansson LK <stefan.lk.hakansson@ericsson.com> Sat, 29 June 2013 13:18 UTC

Return-Path: <stefan.lk.hakansson@ericsson.com>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5B0F411E811B for <rtcweb@ietfa.amsl.com>; Sat, 29 Jun 2013 06:18:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.949
X-Spam-Level:
X-Spam-Status: No, score=-5.949 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HELO_EQ_SE=0.35, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QIp1o5wtf5-v for <rtcweb@ietfa.amsl.com>; Sat, 29 Jun 2013 06:18:02 -0700 (PDT)
Received: from mailgw2.ericsson.se (mailgw2.ericsson.se [193.180.251.37]) by ietfa.amsl.com (Postfix) with ESMTP id C32CE11E8116 for <rtcweb@ietf.org>; Sat, 29 Jun 2013 06:18:00 -0700 (PDT)
X-AuditID: c1b4fb25-b7f826d000001766-34-51cede87e396
Received: from ESESSHC006.ericsson.se (Unknown_Domain [153.88.253.125]) by mailgw2.ericsson.se (Symantec Mail Security) with SMTP id 59.56.05990.78EDEC15; Sat, 29 Jun 2013 15:17:59 +0200 (CEST)
Received: from ESESSMB209.ericsson.se ([169.254.9.6]) by ESESSHC006.ericsson.se ([153.88.183.36]) with mapi id 14.02.0328.009; Sat, 29 Jun 2013 15:17:58 +0200
From: =?iso-8859-1?Q?Stefan_H=E5kansson_LK?= <stefan.lk.hakansson@ericsson.com>
To: Harald Alvestrand <harald@alvestrand.no>
Thread-Topic: [rtcweb] More H.264 vs VP8 tests
Thread-Index: Ac5vQ4bE4hnmu5ERSv6YOg93i7vBVw==
Date: Sat, 29 Jun 2013 13:17:58 +0000
Message-ID: <1447FA0C20ED5147A1AA0EF02890A64B1C308D3B@ESESSMB209.ericsson.se>
References: <BBE9739C2C302046BD34B42713A1E2A22DECC12F@ESESSMB105.ericsson.se> <51C96E36.2000907@alvestrand.no>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [153.88.183.148]
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrHLMWRmVeSWpSXmKPExsUyM+JvrW77vXOBBjPPqloc6+tis1j7r53d gcnjyoQrrB5LlvxkCmCK4rJJSc3JLEst0rdL4Mo4eWU7S8FD1YqXE16yNDBeluti5OSQEDCR eLv5GROELSZx4d56NhBbSOAwo8T3j25djFxA9iJGiYVXrjGCJNgEAiW27lsAViQioCPxcH8D WDOzgLfE+u45LCC2sICuRMPxNVA1ehILn79ggbF37//I3MXIwcEioCrx4oYiiMkr4Ctx77ER xNoCiambtoBNZAQ65/upNVDTxSVuPZkPdaaAxJI955khbFGJl4//sULYShKNS56wQtTrSdyY OoUNwtaWWLbwNVg9r4CgxMmZT1gmMIrOQjJ2FpKWWUhaZiFpWcDIsoqRPTcxMye93GgTIzAO Dm75rbqD8c45kUOM0hwsSuK8m/XOBAoJpCeWpGanphakFsUXleakFh9iZOLglGpg3P3w0f5c 4xWi3DLZzT/b+/LNXr9RWH3yW5p2QrLWA9lH34JOpjF6ynD6XXa9MeOvPtv+7OhzzJs6djD8 +hbcfj3LedWqA4yZGdyy8ffuhKQ4vBUOTuZcVH/KyaT0x9ILgh2bl1dmrlBmKY3W/+I3f9Mm 25yeOUk/K25H3/u/rKOf5bMil2mxEktxRqKhFnNRcSIACEmFEFECAAA=
Cc: "rtcweb@ietf.org" <rtcweb@ietf.org>
Subject: Re: [rtcweb] More H.264 vs VP8 tests
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 29 Jun 2013 13:18:07 -0000

On 6/25/13 12:17 PM, Harald Alvestrand wrote:
> Again - thanks for releasing this openly!
>
> I ran the scripts (with a few tweaks; you run on a system where sh is
> bash, not dash, for instance), and got the same numbers within +/- 0.5%
> (probably some binary version skew);

I re-run the scripts on another computer with another OS today, and I 
get exactly the same results as Bo sent out. I noted however that if the 
input clips are not cut at 10s (but used in their entire length) the 
results get slightly different, but within +/- 0.5%. Can this be the 
reason why you get slightly different numbers?


> we may have disagreements on the
> parameters to use, but we agree on the numbers those parameters produce.
>
> (I have since modified the Google framework to include a script that
> pulls in the sources for the needed binaries and compiles them - if you
> want to make 100% sure people are working from the same sources, you may
> want to rebase to a newer version of the comparision toolkit.)
>
> On 06/22/2013 03:41 PM, Bo Burman wrote:
>> Hi all,
>>
>> We have had a look at Google's comparison between VP8 and H.264 constrained baseline that was posted on April 3rd (http://www.ietf.org/mail-archive/web/rtcweb/current/msg07028.html). This post contains, as the one mentioned above (and if the attachments make it to the list), information on the exact tools and options used for encoding and should thus be repeatable by anyone interested.
>>
>> As was already stated by others on this list, one major problem is that Google's test involves the rate control mechanism. Typically codecs are measured with rate control turned off, since it acts as a huge noise on the measurement. Instead we propose to compare the codecs using fixed qp-levels. The qp-level is the quantization parameter that affects the rate/distortion tradeoff. Comparing using fixed qp-levels is what has been used when benchmarking HEVC against H.264 in the JCT-VC standardization, for instance. We are going to select a codec (essentially bit stream format), not a rate control mechanism: Once the codec is selected you can choose whatever rate control mechanism you wish.
>>
>> We used Google's excellent framework as the baseline and changed the parameter settings in order to make it possible to measure using fixed qp. We used the same sequences, but limited them to the first 10 seconds since they varied from 10 seconds to minutes; this also eased computation time.
>>
>> We used two H.264 encoder implementations: X264, which is an open-source codec that can operate in everything from real-time to slow, and JM which is the reference implementation that was used to develop H.264. JM is very slow but attempts to be very efficient in terms of bits per quality. The results were as follows:
>>
>> X264 baseline vs VP8: H.264 wins with 1%
>> JM baseline vs VP8: H.264 wins with 4%
>>
>> Running times:
>> X264: 1 hour 3 minutes
>> VP8: 2 hours 0 minutes
>> JM: order of magnitude slower
>>
>> It is interesting to note that the measurements are more stable in the new test; the variance of the percentages for the sequences is now around 70, down from around 700 in Google's test of April 3rd.  We believe this is due to the removal of the rate controller, which acts like noise on the measurements.
>>
>> We also tried setting H.264 to constrained high (no interlace and no B-pictures, compared to high). The results were then:
>>
>> X264 constrained high vs VP8: H.264 wins with 25%
>> JM constrained high vs VP8: H.264 wins with 24%
>>
>> We also note that the script that Google provided to calculate the rate differences ("BD-rate") does not give exactly the same numbers as the JCT-VC-way of calculating BD-rate. The main difference is that the JM score for constrained high is better (around 29%) if the JCT-VC way of calculating BD-rate is used.
>>
>> In summary we think that proper testing can conclude that there is no clear performance advantage to any codec between VP8 and H.264 baseline. When comparing VP8 against H.264 constrained high on the other hand, it seems like there is an advantage for H.264 constrained high.
>>
>> The attached file includes the files necessary to reproduce the test.
>>
>> Best Regards,
>>
>> Bo Burman
>>
>>
>> _______________________________________________
>> rtcweb mailing list
>> rtcweb@ietf.org
>> https://www.ietf.org/mailman/listinfo/rtcweb
>