Re: [rtcweb] New VP8 vs H.264 tests uploaded
Harald Alvestrand <harald@alvestrand.no> Fri, 05 April 2013 07:19 UTC
Return-Path: <harald@alvestrand.no>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 55A8D21F95E1 for <rtcweb@ietfa.amsl.com>; Fri, 5 Apr 2013 00:19:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -110.598
X-Spam-Level:
X-Spam-Status: No, score=-110.598 tagged_above=-999 required=5 tests=[AWL=-0.000, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id j2SK2UoE9Lnf for <rtcweb@ietfa.amsl.com>; Fri, 5 Apr 2013 00:18:57 -0700 (PDT)
Received: from eikenes.alvestrand.no (eikenes.alvestrand.no [158.38.152.233]) by ietfa.amsl.com (Postfix) with ESMTP id 4A6EA21F943A for <rtcweb@ietf.org>; Fri, 5 Apr 2013 00:18:56 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by eikenes.alvestrand.no (Postfix) with ESMTP id 534AF39E0C8; Fri, 5 Apr 2013 09:18:54 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at eikenes.alvestrand.no
Received: from eikenes.alvestrand.no ([127.0.0.1]) by localhost (eikenes.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mIHfuOfxDUNi; Fri, 5 Apr 2013 09:18:48 +0200 (CEST)
Received: from hta-dell.lul.corp.google.com (62-20-124-50.customer.telia.com [62.20.124.50]) by eikenes.alvestrand.no (Postfix) with ESMTPSA id 09D7139E091; Fri, 5 Apr 2013 09:18:48 +0200 (CEST)
Message-ID: <515E7AD6.7040001@alvestrand.no>
Date: Fri, 05 Apr 2013 09:18:46 +0200
From: Harald Alvestrand <harald@alvestrand.no>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130308 Thunderbird/17.0.4
MIME-Version: 1.0
To: "Mo Zanaty (mzanaty)" <mzanaty@cisco.com>
References: <CAPVCLWbajJNS-DbXS-AJjakwovBKhhpXAmBaR_LYKjCyk7UnYg@mail.gmail.com> <515D3FA1.6050305@gmail.com> <515D96A2.1000602@cisco.com> <CAGgHUiRLAmGz7H5iY_cpiiKPPN6JXo1jc2-U7TZLe6k-qETo9Q@mail.gmail.com> <3879D71E758A7E4AA99A35DD8D41D3D90F69B243@xmb-rcd-x14.cisco.com>
In-Reply-To: <3879D71E758A7E4AA99A35DD8D41D3D90F69B243@xmb-rcd-x14.cisco.com>
Content-Type: multipart/alternative; boundary="------------050806080901090103050000"
Cc: "Cullen Jennings (fluffy)" <fluffy@cisco.com>, "rtcweb@ietf.org" <rtcweb@ietf.org>
Subject: Re: [rtcweb] New VP8 vs H.264 tests uploaded
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 05 Apr 2013 07:19:12 -0000
On 04/05/2013 01:28 AM, Mo Zanaty (mzanaty) wrote: > > Realtime/low latency and constrained bitrate are obviously important > for the actual implementation used. Thomas was pointing out that these > factors have nothing to do with the codec technology itself, since > they are purely encoder implementation optimizations. There is nothing > in the VP8 or H.264 standard that uniquely provides realtime/low > latency or constrained bitrate. Those are attributes of encoder > implementations which are not part of the standard. > I'd challenge that assumption. If one technology behaves much better under constrained bitrate and/or low latency than the other, that might just possibly be linked to the technology. As an obvious example, consider B-frames; due to their basic "predict from the future" nature, they will incur a latency penalty if used. A test that depends on B-frames will therefore be an invalid test for real-time operation. What we can always be sure of is that when a certain quality is demonstrated under certain conditions, that quality is achievable under those conditions. We should be careful about drawing larger conclusions than that. > So the question was whether we care about evaluating codec technology > or specific implementations. If the former, then tests should be > staged in the same way codec experts evaluate codec technology/tools. > by "the same way codec experts evaluate codec technology/tools", do you mean the way MPEG does it? Having just started working in MPEG, with their procedures for quality evaluations .... I can't say I'm terribly impressed by the evaluation methods used. I'm also very unimpressed with the openness of the process. > If the latter, then tests should be staged using the target > implementations. > > I'm not aware of conferencing applications which use x264, because it > was designed and optimized for transcoding (dvd rips to blu-ray) not > conferencing. Most importantly, x264 cbr mode is inappropriate for > conferencing since it is for broadcast MPEG transport streams that > must be absolutely CBR to avoid M2TS-mux overflow or underflow, and it > will actually insert filler data instead of real frame data to hit the > CBR rate exactly. Looking at the results which show the worst H.264 > bitrate (62% above VP8) in gipsrecstat_1280_720_50_1485kbps.mkv, there > is almost as much filler data as real frame data, meaning the true > bitrate of real frame data is almost half what is reported in the > results. (See attached if it makes it through.) > It made it through, but I can't interpret it much. Certainly sounds like CBR is a setting to avoid. > While the results are bad, the methodology, effort and transparency > are very good (if we want to compare implementations not standards). I > can rerun without the bogus fillers and post the results next week, > unless someone else can do it faster. But as Thomas pointed out, the > technologies themselves are comparable as far as coding tools, so any > results which show significant differences are either suspect or > explained by differences in encoder implementations or settings not > the codec technology itself. > I've pushed (as my first contribution to the actual code; the tests themselves were done by other Googlers) some changes that will make it a little easier for new people who execute them to get predictable results from the tests we published. Looking forward to more contributions! > Mo > > *From:*rtcweb-bounces@ietf.org [mailto:rtcweb-bounces@ietf.org] *On > Behalf Of *Leon Geyser > *Sent:* Thursday, April 04, 2013 12:56 PM > *To:* Thomas Davies (thdavies) > *Cc:* rtcweb@ietf.org > *Subject:* Re: [rtcweb] New VP8 vs H.264 tests uploaded > > If the purpose is to show whether vp8 is superior as a > *technology* to h264 CBP, then I think the comparison should use > the best settings you have (ideally with a special full-on > non-real time implementation) and test against the JM reference > encoder. Ideally you would use the same or similar GOP structures, > number of references, prediction and QP hierarchies. > > I thought WebRTC was meant for real-time communication. What would it > benefit us if we test settings that won't be used or can't be used in > practice? > > The tests need to test the encoders at realtime/low latency and at a > constrained bitrate mode like CBR. We aren't archiving videos here :) > > A graph that shows the bitrate over time for each clip could be > usefull to make sure that no encoder spikes the bitrate too high at > certain moments. > I welcome changes to the encoder settings as long as they stay > realtime/low latency and constrained bitrate. > > On 4 April 2013 17:05, Thomas Davies <thdavies@cisco.com > <mailto:thdavies@cisco.com>> wrote: > > Harald, > > I think there are quite a few problems with the comparison you have > posted. > > 1. Looking at the sequences there is a very major difference between > the initial intra frame qualities. When I encode just one frame of > sequence gipsrecomotion using the parameters in the script at 1Mb/s > then the intra frame is 3 times larger with vp8 than with x264. > > With video conferencing content, the quality of the initial I frame > has a big impact that can last for many seconds - certainly the length > of these clips. You can easily get gains by increasing the quality > difference between an I frame and subsequent frames. > > x264 seems to have a policy of initially undershooting the bitrate > substantially and ramping up, whereas vpxenc has a different approach. > During this ramp up period the quality is very much worse. I can't > find a way to persuade x264 to behave differently. > > This is a good illustration of why including rate control in > comparisons is a bad idea. > > 2. Likewise, looking at the individual frame sizes, it seems vpxenc is > using a quality hierarchy with a length of 8 ("hiercharchical-P") > where every 8th frame is about 4x bigger than the others. x264 has a > constant target per frame. > > Hierarchical P frames are a really good idea, and can easily get you > 10-20% gain with a big separation like this, at a cost in latency. > Again I don't know how to make x264 do this, but the technique is > applicable to any codec and is used in the JM reference. > > 3. The x264 settings are a bit of a black art, but appear not to be > ideal after all. I am definitely no expert but I found that when > encoding gipsrecomotion at 1Mb/s: > > - setting --threads 1 improves quality by a full 1dB (vpxenc seems to > run single threaded by default) > - reducing the number of references from 3 to 2 (--ref 2) reduces the > load very substantially at very little loss (0.2dB or so). > > So with --threads 1 --ref 2, I found x264 ran more than 2x faster than > vpxenc for this data point and had much better quality than before. > vpxenc is still better (about 1dB), but very possibly within the range > of hierarchical P coding improvements. > > Incidentally, I don't think that x264 performs particularly well at > these high complexity settings, at least for video conferencing, no > doubt as other more practical settings have been targeted. x264 > appears to have a quality ceiling that the JM does not have. > > 4. Another (smaller) issue is that the reported PSNR is combined luma > and chroma over all frames. It's relatively easy to improve chroma > PSNR at a small cost in bits, and usually it is best to ignore chroma > PSNR or (possibly) give it a small weight. The arithmetic mean of > frame PSNRs is generally used rather than the PSNR of the whole > sequence, also. I would very much like separate component PSNRs in > tests. The figures I quote above are luma PSNR. > > If the purpose is to show whether vp8 is superior as a *technology* to > h264 CBP, then I think the comparison should use the best settings you > have (ideally with a special full-on non-real time implementation) and > test against the JM reference encoder. Ideally you would use the same > or similar GOP structures, number of references, prediction and QP > hierarchies. > > Comparing different real-time implementations of different codecs > trying to do high quality coding with different GOP structures and > using rate control with different strategies is just a waste of time. > The first two elements in the list above are alone worth a very > significant amount of bit rate. > > On the other hand, a quick perusal of the actual tools would suggest > that vp8 and h264 CBP are likely "comparable" and the variation > between implementations of the same technology would be bigger than > the variation between the technologies. If we could agree that then a > lot of time could be saved. > > best regards > > Thomas > > > > > > On 04/04/13 09:53, Sergio Garcia Murillo wrote: > > Hi Adrian, > > Could you explain how the encoding parametrization is comparable? > > x264 --nal-hrd cbr --vbv-maxrate ${rate} --vbv-bufsize ${rate} \ > --vbv-init 0.8 --bitrate ${rate} --fps ${frame_rate} \ > --profile baseline --no-scenecut --keyint infinite --preset > veryslow \ > --input-res ${width}x${height} \ > --tune psnr \ > -o ./encoded_clips/h264/${clip_stem}_${rate}kbps.mkv ${filename} \ > 2> ./logs/h264/${clip_stem}_${rate}kbps.txt > > vs: > > ./bin/vpxenc --lag-in-frames=0 --target-bitrate=${rate} > --kf-min-dist=3000 \ > --kf-max-dist=3000 --cpu-used=0 --fps=${frame_rate}/1 > --static-thresh=0 \ > --token-parts=1 --drop-frame=0 --end-usage=cbr --min-q=2 > --max-q=56 \ > --undershoot-pct=100 --overshoot-pct=15 --buf-sz=1000 \ > --buf-initial-sz=800 --buf-optimal-sz=1000 --max-intra-rate=1200 \ > --resize-allowed=0 --drop-frame=0 --passes=1 --good > --noise-sensitivity=0 \ > -w ${width} -h ${height} ${filename} --codec=vp8 \ > -o ./encoded_clips/vp8/${clip_stem}_${rate}kbps.webm \ > &>./logs/vp8/${clip_stem}_${rate}kbps.txt > > Best regards > Sergio > > El 03/04/2013 18:20, Adrian Grange escribió: > > We have uploaded a new set of test results comparing VP8 to H.264. > This latest set contains fixes for some of the problems in the > previous set. We would like to extend our thanks to those who made > suggestions as to how we could improve our methodology and > encourage suggestions as to how we can make further improvements. > > In these tests we run x264 with the "veryslow" preset and VP8 with > the "good, speed 0" setting in an attempt to produce comparable > results. > > An overview of our results is available as follows: > > - A Quality comparison (psnr): > http://downloads.webmproject.org/ietf_tests/vp8_vs_h264_quality.html > > - An Encode Speed comparison: > http://downloads.webmproject.org/ietf_tests/vp8_vs_h264_speed.html > > - A comparison of the aggregate time required to decode all of the > clips in the test: > http://downloads.webmproject.org/ietf_tests/vp8vsh264-decodetime.txt > > All of our test scripts can either be downloaded from: > > http://downloads.webmproject.org/ietf_tests/vp8_vs_h264.tar.xz > > or checked out of our git/gerrit repository: > > git clone http://git.chromium.org/webm/vpx_codec_comparison.git > > The file README.txt, contained within, presents details of how to > build and run the tests. > > The compressed video files--the output from the quality tests--can > also be downloaded: > > VP8: > > http://downloads.webmproject.org/ietf_tests/vp8_videos > <http://downloads.webmproject.org/ietf_tests/vp8_videos/>/index.html > > H.264: > > http://downloads.webmproject.org/ietf_tests/h264_videos/index.html > > Adrian Grange > > _______________________________________________ > > rtcweb mailing list > > rtcweb@ietf.org <mailto:rtcweb@ietf.org> > > https://www.ietf.org/mailman/listinfo/rtcweb > > > > _______________________________________________ > rtcweb mailing list > rtcweb@ietf.org <mailto:rtcweb@ietf.org> > https://www.ietf.org/mailman/listinfo/rtcweb > > > _______________________________________________ > rtcweb mailing list > rtcweb@ietf.org <mailto:rtcweb@ietf.org> > https://www.ietf.org/mailman/listinfo/rtcweb >
- [rtcweb] New VP8 vs H.264 tests uploaded Adrian Grange
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Sergio Garcia Murillo
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Harald Alvestrand
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Luca De Cicco
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Harald Alvestrand
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Luca De Cicco
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Thomas Davies
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Matthew Kaufman
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Adam Roach
- [rtcweb] Fundamental asymmetry [was Re: New VP8 v… Marc Petit-Huguenin
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Leon Geyser
- Re: [rtcweb] New VP8 vs H.264 tests uploaded (UNC… Roy, Radhika R CIV USARMY (US)
- Re: [rtcweb] New VP8 vs H.264 tests uploaded (UNC… Thomas Davies (thdavies)
- Re: [rtcweb] New VP8 vs H.264 tests uploaded (UNC… Harald Alvestrand
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Harald Alvestrand
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Mo Zanaty (mzanaty)
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Mo Zanaty (mzanaty)
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Sergio Garcia Murillo
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Harald Alvestrand
- Re: [rtcweb] New VP8 vs H.264 tests uploaded (UNC… Roy, Radhika R CIV USARMY (US)
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Kieran Kunhya
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Mo Zanaty (mzanaty)
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Kieran Kunhya
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Randell Jesup
- Re: [rtcweb] New VP8 vs H.264 tests uploaded Leon Geyser