[video-codec] Benjamin Kaduk's Discuss on draft-ietf-netvc-testing-08: (with DISCUSS and COMMENT)
Benjamin Kaduk via Datatracker <noreply@ietf.org> Thu, 13 June 2019 05:47 UTC
Return-Path: <noreply@ietf.org>
X-Original-To: video-codec@ietf.org
Delivered-To: video-codec@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 27172120048; Wed, 12 Jun 2019 22:47:29 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-netvc-testing@ietf.org, Matthew Miller <linuxwolf+ietf@outer-planes.net>, netvc-chairs@ietf.org, linuxwolf+ietf@outer-planes.net, video-codec@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.97.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <156040484907.27201.12939157708122861483.idtracker@ietfa.amsl.com>
Date: Wed, 12 Jun 2019 22:47:29 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/video-codec/S4HGKlm3OJLD-Nhgeo7R2p8GtLA>
Subject: [video-codec] Benjamin Kaduk's Discuss on draft-ietf-netvc-testing-08: (with DISCUSS and COMMENT)
X-BeenThere: video-codec@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Video codec BoF discussion list <video-codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/video-codec>, <mailto:video-codec-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/video-codec/>
List-Post: <mailto:video-codec@ietf.org>
List-Help: <mailto:video-codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/video-codec>, <mailto:video-codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 13 Jun 2019 05:47:29 -0000
Benjamin Kaduk has entered the following ballot position for draft-ietf-netvc-testing-08: Discuss When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html for more information about IESG DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-netvc-testing/ ---------------------------------------------------------------------- DISCUSS: ---------------------------------------------------------------------- I suspect I will end up balloting Abstain on this document, given how far it is from something I could support publishing (e.g., a freestanding clear description of test procedures), but I do think there are some key issues that need to be resolved before publication. Perhaps some of them stem from a misunderstanding of the intended goal of the document -- I am reading this document as attempting to lay out procedures that are of general utility in evaluating a codec or codecs, but it is possible that (e.g.) it is intended as an informal summary of some choices made in a specific operating environment to make a specific decision. Additional text to set the scope of the discussion could go a long way. Section 2 There's a lot of assertions here without any supporting evidence or reasoning. Why is subjective better than objective? What if objective gets a lot better in the future? What if a test should be important but the interested people don't have the qualifications and the qualified people are too busy doing other things? Section 2.1 Why is p<0.5 an appropriate criterion? Even where p-values are still used in the scientific literature (which is decreasing in popularity), the threshold is more often 0.05, or even 0.00001 (e.g., for high-energy physics). Section 3 Normative C code contained outside of the RFC being published is hardly an archival way to describe an algorithm. There isn't even a git commit hash listed to ensure that the referenced material doesn't change! Section 3.5, 3.6, 3.7 I don't see how MSSSIM, CIEDE2000, VMAF, etc. are not normative references. If you want to use the indicated metric, you have to follow the reference. Section 4.2 There is a dearth of references here. This document alone is far from sufficient to perform these calculations. Section 4.3 There is a dearth of references here as well. What are libaom and libvpx? What is the overlap "BD-Rate method" and where is it specified? Section 5.2 This mention of "[a]ll current test sets" seems to imply that this document is part of a broader set of work. The Introduction should make clear what broader context this document is to be interpreted within. (I only note this once in the Discuss portion, but noted some other examples in the Comment section.) ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- Section 1 Please give the reader a background reading list to get up to speed with the general concepts, terminology, etc. (E.g., I happen to know what the "luma plane" is, but that's not the case for all consumers of the RFC series.) Section 2.1 It seems likely that we should note that the ordering of the algorithms in question should be randomized (presented as left vs. right, first vs. second, etc.) Section 2.3 A Mean Opinion Score (MOS) viewing test is the preferred method of evaluating the quality. The subjective test should be performed as either consecutively showing the video sequences on one screen or on two screens located side-by-side. The testing procedure should When would it be appropriate to perform the test differently? normally follow rules described in [BT500] and be performed with non- expert test subjects. The result of the test will be (depending on (I couldn't follow the links to [BT500] and look; is this a restricted-distribution document?) Section 3.4 A forward reference or other expansion for BD-Rate would be helpful. Section 3.7 perception of video quality [VMAF]. This metric is focused on quality degradation due compression and rescaling. VMAF estimates nit: "due to" Section 4.1 Decibel is a logarithmic scale that requires a fixed reference value in order for numerical values to be defined (i.e., to "cancel out the units" before the transcendental logarthmic function is applied). I assume this is intended to take the reference as the full-fidelity unprocessed original signal, but it may be worth making that explicit. Section 4.2 Why is it necessary to mandate trapezoidal integration for the numerical integration? There are fairly cheap numerical methods available that have superior performance and are well-known. Section 5.2.x How important is it to have what is effectively a directory listing in the final RFC? Section 5.2.2, 5.2.3 This test set requires compiling with high bit depth support. Compiling? Compiling what? Again, this needs to be set in the broader context. Section 5.3 Please expand CQP on first usage. I don't think the broader scope in which the "operating modes" are defined has been made clear. Section 5.3.4, 5.3.5 supported. One parameter is provided to adjust bitrate, but the units are arbitrary. Example configurations follow: Example configurations *of what*? Section 6.2 Normally, the encoder should always be run at the slowest, highest quality speed setting (cpu-used=0 in the case of AV1 and VP9). However, in the case of computation time, both the reference and What is "the case of computation time"? changed encoder can be built with some options disabled. For AV1, - disable-ext_partition and -disable-ext_partition_types can be passed to the configure script to substantially speed up encoding, but the usage of these options must be reported in the test results. Again, this is assuming some context of command-line tools that is not clear from the document.
- [video-codec] Benjamin Kaduk's Discuss on draft-i… Benjamin Kaduk via Datatracker