[Cellar] AD Review: draft-ietf-cellar-ffv1

Adam Roach <adam@nostrum.com> Thu, 02 January 2020 23:26 UTC

Return-Path: <adam@nostrum.com>
X-Original-To: cellar@ietfa.amsl.com
Delivered-To: cellar@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 40DF4120105; Thu, 2 Jan 2020 15:26:23 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.98
X-Spam-Level:
X-Spam-Status: No, score=-1.98 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, T_SPF_HELO_PERMERROR=0.01, T_SPF_PERMERROR=0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=nostrum.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zSlKVk51tSZF; Thu, 2 Jan 2020 15:26:20 -0800 (PST)
Received: from nostrum.com (raven-v6.nostrum.com [IPv6:2001:470:d:1130::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6980B1200C3; Thu, 2 Jan 2020 15:26:17 -0800 (PST)
Received: from Svantevit.local (99-152-146-228.lightspeed.dllstx.sbcglobal.net [99.152.146.228]) (authenticated bits=0) by nostrum.com (8.15.2/8.15.2) with ESMTPSA id 002NQEYC019202 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Thu, 2 Jan 2020 17:26:15 -0600 (CST) (envelope-from adam@nostrum.com)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=nostrum.com; s=default; t=1578007576; bh=RWpafsbT5pX4x5o/EfRLCwKC93OvU1BEkYleTYKjJA4=; h=To:From:Subject:Date; b=iYIo1PFCP8vGGEG9D2z7pfJe9dbHvyit2tlWzDBi/ZXT974BHr2tiTqMMB8CFWe6G 9y9G+v9TZdXAesTKbIEmkLf00SoMqqfxZU9UUt8Fvo7Dwq54tsg2LYFbXj3BfmIxJ9 oYdz0maamo5bqLrVEbUxMzZh1r/VUVUnV9Wgwdec=
X-Authentication-Warning: raven.nostrum.com: Host 99-152-146-228.lightspeed.dllstx.sbcglobal.net [99.152.146.228] claimed to be Svantevit.local
To: draft-ietf-cellar-ffv1.all@ietf.org, cellar@ietf.org
From: Adam Roach <adam@nostrum.com>
Message-ID: <b71e9496-c924-b970-9094-2b29ba173bdd@nostrum.com>
Date: Thu, 02 Jan 2020 17:26:07 -0600
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:60.0) Gecko/20100101 Thunderbird/60.9.1
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/cellar/4MNYpqz1VGtZ1oiQIfgfw_cdOKk>
Subject: [Cellar] AD Review: draft-ietf-cellar-ffv1
X-BeenThere: cellar@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Codec Encoding for LossLess Archiving and Realtime transmission <cellar.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cellar>, <mailto:cellar-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cellar/>
List-Post: <mailto:cellar@ietf.org>
List-Help: <mailto:cellar-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cellar>, <mailto:cellar-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 02 Jan 2020 23:26:23 -0000

This is my AD review for draft-ietf-cellar-ffv1, which I am processing
at Alexey Melinikov's request. Thanks to everyone who has taken the time
to document these formats for publication as an RFC.

I have identified a couple of issues that will need to be addressed prior to
IETF last call (which are clearly identified below), as well as a number of
smaller issues that can be addressed as normal IETF last call comments.

---------------------------------------------------------------------------

§1:

 >  The latest version of this document is available at
 >  https://raw.github.com/FFmpeg/FFV1/master/ffv1.md
 >  (https://raw.github.com/FFmpeg/FFV1/master/ffv1.md)

Please add a note to the RFC editor to remove this paragraph prior to
publication (or remove the paragraph in the draft at this time).

---------------------------------------------------------------------------

§2:

 >  The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
 >  "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
 >  document are to be interpreted as described in [RFC2119].

Please update this to match the updated boilerplate given in RFC 8174.

---------------------------------------------------------------------------

§2.2.1:

 >  The pseudo-code used is based upon the C programming
 >  language [ISO.9899.1990] and uses its "if/else", "while" and "for"
 >  functions as well as functions defined within this document.

It's a minor nit, but these are keywords, not functions.

---------------------------------------------------------------------------

§2.2.2:

 >  Note: the operators and the order of precedence are the same as used
 >  in the C programming language [ISO.9899.1990].

Is there a reason to cite C90 here instead of any of the four newer 
versions?

---------------------------------------------------------------------------

§2.2.2:

 >  Note: the operators and the order of precedence are the same as used
 >  in the C programming language [ISO.9899.1990].
...
 >  "a ^ b" means a raised to the b-th power.

One of these two statements is incorrect: the "^" operator in C is a bitwise
XOR, not a power operator. C has no built-in exponentiation operator, and
relies on a library-provided `pow(a,b)` function.

Because this introduces an ambiguity into the specification, it will need
to be corrected before entering IETF last call.

---------------------------------------------------------------------------

§2.2.2:

 >  "a >> b" means arithmetic right shift of two's complement integer
 >  representation of a by b binary digits.

This needs to indicate whether the resulting value is sign-extended
or zero-padded. We can't rely on C's definition here, as right
shifting a signed value is implementation-defined by the standard:

     The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 
has an
     unsigned type or if E1 has a signed type and a nonnegative value, the
     value of the result is the integral part of the quotient of E1 / 
2E2. If
     E1 has a signed type and a negative value, the resulting value is
     implementation-defined.

Because this introduces an ambiguity into the specification, it will need
to be corrected before entering IETF last call.

---------------------------------------------------------------------------

§2.2.5:

 >  a~b,c.  the 'b,c'-th value of a sequence of a

Please explain this with more words. I can't figure out what it means.

---------------------------------------------------------------------------

§2.2.8:

 >  relies on its "Container" to store the "NumBytes" values, see the
 >  section on the Mapping FFV1 into Containers (#mapping-ffv1-into-
 >  containers).

Nit: "...values; see the section..."

Also, please replace the MD-style anchor with a section number.

---------------------------------------------------------------------------

§3:

 >  For each "Slice" (as described in the section on Slices (#slice)) of
 >  a "Frame", the "Planes", "Lines", and "Samples" are coded in an order
 >  determined by the "Color Space" (see the section on Color Space
 >  (#color-spaces)).  Each "Sample" is predicted by the median predictor
 >  as described in the section of the Median Predictor (#median-
 >  predictor) from other "Samples" within the same "Plane" and the
 >  difference is stored using the method described in Coding of the
 >  Sample Difference (#coding-of-the-sample-difference).

Please replace the MD-style anchors with section numbers.

---------------------------------------------------------------------------

§3.4:

 >  context = Q_{0}[l - tl] +
 >            Q_{1}[tl - t] +
 >            Q_{2}[t - tr] +
 >            Q_{3}[L - l]  +
 >            Q_{4}[T - t]

This is the first time in the document we're encountering Q. Please add a
forward reference to its definition.

---------------------------------------------------------------------------

§3.7.2:

 >  As an example, a "Frame" that is two "Pixels" wide and two "Pixels"
 >  high, could be comprised of the following structure:

Nit: "...could be composed of the following structure:" OR "...could 
comprise
the following structure:"

---------------------------------------------------------------------------

§3.8.1.1:

 >  S_{i+1,C_{i}} =  zero_state_{S_{i,C_{i}}} XOR
 >            l_i =  L_i                      XOR
 >            t_i =  R_i - r_i                <==
 >            b_i =  0                        <==>
 >            L_i <  R_i - r_i

Substantial parts of this syntax are undefined at this point in the 
document.

Presumably "XOR" means "exclusive or," although it is unclear whether 
this is
intended to be a boolean or bitwise operation.

The "<==" and "<==>" notation is also without explanation, and I can't 
even make
an educated guess at their intended meaning.

There is no explanation what it might mean to combine a series of assignment
operations (and one comparison operation) with these XOR, <==, and <==> 
symbols.

This needs to be clarified (for all of figures 7 through 9) before the
document can go to IETF last call.

---------------------------------------------------------------------------

§3.8.1.1.1:

 >  Above describes the range decoding, encoding is defined as any
 >  process which produces a decodable bytestream.

Would it be reasonable to include a non-normative example of one such
process?

---------------------------------------------------------------------------

§3.8.1.2:
 >  exact contexts used are best described by the
 >  following code, followed by some comments.

I can't find the referenced comments.

---------------------------------------------------------------------------

§3.8.1.2:

 > pseudo-code                                                   | type
 > --------------------------------------------------------------|-----
 >  void put_symbol(RangeCoder *c, uint8_t *state, int v, int \   |
 >  is_signed) {                                                  |
 >      int i;                                                    |
 >      put_rac(c, state+0, !v);                                  |
 >      if (v) {                                                  |
 >          int a= abs(v);                                        |
 >          int e= log2(a);                                       |
 >                                                                |
 >          for (i = 0; i < e; i++) {                             |
 >              put_rac(c, state+1+min(i,9), 1); //1..10         |
 > }                                                     |
 >                                                                |
 >          put_rac(c, state+1+min(i,9), 0);                      |
 >          for (i = e-1; i >= 0; i--) {                          |
 >              put_rac(c, state+22+min(i,9), (a>>i)&1); //22..31 |
 > }                                                     |
 >                                                                |
 >          if (is_signed) {                                      |
 >              put_rac(c, state+11 + min(e, 10), v < 0); //11..21|
 > }                                                     |
 > }                                                         |
 > }                                                             |

The "type" column here appears superfluous, and should either employed or
removed.

The behavior of the function "put_rac" isn't defined anywhere. While the 
purpose
of its first two parameters can be deduced from context, the purpose of the
third parameter is baffling. For avoidance of doubt, please add a 
description of
what this function does, and what its parameters mean.

---------------------------------------------------------------------------

§3.8.1.6:

 >  The alternative state transition table has been built using iterative
 >  minimization of frame sizes and generally performs better than the
 >  default.  To use it, the coder_type (see the section on coder_type
 >  (#codertype)) MUST be set to 2 and the difference to the default MUST
 >  be stored in the "Parameters", see the section on Parameters
 >  (#parameters).  The reference implementation of FFV1 in FFmpeg uses
 >  this table by default at the time of this writing when Range coding
 >  is used.

Please replace the MD-style anchors with section numbers.

---------------------------------------------------------------------------

§3.8.2.1:

 > pseudo-code                                                   | type
 > --------------------------------------------------------------|-----
 >  int get_ur_golomb(k) {                                        |
 >      for (prefix = 0; prefix < 12; prefix++) {                 |
 >          if (get_bits(1)) {                                    |
 >              return get_bits(k) + (prefix << k)                |
 > }                                                     |
 > }                                                         |
 >      return get_bits(bits) + 11                                |
 > }                                                             |
 >                                                                |
 >  int get_sr_golomb(k) {                                        |
 >      v = get_ur_golomb(k);                                     |
 >      if (v & 1) return - (v >> 1) - 1;                         |
 >      else       return   (v >> 1);                             |
 >  }

The "type" column here appears superfluous, and should either employed or
removed.

Please clarify the meaning of "ur" and "sr" in the function names.

---------------------------------------------------------------------------

§3.8.2.1.1:

 >                       +----------------+-------+
 >                       | bits           | value |
 >                       +================+=======+
 >                       | 1              | 0     |
 >                       +----------------+-------+
 >                       | 01             | 1     |
 >                       +----------------+-------+
 >                       | ...            | ...   |
 >                       +----------------+-------+
 >                       | 0000 0000 0001 | 11    |
 >                       +----------------+-------+
 >                       | 0000 0000 0000 | ESC   |
 >                       +----------------+-------+

This is pretty confusing, not least of all because the base of the 
number on the
right is unclear (given that all the shown values contain 1's and 0's
exclusively). I would suggest:

                         +----------------+-------+
                         | bits           | value |
                         +================+=======+
                         | 1              | 0     |
                         +----------------+-------+
                         | 01             | 1     |
                         +----------------+-------+
                         | ...            | ...   |
                         +----------------+-------+
                         | 00 0000 0001   | 9     |
                         +----------------+-------+
                         | 000 0000 0001  | 10    |
                         +----------------+-------+
                         | 0000 0000 0001 | 11    |
                         +----------------+-------+
                         | 0000 0000 0000 | ESC   |
                         +----------------+-------+

---------------------------------------------------------------------------

§3.8.2.2.1:

 > pseudo-code                                                   | type
 > --------------------------------------------------------------|-----
 > log2_run[41]={                                                |
 >   0, 0, 0, 0, 1, 1, 1, 1,                                      |
 >   2, 2, 2, 2, 3, 3, 3, 3,                                      |
 >   4, 4, 5, 5, 6, 6, 7, 7,                                      |
 >   8, 9,10,11,12,13,14,15,                                      |
 > 16,17,18,19,20,21,22,23,                                      |
 > 24,                                                           |
 > };                                                            |
 >                                                                |
 >  if (run_count == 0 && run_mode == 1) {                        |
 >      if (get_bits(1)) {                                        |
 >          run_count = 1 << log2_run[run_index];                 |
 >          if (x + run_count <= w) {                             |
 > run_index++;                                      |
 > }                                                     |
 >      } else {                                                  |
 >          if (log2_run[run_index]) {                            |
 >              run_count = get_bits(log2_run[run_index]);        |
 >          } else {                                              |
 >              run_count = 0;                                    |
 > }                                                     |
 >          if (run_index) {                                      |
 > run_index--;                                      |
 > }                                                     |
 >          run_mode = 2;                                         |
 > }                                                         |
 > }                                                             |


The "type" column here appears superfluous, and should either employed or
removed.


 >  The log2_run function is also used within [ISO.14495-1.1999].

The reference to "log2_run" as a "function" is perplexing, as log2_run 
is defined
as an array in the above pseudocode.

---------------------------------------------------------------------------

§3.8.2.3:

Some light documentation of the "state" structure would make this pseudocode
much easier to follow.

---------------------------------------------------------------------------

§4:

 >  Within the following sub-sections, pseudo-code is used to explain the
 >  structure of each FFV1 bitstream component, as described in the
 >  section on Pseudo-Code (#pseudocode).

Please replace the MD-style anchor with a section number.

 > +--------+-------------------------------------------+
 >         | Symbol | Definition |
 > +========+===========================================+
 >         | u(n)   | unsigned big endian integer using n bits |
 > +--------+-------------------------------------------+
 >         | sg     | Golomb Rice coded signed scalar symbol |
 >         |        | coded with the method described in Signed |
 >         |        | Golomb Rice Codes (#golomb-rice-mode) |
 > +--------+-------------------------------------------+
 >         | br     | Range coded Boolean (1-bit) symbol with |
 >         |        | the method described in Range binary |
 >         |        | values (#range-binary-values) |
 > +--------+-------------------------------------------+
 >         | ur     | Range coded unsigned scalar symbol coded |
 >         |        | with the method described in Range non |
 >         |        | binary values (#range-non-binary-values) |
 > +--------+-------------------------------------------+
 >         | sr     | Range coded signed scalar symbol coded |
 >         |        | with the method described in Range non |
 >         |        | binary values (#range-non-binary-values) |
 > +--------+-------------------------------------------+

Please replace the several MD-style anchors with a section number.

---------------------------------------------------------------------------

§4:

 >  The same context that is initialized to 128 is used for all fields in
 >  the header.

This phrasing is confusing, as it reads like it's referring to some
previously-mentioned context that is best identified as containing 
"128." After
convincing myself that I didn't miss such a thing, I *think* you might mean:

    All fields in the header use the same context, which is
    initialized to 128.

Although, even with that adjustment, I'm not sure which header you're 
talking
about. The only header mentioned anywhere earlier in the document is the
"Slice Header," and that seems unlikely to be what you're referring to.



---------------------------------------------------------------------------

§4.1:

 >              state_transition_delta[ i ]                       | sr
...
 >          QuantizationTableSet( i )                             |

This psuedo-code appears to use both "[]" and "()" as array operators.  I
understand this is only pseudocode, but please try to remain consistent with
the meaning of operators.

Also, "QuantiztionTableSet" is missing a type designation.

---------------------------------------------------------------------------

§4.1.16:

 > +-------+--------------------------------------------+
 >         | value | error detection/correction type |
 > +=======+============================================+
 >         | 0     | 32-bit CRC on the global header |
 > +-------+--------------------------------------------+
 >         | 1     | 32-bit CRC per slice and the global header |
 > +-------+--------------------------------------------+
 >         | Other | reserved for future use |
 > +-------+--------------------------------------------+

I'm still perplexed by the intended meaning of "header."

---------------------------------------------------------------------------

§4.2.3.4:

 >  FFV1 SHOULD use "V_FFV1" as the Matroska "Codec ID".

Normative requirements on external documents make those documents normative
dependencies. Please move "[Matroska]" from the "Informative References" 
section
to the "Normative References" section.

---------------------------------------------------------------------------

§4.1.17:

 >             +-------+-------------------------------------+
 >             | value | relationship                        |
 >             +=======+=====================================+
 >             | 0     | Frames are independent or dependent |
 >             |       | (keyframes and non keyframes)       |
 >             +-------+-------------------------------------+
 >             | 1     | Frames are independent (keyframes   |
 >             |       | only)                               |
 >             +-------+-------------------------------------+
 >             | Other | reserved for future use             |
 >             +-------+-------------------------------------+
 >
 >                                 Table 14

I'm having a hard time understanding "Other" in this table. Section 4.3
defines "keyframe" to be of "br" type. I'm confused about how a boolean
variable can take on more than two values.

---------------------------------------------------------------------------

§4.4:

 >  Background: due to some non conforming
 >  encoders, some bitstreams where found with 40 extra bits
 >  corresponding to "error_status" and "slice_crc_parity", a decoder
 >  conforming to the revised specification could not do the difference
 >  between a revised bitstream and a buggy bitstream.

This is hard to parse. I suggest rephrasing like:

    Background: Due to some non-conforming
    encoders, some bitstreams were found with 40 extra bits
    corresponding to "error_status" and "slice_crc_parity". As a
    result, a decoder conforming to the revised specification could not
    distinguish between a revised bitstream and a buggy bitstream.

---------------------------------------------------------------------------

§4.6:

 > pseudo-code                                                   | type
 > --------------------------------------------------------------|-----
 >  SliceContent( ) {                                             |
 >      if (colorspace_type == 0) {                               |
 >          for (p = 0; p < primary_color_count; p++) {           |
 >              for (y = 0; y < plane_pixel_height[ p ]; y++) {   |
 >                  Line( p, y )                                  |
 > }                                                 |
 > }                                                     |
 >      } else if (colorspace_type == 1) {                        |
 >          for (y = 0; y < slice_pixel_height; y++) {            |
 >              for (p = 0; p < primary_color_count; p++) {       |
 >                  Line( p, y )                                  |
 > }                                                 |
 > }                                                     |
 > }                                                         |
 > }                                                             |

Without a "type" indicated on the "Line( p, y )" lines, this encoding 
appears
to be ambiguous.

---------------------------------------------------------------------------

§4.7:

 > pseudo-code                                                   | type
 > --------------------------------------------------------------|-----
 >  Line( p, y ) {                                                |
 >      if (colorspace_type == 0) {                               |
 >          for (x = 0; x < plane_pixel_width[ p ]; x++) {        |
 >              sample_difference[ p ][ y ][ x ]                  |
 > }                                                     |
 >      } else if (colorspace_type == 1) {                        |
 >          for (x = 0; x < slice_pixel_width; x++) {             |
 >              sample_difference[ p ][ y ][ x ]                  |
 > }                                                     |
 > }                                                         |
 > }                                                             |

Although redundant with the request above regarding section 4.6, please
add "type" indicators for the "sample_difference[ p ][ y ][ x ]" lines.

---------------------------------------------------------------------------

§4.7.4:

 >  "Sample" value is computed based on median predictor and context
 >  described in the section on Samples (#samples).

Please replace the MD-style anchor with a section number.

---------------------------------------------------------------------------

§4.9

 >  The Quantization Table Sets are stored by storing the number of equal
 >  entries -1 of the first half of the table (represented as "len - 1"
 >  in the pseudo-code below) using the method described in Range Non
 >  Binary Values (#range-non-binary-values).

Please replace the MD-style anchor with a section number.

---------------------------------------------------------------------------

§6:

While it does provide compression (my understanding is that a typical
ratio is about 2:1), FFV1 still produces a very large bitstream, due to
its lossless properties. This does present some security challenges that
need to be cited.

I would expect this document to include text that is roughly equivalent 
to the
final three paragraphs of RFC 4175, Section 8.

---------------------------------------------------------------------------

§7:

Please replace the seven various MD-style anchors with section numbers.

---------------------------------------------------------------------------

§7:

 >  This registration is done using the template defined in [RFC6838] and
 >  following [RFC4855].

To avoid the need to dereference the cited RFC, please also include the IANA
table name ("Media Types") in this paragraph.

---------------------------------------------------------------------------

§7:

 >  This
 >  media type is framed binary data Section 4.8 of [RFC6838].

I think this means to say something like:

    This media type is framed binary data; see Section 4.8 of [RFC6838].

---------------------------------------------------------------------------

§7:

 >  Published specification:
 >
 >  [I-D.ietf-cellar-ffv1] and RFC XXXX.

I can't figure out why two documents are mentioned here -- it would seem 
that
both of these refer to the same document.

---------------------------------------------------------------------------

 > 9.  Appendixes
 >
 > 9.1.  Decoder implementation suggestions
 >
 > 9.1.1.  Multi-threading Support and Independence of Slices

This seems like an unnecessarily deep nesting for its contents. Consider
collapsing into a single section instead of three nested sections.

---------------------------------------------------------------------------

Section 10 should contain instructions to the RFC editor to remove it 
prior to
publication.

Thanks!

/a