Re: [Cellar] clarity for the EBML CRC Element (and some bit of FFV1 CRC)

Jerome Martinez <jerome@mediaarea.net> Wed, 06 January 2016 18:16 UTC

Return-Path: <jerome@mediaarea.net>
X-Original-To: cellar@ietfa.amsl.com
Delivered-To: cellar@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A1F881A0093 for <cellar@ietfa.amsl.com>; Wed, 6 Jan 2016 10:16:38 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 2.499
X-Spam-Level: **
X-Spam-Status: No, score=2.499 tagged_above=-999 required=5 tests=[BAYES_50=0.8, J_CHICKENPOX_54=0.6, J_CHICKENPOX_55=0.6, J_CHICKENPOX_64=0.6, J_CHICKENPOX_65=0.6, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UBQedZjjzSAN for <cellar@ietfa.amsl.com>; Wed, 6 Jan 2016 10:16:36 -0800 (PST)
Received: from 8.mo69.mail-out.ovh.net (8.mo69.mail-out.ovh.net [46.105.56.233]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0F99A1A0092 for <cellar@ietf.org>; Wed, 6 Jan 2016 10:16:35 -0800 (PST)
Received: from mail433.ha.ovh.net (b9.ovh.net [213.186.33.59]) by mo69.mail-out.ovh.net (Postfix) with SMTP id ED8BDFFB4D4 for <cellar@ietf.org>; Wed, 6 Jan 2016 19:16:33 +0100 (CET)
Received: from localhost (HELO queueout) (127.0.0.1) by localhost with SMTP; 6 Jan 2016 20:16:33 +0200
Received: from p5ddb663f.dip0.t-ipconnect.de (HELO ?192.168.2.101?) (jerome@francoallemand.eu@93.219.102.63) by ns0.ovh.net with SMTP; 6 Jan 2016 20:16:30 +0200
To: cellar@ietf.org
References: <99AE1BC4-B7DC-492A-BD79-A24B4012A20A@dericed.com> <CAOXsMFLw7zHEZDTk-iHRK4e_xLWZMGhpC7GQ3zgjY69_XyJKXQ@mail.gmail.com> <5A83E961-C94B-471E-B6DB-E08E8D5FCD60@dericed.com> <CAOXsMFKbQ7Av0fNUMeugRxzi-Lh3imuujaHJFm5CvMSFjr9Esw@mail.gmail.com> <59916228-8638-4827-8145-5B0206B74A96@dericed.com>
From: Jerome Martinez <jerome@mediaarea.net>
Message-ID: <568D59FD.4030702@mediaarea.net>
Date: Wed, 06 Jan 2016 19:16:29 +0100
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0
MIME-Version: 1.0
In-Reply-To: <59916228-8638-4827-8145-5B0206B74A96@dericed.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-Ovh-Tracer-Id: 6484620514729660562
X-Ovh-Remote: 93.219.102.63 (p5ddb663f.dip0.t-ipconnect.de)
X-Ovh-Local: 213.186.33.20 (ns0.ovh.net)
X-OVH-SPAMSTATE: OK
X-OVH-SPAMSCORE: 0
X-OVH-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrfeekiedrjeduucetufdoteggodftvfcurfhrohhfihhlvgemucfqggfjnecuuegrihhlohhuthemuceftddtnecu
X-VR-SPAMSTATE: OK
X-VR-SPAMSCORE: 0
X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrfeekiedrjedugdellecutefuodetggdotffvucfrrhhofhhilhgvmecuqfggjfenuceurghilhhouhhtmecufedttdenuc
Archived-At: <http://mailarchive.ietf.org/arch/msg/cellar/0lQU9nbn5_vXKjVSRSJXbREZjKo>
Subject: Re: [Cellar] clarity for the EBML CRC Element (and some bit of FFV1 CRC)
X-BeenThere: cellar@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Codec Encoding for LossLess Archiving and Realtime transmission <cellar.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cellar>, <mailto:cellar-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cellar/>
List-Post: <mailto:cellar@ietf.org>
List-Help: <mailto:cellar-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cellar>, <mailto:cellar-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Jan 2016 18:16:38 -0000

Le 06/01/2016 05:35, Dave Rice a écrit :
> [...]
>
> Here is a proposed rewrite of the CRC-32 Element definition. It is 
> based upon the matroska.org’s definition with additional edits for 
> clarity. The CRC-32 Element contains a 32 bit Cyclic Redundancy Check 
> value of all the Element Data of the Parent Element as stored except 
> for the CRC-32 Element itself. The CRC element SHOULD be the first in 
> its Parent Element for easier reading. All Elements at Level 1 of an 
> EBML Document SHOULD include a CRC-32 Element as a Child Element. The 
> CRC in use is the IEEE-CRC-32 algorithm as used in the ISO 3309 
> standard and in section 8.1.1.6.2 of ITU-T recommendation V.42. The 
> CRC value MUST use little endian storage.

CRC definition is a pain, with often implementation details different 
and missing from the spec, despite the fact developers need such details.
For example with IEEE-CRC-32, we can see on Wikipedia:
https://en.wikipedia.org/wiki/Polynomial_representations_of_cyclic_redundancy_checks
that both Gzip (similar to the one used by Matroska) and MPEG-2 are 
listed as CRC-32 (as well as "ITU-T V.42")
We also see in original FFV1 spec "The CRC generator polynom used is the 
standard IEEE CRC polynom"

But:
- MPEG-2 uses an initial content of the register of 0xFFFFFFFF and runs 
on a Big Endian bitstream (final value not stored, must be 0 if not 
reversed, 0xFFFFFFFF if reversed). Note: MPEG-2 don't claim to be IEEE 
compliant)
- Matroska uses an initial content of the register of 0xFFFFFFFF and 
runs on a Little Endian bitstream, final value is reversed and stored in 
Little Endian
- FFV1 uses an initial content of the register of 0x00000000 and runs on 
a Big Endian bitstream, final value is not reversed and stored in Big Endian
All different despite the fact they are in the "CRC-32" group.

I read section 8.1.1.6.2 of ITU-T recommendation V.42, it says:
"As a typical implementation at the transmitter, the initial content of 
the register of the device computing the remainder of the division is 
preset to all 1s"

"typical" is not "MUST."

And I don't see in section 8.1.1.6.2 that the result should be reversed.

I don't have access to ISO 3309.

So I propose to be more explicit:
"The CRC in use is the IEEE-CRC-32 algorithm as used in the ISO 3309 
standard and in section 8.1.1.6.2 of ITU-T recommendation V.42, with 
initial value of 0xFFFFFFFF. The CRC value MUST be computed on a little 
endian bitstream and MUST use little endian storage."

I already added some details on the FFV1 spec some time ago (when I 
tried to reuse MPEG-2 CRC code for FFV1):
https://github.com/FFmpeg/FFV1/commit/0e67a72a75485b95261be2e1f39258004666c4a1
(the initial value is 0x00000000)
Maybe I need to add that it must be computed on a big endian bitstream 
and that the result is stored not reversed (big endian storage is forced 
for the whole spec, so no need to add such info here) in order to be 
explicit on all implementation details.

I am not a CRC expert, just a developer facing issues when I need to 
check CRCs, so don't hesitate to correct me if I am wrong on the 
analysis of the issues I have when I implement, and maybe there are 
already a "standard" way to define a CRC at the IETF (I need to dig 
further, but I see e.g. the Polynomial in some IETF documents).

Note: checking (not a reference)
http://reveng.sourceforge.net/crc-catalogue/17plus.htm
V.42 is defined as:
width=32 poly=0x04c11db7 init=0xffffffff refin=true refout=true 
xorout=0xffffffff check=0xcbf43926 name="CRC-32"
CRC-32/MPEG-2 is defined:
width=32 poly=0x04c11db7 init=0xffffffff refin=false refout=false 
xorout=0x00000000 check=0x0376e6e7 name="CRC-32/MPEG-2"

If it is obvious that "IEEE-CRC-32" is the way it is implemented in 
Matroska and only this way, no need of details and I guess we need to 
update FFV1 spec with the same kind of details ("CRC-32/FFV1"?)


Jérôme