Re: [codec] Format for the codec specification

Jean-Marc Valin <jean-marc.valin@usherbrooke.ca> Sat, 25 September 2010 14:23 UTC

Return-Path: <jean-marc.valin@usherbrooke.ca>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6356D3A6AFB for <codec@core3.amsl.com>; Sat, 25 Sep 2010 07:23:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.885
X-Spam-Level:
X-Spam-Status: No, score=-0.885 tagged_above=-999 required=5 tests=[AWL=-0.700, BAYES_40=-0.185]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rd00szyFVkd6 for <codec@core3.amsl.com>; Sat, 25 Sep 2010 07:23:50 -0700 (PDT)
Received: from relais.videotron.ca (relais.videotron.ca [24.201.245.36]) by core3.amsl.com (Postfix) with ESMTP id 70EB53A6901 for <codec@ietf.org>; Sat, 25 Sep 2010 07:23:50 -0700 (PDT)
MIME-version: 1.0
Content-transfer-encoding: 7bit
Content-type: text/plain; charset="ISO-8859-1"; format="flowed"
Received: from [192.168.1.14] ([70.81.109.112]) by VL-MR-MRZ22.ip.videotron.ca (Sun Java(tm) System Messaging Server 6.3-8.01 (built Dec 16 2008; 32bit)) with ESMTP id <0L9B0093P3ZE9BC0@VL-MR-MRZ22.ip.videotron.ca> for codec@ietf.org; Sat, 25 Sep 2010 10:23:39 -0400 (EDT)
Message-id: <4C9E05EA.9010307@usherbrooke.ca>
Date: Sat, 25 Sep 2010 10:23:38 -0400
From: Jean-Marc Valin <jean-marc.valin@usherbrooke.ca>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.12) Gecko/20100915 Thunderbird/3.0.8
To: Riccardo Bernardini <riccardo.bernardini@uniud.it>
References: <C8C24B29.24A5A%stewe@stewe.org> <4C9D0C24.5080302@usherbrooke.ca> <4c9d216e.1021cc0a.658d.51bf@mx.google.com> <4C9D3288.1000608@usherbrooke.ca> <20100925105523.15884gktkx78i41n@webmail.uniud.it>
In-reply-to: <20100925105523.15884gktkx78i41n@webmail.uniud.it>
Cc: codec@ietf.org
Subject: Re: [codec] Format for the codec specification
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 25 Sep 2010 14:23:51 -0000

On 10-09-25 04:55 AM, Riccardo Bernardini wrote:
> A question just came to my mind: what about bugs in the C code? Suppose,
> for example, that the algorithm requires the computation of a DCT of a
> block of samples and that there is some typo in the constants used in
> the C code, so that what the C code computes is not the DCT, but
> something similar and the difference is such that it really does not
> matter in most cases, but causes very bad performances (when compared
> with theoretical description) in others. In this case, the precedence
> would go to the "wrong" description.

Judging from the bugs I have had in CELT over the development, these 
tend to be a lot more subtle than what you describe above, like corner 
cases that likely would not be covered by the text description either. 
In any case, should an error be found in the source code, the best would 
be to simply publish an updated version.

> Another doubt about having a C code as a normative reference: could
> portability issues make the code "ambiguous"? For example, a code could
> use the assumption that int is _exactly_ 32 bits long and short is
> _exactly_ 16 bits. If the same code is run on a processor that uses,
> say, 64 bits for int and 32 for shorts, maybe the results could be
> different. Of course, you can solve this _specific_ issue by including
> stdint.h (not available everywhere, though) and using types like
> int16_t, but what I want to point out is that we must pay attention to
> this type of portability issues, if we want the meaning of the C code
> being non-ambiguous. Another example, much more subtle, could be

That's already how the CELT and SILK source code are written. There are 
typedefs for integers with exactly 16 bits or exactly 32 bits.

> As far as I know (but I am not a C language-lawyer), the standard grants
> that conversion pointer-to-integer-to-pointer will give the original
> pointer back, but nothing is said about the case above (imagine an
> architecture where addresses can be only even). Although the code above
> would work on a typical x86 architecture, I would not use it in a code
> that should play the role of a reference.

Assuming that pointers can be stored in integers is bad and we just have 
to make sure not to have any of that in the reference source code. 
There's no need for it in a codec anyway.

> Although I agree that errors and ambiguities could creep in the
> mathematical description, maybe it is easier to write a correct and
> unambiguous mathematical description rather than a program code (in
> whatever language is written).

Quite the other way around. Writing a correct and unambiguous 
mathematical description for thousands of line of code is nearly 
impossible. Either you write it in plain English and get all the 
ambiguities of natural language, or you write *everything* with 
equations, in which case what you have is the equivalent of pseudo-code. 
Also, a C reference has the advantage that you can verify it simply by 
compiling it.

Cheers,

	Jean-Marc