Re: [codec] Format for the codec specification

Riccardo Bernardini <riccardo.bernardini@uniud.it> Sat, 25 September 2010 09:00 UTC

Return-Path: <riccardo.bernardini@uniud.it>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 7DF5A3A6A7E for <codec@core3.amsl.com>; Sat, 25 Sep 2010 02:00:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 3.277
X-Spam-Level: ***
X-Spam-Status: No, score=3.277 tagged_above=-999 required=5 tests=[BAYES_50=0.001, HELO_EQ_IT=0.635, HOST_EQ_IT=1.245, MIME_QP_LONG_LINE=1.396]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FNRrnYQwKxg4 for <codec@core3.amsl.com>; Sat, 25 Sep 2010 02:00:13 -0700 (PDT)
Received: from delivery.uniud.it (mail.uniud.it [158.110.1.210]) by core3.amsl.com (Postfix) with ESMTP id 09E2D3A6A8A for <codec@ietf.org>; Sat, 25 Sep 2010 02:00:13 -0700 (PDT)
Received: from nospam.uniud.it (nospam.uniud.it [158.110.1.213]) by delivery.uniud.it (Postfix) with ESMTP id 8C1AFB72413 for <codec@ietf.org>; Sat, 25 Sep 2010 10:55:26 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at talitha2.cc.uniud.it
Received: from smtp.uniud.it ([158.110.1.226]) by nospam.uniud.it (nospam.uniud.it [158.110.1.213]) (amavisd-new, port 10025) with ESMTP id fFeAqCYmy4nX for <codec@ietf.org>; Sat, 25 Sep 2010 10:55:26 +0200 (CEST)
Received: from webmail.uniud.it (webmail1.cc.uniud.it [158.110.1.17]) by smtp.uniud.it (Postfix) with ESMTPA id E5C2C1E002 for <codec@ietf.org>; Sat, 25 Sep 2010 10:55:25 +0200 (CEST)
Received: from 109.52.137.133 ([109.52.137.133]) by webmail.uniud.it (Horde Framework) with HTTP; Sat, 25 Sep 2010 10:55:23 +0200
Message-ID: <20100925105523.15884gktkx78i41n@webmail.uniud.it>
Date: Sat, 25 Sep 2010 10:55:23 +0200
From: Riccardo Bernardini <riccardo.bernardini@uniud.it>
To: codec@ietf.org
References: <C8C24B29.24A5A%stewe@stewe.org> <4C9D0C24.5080302@usherbrooke.ca> <4c9d216e.1021cc0a.658d.51bf@mx.google.com> <4C9D3288.1000608@usherbrooke.ca>
In-Reply-To: <4C9D3288.1000608@usherbrooke.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"; DelSp="Yes"; format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
User-Agent: Internet Messaging Program (IMP) H3 (4.2)
Subject: Re: [codec] Format for the codec specification
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 25 Sep 2010 09:00:14 -0000

Quoting Jean-Marc Valin <jean-marc.valin@usherbrooke.ca>:

> Hi Roni,
>
> Thanks for the correction on the case of G.719. Despite that, my  
> point still stands: there *are* codecs for which the decoder is  
> standardized without a bit-exact definition. As long as we define  
> the amount of deviation permitted, then there is no problem.
>
> Oh, and I do agree that the C code should take precedence over the  
> mathematical description.

A question just came to my mind: what about bugs in the C code?   
Suppose, for example, that the algorithm requires the computation of a  
DCT of a block of samples and that there is some typo in the constants  
used in the C code, so that what the C code computes is not the DCT,  
but something similar and the difference is such that it really does  
not matter in most cases, but causes very bad performances (when  
compared with theoretical description) in others.  In this case, the  
precedence would go to the "wrong" description.

I agree that the example above is somehow  weak (I just invented it  
"on the spot"), but I hope that spirit is clear: it could happen that  
despite of all attention we could pay, some subtle but important bugs  
could find their way to the C code.  Also I agree that we can always  
publish an errata or an RFC that obsolete the old one.

Another doubt about having a C code as a normative reference: could  
portability issues make the code "ambiguous"?  For example, a code  
could use the assumption that int is _exactly_ 32 bits long and short  
is _exactly_ 16 bits.  If the same code is run on a processor that  
uses, say, 64 bits for int and 32 for shorts, maybe the results could  
be different.  Of course, you can solve this _specific_ issue by  
including stdint.h (not available everywhere, though) and using types  
like int16_t, but what I want to point out is that we must pay  
attention to this type of portability issues, if we want the meaning  
of the C code being non-ambiguous.  Another example, much more subtle,  
could be

   int foo[3];
   int goo, foo=3;
   int *pt;

   pt  = (int*) foo;
   goo = (int) pt;  /* Is really pt==foo? */

As far as I know (but I am not a C language-lawyer), the standard  
grants that conversion pointer-to-integer-to-pointer will give the  
original pointer back, but nothing is said about the case above  
(imagine an architecture where addresses can be only even).  Although  
the code above would work on a typical x86 architecture, I would not  
use it in a code that should play the role of a reference.

By the way, I agree that the code above is quite silly, but I seen  
things like this used, for example, in multi-thread applications where  
the programmer wanted to pass an integer by using a function that  
expected a pointer.

What about using some other language with less portability issues,  
such as Ada or Java?  Better yet, some language with a semantic  
formally defined? (I do not know any, but maybe someone does)

[Oh, by the way, I *do not want* to start a flame war about which  
language to use.  Rather than fighting over it, I would go with C.  I  
just wanted to share with you some doubts of mine.]

Although I agree that errors and ambiguities could creep in the  
mathematical description, maybe it is easier to write a correct and  
unambiguous mathematical description rather than a program code (in  
whatever language is written).



> Cheers,
>
> 	Jean-Marc
>
> On 10-09-24 06:06 PM, Roni Even wrote:
>> 6.5 Description of the codec
>> The description of the coding algorithm of this Recommendation is made in
>> terms of bit-exact
>> fixed-point mathematical operations. The ANSI-C code indicated in clause 10,
>> which constitutes an
>> integral part of this Recommendation, reflects this bit-exact, fixed-point
>> descriptive approach. The
>> mathematical descriptions of the encoder and decoder can be implemented in
>> other fashions,
>> possibly leading to a codec implementation which does not comply with this
>> Recommendation.
>> Therefore, the algorithm description of the ANSI-C code of clause 10 shall
>> take precedence over the
>> mathematical descriptions whenever discrepancies are found. A set of test
>> signals, which can be
>> used together with the ANSI-C code in order to verify bit-exactness, is
>> available as an electronic
>> attachment to this Recommendation.
>>
>>
>> Roni Even
>>
>>
>>
> _______________________________________________
> codec mailing list
> codec@ietf.org
> https://www.ietf.org/mailman/listinfo/codec
>



-- 
Riccardo Bernardini
DIEGM -- University of Udine
via delle Scienze 208
33100 Udine
Tel: +39-0432-55-8271
Fax: +39-0432-55-8251

----------------------------------------------------------------------
SEMEL (SErvizio di Messaging ELettronico) - CSIT -Universita' di Udine