Re: [codec] Format for the codec specification

Jean-Marc Valin <jean-marc.valin@usherbrooke.ca> Wed, 29 September 2010 02:45 UTC

Return-Path: <jean-marc.valin@usherbrooke.ca>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 302DA3A6DB2 for <codec@core3.amsl.com>; Tue, 28 Sep 2010 19:45:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.073
X-Spam-Level:
X-Spam-Status: No, score=-2.073 tagged_above=-999 required=5 tests=[AWL=0.526, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cKsOOUBvP33s for <codec@core3.amsl.com>; Tue, 28 Sep 2010 19:45:20 -0700 (PDT)
Received: from relais.videotron.ca (relais.videotron.ca [24.201.245.36]) by core3.amsl.com (Postfix) with ESMTP id 0AAA63A6BFC for <codec@ietf.org>; Tue, 28 Sep 2010 19:45:20 -0700 (PDT)
MIME-version: 1.0
Content-transfer-encoding: 7bit
Content-type: text/plain; charset="ISO-8859-1"; format="flowed"
Received: from [192.168.1.14] ([70.81.109.112]) by VL-MO-MR003.ip.videotron.ca (Sun Java(tm) System Messaging Server 6.3-8.01 (built Dec 16 2008; 32bit)) with ESMTP id <0L9H00HICMCN3FB0@VL-MO-MR003.ip.videotron.ca> for codec@ietf.org; Tue, 28 Sep 2010 22:46:02 -0400 (EDT)
Message-id: <4CA2A868.4060205@usherbrooke.ca>
Date: Tue, 28 Sep 2010 22:46:00 -0400
From: Jean-Marc Valin <jean-marc.valin@usherbrooke.ca>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.12) Gecko/20100915 Thunderbird/3.0.8
To: Stephen Botzko <stephen.botzko@gmail.com>
References: <C8C24B29.24A5A%stewe@stewe.org> <4C9D0C24.5080302@usherbrooke.ca> <D97F32B4-9F12-45DE-A390-EF85C90FF36D@americafree.tv> <DEAE495523C8F140A875D22C7C59D31902BE95DE@ESESSCMS0356.eemea.ericsson.se> <4CA28DAD.8090905@usherbrooke.ca> <AANLkTikUvn0xvjQ96hsCQyiz-5cpaJ2icXOoF6OB=yrs@mail.gmail.com>
In-reply-to: <AANLkTikUvn0xvjQ96hsCQyiz-5cpaJ2icXOoF6OB=yrs@mail.gmail.com>
Cc: "codec@ietf.org" <codec@ietf.org>
Subject: Re: [codec] Format for the codec specification
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Sep 2010 02:45:21 -0000

On 10-09-28 10:28 PM, Stephen Botzko wrote:
> Though I've never heard clear reasons, it seems to me that the
> "decoder-only" folks are mostly focused on applications with a massive
> encoder/decoder imbalance, while the "encoder and decoder" folks tend to
> be focused on applications with equal numbers of encoder and decoders.

I think the main logic here is that when your codec leaves very few 
degrees of freedom to the encoder (the extreme case being an ADPCM 
codec, but CELP generally falls into that as well), then it's fine to 
have the encoder being normative. Voice codecs have typically fallen 
into that category because they were smaller than music codecs. Where I 
think it makes sense to *not* have a normative encoder is when the 
encoder has a lot of freedom. For example, an MP3 encoder has the 
freedom to define it's own psycho acoustic model, decide when to use 
short windows, and so on. And we have seen just how much MP3 encoders 
have improved over the years. This would have been impossible if the 
encoder had been specified normatively.

I would say that the current codec we have here is closer to MP3 than it 
is to (e.g.) G.729. There is a *lot* of freedom in the encoder. Not only 
in how to switch between its three main modes (SILK, CELT, hybrid), but 
within each of these modes. Because of that, I believe that encoders 
will continue to evolve and get better over time, just like they did for 
MP3.

> So I am thinking that making the encoder normative makes sense, given
> that this application is centered on VOIP, so we want to ensure
> consistent quality in all endpoints.

I don't really see a problem with quality. Most implementors will likely 
end up using either the reference encoder, or improvement that got made 
over time. I don't think anyone will complain with having better quality 
than the standard specified.

Cheers,

	Jean-Marc