Re: [codec] Format for the codec specification

Stephan Wenger <stewe@stewe.org> Wed, 29 September 2010 05:03 UTC

Return-Path: <stewe@stewe.org>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 56FBF3A6E67 for <codec@core3.amsl.com>; Tue, 28 Sep 2010 22:03:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.588
X-Spam-Level:
X-Spam-Status: No, score=-1.588 tagged_above=-999 required=5 tests=[AWL=1.011, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CL9Pl-B8VtWw for <codec@core3.amsl.com>; Tue, 28 Sep 2010 22:03:30 -0700 (PDT)
Received: from stewe.org (stewe.org [85.214.122.234]) by core3.amsl.com (Postfix) with ESMTP id 613A33A6E16 for <codec@ietf.org>; Tue, 28 Sep 2010 22:03:28 -0700 (PDT)
Received: from [192.168.1.105] (unverified [24.5.132.232]) by stewe.org (SurgeMail 3.9e) with ESMTP id 799407-1743317 for multiple; Wed, 29 Sep 2010 07:04:08 +0200
User-Agent: Microsoft-Entourage/12.26.0.100708
Date: Tue, 28 Sep 2010 22:03:55 -0700
From: Stephan Wenger <stewe@stewe.org>
To: Jean-Marc Valin <jean-marc.valin@usherbrooke.ca>, Stephen Botzko <stephen.botzko@gmail.com>
Message-ID: <C8C816CB.24BBB%stewe@stewe.org>
Thread-Topic: [codec] Format for the codec specification
Thread-Index: Actfk7dI0cu+Dc4iEkmdSSxG64JFuQ==
In-Reply-To: <4CA2A868.4060205@usherbrooke.ca>
Mime-version: 1.0
Content-type: text/plain; charset="US-ASCII"
Content-transfer-encoding: 7bit
X-Originating-IP: 24.5.132.232
X-Authenticated-User: stewe@stewe.org
X-ORBS-Stamp: Your IP (24.5.132.232) was found in the spamhaus database. http://www.spamhaus.net
Cc: "codec@ietf.org" <codec@ietf.org>
Subject: Re: [codec] Format for the codec specification
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Sep 2010 05:03:32 -0000

Hi all, especially Jean-Marc and Stephen,

I think you are both thinking too technical on this question.  My
understanding is that the speech codecs from the ITU and from 3GPP and 3GPP2
are/have been mostly telco driven developments.  In many legislations,
Telcos have to offer their customers a guaranteed quality level.  It is
easier for them to ensure that level to be achieved by disallowing vendor
differentiation in quality.

Personally, I believe that a codec targeted for a best effort network is
best designed without a strict encoder specification.

That said, I continue to argue that we will be better off with a bit-exact
decoder design.  Again, the reason is not technical, but (patent-) licensing
related.  While it is comparatively easy, in a bit-exact decoder design, to
determine whether a given patent claim is essential to practice the
standard, this is not the case for a design that is based on bitstream
syntax, aspects of the decoder operation, and minimum performance
requirements (which, I believe, is roughly where some people are heading).

Note that it is even more important for a royalty-free codec development to
have a clear understanding about which claims are essential, than for a RAND
codec.  The reason is simple: in the RF case, the development has to stay
clear of any and all claims that may not be available under "exotic" (in
this industry) RF terms, whereas for a RAND codec, people only have to worry
about the inclusion of claims that may not be available for RAND licensing
at all (which are very few in this industry).  As a result, a non bit-exact
decoder design has to be much more conservative in exercising technology (as
more claims may be swept in by advanced designs) than a bit exact decoder
design, neutralizing, IMO, most of the positive effects a non bit-exact
design may have from a performance viewpoint.

In summary, I continue to argue in favor of the MPEG model: (bit-exactly)
standardize the bitstream and the decoder operation on it.

A test-model level encoder design document is desirable as well, as could be
minimum performance specs for the encoder, but, IMO, neither need to be
normative.

Regards,
Stephan





On 9.28.2010 19:46 , "Jean-Marc Valin" <jean-marc.valin@usherbrooke.ca>
wrote:

> On 10-09-28 10:28 PM, Stephen Botzko wrote:
>> Though I've never heard clear reasons, it seems to me that the
>> "decoder-only" folks are mostly focused on applications with a massive
>> encoder/decoder imbalance, while the "encoder and decoder" folks tend to
>> be focused on applications with equal numbers of encoder and decoders.
> 
> I think the main logic here is that when your codec leaves very few
> degrees of freedom to the encoder (the extreme case being an ADPCM
> codec, but CELP generally falls into that as well), then it's fine to
> have the encoder being normative. Voice codecs have typically fallen
> into that category because they were smaller than music codecs. Where I
> think it makes sense to *not* have a normative encoder is when the
> encoder has a lot of freedom. For example, an MP3 encoder has the
> freedom to define it's own psycho acoustic model, decide when to use
> short windows, and so on. And we have seen just how much MP3 encoders
> have improved over the years. This would have been impossible if the
> encoder had been specified normatively.
> 
> I would say that the current codec we have here is closer to MP3 than it
> is to (e.g.) G.729. There is a *lot* of freedom in the encoder. Not only
> in how to switch between its three main modes (SILK, CELT, hybrid), but
> within each of these modes. Because of that, I believe that encoders
> will continue to evolve and get better over time, just like they did for
> MP3.
> 
>> So I am thinking that making the encoder normative makes sense, given
>> that this application is centered on VOIP, so we want to ensure
>> consistent quality in all endpoints.
> 
> I don't really see a problem with quality. Most implementors will likely
> end up using either the reference encoder, or improvement that got made
> over time. I don't think anyone will complain with having better quality
> than the standard specified.
> 
> Cheers,
> 
> Jean-Marc
> _______________________________________________
> codec mailing list
> codec@ietf.org
> https://www.ietf.org/mailman/listinfo/codec