Re: [dispatch] [AVT] Proposal to form Internet Wideband AudioCodec WG

"Michael Ramalho (mramalho)" <mramalho@cisco.com> Mon, 20 July 2009 18:28 UTC

Return-Path: <mramalho@cisco.com>
X-Original-To: dispatch@core3.amsl.com
Delivered-To: dispatch@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 22DCA3A69F7; Mon, 20 Jul 2009 11:28:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.469
X-Spam-Level:
X-Spam-Status: No, score=-5.469 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, DNS_FROM_OPENWHOIS=1.13, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Od6Ns-EiAKb7; Mon, 20 Jul 2009 11:28:16 -0700 (PDT)
Received: from rtp-iport-1.cisco.com (rtp-iport-1.cisco.com [64.102.122.148]) by core3.amsl.com (Postfix) with ESMTP id B4DA13A6359; Mon, 20 Jul 2009 11:28:15 -0700 (PDT)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: ApoEAORSZEpAZnmf/2dsb2JhbAC6H4gjjjYFhAw
X-IronPort-AV: E=Sophos;i="4.43,235,1246838400"; d="scan'208";a="51069629"
Received: from rtp-dkim-2.cisco.com ([64.102.121.159]) by rtp-iport-1.cisco.com with ESMTP; 20 Jul 2009 18:20:29 +0000
Received: from rtp-core-1.cisco.com (rtp-core-1.cisco.com [64.102.124.12]) by rtp-dkim-2.cisco.com (8.12.11/8.12.11) with ESMTP id n6KIKTIk019774; Mon, 20 Jul 2009 14:20:29 -0400
Received: from xbh-rtp-211.amer.cisco.com (xbh-rtp-211.cisco.com [64.102.31.102]) by rtp-core-1.cisco.com (8.13.8/8.14.3) with ESMTP id n6KIKTrO005596; Mon, 20 Jul 2009 18:20:29 GMT
Received: from xmb-rtp-219.amer.cisco.com ([64.102.31.101]) by xbh-rtp-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 20 Jul 2009 14:20:29 -0400
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Mon, 20 Jul 2009 14:20:29 -0400
Message-ID: <AA847E176042A54CBB8BA283835E7BCE011FE17E@xmb-rtp-219.amer.cisco.com>
In-Reply-To: <EDC0A1AE77C57744B664A310A0B23AE206E9C8D8@FRMRSSXCHMBSC3.dc-m.alcatel-lucent.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [AVT] [dispatch] Proposal to form Internet Wideband AudioCodec WG
Thread-Index: Acnkg+tPW7/Qit8kSMO3DzsL4DUqDgAdZA9wCO9h3LA=
X-Priority: 5
Priority: Non-Urgent
Importance: low
References: <AA5A65FC22B6F145830AC0EAC7586A6C04BF8E77@mail-srv.spiritcorp.com><00a401c9e388$b25c2350$171469f0$%roni@huawei.com><4A2541B9.2000805@octasic.com><00d501c9e39a$dcbbbe50$96333af0$%roni@huawei.com><D1611ACB-4739-4A65-94F0-403FC24CDC43@cs.columbia.edu><B678F1CB-0000-4774-BF03-6B53C333F15D@standardstrack.com><CE8BFF1C-6F4D-4AF7-A5A7-20FD7C516D12@voxeo.com> <EDC0A1AE77C57744B664A310A0B23AE206E9C8D8@FRMRSSXCHMBSC3.dc-m.alcatel-lucent.com>
From: "Michael Ramalho (mramalho)" <mramalho@cisco.com>
To: "DRAGE, Keith (Keith)" <drage@alcatel-lucent.com>, Dan York <dyork@voxeo.com>, dispatch@ietf.org
X-OriginalArrivalTime: 20 Jul 2009 18:20:29.0464 (UTC) FILETIME=[C2E5E180:01CA0966]
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; l=4299; t=1248114029; x=1248978029; c=relaxed/simple; s=rtpdkim2001; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; d=cisco.com; i=mramalho@cisco.com; z=From:=20=22Michael=20Ramalho=20(mramalho)=22=20<mramalho@c isco.com> |Subject:=20RE=3A=20[AVT]=20[dispatch]=20Proposal=20to=20fo rm=20Internet=20Wideband=20AudioCodec=09WG |Sender:=20 |To:=20=22DRAGE,=20Keith=20(Keith)=22=20<drage@alcatel-luce nt.com>,=0A=20=20=20=20=20=20=20=20=22Dan=20York=22=20<dyork @voxeo.com>,=20<dispatch@ietf.org>; bh=exuTRJq18PCVlDOpB6f0RewSGMHnG84ivf+puVpdVYE=; b=qMYy2Lj+9CzspHgYGb6Lcn8kycU/m1lToDRKrF1kLwk1oG4f/BOXUF7o2o 185zmKIFpYb5hyAfnr24az3nuFFAp/6qPr1awSdPIuxV7uKIaQHgy+xHipS9 kkJ5pXzGKR;
Authentication-Results: rtp-dkim-2; header.From=mramalho@cisco.com; dkim=pass ( sig from cisco.com/rtpdkim2001 verified; );
X-Mailman-Approved-At: Mon, 20 Jul 2009 11:41:38 -0700
Cc: avt@ietf.org
Subject: Re: [dispatch] [AVT] Proposal to form Internet Wideband AudioCodec WG
X-BeenThere: dispatch@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: DISPATCH Working Group Mail List <dispatch.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/dispatch>, <mailto:dispatch-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dispatch>
List-Post: <mailto:dispatch@ietf.org>
List-Help: <mailto:dispatch-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dispatch>, <mailto:dispatch-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 20 Jul 2009 18:28:17 -0000

Keith,

Comments in-line w/MAR:

Regards,

Michael A. Ramalho, Ph.D.

-----Original Message-----
From: avt-bounces@ietf.org [mailto:avt-bounces@ietf.org] On Behalf Of
DRAGE, Keith (Keith)
Sent: Thursday, June 04, 2009 6:01 AM
To: Dan York; dispatch@ietf.org
Cc: avt@ietf.org
Subject: Re: [AVT] [dispatch] Proposal to form Internet Wideband
AudioCodec WG

I would argue that G.711 is implemented, not because anyone can
implement it, but because everyone else has implemented it. It
essentially forms the lowest common denominator for interoperability. If
fancy codec x doesn't work, then assuming you have the bandwidth
availabile, G.711 probably will. And I suspect it is royalty free not
because it was always so, but because any IPR that existed has pretty
much expired by now.

G.711 is also the one codec that is probably the most neutral to
transcoding. I can do to codec A to G.711 and back to codec A with less
impact that with any other intermediate codec.

MAR: Let's look at that statement. G.711 is a simple *waveform* codec
that achieves a (single frequency) SNR of approximately 25 dB over an
approximate 25 dB dynamic range. The reason why you can "go to G.711 and
back" is simply that the other (narrowband) codecs you use have less
fidelity (i.e., on average much more distortion - or equivalently lesser
SNR). The distortion introduced by G.711 is less than the distortion
already introduced by the other codecs.

MAR: An equivalent statement is that for the 0 - 4kHz band (narrowband),
25 dB SNR is good enough for telephony.

For wideband there probably is not any one codec that has achieved that
position.

MAR: Well, the game changes with VOICE when it comes to wideband. Speech
has a spectral tilt to it - lesser power per Hz with increasing
frequency (as it comes from a finite energy acoustic source - this MUST
be the case in the limit). The end result is that a PSD for an ensemble
of speech signals has approximately 6dB less energy per octave (i.e.,
over all phonetic content).

MAR: If you use G.711 for wideband (16k sampling) - the higher
frequencies will have more distortion (i.e., much less than 25 dB). Thus
G.711 sounds somewhat bad in wideband application due to this.

MAR: What I believe you are asking for is: 1) a simple "waveform codec",
2) but one that is "pre-equalized" to whiten the spectral tilt in speech
signals. For example, virtually all filter-band-based codecs (e.g.,
MLT/MCT-based) do this via the "bit allocation" process (sometimes in
conjunction with pre-emphasis).

MAR: I believe most of the codecs proposed for this proposed "codec"
working group are speech-model based. If you want a "waveform based"
solution that you can "transform into and out of" (something which I
call a "do-no-harm" codec property) - you simply need pre-equalization
plus a relatively simple waveform codec behind it (G.711 or something
nearly as simple).

MAR: Note these codecs will be less bandwidth efficient than the usual
(model-based) suspects - as the model-based codecs generally only
parameterize the spectrum (NOT the waveform) above 4kHz. But give me
60~90 kbps ... and such a codec becomes a relatively easy task for any
signal processing graduate student (at least one that would have me on
their thesis committee).

MAR: Lastly, waveform based codecs are generally better for speech
recognition applications - a quality that we might also desire to
consider.

I don't believe building a better, even royalty free codec, will
guarantee a position in the market place. The world is littered with
better solutions that never made it. You need a market where everyone
chooses to implement it so that you can use that codec without having to
go to transcoding. Pasting IETF on the front cover does not achieve
that.

I don't really have a problem with people trying to sit down and design
a new codec and IETF then publishing it. Without some other selling
point however it just becomes yet another codec competing for market
place. As such, put it somewhere where it does not interfere with other
work. Sticking it on a separate mailing list, and not letting it compete
for valuable IETF face to face slots as a working group is fine.

regards

Keith