Re: [codec] #8: Sample rates?

"Benjamin M. Schwartz" <bmschwar@fas.harvard.edu> Wed, 14 April 2010 01:54 UTC

Return-Path: <bmschwar@fas.harvard.edu>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id E62DD3A6808 for <codec@core3.amsl.com>; Tue, 13 Apr 2010 18:54:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iz9vSrppZ7jh for <codec@core3.amsl.com>; Tue, 13 Apr 2010 18:54:55 -0700 (PDT)
Received: from us26.unix.fas.harvard.edu (us26.unix.fas.harvard.edu [140.247.35.202]) by core3.amsl.com (Postfix) with ESMTP id B35D43A63EC for <codec@ietf.org>; Tue, 13 Apr 2010 18:54:55 -0700 (PDT)
Received: from us26.unix.fas.harvard.edu (localhost.localdomain [127.0.0.1]) by us26.unix.fas.harvard.edu (Postfix) with ESMTP id 3FEB71F726E; Tue, 13 Apr 2010 21:54:49 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=fas.harvard.edu; h= message-id:date:from:reply-to:mime-version:to:cc:subject :references:in-reply-to:content-type; s=mail; bh=u1r0hIbiiyUompK MBlobsMAh+o+jg7jYkYq2p0IlY0g=; b=s2o29c9FwLZ0x2vdnRGTmpb7fEbm0pR oixav0bRG2lf5Ys9Vo7E4sLXDyDsu1P/HBzTA8VmxoBSLp15uSx25Ki4tO7YMXRU Ry0htP49bgYjRk62C/Yoq9c63dQoeovmiQQlYFo1wEHvpkkCGCR870ZprvJCh0Cd w/BEfSG3sCNQ=
DomainKey-Signature: a=rsa-sha1; c=simple; d=fas.harvard.edu; h= message-id:date:from:reply-to:mime-version:to:cc:subject :references:in-reply-to:content-type; q=dns; s=mail; b=feRxZmCYT Dd7DvTkaaDJZsk4NS8eZcltEuTSPqxrlvV5VvarFyt7cP7zDBf8IG1IUsZud+IIR EpCwsw1NXQGa5n/yshep/ZNLlKSW4pyb9YelGK4icCWLQ7amQLJ1iZb5VQzvOWBP Xt3ZUAsp40RxbWE/UZCsyoTADMvNgmq7TI=
Received: from [192.168.1.100] (c-71-192-160-188.hsd1.nh.comcast.net [71.192.160.188]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: bmschwar@fas) by us26.unix.fas.harvard.edu (Postfix) with ESMTPSA id 224A81F7269; Tue, 13 Apr 2010 21:54:49 -0400 (EDT)
Message-ID: <4BC52068.1080906@fas.harvard.edu>
Date: Tue, 13 Apr 2010 21:54:48 -0400
From: "Benjamin M. Schwartz" <bmschwar@fas.harvard.edu>
User-Agent: Thunderbird 2.0.0.23 (X11/20091019)
MIME-Version: 1.0
To: Koen Vos <koen.vos@skype.net>
References: <062.89d7aa91c79b145b798b83610e45ce71@tools.ietf.org> <003101cadb1c$828b3990$87a1acb0$@de> <j2l6e9223711004130926nfaa975e3y129cc8cc21c52a84@mail.gmail.com> <m2v28bf2c661004130941g2e2bf956ld512b5d162df9080@mail.gmail.com> <g2h6e9223711004131029m3bfeb1ddq1a0e2bbd8418102f@mail.gmail.com> <m2s28bf2c661004131111pd7880c03m5f225ad464819414@mail.gmail.com> <s2i6e9223711004131143v3f3d2123pc94fe430a59b5776@mail.gmail.com> <CB68DF4CFBEF4942881AD37AE1A7E8C74AB3D92271@IRVEXCHCCR01.corp.ad.broadcom.com> <y2q6e9223711004131303l15fb87ffoe1039c56d21c565f@mail.gmail.com> <20100413164818.546929eae97cjjr6@mail.skype.net> <z2g6e9223711004131723qa66e5a82y3bea15ae44ae5ba0@mail.gmail.com> <4BC514CE.2080800@fas.harvard.edu> <20100413183602.86565rmv5hve5d6q@mail.skype.net>
In-Reply-To: <20100413183602.86565rmv5hve5d6q@mail.skype.net>
X-Enigmail-Version: 0.95.7
Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="------------enig501F2A1B87538068C3844442"
Cc: codec@ietf.org, stephen botzko <stephen.botzko@gmail.com>
Subject: Re: [codec] #8: Sample rates?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: bens@alum.mit.edu
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Apr 2010 01:54:57 -0000

Koen Vos wrote:
> Quoting "Benjamin M. Schwartz":
>> 1. Why would high frequencies be unheard?  Cheap speakers and microphones
>> have difficulties with low frequencies, but not high frequencies, and
>> routinely go all the way up past the limit of hearing.
> 
> Not all hardware supports arbitrary/high sampling rates.  PSTN gateways
> don't go above 8 kHz.  Same for some mobile devices.

True.

>> 2. Why would it need to be negotiated?  For a suitably designed format,
>> the encoder could choose not to waste bits on high frequencies without
>> any
>> negotiation or extra signalling.
> 
> Without signaling, how would the encoder know that the farend decoder
> will not take advantage of frequencies above a certain threshold?

When I say signalling, I mean signalling within the codec bitstream.  The
encoder can change its behavior based on knowledge of the receiver's
configuration, but the bitstream does not need any extra signalling to
indicate the change in behavior.

>>> Signaling the bandwidth, and defining the
>>> internal codec rate as fullband should let us lock down the RTP
>>> timestamp
>>> rate at 48 kHz (which I think is desirable).
>>
>> I do agree that having "only one mode" would be ideal, to maximize
>> interoperability.  I wonder whether we can achieve high enough
>> computational efficiency for this to be viable.
> 
> Changing the RTP timestamp sampling rate causes no computational
> complexity, does it?  Perhaps an extra multiplication for each packet or
> so?  The point was that RTP timestamp sampling rate should disconnected
> from the actual audio signals.

Right, but Stephen also suggested "defining the internal codec rate as
fullband".  From this, I imagined a scenario in which all (compliant) IWAC
implementations MUST decode all IWAC streams, which always have a sampling
rate of 48 KHz.  I think this is a great idea, to achieve really good
interoperability.

If the receiver is a PSTN gateway, then an "internal codec rate" of 8 KHz
would presumably produce as good quality/bitrate with lower encoder and
decoder complexity.  However, if we can make IWAC sufficiently
low-complexity, operating at 48 KHz may be acceptable.  It will help if we
can structure the codec so that operating at lower bandwidth is very
efficient.  For example, it may be possible to structure a transform codec
such that unneeded high frequencies can cheaply be zero'd on encode and
ignored on decode.

--Ben