Re: [codec] #8: Sample rates?

"Christian Hoene" <hoene@uni-tuebingen.de> Wed, 14 April 2010 10:59 UTC

Return-Path: <hoene@uni-tuebingen.de>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id A21493A68C1 for <codec@core3.amsl.com>; Wed, 14 Apr 2010 03:59:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.345
X-Spam-Level:
X-Spam-Status: No, score=-4.345 tagged_above=-999 required=5 tests=[AWL=-0.697, BAYES_50=0.001, HELO_EQ_DE=0.35, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5wpJ1dlS088a for <codec@core3.amsl.com>; Wed, 14 Apr 2010 03:59:27 -0700 (PDT)
Received: from mx06.uni-tuebingen.de (mx06.uni-tuebingen.de [134.2.3.3]) by core3.amsl.com (Postfix) with ESMTP id 6DFA03A68DF for <codec@ietf.org>; Wed, 14 Apr 2010 03:59:25 -0700 (PDT)
Received: from hoeneT60 (u-173-c009.cs.uni-tuebingen.de [134.2.173.9]) (authenticated bits=0) by mx06.uni-tuebingen.de (8.13.6/8.13.6) with ESMTP id o3EAxGxX028249 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Wed, 14 Apr 2010 12:59:16 +0200
From: Christian Hoene <hoene@uni-tuebingen.de>
To: 'stephen botzko' <stephen.botzko@gmail.com>
References: <062.89d7aa91c79b145b798b83610e45ce71@tools.ietf.org> <m2s28bf2c661004131111pd7880c03m5f225ad464819414@mail.gmail.com> <s2i6e9223711004131143v3f3d2123pc94fe430a59b5776@mail.gmail.com> <CB68DF4CFBEF4942881AD37AE1A7E8C74AB3D92271@IRVEXCHCCR01.corp.ad.broadcom.com> <y2q6e9223711004131303l15fb87ffoe1039c56d21c565f@mail.gmail.com> <20100413164818.546929eae97cjjr6@mail.skype.net> <z2g6e9223711004131723qa66e5a82y3bea15ae44ae5ba0@mail.gmail.com> <4BC514CE.2080800@fas.harvard.edu> <20100413183602.86565rmv5hve5d6q@mail.skype.net> <4BC52068.1080906@fas.harvard.edu> <x2r6e9223711004131955p91007c5byc8b0fa19c21ac3e3@mail.gmail.com>
In-Reply-To: <x2r6e9223711004131955p91007c5byc8b0fa19c21ac3e3@mail.gmail.com>
Date: Wed, 14 Apr 2010 12:59:15 +0200
Message-ID: <000c01cadbc1$86a14f10$93e3ed30$@de>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_000D_01CADBD2.4A2A1F10"
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: AcrbffKJ+ou/PejfTNWfUZGMsNLnmAAQucIg
Content-Language: de
X-AntiVirus: NOT checked by Avira MailGate (version: 3.0.0-4; host: mx06)
Cc: codec@ietf.org
Subject: Re: [codec] #8: Sample rates?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Apr 2010 10:59:28 -0000

Hi ,
 
comments inline:
 
From: codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] On Behalf Of stephen botzko
Sent: Wednesday, April 14, 2010 4:55 AM
To: bens@alum.mit.edu
Cc: codec@ietf.org
Subject: Re: [codec] #8: Sample rates?
 
When I said signaling I meant SDP, not anything in the bitstream itself.  I was not excluding audio bandwidth changes mid-call as
part of network adaptation.  Though as we all agree this needs to be carefully designed.

I agree it is best if the decoder does not require any knowledge of the SDP negotiation (or any other information beyond the RTP
packet stream itself) in order to correctly decode the audio -- which I think is what you were concerned about.
CH: Negotiating codec parameters with SDP has a long tradition. Take for example µLaw (RTP payload type 0): Here you negotiate the
sampling rate. Also, the number of channels are negotiated for many codecs. I think sampling rate and number of channels can be done
with SDP. However, I would avoid other codec specific parameters. Especially, in case of AMR the negotiation is quite complex should
be avoided for the Internet  CODEC.
Christian

It would be a nice property if reducing the acoustic bandwidth also allowed the MIPS to be reduced, but I do not think it is a
requirement;  I'd personally rather manage complexity with a Low Complexity profile (if that is really needed), since then I could
keep the acoustic bandwidth (accepting a higher bit rate instead).

Stephen Botzko
On Tue, Apr 13, 2010 at 9:54 PM, Benjamin M. Schwartz <bmschwar@fas.harvard.edu> wrote:
Koen Vos wrote:
> Quoting "Benjamin M. Schwartz":
>> 1. Why would high frequencies be unheard?  Cheap speakers and microphones
>> have difficulties with low frequencies, but not high frequencies, and
>> routinely go all the way up past the limit of hearing.
>
> Not all hardware supports arbitrary/high sampling rates.  PSTN gateways
> don't go above 8 kHz.  Same for some mobile devices.
True.

>> 2. Why would it need to be negotiated?  For a suitably designed format,
>> the encoder could choose not to waste bits on high frequencies without
>> any
>> negotiation or extra signalling.
>
> Without signaling, how would the encoder know that the farend decoder
> will not take advantage of frequencies above a certain threshold?
When I say signalling, I mean signalling within the codec bitstream.  The
encoder can change its behavior based on knowledge of the receiver's
configuration, but the bitstream does not need any extra signalling to
indicate the change in behavior.

>>> Signaling the bandwidth, and defining the
>>> internal codec rate as fullband should let us lock down the RTP
>>> timestamp
>>> rate at 48 kHz (which I think is desirable).
>>
>> I do agree that having "only one mode" would be ideal, to maximize
>> interoperability.  I wonder whether we can achieve high enough
>> computational efficiency for this to be viable.
>
> Changing the RTP timestamp sampling rate causes no computational
> complexity, does it?  Perhaps an extra multiplication for each packet or
> so?  The point was that RTP timestamp sampling rate should disconnected
> from the actual audio signals.
Right, but Stephen also suggested "defining the internal codec rate as
fullband".  From this, I imagined a scenario in which all (compliant) IWAC
implementations MUST decode all IWAC streams, which always have a sampling
rate of 48 KHz.  I think this is a great idea, to achieve really good
interoperability.

If the receiver is a PSTN gateway, then an "internal codec rate" of 8 KHz
would presumably produce as good quality/bitrate with lower encoder and
decoder complexity.  However, if we can make IWAC sufficiently
low-complexity, operating at 48 KHz may be acceptable.  It will help if we
can structure the codec so that operating at lower bandwidth is very
efficient.  For example, it may be possible to structure a transform codec
such that unneeded high frequencies can cheaply be zero'd on encode and
ignored on decode.

--Ben