Re: [codec] #8: Sample rates?

"Roni Even" <ron.even.tlv@gmail.com> Wed, 14 April 2010 11:42 UTC

Return-Path: <ron.even.tlv@gmail.com>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 538CC28C160 for <codec@core3.amsl.com>; Wed, 14 Apr 2010 04:42:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.979
X-Spam-Level:
X-Spam-Status: No, score=-1.979 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_SORBS_WEB=0.619]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uKLmxr89Ust5 for <codec@core3.amsl.com>; Wed, 14 Apr 2010 04:42:50 -0700 (PDT)
Received: from mail-bw0-f223.google.com (mail-bw0-f223.google.com [209.85.218.223]) by core3.amsl.com (Postfix) with ESMTP id 0DB0128C2DF for <codec@ietf.org>; Wed, 14 Apr 2010 04:36:49 -0700 (PDT)
Received: by bwz23 with SMTP id 23so4924087bwz.26 for <codec@ietf.org>; Wed, 14 Apr 2010 04:36:38 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:cc:references :in-reply-to:subject:date:message-id:mime-version:content-type :x-mailer:content-language:thread-index; bh=83C4DH4ck/DGPmxMU2EEWTIi+DQ6ESuklZh5Ag62GPs=; b=yBYRkul//AYQ5uf7SFTd9fxALDi0IyivawoBCngvZWaSjYzoFUsSzHO0wzPLrTNCAl 2VgHHOETAWKAknJqyYfnqvTnbynT15HGNyLY+JEaybglOCrQZNK92coBVVYTgt43szmq /DI9DUhtMZl/6luAHkgTi++Ern87PgniYQKqk=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:cc:references:in-reply-to:subject:date:message-id :mime-version:content-type:x-mailer:content-language:thread-index; b=Ck4lM18FJQdnQs2sFoINPKL90KYT8A2nNqraaJU1oBzIbn6GnMhpQc05VXB309aw70 Aj83q5BQeNHjbSyCvK+GUzoYRpdQbgETTy8Vgrk7Y9UoRIQbcD50TJUkLuz+ltXJPn3z DAz7cON0grXDtXgy78vvN1pTY8IOhyXx0Q/Co=
Received: by 10.103.86.39 with SMTP id o39mr4092304mul.58.1271244998504; Wed, 14 Apr 2010 04:36:38 -0700 (PDT)
Received: from windows8d787f9 (bzq-79-178-26-146.red.bezeqint.net [79.178.26.146]) by mx.google.com with ESMTPS id n7sm1481401mue.15.2010.04.14.04.36.33 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 14 Apr 2010 04:36:36 -0700 (PDT)
From: Roni Even <ron.even.tlv@gmail.com>
To: 'Christian Hoene' <hoene@uni-tuebingen.de>, 'stephen botzko' <stephen.botzko@gmail.com>
References: <062.89d7aa91c79b145b798b83610e45ce71@tools.ietf.org> <m2s28bf2c661004131111pd7880c03m5f225ad464819414@mail.gmail.com> <s2i6e9223711004131143v3f3d2123pc94fe430a59b5776@mail.gmail.com> <CB68DF4CFBEF4942881AD37AE1A7E8C74AB3D92271@IRVEXCHCCR01.corp.ad.broadcom.com> <y2q6e9223711004131303l15fb87ffoe1039c56d21c565f@mail.gmail.com> <20100413164818.546929eae97cjjr6@mail.skype.net> <z2g6e9223711004131723qa66e5a82y3bea15ae44ae5ba0@mail.gmail.com> <4BC514CE.2080800@fas.harvard.edu> <20100413183602.86565rmv5hve5d6q@mail.skype.net> <4BC52068.1080906@fas.harvard.edu> <x2r6e9223711004131955p91007c5byc8b0fa19c21ac3e3@mail.gmail.com> <000c01cadbc1$86a14f10$93e3ed30$@de>
In-Reply-To: <000c01cadbc1$86a14f10$93e3ed30$@de>
Date: Wed, 14 Apr 2010 14:35:54 +0300
Message-ID: <4bc5a8c4.07a5660a.30ee.5382@mx.google.com>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_0488_01CADBDF.CB1A0330"
X-Mailer: Microsoft Office Outlook 12.0
Content-language: en-us
Thread-index: AcrbffKJ+ou/PejfTNWfUZGMsNLnmAAQucIgAAE0HgA=
Cc: codec@ietf.org
Subject: Re: [codec] #8: Sample rates?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Apr 2010 11:42:51 -0000

Hi,

Negotiation of codec parameters is not a tradition it  is needed if there
are optional modes that the decoder can support in order to allow the sender
to know if the receiver can receive the specific mode. If there are
mandatory modes you may be able to provide the information in-band but this
is not negotiation. Also note that while the signaling may use reliable
channel the media path is not reliable and may suffer packet loss that may
cause the loss of important parameters. We have such example in the H.264
parameter sets where they can be carried in the SDP for reliability on
in-band as part of the payload.

 

Roni Even

 

 

From: codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] On Behalf Of
Christian Hoene
Sent: Wednesday, April 14, 2010 1:59 PM
To: 'stephen botzko'
Cc: codec@ietf.org
Subject: Re: [codec] #8: Sample rates?

 

Hi ,

 

comments inline:

 

From: codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] On Behalf Of
stephen botzko
Sent: Wednesday, April 14, 2010 4:55 AM
To: bens@alum.mit.edu
Cc: codec@ietf.org
Subject: Re: [codec] #8: Sample rates?

 

When I said signaling I meant SDP, not anything in the bitstream itself.  I
was not excluding audio bandwidth changes mid-call as part of network
adaptation.  Though as we all agree this needs to be carefully designed.

I agree it is best if the decoder does not require any knowledge of the SDP
negotiation (or any other information beyond the RTP packet stream itself)
in order to correctly decode the audio -- which I think is what you were
concerned about.

CH: Negotiating codec parameters with SDP has a long tradition. Take for
example µLaw (RTP payload type 0): Here you negotiate the sampling rate.
Also, the number of channels are negotiated for many codecs. I think
sampling rate and number of channels can be done with SDP. However, I would
avoid other codec specific parameters. Especially, in case of AMR the
negotiation is quite complex should be avoided for the Internet  CODEC.

Christian

It would be a nice property if reducing the acoustic bandwidth also allowed
the MIPS to be reduced, but I do not think it is a requirement;  I'd
personally rather manage complexity with a Low Complexity profile (if that
is really needed), since then I could keep the acoustic bandwidth (accepting
a higher bit rate instead).


Stephen Botzko

On Tue, Apr 13, 2010 at 9:54 PM, Benjamin M. Schwartz
<bmschwar@fas.harvard.edu> wrote:

Koen Vos wrote:
> Quoting "Benjamin M. Schwartz":
>> 1. Why would high frequencies be unheard?  Cheap speakers and microphones
>> have difficulties with low frequencies, but not high frequencies, and
>> routinely go all the way up past the limit of hearing.
>
> Not all hardware supports arbitrary/high sampling rates.  PSTN gateways
> don't go above 8 kHz.  Same for some mobile devices.

True.


>> 2. Why would it need to be negotiated?  For a suitably designed format,
>> the encoder could choose not to waste bits on high frequencies without
>> any
>> negotiation or extra signalling.
>
> Without signaling, how would the encoder know that the farend decoder
> will not take advantage of frequencies above a certain threshold?

When I say signalling, I mean signalling within the codec bitstream.  The
encoder can change its behavior based on knowledge of the receiver's
configuration, but the bitstream does not need any extra signalling to
indicate the change in behavior.


>>> Signaling the bandwidth, and defining the
>>> internal codec rate as fullband should let us lock down the RTP
>>> timestamp
>>> rate at 48 kHz (which I think is desirable).
>>
>> I do agree that having "only one mode" would be ideal, to maximize
>> interoperability.  I wonder whether we can achieve high enough
>> computational efficiency for this to be viable.
>
> Changing the RTP timestamp sampling rate causes no computational
> complexity, does it?  Perhaps an extra multiplication for each packet or
> so?  The point was that RTP timestamp sampling rate should disconnected
> from the actual audio signals.

Right, but Stephen also suggested "defining the internal codec rate as
fullband".  From this, I imagined a scenario in which all (compliant) IWAC
implementations MUST decode all IWAC streams, which always have a sampling
rate of 48 KHz.  I think this is a great idea, to achieve really good
interoperability.

If the receiver is a PSTN gateway, then an "internal codec rate" of 8 KHz
would presumably produce as good quality/bitrate with lower encoder and
decoder complexity.  However, if we can make IWAC sufficiently
low-complexity, operating at 48 KHz may be acceptable.  It will help if we
can structure the codec so that operating at lower bandwidth is very
efficient.  For example, it may be possible to structure a transform codec
such that unneeded high frequencies can cheaply be zero'd on encode and
ignored on decode.

--Ben