Re: [codec] #8: Sample rates?

"Christian Hoene" <hoene@uni-tuebingen.de> Wed, 14 April 2010 12:11 UTC

Return-Path: <hoene@uni-tuebingen.de>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id C3F1F28C154 for <codec@core3.amsl.com>; Wed, 14 Apr 2010 05:11:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.412
X-Spam-Level:
X-Spam-Status: No, score=-5.412 tagged_above=-999 required=5 tests=[AWL=0.836, BAYES_00=-2.599, HELO_EQ_DE=0.35, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jv0WQ+qjqFvQ for <codec@core3.amsl.com>; Wed, 14 Apr 2010 05:11:39 -0700 (PDT)
Received: from mx06.uni-tuebingen.de (mx06.uni-tuebingen.de [134.2.3.3]) by core3.amsl.com (Postfix) with ESMTP id B30EB28C1C2 for <codec@ietf.org>; Wed, 14 Apr 2010 05:08:42 -0700 (PDT)
Received: from hoeneT60 (u-173-c009.cs.uni-tuebingen.de [134.2.173.9]) (authenticated bits=0) by mx06.uni-tuebingen.de (8.13.6/8.13.6) with ESMTP id o3EC8UaM012005 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Wed, 14 Apr 2010 14:08:30 +0200
From: Christian Hoene <hoene@uni-tuebingen.de>
To: 'stephen botzko' <stephen.botzko@gmail.com>, 'Roni Even' <ron.even.tlv@gmail.com>
References: <062.89d7aa91c79b145b798b83610e45ce71@tools.ietf.org> <y2q6e9223711004131303l15fb87ffoe1039c56d21c565f@mail.gmail.com> <20100413164818.546929eae97cjjr6@mail.skype.net> <z2g6e9223711004131723qa66e5a82y3bea15ae44ae5ba0@mail.gmail.com> <4BC514CE.2080800@fas.harvard.edu> <20100413183602.86565rmv5hve5d6q@mail.skype.net> <4BC52068.1080906@fas.harvard.edu> <x2r6e9223711004131955p91007c5byc8b0fa19c21ac3e3@mail.gmail.com> <000c01cadbc1$86a14f10$93e3ed30$@de> <4bc5a8c4.07a5660a.30ee.5382@mx.google.com> <o2u6e9223711004140451s567cff9dwf12cbda8649a3f85@mail.gmail.com>
In-Reply-To: <o2u6e9223711004140451s567cff9dwf12cbda8649a3f85@mail.gmail.com>
Date: Wed, 14 Apr 2010 14:08:29 +0200
Message-ID: <003d01cadbcb$32653ce0$972fb6a0$@de>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_003E_01CADBDB.F5EE0CE0"
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: AcrbyNxVdSPqsxByRdGy9ACHAqxUjAAAUpRg
Content-Language: de
X-AntiVirus: NOT checked by Avira MailGate (version: 3.0.0-4; host: mx06)
Cc: codec@ietf.org
Subject: Re: [codec] #8: Sample rates?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Apr 2010 12:11:46 -0000

Hi,
 
I am fine with dropping any SDP negotiation on codec parameters including sampling rate and channels. I like the idea of splitting
signaling and transportation issues.
 
But one question remain. We had the question on limiting the complexity for some kind of devices by choosing a lower sampling rate
or a low number of channels. Shall this negotiation be done with SDP or inband?
 
Christian
 
 
---------------------------------------------------------------
Dr.-Ing. Christian Hoene
Interactive Communication Systems (ICS), University of Tübingen 
Sand 13, 72076 Tübingen, Germany, Phone +49 7071 2970532 
 <http://www.net.uni-tuebingen.de/> http://www.net.uni-tuebingen.de/
 
From: stephen botzko [mailto:stephen.botzko@gmail.com] 
Sent: Wednesday, April 14, 2010 1:52 PM
To: Roni Even
Cc: Christian Hoene; codec@ietf.org
Subject: Re: [codec] #8: Sample rates?
 
Good points, thanks for clarifying..

Personally I favor carrying those H.264 parameter sets on the media path, since there are situations (switched multipoint calls for
one) where the timing matters.  With that use case, if reliable-but-too-late delivery occurs, there are decoding errors even if
there is no packet loss.  

Though of course SDP transmission alone may be suitable for other applications, and it is perfectly legal to send them both ways.

Stephen Botzko
2010/4/14 Roni Even <ron.even.tlv@gmail.com>
Hi,
Negotiation of codec parameters is not a tradition it  is needed if there are optional modes that the decoder can support in order
to allow the sender to know if the receiver can receive the specific mode. If there are mandatory modes you may be able to provide
the information in-band but this is not negotiation. Also note that while the signaling may use reliable channel the media path is
not reliable and may suffer packet loss that may cause the loss of important parameters. We have such example in the H.264 parameter
sets where they can be carried in the SDP for reliability on in-band as part of the payload.
 
Roni Even
 
 
From: codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] On Behalf Of Christian Hoene
Sent: Wednesday, April 14, 2010 1:59 PM
To: 'stephen botzko'

Cc: codec@ietf.org
Subject: Re: [codec] #8: Sample rates?
 
Hi ,
 
comments inline:
 
From: codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] On Behalf Of stephen botzko
Sent: Wednesday, April 14, 2010 4:55 AM
To: bens@alum.mit.edu
Cc: codec@ietf.org
Subject: Re: [codec] #8: Sample rates?
 
When I said signaling I meant SDP, not anything in the bitstream itself.  I was not excluding audio bandwidth changes mid-call as
part of network adaptation.  Though as we all agree this needs to be carefully designed.

I agree it is best if the decoder does not require any knowledge of the SDP negotiation (or any other information beyond the RTP
packet stream itself) in order to correctly decode the audio -- which I think is what you were concerned about.
CH: Negotiating codec parameters with SDP has a long tradition. Take for example µLaw (RTP payload type 0): Here you negotiate the
sampling rate. Also, the number of channels are negotiated for many codecs. I think sampling rate and number of channels can be done
with SDP. However, I would avoid other codec specific parameters. Especially, in case of AMR the negotiation is quite complex should
be avoided for the Internet  CODEC.
Christian

It would be a nice property if reducing the acoustic bandwidth also allowed the MIPS to be reduced, but I do not think it is a
requirement;  I'd personally rather manage complexity with a Low Complexity profile (if that is really needed), since then I could
keep the acoustic bandwidth (accepting a higher bit rate instead).

Stephen Botzko
On Tue, Apr 13, 2010 at 9:54 PM, Benjamin M. Schwartz <bmschwar@fas.harvard.edu> wrote:
Koen Vos wrote:
> Quoting "Benjamin M. Schwartz":
>> 1. Why would high frequencies be unheard?  Cheap speakers and microphones
>> have difficulties with low frequencies, but not high frequencies, and
>> routinely go all the way up past the limit of hearing.
>
> Not all hardware supports arbitrary/high sampling rates.  PSTN gateways
> don't go above 8 kHz.  Same for some mobile devices.
True.

>> 2. Why would it need to be negotiated?  For a suitably designed format,
>> the encoder could choose not to waste bits on high frequencies without
>> any
>> negotiation or extra signalling.
>
> Without signaling, how would the encoder know that the farend decoder
> will not take advantage of frequencies above a certain threshold?
When I say signalling, I mean signalling within the codec bitstream.  The
encoder can change its behavior based on knowledge of the receiver's
configuration, but the bitstream does not need any extra signalling to
indicate the change in behavior.

>>> Signaling the bandwidth, and defining the
>>> internal codec rate as fullband should let us lock down the RTP
>>> timestamp
>>> rate at 48 kHz (which I think is desirable).
>>
>> I do agree that having "only one mode" would be ideal, to maximize
>> interoperability.  I wonder whether we can achieve high enough
>> computational efficiency for this to be viable.
>
> Changing the RTP timestamp sampling rate causes no computational
> complexity, does it?  Perhaps an extra multiplication for each packet or
> so?  The point was that RTP timestamp sampling rate should disconnected
> from the actual audio signals.
Right, but Stephen also suggested "defining the internal codec rate as
fullband".  From this, I imagined a scenario in which all (compliant) IWAC
implementations MUST decode all IWAC streams, which always have a sampling
rate of 48 KHz.  I think this is a great idea, to achieve really good
interoperability.

If the receiver is a PSTN gateway, then an "internal codec rate" of 8 KHz
would presumably produce as good quality/bitrate with lower encoder and
decoder complexity.  However, if we can make IWAC sufficiently
low-complexity, operating at 48 KHz may be acceptable.  It will help if we
can structure the codec so that operating at lower bandwidth is very
efficient.  For example, it may be possible to structure a transform codec
such that unneeded high frequencies can cheaply be zero'd on encode and
ignored on decode.

--Ben