Re: [AVT] RE: <draft-ietf-avt-rtp-vmr-wb-03.txt>: sampling rate

Magnus Westerlund <magnus.westerlund@ericsson.com> Mon, 13 September 2004 14:51 UTC

Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA05758 for <avt-archive@ietf.org>; Mon, 13 Sep 2004 10:51:58 -0400 (EDT)
Received: from megatron.ietf.org ([132.151.6.71]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1C6sG3-0001Rq-94 for avt-archive@ietf.org; Mon, 13 Sep 2004 10:56:53 -0400
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1C6s42-000798-RX; Mon, 13 Sep 2004 10:44:18 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1C6s20-0006j0-GH for avt@megatron.ietf.org; Mon, 13 Sep 2004 10:42:12 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id KAA05237 for <avt@ietf.org>; Mon, 13 Sep 2004 10:42:10 -0400 (EDT)
Received: from albatross.ericsson.se ([193.180.251.49]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1C6s6X-0001H3-Cq for avt@ietf.org; Mon, 13 Sep 2004 10:47:05 -0400
Received: from esealmw141.al.sw.ericsson.se ([153.88.254.120]) by albatross.ericsson.se (8.12.10/8.12.10/WIREfire-1.8b) with ESMTP id i8DEfxWR026992 for <avt@ietf.org>; Mon, 13 Sep 2004 16:41:59 +0200 (MEST)
Received: from esealnt611.al.sw.ericsson.se ([153.88.254.121]) by esealmw141.al.sw.ericsson.se with Microsoft SMTPSVC(6.0.3790.0); Mon, 13 Sep 2004 16:41:59 +0200
Received: from [147.214.34.64] (research-1fd0e1.ki.sw.ericsson.se [147.214.34.64]) by esealnt611.al.sw.ericsson.se with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2657.72) id SY6X7H18; Mon, 13 Sep 2004 16:41:59 +0200
Message-ID: <4145B1B6.4090000@ericsson.com>
Date: Mon, 13 Sep 2004 16:41:58 +0200
X-Sybari-Trust: 346b1016 74898554 f2bc8c0d 00000139
From: Magnus Westerlund <magnus.westerlund@ericsson.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.2) Gecko/20040803
X-Accept-Language: sv, en-us, en
MIME-Version: 1.0
To: Qiaobing Xie <Qiaobing.Xie@motorola.com>
Subject: Re: [AVT] RE: <draft-ietf-avt-rtp-vmr-wb-03.txt>: sampling rate
References: <0B08EA1BF5F6304992CDC985EE02209E02A74365@sdebe002.americas.nokia.com> <1225B53B-03EB-11D9-A048-000A957FC5F2@csperkins.org> <41433EE5.3030604@motorola.com>
In-Reply-To: <41433EE5.3030604@motorola.com>
Content-Type: text/plain; charset="us-ascii"; format="flowed"
Content-Transfer-Encoding: 7bit
X-OriginalArrivalTime: 13 Sep 2004 14:41:59.0337 (UTC) FILETIME=[D3140190:01C4999F]
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 3002fc2e661cd7f114cb6bae92fe88f1
Content-Transfer-Encoding: 7bit
Cc: Colin Perkins <csp@csperkins.org>, avt@ietf.org, sassan.ahmadi@nokia.com
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Audio/Video Transport Working Group <avt.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
Sender: avt-bounces@ietf.org
Errors-To: avt-bounces@ietf.org
X-Spam-Score: 0.0 (/)
X-Scan-Signature: f66b12316365a3fe519e75911daf28a8
Content-Transfer-Encoding: 7bit

Hi Sassan and Colin,

I think we have two issues:

A. Is there any benefit to indicate or request that the sampling 
frequency used at the sender.

B. Is it necessary to use the sampling frequency as RTP timestamp rate.

I will start with A that I think is easier to explain and also can 
provide some information for issue B. If you find any of my assumptions 
and statements are incorrect, please correct me.

To my understanding of the VMR-WB after a conversation with Jonas 
Svedberg is that the VMR-WB will provide a somewhat better encoding of 
8kHz material if it is indicated that the input is 8kHz. However there 
is no need due to compatibility or decoder operation to signal the case 
where the 8kHz is used as input into the encoder. These would then 
result that the only case needed to be signaled between encoder and 
decoder is cases where the decoder will use output at 8kHz. Because if 
the decoder can request that the encoder uses 8kHz input some 
improvement of the 8kHz material is achieved. In the other cases where 
the receiver is capable of 16kHz it doesn't matter for the receiver if 
the original audio was 8 or 16kHz from a decoding point of view.

Colin, if one looks at issue B. Is it really needed to use the RTP 
timestamp frequency equal to the sampling rate used? I would say NO to 
that question. My reasoning is the following.

- Many audio input is sampled from a source at a higher rate then the 
encoder may handle. Thus a resampling and pre-processing stage is 
employed based on the encoders input frequency rather then producing 
that rate initially from the hardware. Some of the reason is that the 
pre-processing may actually yield better results than what the hardware 
at given input rate can gain. Another reason may be that one like to 
avoid switching the hardware between rate if changing the encoding.

- The frame based decoders does not need to know the encoders input 
rate. The encoder may anyway resample this into other rates for internal 
processing and band limited signals. I would claim that VMR-WB, AMR-WB+ 
and AAC are all example of codecs that perform this kind of tricks. On 
the receiver side they produce a output signal that has any sampling 
frequency the receiver finds most useful. Either causing clipping of the 
higher frequencies, but more commonly to a higher clock rate, despite 
that no more information is provided simply for ease of use.

- The frame based codecs do only need a RTP timestamp that allows the 
receiver to correctly reconstruct the time line when the encoding is 
done with the most audio bandwidth. In the VMR-WB case this is 16kHz. 
AMR-WB+ is even more strange, as we have selected an RTP timestamp rate 
that results in that all internal sampling frequencies will result in 
integer timestamp ticks. Thus actually allowing one to correctly 
calculate frame alignment when the internal sampling frequency changes. 
That the frequency also is possible to recalculate into several common 
sampling frequencies with few partial sample alignments was also 
considered.

Thus I would use this to argue that indicating the actual sampling 
frequency is not necessarily as long as the receiver is capable of 
correctly reconstruct the media stream with its timing information in 
full resolution.

In the VMR-WB case I would think that having only one timestamp rate of 
16kHz does not effect codec operation and would simplify the handling 
when one has some senders that do use 8kHz, especially when gateways 
need to encoded sometime 8kHz material from pre-recorded responses and 
in other cases WB channel data. This do avoid the need to perform RTP 
timestamp rate switches.

If desired to have this possibility to request by a receiver that the 
sender do use 8kHz input then one should introduce a MIME parameter for 
this. However I would like to avoid using the "rate" parameter as it 
results in unnecessary barriers in form of signalling and RTP timestamp 
rate switching.


Cheers

Magnus Westerlund

Multimedia Technologies, Ericsson Research EAB/TVA/A
----------------------------------------------------------------------
Ericsson AB                | Phone +46 8 4048287
Torshamsgatan 23           | Fax   +46 8 7575550
S-164 80 Stockholm, Sweden | mailto: magnus.westerlund@ericsson.com

_______________________________________________
Audio/Video Transport Working Group
avt@ietf.org
https://www1.ietf.org/mailman/listinfo/avt