Re: [AVT] RE: <draft-ietf-avt-rtp-vmr-wb-03.txt>: sampling rate

Randell Jesup <rjesup@wgate.com> Fri, 24 September 2004 19:44 UTC

Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA09420 for <avt-archive@ietf.org>; Fri, 24 Sep 2004 15:44:27 -0400 (EDT)
Received: from megatron.ietf.org ([132.151.6.71]) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1CAw6b-0001Fo-MS for avt-archive@ietf.org; Fri, 24 Sep 2004 15:51:46 -0400
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1CAvvl-0005EP-61; Fri, 24 Sep 2004 15:40:33 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1CAvoE-0001Xm-JK for avt@megatron.ietf.org; Fri, 24 Sep 2004 15:32:46 -0400
Received: from ietf-mx.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA08458 for <avt@ietf.org>; Fri, 24 Sep 2004 15:32:45 -0400 (EDT)
Received: from pr-66-150-46-254.wgate.com ([66.150.46.254] helo=mail.tvol.net) by ietf-mx.ietf.org with esmtp (Exim 4.33) id 1CAvvH-000118-Av for avt@ietf.org; Fri, 24 Sep 2004 15:40:04 -0400
Received: from jesup.eng.tvol.net ([10.32.2.26]) by mail.tvol.net with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id NPY794XC; Fri, 24 Sep 2004 15:31:38 -0400
To: Colin Perkins <csp@csperkins.org>
Subject: Re: [AVT] RE: <draft-ietf-avt-rtp-vmr-wb-03.txt>: sampling rate
References: <0B08EA1BF5F6304992CDC985EE02209E02342B61@sdebe002.americas.nokia.com> <751B9B52-03EA-11D9-A048-000A957FC5F2@csperkins.org> <ybur7p8ynaq.fsf@jesup.eng.tvol.net.jesup.eng.tvol.net> <366D9364-0D38-11D9-A100-000A957FC5F2@csperkins.org>
From: Randell Jesup <rjesup@wgate.com>
Date: Fri, 24 Sep 2004 15:33:15 -0400
In-Reply-To: <366D9364-0D38-11D9-A100-000A957FC5F2@csperkins.org>
Message-ID: <ybuy8iz798k.fsf@jesup.eng.tvol.net.jesup.eng.tvol.net>
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 7aafa0432175920a4b3e118e16c5cb64
Cc: magnus.westerlund@ericsson.com, sassan.ahmadi@nokia.com, avt@ietf.org, Qiaobing.Xie@motorola.com
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Randell Jesup <rjesup@wgate.com>
List-Id: Audio/Video Transport Working Group <avt.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
Sender: avt-bounces@ietf.org
Errors-To: avt-bounces@ietf.org
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 0a7aa2e6e558383d84476dc338324fab

Colin Perkins <csp@csperkins.org> writes:
>>         Not that such things exist currently, but playing devil's
>> advocate, one could create a codec that takes a variable-sampling-rate
>> input depending on current conditions.  (Similar to video codecs that
>> vary the frame rate, resolution, and/or bitrate within a stream.)
>
>Sure, one could. I don't think it would be a good idea though: the
>convention that the sampling rate equals the timestamp rate is very helpful
>when building multi-format systems.

        Actually, such a codec could be VERY useful.  One of the big
problems with audio codecs nowadays is that they're situation-specific.
A "good" codec for speech is not a good codec for music, and a good music
codec may not do as well at covering packet loss and uses a lot more
bandwidth when the stream has plain speech.

        You could define a codec that generates (effectively) two or more
internal encodings depending on the audio data at that point in time
(perhaps switching per-packet or even more often), and could switch the
input sample rate to support that on a frame-by-frame basis.  The likely
logical solution for that would be to have the decoder output the highest
frequency that could be needed, and to use that same rate for all the RTP
timestamps, regardless if the encoder was currently doing the (say) 32KHz
sampling or 16KHz or 8KHz sampling.

        Such a codec might be a nice thing - it'd use extra bandwidth
only when needed.  Music-on-hold would sound good (though it might handle
packet loss less smoothly), and speech would use less bits and might be
more robust to packet loss (such as with iLBC).

        By insisting on the point that "timestamp rate == sample rate",
you close off a series of possibilities such as this, or make them much
more painful (like requiring dual synchronized streams at different
rates), or frequent (expensive) renegotiations of RTP stream parameters.

>>         While I don't see a _need_ for a multi-rate audio codec to use
>> a fixed timestamp rate, I also don't see a _need_ for the timestamp rate
>> to be the sample rate for audio, other than tradition.  Perhaps I'm
>> missing something.
>
>It's a very useful convention, not a requirement. I don't see anything in
>this codec that justifies breaking that convention.

        And I guess that's the crux - you see it as a convention to be
followed unless there's a VERY strong argument.  I and others seem to feel
that it's at most a minor convention which gives minimal benefit, and given
a reasonable argument that there's some advantage breaking the convention
that we should do so.

-- 
Randell Jesup, Worldgate Communications, ex-Scala, ex-Amiga OS team
rjesup@wgate.com



_______________________________________________
Audio/Video Transport Working Group
avt@ietf.org
https://www1.ietf.org/mailman/listinfo/avt