Re: [codec] #15: Efficiently combine pre-encoded audio

There is one more application to efficiently combining pre-encoded
audio: playing announcements or recorded audio. Standard network or
IVR announcements can be encoded once and efficiently inserted or
combined into audio stream. If pre-encoded audio is supported and the
client supports AVT tones, it is trivial to develop a very efficient
IVR server which does not require any CODEC encoding or decoding.

Efficient decoder side VAD is also very helpful in case of speech
recognition, where it allows to save cycles in end-pointer. This way
audio only needs to be decoded and passed to the speech recognition
system only when voice is present.

Bottom line, if we have both efficient decoder side VAD and combining
pre-encoded audio we can develop some very efficient VXML servers,
voice mail and IVR system, not just conferencing servers.
_____________________________
Roman Shpount - www.telurix.com

On Wed, May 12, 2010 at 12:43 PM, Benjamin M. Schwartz
<bmschwar@fas.harvard.edu> wrote:
> On 05/12/2010 12:37 PM, Jean-Marc Valin wrote:
>>
>> Benjamin M. Schwartz wrote:
>>>
>>> I think I failed to communicate that by VAD I mean _not sending packets_
>>> during inactivity. For the packets that are sent, the overhead should
>>> average much less than 1 bit per frame.
>>
>> What you're describing is called DTX (discontinuous transmission).
>
> Oops. Right.  What I'm trying to say is that DTX, based on encoder-side VAD,
> also greatly reduces the (average) computational burden on a conference
> mixer.  Of course, if everyone's really talking at once then VAD can't help.
>
> --Ben
> _______________________________________________
> codec mailing list
> codec@ietf.org
> https://www.ietf.org/mailman/listinfo/codec
>