Re: [codec] #15: Efficiently combine pre-encoded audio

"Benjamin M. Schwartz" <bmschwar@fas.harvard.edu> Wed, 12 May 2010 16:40 UTC

DomainKey-Signature: a=rsa-sha1; c=simple; d=fas.harvard.edu; h= message-id:date:from:reply-to:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; q=dns; s= mail; b=M2PA8fyZoi++wKar4XyEg/Drvaxelf3S1mc9CBZwfUTnciyeWicqEsTe by2bqjMWM92TM+jLOW2kVSaHlXC6bsENWwvQ04B02v+HzFGw/zpp5AEAZp/XN4kl 7i407skYJnijaqGyFxcaju4dkCVL8wO9G7/hU6OtE6+dUKW4pD4=
Message-ID: <4BEAD5C1.4000802@fas.harvard.edu>
Date: Wed, 12 May 2010 12:22:25 -0400
From: "Benjamin M. Schwartz" <bmschwar@fas.harvard.edu>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4
MIME-Version: 1.0
To: Jean-Marc Valin <jean-marc.valin@octasic.com>, codec@ietf.org
References: <062.bc75a3b3c4a980df34535f87c9484935@tools.ietf.org> <071.30b67e93d22f0bfedf46b5035d133441@tools.ietf.org> <1F68067D-33B9-4F0C-B31B-B3A56A72DBA4@cisco.com> <4BEAC888.50109@fas.harvard.edu> <4BEACCD7.8080401@octasic.com> <4BEACEBF.7080403@fas.harvard.edu> <4BEAD147.8080307@octasic.com>
In-Reply-To: <4BEAD147.8080307@octasic.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Subject: Re: [codec] #15: Efficiently combine pre-encoded audio
Precedence: list
Reply-To: bens@alum.mit.edu

On 05/12/2010 12:03 PM, Jean-Marc Valin wrote:
> Benjamin M. Schwartz wrote:
>> but how is decoder VAD
>> better than encoder VAD? Encoder VAD saves even more CPU, saves
>> bandwidth, and enables easier jitter buffering.
>
> There's a few reasons why I think decoder-side is better:
> - The decision for an encoder-size VAD would take some amount of space
> in the bit-stream

I think I failed to communicate that by VAD I mean _not sending packets_ 
during inactivity.  For the packets that are sent, the overhead should 
average much less than 1 bit per frame.

I'm not suggesting sending 200 packets a second containing a flag 
indicating no voice activity, followed by carefully coded background 
noise.  That would be silly.

> - If we make an encode-size VAD mandatory, then all encoders will have
> to spend the CPU cycles, even when it's not needed. If it's not
> mandatory, then the decoder cannot rely on it, so it still needs to
> implement a VAD

I don't see this as "mandatory".  The encoder can turn off VAD, and 
probably should for full-quality applications.

> - A decoder VAD does not need to be specified in an exact way, so
> implementers can choose different implementations depending on that
> information they need.

The only thing that needs exact specification is the signalling.  The 
encoder may use it or not use it as it pleases.

> - You cannot "game" a decode-size VAD.

I don't know what this means.

>> Are you thinking about some sort of adaptive thresholding that requires
>> knowing all streams' volume levels?
>
> Well, knowing the relative amplitudes of each stream can allow you to
> take more intelligent decisions, e.g. when you have to choose the "most
> active speaker". That's something you can't really get from an encoder VAD.
>
>> Anyway, VAD can run on both encode and decode sides at the same time.
>
> That would just mean nobody would bother implementing the encode side.

I expect encode-side VAD on a conference call to save more than a factor 
of 2 in bandwidth, which makes it very desirable, especially for large 
deployments.  People will use it to save bandwidth (especially if it's on 
by default in the reference implementation).  The decode-side CPU savings 
are just a minor bonus side-effect.

--Ben

[codec] #15: Efficiently combine pre-encoded audio codec issue tracker
Re: [codec] #15: Efficiently combine pre-encoded … stephen botzko
Re: [codec] #15: Efficiently combine pre-encoded … Stephan Wenger
Re: [codec] #15: Efficiently combine pre-encoded … codec issue tracker
Re: [codec] #15: Efficiently combine pre-encoded … stephen botzko
Re: [codec] #15: Efficiently combine pre-encoded … Jean-Marc Valin
Re: [codec] #15: Efficiently combine pre-encoded … Cullen Jennings
Re: [codec] #15: Efficiently combine pre-encoded … Brian Rosen
Re: [codec] #15: Efficiently combine pre-encoded … Benjamin M. Schwartz
Re: [codec] #15: Efficiently combine pre-encoded … Jean-Marc Valin
Re: [codec] #15: Efficiently combine pre-encoded … stephen botzko
Re: [codec] #15: Efficiently combine pre-encoded … Benjamin M. Schwartz
Re: [codec] #15: Efficiently combine pre-encoded … Benjamin M. Schwartz
Re: [codec] #15: Efficiently combine pre-encoded … Jean-Marc Valin
Re: [codec] #15: Efficiently combine pre-encoded … Benjamin M. Schwartz
Re: [codec] #15: Efficiently combine pre-encoded … Roman Shpount
Re: [codec] #15: Efficiently combine pre-encoded … codec issue tracker
Re: [codec] #15: Efficiently combine pre-encoded … Roman Shpount
Re: [codec] #15: Efficiently combine pre-encoded … codec issue tracker
Re: [codec] #15: Efficiently combine pre-encoded … Christian Hoene
Re: [codec] #15: Efficiently combine pre-encoded … codec issue tracker