Re: [codec] #15: Efficiently combine pre-encoded audio

"Benjamin M. Schwartz" <bmschwar@fas.harvard.edu> Wed, 12 May 2010 16:06 UTC

Return-Path: <bmschwar@fas.harvard.edu>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id D9E6328C223 for <codec@core3.amsl.com>; Wed, 12 May 2010 09:06:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.206
X-Spam-Level:
X-Spam-Status: No, score=-5.206 tagged_above=-999 required=5 tests=[AWL=0.774, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, RCVD_IN_SORBS_WEB=0.619]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IrKBPXGd3oKK for <codec@core3.amsl.com>; Wed, 12 May 2010 09:06:45 -0700 (PDT)
Received: from us12.unix.fas.harvard.edu (us12.unix.fas.harvard.edu [140.247.35.203]) by core3.amsl.com (Postfix) with ESMTP id BE3343A6939 for <codec@ietf.org>; Wed, 12 May 2010 08:52:43 -0700 (PDT)
Received: from us12.unix.fas.harvard.edu (localhost.localdomain [127.0.0.1]) by us12.unix.fas.harvard.edu (Postfix) with ESMTP id 28EF4665348; Wed, 12 May 2010 11:52:33 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=fas.harvard.edu; h= message-id:date:from:reply-to:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; s=mail; bh=h0O0364HRFI4yNPvg3rHjg7JJHcAExI26roYReI+t/U=; b=U1XA IzygHk91QjYE7lyIlSoQtzNQkl9aWKNK4sjQd+02G+9I6PkcpURRYVciv8dWGjkc zrDiAaYq4VE279XunumKtifYgVgdIo7JNgql81bMKuW3IxEWpKDpVWzNDTapTfSW kaQTT7dy9mP2HnUX0vhAPSM3TG8UC+VHmq9BGtI=
DomainKey-Signature: a=rsa-sha1; c=simple; d=fas.harvard.edu; h= message-id:date:from:reply-to:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; q=dns; s=mail; b=g+odrbvO/7is6RPVZkZRxWEIMJAEB9ehjF4eF6Ft+DTKpb bbGTm3yzrB8Uba2Qq3rwtn/TKqTQnv+EOfgZhB3M5N7DiQZRpKynSPCKTYVtJ8Is Ooi9UddBKQdhYj7TMw67fopKLmKsahHHJ/10sEN3E+LPzVc4+S5dF5ecalLD0=
Received: from [172.23.141.103] (bwhmaincampuspat25.partners.org [170.223.207.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: bmschwar@fas) by us12.unix.fas.harvard.edu (Postfix) with ESMTPSA id 232936652F8; Wed, 12 May 2010 11:52:33 -0400 (EDT)
Message-ID: <4BEACEBF.7080403@fas.harvard.edu>
Date: Wed, 12 May 2010 11:52:31 -0400
From: "Benjamin M. Schwartz" <bmschwar@fas.harvard.edu>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4
MIME-Version: 1.0
To: Jean-Marc Valin <jean-marc.valin@octasic.com>
References: <062.bc75a3b3c4a980df34535f87c9484935@tools.ietf.org> <071.30b67e93d22f0bfedf46b5035d133441@tools.ietf.org> <1F68067D-33B9-4F0C-B31B-B3A56A72DBA4@cisco.com> <4BEAC888.50109@fas.harvard.edu> <4BEACCD7.8080401@octasic.com>
In-Reply-To: <4BEACCD7.8080401@octasic.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: codec@ietf.org
Subject: Re: [codec] #15: Efficiently combine pre-encoded audio
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
Reply-To: bens@alum.mit.edu
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 12 May 2010 16:06:51 -0000

On 05/12/2010 11:44 AM, Jean-Marc Valin wrote:
> Benjamin M. Schwartz wrote:
>> The cheapest solution, of course, is transmit-side activity detection.
>> Maybe we need to specify a way for a receiver to request that the
>> transmitter employ (or not employ) VAD.
>
> I think you can do better than an encoder VAD.

I know that CELT makes decoder VAD very efficient, but how is decoder VAD 
better than encoder VAD?  Encoder VAD saves even more CPU, saves 
bandwidth, and enables easier jitter buffering.

Are you thinking about some sort of adaptive thresholding that requires 
knowing all streams' volume levels?

Anyway, VAD can run on both encode and decode sides at the same time.

--Ben