Re: [codec] #3: 2.2. Conferencing: Support of binaural audio?

Stefan Sayer <> Tue, 23 March 2010 21:04 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 907943A67F9 for <>; Tue, 23 Mar 2010 14:04:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: 1.131
X-Spam-Level: *
X-Spam-Status: No, score=1.131 tagged_above=-999 required=5 tests=[BAYES_50=0.001, DNS_FROM_OPENWHOIS=1.13]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id TAEv7Jik9bul for <>; Tue, 23 Mar 2010 14:04:25 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 6C6CC3A6850 for <>; Tue, 23 Mar 2010 14:04:25 -0700 (PDT)
Received: from localhost (localhost []) by (Postfix) with ESMTP id 0320D115511A; Tue, 23 Mar 2010 22:03:55 +0100 (CET)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id rzWxdyg1cvkQ; Tue, 23 Mar 2010 22:03:54 +0100 (CET)
Received: from [] ( []) by (Postfix) with ESMTPSA id 8C3D11155117; Tue, 23 Mar 2010 22:03:54 +0100 (CET)
Message-ID: <>
Date: Tue, 23 Mar 2010 22:04:42 +0100
From: Stefan Sayer <>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv: Gecko/20100317 Lightning/0.9 Thunderbird/ Mnenhy/
MIME-Version: 1.0
To: Gregory Maxwell <>
References: <> <>, <003001cacab3$4e9ded90$ebd9c8b0$@de> <>
In-Reply-To: <>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Cc: 'Slava Borilin' <>, "" <>
Subject: Re: [codec] #3: 2.2. Conferencing: Support of binaural audio?
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 23 Mar 2010 21:04:26 -0000

o Gregory Maxwell [03/23/2010 07:59 PM]:
> Christian Hoene [] wrote:
>> (a) does not have any impact on the requirements but in case of (b) codec requirements are the support of stereo speech transmission and support for efficient mixing.
> I agree that (b) requires support for stereo. 
I, too, agree on that, and I think stereo encoding support is required, 
and for spatial audio conferencing with centralized mixing it is 
sufficient. The question here would be: Can the codec optimize stereo 
encoding, if it is known to the codec that the stereo signal contains 
sources which are spatially rendered with known angles (because the two 
signals are strongly correlated with a known time difference)? Or will 
the usual methods employed in stereo encoding already catch this?

It is also my view that for decentralized mixing, a positioning 
mechanism should not be part of the codec.

Note also that support for stereo and multi-channel is also explicitly 
mentioned in applications 2.3 Telepresence and 2.6  Live distributed 
music performances / Internet music lessons.

> Of course, minimizing the codec computational burden is helpful for scaling conferencing systems.

> Can anyone share some typical conference scaling numbers which they consider interesting?    
 > It's been my view that all comers are fast enough that conferencing 
on commercial scale isn't
 > much of an issue:  That, yes, it might take some serious processing 
grunt to handle 10,000 users on a conferencing system,
 > [...]

for PSTN quality (G.711) conferencing, today a good PC based server 
without any extra DSP hardware can handle 5000 users, but that drops to 
about 60% of it if you use GSM codec, and about 23% for iLBC or speex 
codec. so, minimizing the computational burden is indeed helpful.