Re: [codec] #3: 2.2. Conferencing: Support of binaural audio?
"codec issue tracker" <trac@tools.ietf.org> Sat, 01 May 2010 10:44 UTC
Return-Path: <trac@tools.ietf.org>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id C79E73A6AD2 for <codec@core3.amsl.com>; Sat, 1 May 2010 03:44:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -101.206
X-Spam-Level:
X-Spam-Status: No, score=-101.206 tagged_above=-999 required=5 tests=[AWL=-1.206, BAYES_50=0.001, NO_RELAYS=-0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8K3JXLZG5-di for <codec@core3.amsl.com>; Sat, 1 May 2010 03:44:33 -0700 (PDT)
Received: from zinfandel.tools.ietf.org (unknown [IPv6:2001:1890:1112:1::2a]) by core3.amsl.com (Postfix) with ESMTP id D78E73A67F6 for <codec@ietf.org>; Sat, 1 May 2010 03:44:33 -0700 (PDT)
Received: from localhost ([::1] helo=zinfandel.tools.ietf.org) by zinfandel.tools.ietf.org with esmtp (Exim 4.69) (envelope-from <trac@tools.ietf.org>) id 1O8ABH-00021R-69; Sat, 01 May 2010 03:44:19 -0700
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: codec issue tracker <trac@tools.ietf.org>
X-Trac-Version: 0.11.6
Precedence: bulk
Auto-Submitted: auto-generated
X-Mailer: Trac 0.11.6, by Edgewall Software
To: hoene@uni-tuebingen.de
X-Trac-Project: codec
Date: Sat, 01 May 2010 10:44:19 -0000
X-URL: http://tools.ietf.org/codec/
X-Trac-Ticket-URL: http://trac.tools.ietf.org/wg/codec/trac/ticket/3#comment:1
Message-ID: <071.5c139aff3b600414066c330b20c0e191@tools.ietf.org>
References: <062.a837f2ff7647f7cb184f0c86b7e65747@tools.ietf.org>
X-Trac-Ticket-ID: 3
In-Reply-To: <062.a837f2ff7647f7cb184f0c86b7e65747@tools.ietf.org>
X-SA-Exim-Connect-IP: ::1
X-SA-Exim-Rcpt-To: hoene@uni-tuebingen.de, codec@ietf.org
X-SA-Exim-Mail-From: trac@tools.ietf.org
X-SA-Exim-Scanned: No (on zinfandel.tools.ietf.org); SAEximRunCond expanded to false
Cc: codec@ietf.org
Subject: Re: [codec] #3: 2.2. Conferencing: Support of binaural audio?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Reply-To: codec@ietf.org
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 01 May 2010 10:44:34 -0000
#3: 2.2. Conferencing: Support of binaural audio? ------------------------------------+--------------------------------------- Reporter: hoene@… | Owner: Type: enhancement | Status: new Priority: major | Milestone: Component: requirements | Version: Severity: - | Keywords: ------------------------------------+--------------------------------------- Comment(by hoene@…): [Hoene]: I am trying to compile the different requirements that have been mentioned on this list. - low complexity (with just one active speaker) vs. multiple speaker mixing vs. spatial audio/stereo mixing - centralized vs. distributed - few participants vs. hundreds of listeners and talkers - individual distribution of audio streams vs. IP multicast or RTP group communication - efficient encoding of multiple streams having the same content (but different quality). To make things easier, why not to split the teleconferencing scenario in two: High quality and Scalable? The high quality scenario, intended for a low number of users, could have features like - Distributed processing and mixing - High computational resources to support spatial audio mixing (at the receiver) and multiple encodings of the same audio stream at different qualities (at the sender) - Enough bandwidth to allow direct N to N transmissions of audio streams (no multicast or group communication). This would be good for the latency, too. The scalable scenario is the opposite: - Central processing and mixing for many participants . - N to 1 and 1 to N communication using efficient distribution mechanisms (RTP group communication and IP multicast). - Low complexity mixing of many using tricks like VAD, encoding at lowest rate to support many receivers having different paths, you name it... High quality: - Quite the same requirement as an end-to-end audio transmission: high quality and low latency. - Maybe additionally: variable bit rate encoding to achieve a multiplexing gain at the receiver - and thus, a fast control loop to cope with variable bitrates on transmission paths. - Maybe stereo/multichannel support to send the spatial audio to the headphone or loudspeakers. Scalable: - Efficient encoding/transcoding for multiple different qualities (at the conference bridge) - The control loop must not react (fast) because (multicast) group communication requires to encode at low quality anyhow. - Receiver side activity detection for music and voice having low complexity (for the conference bridge) - Efficient mixing of two to four(?) active flows (is this achievable without the complete process of decoding and encoding again?) [Raymond]: High quality is a given, but I would like to emphasize the importance of low latency. (1) It is well-known that the longer the latency, the lower the perceived quality of the communication link. [...] (2) The lower the latency, the less audible the echo, and thus the lower the required echo return loss. Hence, lower latency means easier echo control and simpler echo canceller, and as people already mentioned previously, below a certain delay, an echo is simply perceived as a harmless side-tone and no echo canceller is needed. It seems to me that echo control in conference calls is more difficult than in point-to-point calls. While I hardly ever heard echoes in domestic point-to-point calls, in my experience with conference calls at work, even with the G.711 codec (which has almost no delay), sometimes I still hear echoes (I just heard another one this afternoon). If a relatively long-delay IETF codec is used, the echo control will be even more problematic. (3) In normal phone calls or conference calls, people routinely have a need to interrupt each other, but beyond a certain point, long latency makes it very difficult for people to interrupt each other on the call. This is because when you try to interrupt another person, that person doesn’t hear your interruption until a certain time later, so he keeps talking, but when you hear that he did not stop talking when you interrupted, you stop; then, he hears your interruption, so he stops. When you hear he stops, you start talking again, but then he also hears you stopped (due to the long delay), so he also starts talking again. The net result is that with a long latency, when you try to interrupt him, you and he end up stopping and starting at roughly the same time for a few cycles, making it difficult to interrupt each other. [Jean-Marc:] The decoder complexity is very important. Not only because of mixing issue, but also because the decoder is generally not allowed to take shortcuts to save on complexity (unlike the encoder). As for compressed- domain mixing, as you say it is not always available, but *if* we can do it (even if only partially), then that can result in a "free" reduction in decoder complexity for mixing. -- Ticket URL: <http://trac.tools.ietf.org/wg/codec/trac/ticket/3#comment:1> codec <http://tools.ietf.org/codec/>
- [codec] #3: 2.2. Conferencing: Support of binaura… codec issue tracker
- Re: [codec] #3: 2.2. Conferencing: Support of bin… Slava Borilin
- Re: [codec] #3: 2.2. Conferencing: Support of bin… Christian Hoene
- Re: [codec] #3: 2.2. Conferencing: Support of bin… Michael Knappe
- Re: [codec] #3: 2.2. Conferencing: Support of bin… Marc Petit-Huguenin
- Re: [codec] #3: 2.2. Conferencing: Support of bin… Gregory Maxwell
- Re: [codec] #3: 2.2. Conferencing: Support of bin… Slava Borilin
- Re: [codec] #3: 2.2. Conferencing: Support of bin… Stefan Sayer
- Re: [codec] #3: 2.2. Conferencing: Support of bin… codec issue tracker
- Re: [codec] requirements #3 (new): 2.2. Conferenc… codec issue tracker
- Re: [codec] #3: 2.2. Conferencing: Support of bin… codec issue tracker