Re: [codec] Next Steps for WG

Roman Shpount <roman@telurix.com> Sat, 15 January 2011 05:24 UTC

Return-Path: <roman@telurix.com>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6EE043A6BBD for <codec@core3.amsl.com>; Fri, 14 Jan 2011 21:24:47 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.976
X-Spam-Level:
X-Spam-Status: No, score=-2.976 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HD7eHUkqeO2S for <codec@core3.amsl.com>; Fri, 14 Jan 2011 21:24:45 -0800 (PST)
Received: from mail-iy0-f172.google.com (mail-iy0-f172.google.com [209.85.210.172]) by core3.amsl.com (Postfix) with ESMTP id 1F69B3A689A for <codec@ietf.org>; Fri, 14 Jan 2011 21:24:44 -0800 (PST)
Received: by iyi42 with SMTP id 42so3390175iyi.31 for <codec@ietf.org>; Fri, 14 Jan 2011 21:27:12 -0800 (PST)
Received: by 10.231.16.73 with SMTP id n9mr445809iba.113.1295069230479; Fri, 14 Jan 2011 21:27:10 -0800 (PST)
Received: from mail-iw0-f172.google.com (mail-iw0-f172.google.com [209.85.214.172]) by mx.google.com with ESMTPS id gy41sm1590223ibb.5.2011.01.14.21.27.08 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 14 Jan 2011 21:27:09 -0800 (PST)
Received: by iwn40 with SMTP id 40so3364080iwn.31 for <codec@ietf.org>; Fri, 14 Jan 2011 21:27:07 -0800 (PST)
MIME-Version: 1.0
Received: by 10.231.14.74 with SMTP id f10mr1602811iba.119.1295069227614; Fri, 14 Jan 2011 21:27:07 -0800 (PST)
Received: by 10.231.20.139 with HTTP; Fri, 14 Jan 2011 21:27:07 -0800 (PST)
In-Reply-To: <276344716.988745.1295041270412.JavaMail.root@lu2-zimbra>
References: <006601cbb3c4$d2cd6870$78683950$@uni-tuebingen.de> <276344716.988745.1295041270412.JavaMail.root@lu2-zimbra>
Date: Sat, 15 Jan 2011 00:27:07 -0500
Message-ID: <AANLkTikB++mV7G5ukho2ipWo=QL7iP9h_R0qu5RZK5Vi@mail.gmail.com>
From: Roman Shpount <roman@telurix.com>
To: Koen Vos <koen.vos@skype.net>
Content-Type: multipart/alternative; boundary="00221532cb384582af0499dbcda1"
Cc: codec@ietf.org
Subject: Re: [codec] Next Steps for WG
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 15 Jan 2011 05:24:47 -0000

My question was not as much about the VAD, as about multiple stream
combining. What I want is to implement is switching between multiple talker
based on VAD (computation of which is done by some external methods). In
this case we need to indicate to the remote party that this is a new stream
and ideally get the remote decoder in such a state that it can immediately
start correctly decoding the new audio without a glitch. So, the minimum
requirement would be to have a flag or a packet that indicates the decoder
that it needs to reset its state. Ideally, we need an algorithm to get the
proper decoder state based on some audio history stored on the conferencing
server, and to send this decoder state to the remote party so that it can be
synchronized. I don't think this should affect the codec performance. This
is just something that needs to be accommodated in the bitstream by adding a
packet type which either resets the decoder state or sets it to some
specified value.

I would also prefer if this was working across all the codec modes and did
not require the conferencing server from restricting what kinds of encoded
audio it can accept.
_____________
Roman Shpount


On Fri, Jan 14, 2011 at 4:41 PM, Koen Vos <koen.vos@skype.net> wrote:

> Even without DTX, the Voice modes will expose a Voice Activity flag in the
> decoder API. This will reveal without any decoding whether a payload
> contains active speech.
>
> And all decoder modes have low CPU demands, so decoding all streams to
> determine which ones are most active is relatively cheap.
>
> koen.
>
>
> ----- Original Message -----
> From: "Christian Hoene" <hoene@uni-tuebingen.de>
> To: "Jean-Marc Valin" <jmvalin@jmvalin.ca>, "Roman Shpount" <
> roman@telurix.com>
> Cc: codec@ietf.org
> Sent: Friday, January 14, 2011 12:27:02 AM
> Subject: Re: [codec] Next Steps for WG
>
> The easiest way to mix the multiple streams is to take advantage of the DTX
> feature, which is available in the Silk and hybrid mode. However, I am not
> aware whether DTX or VAD is supported in the CELT mode.
>
> Christian
>
> > -----Original Message-----
> > From: codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] On Behalf
> > Of Jean-Marc Valin
> > Sent: Friday, January 14, 2011 3:51 AM
> > To: Roman Shpount
> > Cc: codec@ietf.org
> > Subject: Re: [codec] Next Steps for WG
> >
> > Hi Roman,
> >
> > We believe that we meet all requirements that were identified. As for
> mixing
> > multiple encoded streams, that was marked as desirable in the draft
> because
> > it's very hard to achieve without sacrificing efficiency.
> > To some extent, if you constrain the Opus encoder to a subset of the
> > available options (CELT only without short blocks or post-filter), then
> you can
> > create streams that can be mixed at roughly half the normal complexity. I
> > think that's about as much as we can do here.
> >
> > Cheers,
> >
> >       Jean-Marc
> >
> > On 11-01-13 07:34 PM, Roman Shpount wrote:
> > > Just out of curiosity, have the OPUS codec been review against the
> > > list of requirements that we put together?
> > >
> > > In particular, I raised the requirement that multiple encoded streams
> > > form different sources can be efficiently combined. Is it possible
> > > with OPUS? If it is, then how?
> > > _____________________________
> > > Roman Shpount - www.telurix.com <http://www.telurix.com>
> > >
> > >
> > > On Thu, Jan 13, 2011 at 5:29 PM, Cullen Jennings <fluffy@cisco.com
> > > <mailto:fluffy@cisco.com>> wrote:
> > >
> > >
> > >     The OPUS draft authors believe the bitstream for OPUS will be ready
> > >     to be "frozen" by the end of the month. At that point we plan to
> > >     spend the following month testing and, baring any surprises, will
> > >     likely go to WGLC after that.
> > >
> > >     There has been some questions about what "frozen" means in IETF
> > >     context. The bitstream could change any time up to when the IESG
> > >     approves it which is after Working Group Last call and after IETF
> > >     Last Call so there is no guarantee that nothing will change but ...
> > >       by "frozen" we mean that we believe no changes are needed and
> > >     don't plan to make changes unless some significant problem is
> found.
> > >
> > >     Thanks,
> > >     Cullen, Jonathan, and Mike
> > >
> > >
> > >     _______________________________________________
> > >     codec mailing list
> > >     codec@ietf.org <mailto:codec@ietf.org>
> > >     https://www.ietf.org/mailman/listinfo/codec
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > codec mailing list
> > > codec@ietf.org
> > > https://www.ietf.org/mailman/listinfo/codec
> > _______________________________________________
> > codec mailing list
> > codec@ietf.org
> > https://www.ietf.org/mailman/listinfo/codec
>
> _______________________________________________
> codec mailing list
> codec@ietf.org
> https://www.ietf.org/mailman/listinfo/codec
>