Re: [codec] Next Steps for WG

Roman Shpount <roman@telurix.com> Sat, 15 January 2011 22:05 UTC

Return-Path: <roman@telurix.com>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 1E1233A6B46 for <codec@core3.amsl.com>; Sat, 15 Jan 2011 14:05:10 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.976
X-Spam-Level:
X-Spam-Status: No, score=-2.976 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RGxllRXX7YvK for <codec@core3.amsl.com>; Sat, 15 Jan 2011 14:05:03 -0800 (PST)
Received: from mail-iy0-f172.google.com (mail-iy0-f172.google.com [209.85.210.172]) by core3.amsl.com (Postfix) with ESMTP id 82CBE3A6B3A for <codec@ietf.org>; Sat, 15 Jan 2011 14:05:03 -0800 (PST)
Received: by iyi42 with SMTP id 42so3857520iyi.31 for <codec@ietf.org>; Sat, 15 Jan 2011 14:07:32 -0800 (PST)
Received: by 10.42.166.138 with SMTP id o10mr2542329icy.37.1295129251686; Sat, 15 Jan 2011 14:07:31 -0800 (PST)
Received: from mail-iw0-f172.google.com (mail-iw0-f172.google.com [209.85.214.172]) by mx.google.com with ESMTPS id c4sm2088107ict.7.2011.01.15.14.07.29 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 15 Jan 2011 14:07:30 -0800 (PST)
Received: by iwn40 with SMTP id 40so3827277iwn.31 for <codec@ietf.org>; Sat, 15 Jan 2011 14:07:28 -0800 (PST)
MIME-Version: 1.0
Received: by 10.231.37.70 with SMTP id w6mr2323007ibd.169.1295129248752; Sat, 15 Jan 2011 14:07:28 -0800 (PST)
Received: by 10.231.20.139 with HTTP; Sat, 15 Jan 2011 14:07:28 -0800 (PST)
In-Reply-To: <4D319D36.8010607@digium.com>
References: <006601cbb3c4$d2cd6870$78683950$@uni-tuebingen.de> <276344716.988745.1295041270412.JavaMail.root@lu2-zimbra> <AANLkTikB++mV7G5ukho2ipWo=QL7iP9h_R0qu5RZK5Vi@mail.gmail.com> <4D319D36.8010607@digium.com>
Date: Sat, 15 Jan 2011 17:07:28 -0500
Message-ID: <AANLkTinD-ghqhP_dLkXigBjSZjc4dqZp+q_XY9Vedz9f@mail.gmail.com>
From: Roman Shpount <roman@telurix.com>
To: "Kevin P. Fleming" <kpfleming@digium.com>
Content-Type: multipart/alternative; boundary="002215048dabcf65020499e9c684"
Cc: codec@ietf.org
Subject: Re: [codec] Next Steps for WG
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 15 Jan 2011 22:05:10 -0000

First of all, CODEC definition should be independent from the transport
protocol, and we might need this functionality when RTP SSRC are not
available.

Furthermore, in case of RTP, there are two problems with your suggestion:

1. Quite a few clients do not allow remote party to change SSRC without the
re-Invite. SIP clients note SSRC of the first received RTP packet and ignore
RTP packets with different SSRC

2. Even when the client allows switching of SSRC on the fly, a few initial
RTP packets are typically discarded due to RTP probation. After this,
clients typically need to pre-fill the jitter buffer, and only after this
start audio playback. This produces from 40ms to 100ms gap in audio. This is
very audible and highly undesirable.

Finally, what I was looking for was a bit more then just decoder reset.
Ideally I wanted to set decoder state to a known value to start decoding
audio packets that I will send to it.

P.S. As a side note, Intel Performance Primitives CODECs implement a packet
type in its wave file format that resets the decoder. This packet is used to
simplify implementation of performance and regression tests. I think they
are using standard sized packet with all bits set to zero for this purpose.
We can at least do something similar.

_____________
Roman Shpount


On Sat, Jan 15, 2011 at 8:12 AM, Kevin P. Fleming <kpfleming@digium.com>wrote:

> On 01/14/2011 11:27 PM, Roman Shpount wrote:
>
>> My question was not as much about the VAD, as about multiple stream
>> combining. What I want is to implement is switching between multiple
>> talker based on VAD (computation of which is done by some external
>> methods). In this case we need to indicate to the remote party that this
>> is a new stream and ideally get the remote decoder in such a state that
>> it can immediately start correctly decoding the new audio without a
>> glitch. So, the minimum requirement would be to have a flag or a packet
>> that indicates the decoder that it needs to reset its state. Ideally, we
>> need an algorithm to get the proper decoder state based on some audio
>> history stored on the conferencing server, and to send this decoder
>> state to the remote party so that it can be synchronized. I don't think
>> this should affect the codec performance. This is just something that
>> needs to be accommodated in the bitstream by adding a packet type which
>> either resets the decoder state or sets it to some specified value.
>>
>
> This sort of mechanism already exists when using RTP as the transport
> mechanism, by setting the marker bit and changing the SSRC to indicate that
> the payload in the packet is from a different source than the previous
> packet. In my opinion there's no need for the codec bitstream to have any
> provisions for such an indication.
>
> --
> Kevin P. Fleming
> Digium, Inc. | Director of Software Technologies
> 445 Jan Davis Drive NW - Huntsville, AL 35806 - USA
> skype: kpfleming | jabber: kfleming@digium.com
> Check us out at www.digium.com & www.asterisk.org
>
> _______________________________________________
> codec mailing list
> codec@ietf.org
> https://www.ietf.org/mailman/listinfo/codec
>