Re: [codec] #8: Sample rates?

stephen botzko <stephen.botzko@gmail.com> Tue, 13 April 2010 17:29 UTC

Return-Path: <stephen.botzko@gmail.com>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 3E7143A690B for <codec@core3.amsl.com>; Tue, 13 Apr 2010 10:29:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.505
X-Spam-Level:
X-Spam-Status: No, score=-2.505 tagged_above=-999 required=5 tests=[AWL=0.093, BAYES_00=-2.599, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6wUNIPDW2xjr for <codec@core3.amsl.com>; Tue, 13 Apr 2010 10:29:32 -0700 (PDT)
Received: from mail-pw0-f44.google.com (mail-pw0-f44.google.com [209.85.160.44]) by core3.amsl.com (Postfix) with ESMTP id 54D8F3A63EB for <codec@ietf.org>; Tue, 13 Apr 2010 10:29:16 -0700 (PDT)
Received: by pwj2 with SMTP id 2so5634733pwj.31 for <codec@ietf.org>; Tue, 13 Apr 2010 10:29:06 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:cc:content-type; bh=qJsQAdAEq0cNckXsRgLapiEp4ybT7WfvkShdpZYlXUY=; b=gO0sVELj7b7Ij6hv5Pb8/fd0/pUPOR8DfgSSMW7omQZO03oIcOwkHldeqLmw9ng0Z2 bhncn5vpmE9uPZEnDzPZm80SIueyW46yjPaN/Mba+SeLeFH1tc2jQbth0L66cYFdS7V8 drILIenEGKgdmDu54aKk7l/8R5x7ouE0DgSIo=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=tPzOKIa3ZDS67ltu6sOnPDSs0lh2lW8XEv21+SUO0taaAtpER1NbQjwwweqZBBIdJC qds7M6dd3YpBCOBvDrs84ixUU3fxqAr+ZTaBnfPaYNKVkpdccgZYum70jzHxUJfnrJSV qbULoWLBnOxqWH6g5G7xU5g7d+N9V5cU08sxc=
MIME-Version: 1.0
Received: by 10.231.85.133 with HTTP; Tue, 13 Apr 2010 10:29:05 -0700 (PDT)
In-Reply-To: <m2v28bf2c661004130941g2e2bf956ld512b5d162df9080@mail.gmail.com>
References: <062.89d7aa91c79b145b798b83610e45ce71@tools.ietf.org> <002a01cadac8$68dbf380$3a93da80$@de> <w2k6e9223711004130337l5ecfccdbl153ac4895aedfdf@mail.gmail.com> <4BC4586F.1010709@digium.com> <o2u6e9223711004130620lb04d335auaafacfa34b0d6fe7@mail.gmail.com> <001e01cadb17$886fcec0$994f6c40$@de> <v2p6e9223711004130756p52726f8bo2db445e749ffe662@mail.gmail.com> <003101cadb1c$828b3990$87a1acb0$@de> <j2l6e9223711004130926nfaa975e3y129cc8cc21c52a84@mail.gmail.com> <m2v28bf2c661004130941g2e2bf956ld512b5d162df9080@mail.gmail.com>
Date: Tue, 13 Apr 2010 13:29:05 -0400
Received: by 10.141.91.16 with SMTP id t16mr5910013rvl.128.1271179746108; Tue, 13 Apr 2010 10:29:06 -0700 (PDT)
Message-ID: <g2h6e9223711004131029m3bfeb1ddq1a0e2bbd8418102f@mail.gmail.com>
From: stephen botzko <stephen.botzko@gmail.com>
To: Roman Shpount <roman@telurix.com>
Content-Type: multipart/alternative; boundary="000e0cd1123c366a6e04842199ef"
Cc: codec@ietf.org
Subject: Re: [codec] #8: Sample rates?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Apr 2010 17:29:34 -0000

Superwideband (and even fullband) do make speech somewhat more intelligible,
and also reduce listener fatigue.  Telepresence and other videoconferencing
equipment use those acoustic bandwidths today, so it would be nice if CODEC
supported at least superwideband also.

Personally I see some value in carriage of music.  Sometimes our equipment
is used for music performance.  Distance learning is another use case where
music has some value, since course and training materials frequently do
include videos with music.  Though of course conversational speech is the
dominant use case.

BTW, Videoconferencing devices do almost always support RTCP.  It is
regrettable that so many VOIP devices do not.  Anyway, I do not think our
charter scope includes invention of a new mechanism for signaling the
network quality.

Stephen Botzko

On Tue, Apr 13, 2010 at 12:41 PM, Roman Shpount <roman@telurix.com> wrote:

> I am not sure if this was decided, but should this new CODEC support music
> encoding? If we don't plan to support music, we should probably stick to 16
> Khz sampling rate. If we need music, I would suggest to have a 24 Khz (or
> higher sampling rate) variant. I am not sure how many people here care about
> a non-voice CODEC. For all the practical purposes I don't. I would argue,
> at least, for a fixed 16 KHz sampling rate CODEC variant.
>
> P.S. On the same note, does anybody here cares about using this CODEC with
> multicast? Is there a single commercial multicast voice deployment? From
> what I've seen all multicast does is making IETF voice standards harder to
> understand or implement.
>
> P.P.S. RTCP is almost universally not implemented. The biggest VoIP gateway
> on the market does not generate RTCP. If we will rely on any RTCP
> functionality for bandwidth control it will probably be ignored.
> ______________________________
> Roman Shpount -  www.telurix.com
>
>
> On Tue, Apr 13, 2010 at 12:26 PM, stephen botzko <stephen.botzko@gmail.com
> > wrote:
>
>> TCP is a different case, since for this we are using RTCP to signal our
>> feedback, and I don't think it has the facility you are envisioning.
>>
>> Also, I disagree with your presumption that multicast is out of scope.  I
>> don't know of any other packetization RFCs that expressly rule out
>> multicast, and multicast can be used for interactive applications.
>>
>> This concept seems pretty theoretical to me.  If we need to manage
>> complexity / quality tradeoffs, why not just use profiles (as AVC/H.264
>> does) or create a low complexity variant (like G.729A).  I really don't see
>> the need for *dynamic* complexity management.
>>
>> BTW, you seem to be assuming that a lower sample rate results in
>> significantly less complexity.  The savings there might not be as great as
>> you think, especially if the receiver needs to resample anyway (to prevent
>> those sound card limitations you were talking about before).
>>
>> Stephen Botzko
>>
>> On Tue, Apr 13, 2010 at 11:18 AM, Christian Hoene <hoene@uni-tuebingen.de
>> > wrote:
>>
>>>  Hi,
>>>
>>>
>>>
>>> comments inline:
>>>
>>>
>>>
>>>
>>>
>>> *From:* stephen botzko [mailto:stephen.botzko@gmail.com]
>>> *Sent:* Tuesday, April 13, 2010 4:56 PM
>>> *To:* Christian Hoene
>>> *Cc:* codec@ietf.org
>>>
>>> *Subject:* Re: [codec] #8: Sample rates?
>>>
>>>
>>>
>>> This would make the signaling more complicated - personally I am not
>>> convinced it is worth it.
>>>
>>> CH: It is a difficult tradeoff. However, signaling overload is done in
>>> Skype.  Such as signaling might be very useful for mobile devices, which
>>> want to save power and thus lower their CPU clock. Or wireless IP based
>>> headphones which do not have large batteries. I am thinking of signaling the
>>> states: overloaded, fine, and low. That should be enough for most
>>> operational cases.
>>>
>>>
>>> I think a better avenue is to bound overall complexity, and to focus on
>>> dynamically adapting to network conditions (as opposed to dynamic complexity
>>> management).
>>>
>>> CH: I just like to remind that the good old TCP does support both:
>>> congestion control to adapt to network conditions and flow control take into
>>> account an overloaded (=full) receiver.
>>>
>>> You can't dynamically negotiate complexity in many scenarios anyway - for
>>> instance it makes no sense if you are using multicast.
>>>
>>> CH: Multicast is out of scope anyhow. We are considering an interactive
>>> codec.
>>>
>>> CH: The conferencing scenario might be some more difficult to handle but
>>> will not a big problem.
>>>
>>> Christian
>>>
>>>
>>>
>>> Stephen Botzko
>>>
>>>  On Tue, Apr 13, 2010 at 10:42 AM, Christian Hoene <
>>> hoene@uni-tuebingen.de> wrote:
>>>
>>> Hi,
>>>
>>>
>>>
>>> It still might make sense to negotiate the maximal supported sampling
>>> rate via SDP or, if possible, to select one out of multiple sampling rates,
>>> if the audio receiver can cope with multiple rates well. The internal
>>> sampling frequency of the codec NEEDS NOT to be affected by the external
>>> sampling frequency.
>>>
>>>
>>>
>>> However, the decoder might want to signal to the encoder that the
>>> decoding is requiring too many computational resources and that a less
>>> complex coding mode (or a lower sampling frequency) should be taken.
>>>
>>>
>>>
>>> Christian
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------
>>>
>>> Dr.-Ing. Christian Hoene
>>>
>>> Interactive Communication Systems (ICS), University of Tübingen
>>>
>>> Sand 13, 72076 Tübingen, Germany, Phone +49 7071 2970532
>>> http://www.net.uni-tuebingen.de/
>>>
>>>
>>>
>>> *From:* stephen botzko [mailto:stephen.botzko@gmail.com]
>>> *Sent:* Tuesday, April 13, 2010 3:21 PM
>>> *To:* Kevin P. Fleming
>>> *Cc:* Christian Hoene; codec@ietf.org
>>> *Subject:* Re: [codec] #8: Sample rates?
>>>
>>>
>>>
>>> Though I generally avoid MAY, this could be a case where it makes sense.
>>>
>>> Something like:
>>>
>>> CODEC MAY reduce the acoustic bandwidth at lower bit rates in order to
>>> optimize audio quality.
>>>
>>> This is free of any technology assumption about *how* the acoustic
>>> bandwidth is reduced.  The MAY indicates that it is permissible.  But if the
>>> CODEC algorithm doesn't need to reduce the acoustic bandwidth, then we are
>>> making no statement that it SHOULD (or SHOULD NOT).
>>>
>>> Kevin is distinguishing dynamic changes to the sample rate (for bandwidth
>>> management) from multiple fixed sample rates; and I agree that is a key
>>> distinction.
>>>
>>> I have not heard any clear application requirement for more than one
>>> fixed sampling rate.  Though if there is such a requirement, IMHO we would
>>> have to negotiate the rate within SDP in the usual way, and it would affect
>>> the RTP timestamps, jitter buffers, etc.  G.722.1 / G.722.1C is one
>>> precedent - it is the same core codec, but can run at two different sample
>>> rates (negotiated by SDP).
>>>
>>> Stephen Botzko
>>>
>>> On Tue, Apr 13, 2010 at 7:41 AM, Kevin P. Fleming <kpfleming@digium.com>
>>> wrote:
>>>
>>> stephen botzko wrote:
>>>
>>> > Dynamically changing sample rates on the system level adds some
>>> > complexity for RTP, since the timestamp granularity is supposed to be
>>> > the sample rate.
>>>
>>> And jitter buffers, and anything else that is based on timestamps and
>>> sample rates/counts. If the desire is for the codec to be able to change
>>> sample rates to adjust to network conditions, then I agree with
>>> Stephen... the 'external' sample rate (input to the encoder and output
>>> from the decoder) should be fixed, and this is what would be negotiated
>>> in SDP and used for RTP timestamps. The codec can downsample in the
>>> encoder and upsample in the decoder if it has decided to transmit fewer
>>> bits across the network.
>>>
>>> --
>>> Kevin P. Fleming
>>> Digium, Inc. | Director of Software Technologies
>>> 445 Jan Davis Drive NW - Huntsville, AL 35806 - USA
>>> skype: kpfleming | jabber: kfleming@digium.com
>>> Check us out at www.digium.com & www.asterisk.org
>>>
>>>
>>>
>>>
>>>
>>
>>
>> _______________________________________________
>> codec mailing list
>> codec@ietf.org
>> https://www.ietf.org/mailman/listinfo/codec
>>
>>
>