Re: [codec] requirements #8 (new): Sample rates?

Jean-Marc Valin <jean-marc.valin@octasic.com> Wed, 26 January 2011 21:22 UTC

Return-Path: <jean-marc.valin@octasic.com>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id C49D13A69E1 for <codec@core3.amsl.com>; Wed, 26 Jan 2011 13:22:00 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.419
X-Spam-Level:
X-Spam-Status: No, score=-2.419 tagged_above=-999 required=5 tests=[AWL=0.180, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id f-dHApIymevj for <codec@core3.amsl.com>; Wed, 26 Jan 2011 13:21:59 -0800 (PST)
Received: from toroondcbmts06-srv.bellnexxia.net (toroondcbmts06.bellnexxia.net [207.236.237.40]) by core3.amsl.com (Postfix) with ESMTP id F0B3A3A69DF for <codec@ietf.org>; Wed, 26 Jan 2011 13:21:58 -0800 (PST)
Received: from toip58-bus.srvr.bell.ca ([67.69.240.185]) by toroondcbmts06-srv.bellnexxia.net (InterMail vM.8.00.01.00 201-2244-105-20090324) with ESMTP id <20110126212457.KXQN19743.toroondcbmts06-srv.bellnexxia.net@toip58-bus.srvr.bell.ca>; Wed, 26 Jan 2011 16:24:57 -0500
Received: from toip41-bus.srvr.bell.ca ([67.69.240.42]) by toip58-bus.srvr.bell.ca with ESMTP; 26 Jan 2011 16:24:47 -0500
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AvsEAHsYQE3PPaAN/2dsb2JhbACkeHO9IYMRgj4EhReKWAY
Received: from mail.octasic.com ([207.61.160.13]) by toip41-bus.srvr.bell.ca with ESMTP; 26 Jan 2011 16:24:46 -0500
Received: from [10.100.60.27] (10.100.60.27) by MAIL1.octasic.com (10.100.10.44) with Microsoft SMTP Server (TLS) id 14.1.270.1; Wed, 26 Jan 2011 16:22:37 -0500
Message-ID: <4D40909D.10503@octasic.com>
Date: Wed, 26 Jan 2011 16:22:37 -0500
From: Jean-Marc Valin <jean-marc.valin@octasic.com>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101208 Thunderbird/3.1.7
MIME-Version: 1.0
To: Koen Vos <koen.vos@skype.net>
References: <731662711.1415662.1296076131142.JavaMail.root@lu2-zimbra>
In-Reply-To: <731662711.1415662.1296076131142.JavaMail.root@lu2-zimbra>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
X-Originating-IP: [10.100.60.27]
Cc: codec@ietf.org, Pochol@WebfootGames.com
Subject: Re: [codec] requirements #8 (new): Sample rates?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Jan 2011 21:22:00 -0000

On 11-01-26 04:08 PM, Koen Vos wrote:
> 1. How to call these custom modes (for instance in the SDP descriptor).
> As long as we agree that it is not "Opus", then we prevent compatibility
> issues and we keep our on-the-fly switching flexibility within the
> standard Opus format.  I'm fine with "Opus-custom".

OK, unless there's objections I think we can go with "Opus-custom".

> 2. How to enable the custom modes in the code.  I think there must be a
> small barrier so that users don't unintentionally start using them.
> Maybe a well-commented #define somewhere, plus an API control flag to
> put the codec in custom mode?

Agreed. The API will *have* to be different because in the normal mode 
there won't even be a need to specify things like frame size. As for a 
#define it also makes sense in terms of saving code size by default.

> 3. Should there be a set of "official" custom modes, to aid interop even
> with Opus-custom?  Users could still pick and choose any modes they
> want, but having the list could make users gravitate towards a few
> common choices.

I think the following modes are main custom modes we want to recommend:
1) 48 kHz, frame sizes 128, 256, 512, 1024 (switchable)
2) 44.1 kHz, frame sizes 128, 256, 512, 1024 (switchable)
3) 48 kHz, frame size 64
4) 44.1 kHz frame size 64

though the other modes will still be available.

Cheers,

	Jean-Marc

> koen.
>
>
>
> ----- Original Message ----- From: "Gregory
> Maxwell"<gmaxwell@juniper.net> To: Pochol@WebfootGames.com, "Jean-Marc
> Valin"<jean-marc.valin@octasic.com> Cc: codec@ietf.org Sent: Wednesday,
> January 26, 2011 11:32:01 AM Subject: Re: [codec] requirements #8 (new):
> Sample rates?
>
>
>
> I'm very concerned that some people may believe that 44.1k is just
> another checkbox that can be added without cost or much consideration.
> This isn't the case. I think it's essential that we delineate realtime
> VoIP style usage from other applications.
>
> I support Jean Marc's option (1), which I think allows us to have our
> cake and eat it too. It's the closest thing to a "cost free" option that
> I think we're going to get. This option basically separates the codec
> into two profiles, one which imposes sampling rate restrictions in
> exchange for many important advantages, and one which does not, but
> misses the advantages.
>
> JM's option (2) would also work. But I don't like the idea of the market
> confusion created by a totally separate codec which has significant
> overlap with Opus, but isn't Opus.  Basically, I think it's silly for
> part of the working group's output to compete with itself.  I'd prefer
> to just have "Opus" and "Opus-custom" or whatever.
>
>
> There are several reasons that supporting 44.1kHz in the primary Opus
> profile would be bad:
>
> One reason for this is that quite a bit of hardware (even on desktops)
> can only do 48kHz (or closely related rates) and even when the hardware
> can do multiple rates it can only ever do one rate at a time so if its
> even possible that multiple applications may play sound at once then the
> only way to avoid resampling is if they are all running at a common
> rate.
>
> For the 48kHz related rates we can do very computationally cheap
> handling of different rates purely inside the codec. If their hardware
> supports any mode out of the 48kHz family, then they'll need no costly
> resampling at all. And if they do  need run at 44.1k they can resample
> in and out of the codec without imposing on (or negotiating with) the
> far end.
>
> Another one is that Opus (as described in the draft, without 44.1kHz)
> can switch between any of its supported modes, all on the fly, without
> creating any surprising impositions on the clients. This is possible
> only because of the closely related nature of the supported rates, and
> 44.1kHz can't be accommodated in this scheme.
>
> The on the fly switching and lack of requirement to negotiate, also
> means that two opus devices can communicate without transcoding, even if
> they were spliced long after negotiation. (e.g. as part of a conference
> gateway).
>
>
> On the other end of the spectrum— the current CELT library and
> bitstream is extremely flexible. It supports a great many frame sizes
> and sample rates and I've personally argued against every limitation
> we've imposed on it, because I like the idea that CELT can fit into
> every niche requirement (like the DAB frame sizes).
>
> But this flexibility has a serious price for interoperable
> implementations: They must carry substantially more code (the limited
> rates/sample sizes means that a simple table can replace several
> hundred lines of tricky bit-exact initialization code), cope with
> increased peak CPU usage (e.g. if some device only speaks 64 sample
> frames, 96KHz you might need 10x the CPU power to speak to it compared
> to your preferred mode), and undertake more complicated negotiation and
> testing (CELT can support far more unique mixtures of sample rate,
> frame-size, and channel count then there are RTP payload types).
>
> So basically, I support fully supporting oddball configurations as a
> well specified standardized mode, but I oppose subjecting the general
> VoIP/RTP users to the increase complexity and limitations of the more
> rate/framesize agile configurations.
>
> I expect that most users which care about the 'custom' modes are doing
> other specialized things and won't be expected to ever interop with a
> random Opus phone except (maybe) via a gateway, and that most of them
> won't even speak RTP— so the separation shouldn't even make much
> difference to them at all.
>
>
> Thoughts?
>
>
>
>
> ________________________________________ From: codec-bounces@ietf.org
> [codec-bounces@ietf.org] On Behalf Of Pascal Pochol
> [Pochol@WebfootGames.com] Sent: Wednesday, January 26, 2011 6:06 PM To:
> Jean-Marc Valin Cc: codec@ietf.org Subject: Re: [codec] requirements #8
> (new): Sample rates?
>
> Hello,
>
> I just wanted to give in my 2 cents about 44.1Khz native support.
>
> 99% of all the audio we use celt and eventually opus for are encoded at
> 44.1Khz. They are provided to us that way. Which means that without
> 44.1khz support we'll have to up-sample 99% of our audio most likely in
> a preprocess build making us maintain 2 sets of audibly identical files.
> We had to do it before with speex where we converted from 44.1 down to
> 32khz to use its native ultrawideband but it really wasn't the easiest.
> We had thousands of files duplicated and every now and then a few of
> these getting updated from the source, forcing us to redo massive
> conversions each time to make sure we didn't miss one somewhere.
>
> Also about upsampling not costing much, I beg to differ. We had to work
> with hardware that could decode speex ultrawideband 32Khz just fine but
> the decoding alone was eating up all our CPU leaving not much else to do
> the real work that we needed to do. We had to use 16khz instead to make
> it all work. 48Khz, 44.1khz might not look like a big difference when
> working on a desktop but when you're counting bytes to see how you can
> reduce you memory consumption it could mean the world.
>
> So strickly from a user of the codec's view, native 44.1Khz would
> certainly make working with celt/opus a lot easier. I'm guessing that
> I'm not the only one in that case based on this thread. Easier would
> also lead to faster adoption.
>
> Sorry I didn't intend to write that much. In short 44.1Khz: great if
> you can do it, if not we'll just have to work around it.
>
> -Pascal
>
> _______________________________________________ codec mailing list
> codec@ietf.org https://www.ietf.org/mailman/listinfo/codec
> _______________________________________________ codec mailing list
> codec@ietf.org https://www.ietf.org/mailman/listinfo/codec