Re: [codec] requirements #8 (new): Sample rates?

Jean-Marc Valin <jean-marc.valin@octasic.com> Wed, 26 January 2011 13:48 UTC

Return-Path: <jean-marc.valin@octasic.com>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 20F5F3A6980 for <codec@core3.amsl.com>; Wed, 26 Jan 2011 05:48:30 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.861
X-Spam-Level:
X-Spam-Status: No, score=-1.861 tagged_above=-999 required=5 tests=[AWL=-0.462, BAYES_00=-2.599, J_CHICKENPOX_44=0.6, J_CHICKENPOX_72=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id P+RuAY2zRJHh for <codec@core3.amsl.com>; Wed, 26 Jan 2011 05:48:28 -0800 (PST)
Received: from toroondcbmts06-srv.bellnexxia.net (toroondcbmts06-srv.bellnexxia.net [207.236.237.40]) by core3.amsl.com (Postfix) with ESMTP id 8EE723A69B8 for <codec@ietf.org>; Wed, 26 Jan 2011 05:48:28 -0800 (PST)
Received: from toip55-bus.srvr.bell.ca ([67.69.240.141]) by toroondcbmts06-srv.bellnexxia.net (InterMail vM.8.00.01.00 201-2244-105-20090324) with ESMTP id <20110126135128.TGOG19743.toroondcbmts06-srv.bellnexxia.net@toip55-bus.srvr.bell.ca>; Wed, 26 Jan 2011 08:51:28 -0500
Received: from toip36-bus.srvr.bell.ca ([67.69.240.37]) by toip55-bus.srvr.bell.ca with ESMTP; 26 Jan 2011 08:51:16 -0500
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AvsEAEOrP03PPaAN/2dsb2JhbACkdnO8KIVPBIUXilgG
Received: from mail.octasic.com ([207.61.160.13]) by toip36-bus.srvr.bell.ca with ESMTP; 26 Jan 2011 08:51:16 -0500
Received: from [10.100.60.27] (10.100.60.27) by MAIL2.octasic.com (10.100.10.44) with Microsoft SMTP Server (TLS) id 14.1.270.1; Wed, 26 Jan 2011 08:51:16 -0500
Message-ID: <4D4026D4.6010404@octasic.com>
Date: Wed, 26 Jan 2011 08:51:16 -0500
From: Jean-Marc Valin <jean-marc.valin@octasic.com>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101208 Thunderbird/3.1.7
MIME-Version: 1.0
To: Koen Vos <koen.vos@skype.net>
References: <1108895421.1374993.1296021563190.JavaMail.root@lu2-zimbra>
In-Reply-To: <1108895421.1374993.1296021563190.JavaMail.root@lu2-zimbra>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
X-Originating-IP: [10.100.60.27]
Cc: codec@ietf.org
Subject: Re: [codec] requirements #8 (new): Sample rates?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Jan 2011 13:48:30 -0000

Hi,

I think it goes beyond just having native support for 44.1 kHz. There are 
several existing applications that are currently using CELT in modes that 
are do not have a 2.5/5/10/20 ms frame size. There are different reasons 
for this depending on the application. Some of these are:
1) Some implementers choose frames that are powers of two so that they can 
use their platform's highly-optimized FFT
2) For ultra low delay applications, such as "jamming over the network" 
remote music performances, one can reduce latency by choosing a frame size 
that matches the soundcard's buffer size (which is almost always a power of 
two).
3) There are even users that consider 2.5 ms to be too high of a delay for 
their applications and use frames of 1.33 ms (64 samples at 48 kHz).
4) In some cases, transport or protocol issues impose some constraints on 
the frame size. For example, the DAB digital radio standard requires frames 
of 24 ms, which CELT can also handle.

I agree that the use cases I listed above may not be part of our "core use 
cases". However, CELT already handles all these cases with no change 
required in the code. Considering that people will keep using CELT for 
these "special" applications, the only question is how we want it to 
happen. I can see essentially two possible approaches:

1) Defining "custom modes" that handle the use cases above and that operate 
differently from the "main modes" that we have defined (e.g. no bandwidth 
switching on the fly).
2) We can exclude these modes from Opus and have these people using "CELT" 
as a non-standard codec.

I don't have a strong opinion on the subject, but I tend to lean towards 1) 
mainly because we would avoid having Opus and CELT as two "competing" 
standards. Also, it would mean that the IETF would have change control over 
all the operating modes instead of "splitting" change control in two. OTOH, 
I do agree that we would need a way to avoid causing compatibility issues 
(perhaps by using a different labelling for these "custom modes").

Cheers,

	Jean-Marc

On 11-01-26 12:59 AM, Koen Vos wrote:
> Hi Raymond,
>
> Streaming of CD music was not an anticipated use case or requirement. That
> said, 44.1 kHz support at the API is certainly nice to have.
>
> My understanding of how CELT works is that a *native* 44.1 kHz mode with a
> power-of-2 frame size would break the flexibility to seamlessly switch API
> sampling rates or coded audio bandwidth on the fly. (Jean-Marc can shine
> more light on this.) This feature makes it easy for applications to switch
> during a call between sources or streams that have different sampling rates.
>
> I'm also not as negative about resampling as you are: modern hybrid-FIR/IIR
> designs are very efficient, and I'm convinced a 44.1<->48 kHz resampler
> with transparent quality can be built with 10~20 MACs per sample. And
> resampling happens everywhere anyway: most hardware ADCs and DACs resample
> internally, and even Opus internally resamples in many of its modes already.
>
> So if desired we could support 44.1 kHz at the API level, without losing
> the nice flexibility properties we've so painstakingly built into the
> design. (Agree we have to think about how to handle 2.5 and 5 ms frames at
> 44.1 kHz, but I'm sure we'll find a solution.)
>
> best,
> koen.
>
>
> ---------------------------------------------------------------------------
> *From: *"Raymond (Juin-Hwey) Chen" <rchen@broadcom.com>
> *To: *codec@ietf.org
> *Cc: *"jean-marc valin" <jean-marc.valin@usherbrooke.ca>
> *Sent: *Tuesday, January 25, 2011 4:53:38 PM
> *Subject: *Re: [codec] requirements #8 (new): Sample rates?
>
> I agree that supporting sampling rates of 48, 16, and 8 kHz all makes
> sense, but I also think it would be highly desirable to support
>
> another common sampling rate used by music CDs: 44.1 kHz.
>
> For music streaming applications, a large percentage of the music sources
> will have this 44.1 kHz sampling rate. I know it is
>
> possible to up-sample them to 48 kHz first and then pass them through the
> 48 kHz version of Opus. However, doing such up-
>
> sampling requires additional processing power and additional latency, and
> if the sampling rate conversion is not done with a
>
> high-quality method (which usually requires higher complexity and higher
> delay), audio quality degradation may be introduced
>
> in the process. For these reasons, many people prefer not to do the 44.1
> kHz to 48 kHz sampling rate conversion and prefer to
>
> encode the 44.1 kHz music directly instead. Therefore, it is highly
> desirable for Opus to support this 44.1 kHz sampling rate.
>
> I understand that it may be inconvenient for the SILK mode or the SILK+CELT
> hybrid mode to support 44.1 kHz. However, for the
>
> CELT-only mode, my understanding is that the current CELT C code already
> supports the 44.1 kHz sampling rate. Since the CELT-
>
> only mode is also the most suitable mode to encode music, it would then
> make perfect sense for the CELT-only mode to also
>
> support this 44.1 kHz in addition to 48 kHz.
>
> Furthermore, currently the CELT mode at 48 kHz supports frame sizes of 2.5,
> 5., 10, and 20 ms to be consistent with the frame
>
> sizes of the SILK mode and SILK+CELT hybrid mode, and this results in frame
> sizes and FFT window sizes that are not powers of 2,
>
> which reduces the implementation efficiency. If the CELT mode supports the
> 44.1 kHz sampling rate, then since it is no longer
>
> possible to support exactly 2.5, 5, 10, and 20 ms at this sampling rate
> anyway, I think we might as well choose the FFT window
>
> sizes to be powers of 2 to allow more efficient implementation. Again, such
> power of 2 FFT sizes at 44.1 kHz is already supported
>
> in the current C code for CELT, so no new development is needed.
>
> In summary, I propose that we allow the CELT-only mode of the Opus codec to
> officially support the 44.1 kHz sampling rate using
>
> power of 2 FFT window sizes to avoid sampling rate conversion and to allow
> the most efficient implementation.
>
> Raymond
>
> *From:*codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] *On Behalf Of
> *Roman Shpount
> *Sent:* Monday, January 24, 2011 5:08 PM
> *To:* codec@ietf.org
> *Cc:* jean-marc.valin@usherbrooke.ca
> *Subject:* Re: [codec] requirements #8 (new): Sample rates?
>
> I actually do believe that the requirement is already addressed by section
> 4.1 of the document. I was responding to a comment that only full band is
> required, which is clearly not what we agreed upon and not what's in the
> document.
> _____________
> Roman Shpount
>
> On Mon, Jan 24, 2011 at 7:40 PM, codec issue tracker <trac@tools.ietf.org
> <mailto:trac@tools.ietf.org>> wrote:
>
> #8: Sample rates?
>
>
> Comment(by gmaxwell@…):
>
> On 11-01-24 07:14 PM, Roman Shpount wrote:
>  > I would like to see 8 and 16 KHz as required rates for the codec to
> insure
>  > interoperability with existing narrowband and wideband codecs. In other
>  > words we should be able to negotiate 8 or 16 Khz sample rate if audio
> will
>  > be transcoded for PSTN or wideband codec such as G.722
>
> Jean-Marc wrote:
>  > I believe such requirement for narrowband/wideband is already present,
> but
>  > I don't mind making it even more explicit is necessary.
>
>
> I believe that we since are close enough to finalizing the draft we should
> try to include proposed language with our issues. Otherwise there may be
> no clear path forward.
>
> Some of the requirements specify that compatibility with
> wideband/narrowband is important, but for the avoidance of doubt, I'll
> suggest:
>
> At the end of 4.1. Operating space:
>
> "Because interoperation with existing wideband and narrowband facilities
> is essential at least one method of interoperation must be provided
> regardless of the codec's operating mode, sample rate, or bitrate."
>
> Of course, I would be perfectly happy to leave this out entirely (as I
> believe the application section 2.1 already implies the requirement) or
> use some other language.
>
> --
>
> ------------------------------------+---------------------------------------
> Reporter: hoene@… | Owner: jean-marc.valin@…
> Type: enhancement | Status: new
> Priority: minor | Milestone:
> Component: requirements | Version:
> Severity: Active WG Document | Keywords:
> ------------------------------------+---------------------------------------
>
> Ticket URL: <http://trac.tools.ietf.org/wg/codec/trac/ticket/8#comment:4>
>
> codec <http://tools.ietf.org/codec/>
>
> _______________________________________________
> codec mailing list
> codec@ietf.org <mailto:codec@ietf.org>
>
> https://www.ietf.org/mailman/listinfo/codec
>
>
> _______________________________________________
> codec mailing list
> codec@ietf.org
> https://www.ietf.org/mailman/listinfo/codec
>
>
>
> _______________________________________________
> codec mailing list
> codec@ietf.org
> https://www.ietf.org/mailman/listinfo/codec