Re: [codec] #20: Computational complexity?

"codec issue tracker" <trac@tools.ietf.org> Mon, 24 May 2010 14:14 UTC

Return-Path: <trac@tools.ietf.org>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id AD3073A6BBE for <codec@core3.amsl.com>; Mon, 24 May 2010 07:14:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -99.677
X-Spam-Level:
X-Spam-Status: No, score=-99.677 tagged_above=-999 required=5 tests=[AWL=-2.443, BAYES_50=0.001, FF_IHOPE_YOU_SINK=2.166, J_CHICKENPOX_66=0.6, NO_RELAYS=-0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lCbMs7XuWiam for <codec@core3.amsl.com>; Mon, 24 May 2010 07:14:42 -0700 (PDT)
Received: from zinfandel.tools.ietf.org (unknown [IPv6:2001:1890:1112:1::2a]) by core3.amsl.com (Postfix) with ESMTP id DE2C13A6BF8 for <codec@ietf.org>; Mon, 24 May 2010 07:14:41 -0700 (PDT)
Received: from localhost ([::1] helo=zinfandel.tools.ietf.org) by zinfandel.tools.ietf.org with esmtp (Exim 4.71) (envelope-from <trac@tools.ietf.org>) id 1OGYQI-0004En-GO; Mon, 24 May 2010 07:14:34 -0700
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: codec issue tracker <trac@tools.ietf.org>
X-Trac-Version: 0.11.7
Precedence: bulk
Auto-Submitted: auto-generated
X-Mailer: Trac 0.11.7, by Edgewall Software
To: hoene@uni-tuebingen.de
X-Trac-Project: codec
Date: Mon, 24 May 2010 14:14:30 -0000
X-URL: http://tools.ietf.org/codec/
X-Trac-Ticket-URL: http://trac.tools.ietf.org/wg/codec/trac/ticket/20#comment:2
Message-ID: <071.1e13fb419ab1efbd5ed09c3d0b550bf8@tools.ietf.org>
References: <062.8524135614c0f45c18915362cc459235@tools.ietf.org>
X-Trac-Ticket-ID: 20
In-Reply-To: <062.8524135614c0f45c18915362cc459235@tools.ietf.org>
X-SA-Exim-Connect-IP: ::1
X-SA-Exim-Rcpt-To: hoene@uni-tuebingen.de, codec@ietf.org
X-SA-Exim-Mail-From: trac@tools.ietf.org
X-SA-Exim-Scanned: No (on zinfandel.tools.ietf.org); SAEximRunCond expanded to false
Cc: codec@ietf.org
Subject: Re: [codec] #20: Computational complexity?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Reply-To: codec@ietf.org
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 24 May 2010 14:14:44 -0000

#20: Computational complexity?
------------------------------------+---------------------------------------
 Reporter:  hoene@…                 |       Owner:     
     Type:  defect                  |      Status:  new
 Priority:  major                   |   Milestone:     
Component:  requirements            |     Version:     
 Severity:  -                       |    Keywords:     
------------------------------------+---------------------------------------

Comment(by hoene@…):

 [Benjamin]:
 I think this is a typo, and you mean "lessened the pressure to reduce
 bitrates and complexity, and has shifted the focus to fidelity and delay
 instead".
 ...
 I'd also like some clarification as to whether we're talking about ... or
 (b) future hardware designed to support the IWAC.  If (b), then this is
 just a matter of negotiating the acceptable
 encode+decode complexity, for eventual implementation by DSP or ASIC.

 [Raymond]:
 My main point in this area is just that there are complexity-sensitive
 applications such as low-end devices and VoIP gateways where a low codec
 complexity is important or even necessary.  In other words, if the IETF
 codec complexity is too high, then either it will become much more
 expensive for some applications (e.g. gateways), or it may not even be
 feasible to put the codec in some low-end devices. ...
 I merely wanted to use Bluetooth headset as an example to make a point
 that a low codec complexity is desirable and a high codec complexity can
 have negative consequences.

 [Kevin]: This is all perfectly reasonable, but given the likely timeframe
 we are talking about for this codec to be produced and published as a
 standards-track RFC, the definition of 'low complexity' in this discussion
 is really talking about the 2012-2013 version of 'low complexity', not
 today's. It seems highly likely that the MIPS capacity of the DSPs
 designed into Bluetooth headsets in 2012 will be vastly greater than what
 is used today, if there is an application to take advantage of the
 additional MIPS.

 [Raymond]: I completely agree with you that the IETF codec development
 should not be constrained by a low-complexity device designed in 2009 or
 earlier, and we should look toward the time frame of 2012 and 2013
 instead.

 In my previous emails I have indicated that due to many reasons, over the
 last several years the processing power of Bluetooth headsets has been
 increasing at a rate much slower than what's predicted by Moore's Law, and
 it doesn't look like this will change significantly in the next few years.
 I also said that for the current-generation Bluetooth headset chips, the
 maximum codec complexity it can support is probably somewhere around 5
 MIPS on a 16-bit fixed-point DSP, and by the time the IETF codec becomes a
 standard, the number may go up to 10 MIPS, or 15 MIPS at most.

 Thus, if we want mono Bluetooth headsets in the FUTURE (i.e. in the next
 several years) to be able to run the IETF codec in the narrowband or
 wideband mode at least, a good complexity target to shoot for is 10 to 20
 MIPS on a fixed-point DSP.

 [JM]
 I'm just curious about where you got your MIPS figures for Bluetooth? I'm
 not familiar with the type of DSPs used in those applications, but from a
 quick search of more "general-purpose" DSPs (TI, ADI and similar), the
 lowest speed I was able to find (sold in 2010) was a 50 MIPS DSP. Any idea
 what type of DSPs are currently used in Bluetooth?

 [Raymond]:
 I should have clarified:
 I am not proposing that we limit the complexity of all IETF codec modes to
 10 to 20 MIPS.  That would be unreasonable, especially for high- sampling-
 rate and high-fidelity applications.

 Just like we seemed to have reached a consensus that a low-delay mode is
 necessary to address delay-sensitive applications (as is specified in the
 codec requirement document), we can also have a low-complexity mode to
 address complexity-sensitive applications such as low- end/mobile devices
 and gateways. It is for such a low-complexity mode that 10 - 20 MIPS is a
 good target to shoot for, at least for narrowband and wideband. (Super-
 wideband and full-band can be layered coding on top of that and do not
 need to be subject to this 10 - 20 MIPS target.) For other coding modes
 that require more processing power, this 10 - 20 MIPS target obviously
 would not apply.

 Also, if we don't like to have too many different coding modes, and if
 some modes can be combined, for example, if the low-delay mode can also
 achieve low complexity, then we can combine the low-delay mode and the
 low-complexity mode into a single mode.  We can have another mode that's
 more efficient in bit-rate but may have higher delay and complexity to
 address those applications that are less sensitive to delay and
 complexity.

 [JM]: Oh, I realise your original message did not state 10-20 MIPS as the
 target for all modes. My only questions was about how you came to the
 10-20 MIPS figure in the first place. Any DSP vendor(s) roadmap or
 something? Or what would be the relevant existing DSP for which we could
 extrapolate future performance? As I said, I'm not too familiar about the
 DSPs used in Bluetooth because all the regular DSPs I can find are 50 MIPS
 or above.

 [Raymond]: Bluetooth headsets normally don't use general-purpose DSPs from
 those traditional DSP houses that you mentioned below.  A lot of mid- to
 low-end Bluetooth headsets (where most of the shipping volume is) don't
 even have any DSP and instead just rely on an ARM processor to do all
 voice/audio processing and Bluetooth protocol stack.  The 5 MIPS number
 for current-generation BT headsets was quoted to me by an experienced
 Bluetooth audio engineering manager in terms of MIPS on an ARM processor,
 but since most speech coding engineers are more familiar with the MIPS for
 a general-purpose 16-bit fixed-point DSP, so I converted the MIPS on an
 ARM to the equivalent MIPS on a fixed-point DSP, and it came to around 5
 MIPS.  The 10 to 20 MIPS number is what we may expect in a several years
 as Moore's Law helps to increase the processing power of Bluetooth
 headsets.

 High-end Bluetooth headset chips may have a DSP on-chip, but it is usually
 either a proprietary DSP or a DSP core not from the traditional DSP houses
 but from companies like ARM.

 One thing to keep in mind, though, is that even if you have a DSP on a
 Bluetooth headset chip, and it is clocked at 50 MHz or higher, it doesn't
 mean that you can use all or most of that DSP processing power for speech
 or audio coding.  This is because it has to handle many other signal
 processing tasks such as acoustic echo canceller, single-mic or multi-mic
 noise suppression, wind noise suppression, packet loss concealment, voice
 prompt decoding, and even voice command recognition, to name just a few.
 You can't even get half of the DSP processing power solely dedicated to
 speech coding.  You will be lucky if you can get 20% of the DSP for speech
 coding.  Remember that currently the speech coding part (CVSD codec) takes
 0% of the DSP or ARM because it is done in chip hardware.

 [Christan]: what do you think about STL2005/9 described in ITU-T G.191. It
 might be a good metric to measure the codec performance in low-complexity
 mode. STL2009 is not yet published officially but the STL 2005 manual is
 available for free.
 http://www.itu.int/rec/T-REC-G.191-200508-I/en
 Chapter 13, Page 159 describes a set of basic operator and their
 complexity weights. STL2009 is similar but has guidelines on data movement
 and program ROM estimation tool for fixed-point c code and a complexity
 evaluation tool for floating-point C Code.

 However, STL2005 might not be optimal for soft-phone complexity prediction
 because many operations have overflow control and saturation, that
 typically cannot be translated into a single native assembler instruction.
 Thus, for soft-phone I would recommend to use slightly modified
 measures...

 [Raymond]:
 The ITU-T software tool library has been used to measure the complexity of
 many fixed-point standard speech codecs, and we have done that, too.  So,
 yes, I think it is a good and well-accepted way to measure the complexity
 of fixed-point codecs.

 For floating-point implementations, although I don't know of a similar
 software tool for complexity measurement, it is not that difficult to do
 something similar and add some complexity counting code to an existing
 floating-point C code to measure the complexity of a floating-point codec.

-- 
Ticket URL: <http://trac.tools.ietf.org/wg/codec/trac/ticket/20#comment:2>
codec <http://tools.ietf.org/codec/>