Re: [codec] #21: Supporting Wireless Links?
"codec issue tracker" <trac@tools.ietf.org> Sun, 09 May 2010 17:56 UTC
Return-Path: <trac@tools.ietf.org>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id DDB983A6A7C for <codec@core3.amsl.com>; Sun, 9 May 2010 10:56:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -101.113
X-Spam-Level:
X-Spam-Status: No, score=-101.113 tagged_above=-999 required=5 tests=[AWL=-1.113, BAYES_50=0.001, NO_RELAYS=-0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id upTuxUyrePBL for <codec@core3.amsl.com>; Sun, 9 May 2010 10:56:05 -0700 (PDT)
Received: from zinfandel.tools.ietf.org (unknown [IPv6:2001:1890:1112:1::2a]) by core3.amsl.com (Postfix) with ESMTP id E688B3A6A81 for <codec@ietf.org>; Sun, 9 May 2010 10:56:04 -0700 (PDT)
Received: from localhost ([::1] helo=zinfandel.tools.ietf.org) by zinfandel.tools.ietf.org with esmtp (Exim 4.69) (envelope-from <trac@tools.ietf.org>) id 1OBAjJ-0001nP-TJ; Sun, 09 May 2010 10:55:53 -0700
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: codec issue tracker <trac@tools.ietf.org>
X-Trac-Version: 0.11.6
Precedence: bulk
Auto-Submitted: auto-generated
X-Mailer: Trac 0.11.6, by Edgewall Software
To: hoene@uni-tuebingen.de
X-Trac-Project: codec
Date: Sun, 09 May 2010 17:55:53 -0000
X-URL: http://tools.ietf.org/codec/
X-Trac-Ticket-URL: http://trac.tools.ietf.org/wg/codec/trac/ticket/21#comment:1
Message-ID: <071.09efd48992c6281692d6860b09d86a77@tools.ietf.org>
References: <062.a00b15332f6e9da39f0d81d14d24c64d@tools.ietf.org>
X-Trac-Ticket-ID: 21
In-Reply-To: <062.a00b15332f6e9da39f0d81d14d24c64d@tools.ietf.org>
X-SA-Exim-Connect-IP: ::1
X-SA-Exim-Rcpt-To: hoene@uni-tuebingen.de, codec@ietf.org
X-SA-Exim-Mail-From: trac@tools.ietf.org
X-SA-Exim-Scanned: No (on zinfandel.tools.ietf.org); SAEximRunCond expanded to false
Cc: codec@ietf.org
Subject: Re: [codec] #21: Supporting Wireless Links?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Reply-To: codec@ietf.org
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 09 May 2010 17:56:08 -0000
#21: Supporting Wireless Links? ------------------------------------+--------------------------------------- Reporter: hoene@… | Owner: Type: defect | Status: new Priority: major | Milestone: Component: requirements | Version: Severity: - | Keywords: ------------------------------------+--------------------------------------- Comment(by hoene@…): [Raymond]: Many telephone terminal devices at the edge of the Internet use embedded processors with limited processing power, and the processors also have to handle many tasks other than speech coding. If the IETF codec complexity is too high, some of such devices may not have sufficient processing power to run it. Even if the codec can fit, some battery-powered mobile devices may prefer to run a lower-complexity codec to reduce power consumption and battery drain. For example, even if you make a Internet phone call from a computer, you may like the convenience of using a Bluetooth headset that allows you to walk around a bit and have hands-free operation. Currently most Bluetooth headsets have small form factors with a tiny battery. This puts a severe constraint on power consumption. Bluetooth headset chips typically have very limited processing capability, and it has to handle many other tasks such as echo cancellation and noise reduction. There is just not enough processing power to handle a relatively high-complexity codec. Most BT headsets today relies on the extremely low-complexity, hardware-based CVSD codec at 64 kb/s to transmit narrowband voice, but CVSD has audible coding noise, so it degrades the overall audio quality. If the IETF codec has low enough complexity, it would be possible to directly encode and decode the IETF codec bit-stream at the BT headset, thus avoiding the quality degradation of CVSD transcoding. [Koen]: By the time the BlueTooth Special Interest Group will have adopted a future IETF codec standard, Moore's law will surely have multiplied CPU resources in the BT device by one order of magnitude..? Not sure it makes sense to apply today's BT constraints to tomorrow's codec. I'm not even convinced BlueTooth is a relevant use case for an Internet codec. BT devices are audio devices more than VoIP end points: BT always connects to the Internet through another device. You could simply first decode incoming packets and send PCM data to the BT device, or use a high-quality/high-bitrate codec like G.722. The requirements for BT devices and the Internet are just too different. Similarly, GSM phones use AMR on the network side and a different codec towards the BT device. The required transcoding causes no quality problems because BT supports high bitrates. [Raymond]: Hi Koen, Responding to your earlier email about Bluetooth headset application: (1) Although BT SIG standardization is a preferred route, it is technically feasible to negotiate and use a non-Bluetooth-SIG codec. (2) Someone familiar with BT SIG told me that it would probably take only 6 months to add an optional codec to the BT SIG spec and 12 to 18 months to add a mandatory codec. (3) The IETF codec is scheduled to be finalized in 14 months and submitted to IESG in 18 months. Even if we take the BT SIG route and take 6 to 18 months there. The total time of 2 to 3 years from now means the Moore's Law would only increase the CPU resources 2X to 3X, and definitely no more than 4X max, not 10X. (4) Most importantly, guess what, in the last several years the Bluetooth headset chips have been growing its processing power at a MUCH, MUCH slower rate than what the Moore's Law says it should. Sometimes they did not increase the speed at all for years. The reasons? The ASP (average sale price) of Bluetooth chips plummeted very badly, making it unattractive to invest significant resources to make them significantly faster. Also, for low-end and mid-end BT headsets, the BT chips were often considered "good enough" and there wasn't a strong drive to increase the computing resources. In addition, the BT headsets got smaller over the last few years; the corresponding reduction in battery size required a reduction in power consumption, which also limited how fast the processor speed could grow. In the next several years, it is highly likely that the computing capabilities of Bluetooth headset chips will continue to grow at a rate substantially below what's predicted by the Moore's Law. (5) Although Bluetooth supports G.711 as an optional codec, basically no one uses it because it is too sensitive to bit errors. Essentially all the BT mono headsets on the market today are narrowband (8 kHz sampling) headsets using CVSD. There isn't any real wideband support yet, so your comment about G.722 doesn't apply. Even after wideband-capable BT headsets come out, for many years to come the majority of the BT headsets (especially mid- to low-end) will still be narrowband only, running only CVSD. Hence, the quality degradation of the CVSD transcoding is real and will be with us for quite a while, so it is desirable for the IETF codec to have a low-complexity mode that can directly run on the BT headsets to avoid the quality degradation of CVSD when using BT headsets to make Internet phone calls. (6) Even if you could use G.711 or G.722 in the BT headsets, they both operate at 64 kb/s. A low-complexity mode of the IETF codec can operate at half or one quarter of that bit-rate. This will help conserve BT headsets' radio power because of the lower transmit duty cycle. It will also help the Bluetooth + WiFi co-existence technologies. (7) Already a lot of people are used to using Bluetooth headsets to make phone calls today. If they have a choice, many of these people will also want to use Bluetooth headsets to make Internet phone calls, not only through computers, but also through smart phones connected to WiFi or cellular networks. As more and more states and countries pass laws to ban the use of cell phones that are not in hands-free mode while driving, the number of Bluetooth headset users will only increase with time, and many of them will want to make Internet-based phone calls. Given all the above, I would argue that Bluetooth headset is a very relevant application that the IETF codec should address with a low- complexity mode. [Koen]: You seem to suggest that the IETF Internet codec should fit Bluetooth requirements in order to enable transcoding-free operation all the way from the Internet, through the Internet-connected device, to the BT wireless audio device. A similar argument would hold for ITU-T cellular codecs: AMR-WB and G.718 could have been designed with BT as an application. In reality, these codecs have very little in common with BT codecs, because of the vastly different requirements in terms of - complexity - memory footprint - bitrate - scalability - bit error robustness - packet loss robustness. Do you think it's realistic for us to come up with a design that fulfills the needs of both worlds? The alternative is to separately design codecs for Internet applications and BT devices, and continue the practice of transcoding on the Internet- connected device. That would have a better chance of maximizing quality in all scenarios. [Raymond]: […] If a high-quality, low-complexity, wider bandwidth IETF codec mode can be implemented in Skype and the Bluetooth headset to avoid the CVSD transcoding (together with wideband upgrade of the transducers and audio path in the BT headset, of course), then not only will you get much better speech quality in your Skype calls than what you have experienced, but also you will get a lower latency. This is because transcoding between the Skype codec and CVSD not only accumulates the coding distortion of the two codecs, but also accumulates the coding delays. Although CVSD is a sample-by-sample codec, BT headsets still transmit the CVSD bit-stream in 3.75 ms or 7.5 ms packets, and they can potentially add a one-way delay up to 20 ~ 25 ms through the Bluetooth headset (the exact delay depends on the implementation). While we were discussing whether a 5 ms packet size can even be considered, for many years Bluetooth headsets have been using an even smaller 3.75 ms packet size. [Raymond]: I agree that there are some fundamental differences in the requirements for cellular codecs and Bluetooth codecs which caused the codecs in these two types of devices to each go their own way. However, these differences are (or can be) substantially smaller between an Internet codec and Bluetooth codecs, so I think it is easier for Internet devices and Bluetooth devices to use the same codec to avoid the additional delay and coding distortion of transcoding. (1) Royalty-free requirement: Cellular codecs are usually royalty-bearing, and that's acceptable in the cellular world. Not so with Bluetooth. Bluetooth devices are meant to be simple and low cost. As such, Bluetooth SIG basically only wants to standardize royalty-free technologies. That's an important reason why they picked the CVSD codec, a royalty-free old technology of 1970. We are trying to make the IETF codec royalty-free, so in this regard this goal is consistent with the Bluetooth SIG's royalty-free requirement for codec. (2) Bit-rate requirement: Cellular radio spectrum is a limited, fixed resource that doesn't change with time, and cellular operators spent billions of dollars in radio spectrum auctions. Thus, it is extremely important for cellular codecs to have bit-rates as low as possible, with an average bit-rate often going below 1 bit/sample, to maximize the number of cellular subscribers a given amount of radio spectrum can support. In contrast, the bit-rate is not nearly as big a concern for Bluetooth. Initially Bluetooth SIG picked the relatively high-bit-rate 64 kb/s CVSD narrowband codec (8 bits/sample) for its simplicity and royalty-free nature among other things. Since the speeds of the Internet back bone and access networks keep growing with time, the bit-rate of an Internet codec is also not nearly as big a concern as in cellular codecs, and an Internet codec around 2 bits/sample can have better trade-offs (e.g. higher quality, lower delay, and lower complexity) for Internet applications than what cellular codecs can provide. Incidentally, Bluetooth SIG is moving toward 4 bits/sample. As you can see, in terms of the bit-rate requirement, an Internet codec is much closer to Bluetooth codecs than cellular codecs are. (3) Complexity requirement: Bluetooth headsets have much lower processing power and much smaller batteries than cell phones. The complexity of cellular codecs, typically in the range of 20 to 40 MHz on a DSP, is too high to fit most Bluetooth headsets. However, unlike cell phones and Bluetooth headsets where each is a specific type of device with a relatively narrow range of device complexity, Internet voice/audio applications can potentially encompass a large variety of different device types, from desktop computers at the high end with > 3 GHz multi-core CPU to IP phones and possibly even Bluetooth headsets at the low end with a processor of only a few tens of MHz. It is up to the IETF codec WG how we want the complexity of the IETF codec to be. We can standardize just one codec mode that works well for computer-to-computer calls but can't fit in low-end devices, or we can keep that mode but also have a low-complexity mode that can be implemented in low-end devices. Frankly, I think the second approach makes much more sense since it allows many more devices to benefit from the IETF codec and enables the large number of Bluetooth headset users to avoid the additional distortion and delay associated with transcoding when making Internet calls. (4) Delay requirement: Due to the need for cellular codecs to achieve bit- rates as low as possible, they sacrificed the coding delay and used a 20 ms frame size, because using a 10 or 5 ms frame size would increase the bit-rate for a given level of speech quality. On the other hand, a Bluetooth headset needs to have a low delay since its delay is added to the already long cell phone delay. For the IETF codec, again it is up to the codec WG to decide what kind of codec delay we want, and again I think it makes sense to have a higher-delay, higher bit-rate efficiency mode for bit-rate-sensitive applications and another low-delay mode for delay- sensitive applications, since one size doesn't fit all. If the IETF codec delay is forced to be one size, the resulting codec will be (potentially very) suboptimal for some applications. You wrote: > Do you think it's realistic for us to come up with a design that > fulfills the needs of both worlds? With a one-size-fit-all approach, probably not, but with a multi-mode approach, then I think so. [Stephen]: Though the Bluetooth angle is interesting, it is clearly out-of-scope for this WG. Of course Bluetooth SIG could pick up CODEC later on if they think it meets their requirements. [Christian]: I just want to share some insights from the recent development of Bluetooth's Hands-Free Profile (HFP) version 1.5, which supports wideband speech. One main requirement were on the frame size of 7.5 ms because the Bluetooth MAC protocol support scheduling at this interval. Actually, to achieve this they modified SBC to work on 15 blocks instead of 4,8,12 or 16 blocks and they decided against G.722. The lesson to learn is about the importance of MAC protocol. To get a efficient, low power, and mobile device you have to consider the impact of packet scheduling. If packets are scheduled regular, the MAC protocol can work more efficient. The more irregular packet arrive, the more expensive a packet transmission gets. Actually, you can translate the cost of packet scheduling to bits per packet. Depending on the wireless technology, it might vary substantially. The worst case is 802.11b at 11 Mbps at long preamble - then, you can add about 1300 bytes to every packet just for physical headers and MAC scheduling. However, more modern technologies like LTE and IEEE 802.11n are much more efficient in terms of per packet overhead. If you ask me, one important usage scenario is over wireless links supporting low-power mobile devices. If we ignore this scenario it will be a judge mistake: a) Battery powered mobile devices must be energy efficient to reduce the size of the batteries. Also, they should not demand to many computational resources, otherwise they would consume to much energy. b) Wireless IP access is also in scope because many devices get Internet access LTE, WLAN, Wimax, UMTS, etc. Bluetooth headsets are somewhat a special case. Actually, they are two cases: 1) Headphones (A2DP): For me it is not clear whether supporting the Internet CODEC on top of A2DP (which is - by the way - already possible according to Bluetooth spec A2DP V1.2) or using the Internet CODEC till the Bluetooth AP and transcoding to SBC is more efficient, cost effect, or energy saving. 2) Mic (HFP): Here is the scheduling of 3.75 or 7.5ms might be an important requirement that the Internet CODEC cannot fulfill always because it must adapt its parameters to the Internet transmission path not just to the Bluetooth link. Thus, I would recommend to write a liaison statement to Bluetooth AVT group and ask whether they would have interest to include the Internet CODEC into a future version of A2DP. Definitely, this must not happen soon because the they can do is only if the Internet CODEC is finishing. Supporting HFP might be more difficult than A2DP because of the very tough requirements on efficiency. PS: No, not version 1.5 but its not yet published successor. [Roni]: I think that this is going to a wrong direction. I already suggested that since the group will do one codec we first need to decide on the applications. The initial request was for a wideband codec for the Internet and this is the application that should dictate the requirements. We can look at the other applications in the requirements and charter and continue with the requirements that are in-line with the original application. Other applications are nice to have if they do not add more requirements to the codec to be defined here. We are talking here on more requirements which in my understanding are not in the scope of the WG [Stephen]: I agree that Wireless IP is in scope, one could also argue that LTE might be in scope, since it will be using IPv6. As we all know, the internet runs over multiple physical layers, and usually an end-to-end connection uses more than one physical layer. So I am not sure how we can translate layer 2 constraints into requirements for CODEC itself. I can see how it might impact the RTP packetization - allowing frames to be split across packets would allow media-aware devices to adjust the packetization to optimize delivery. [Koen]: I continue to fail to see the connection between the Internet codec and Bluetooth, for the reasons below. (1) Bluetooth != Internet: Bluetooth devices are wireless audio devices, not VoIP end points, and are indeed used mostly for (mobile) PSTN calls. (2) Diverging requirements: A codec/mode that meets the BT requirements for ultra-low complexity will have a relatively poor coding efficiency, resulting in lower audio quality and/or a higher bitrate. Both of these negatively impact the user experience over the Internet. Therefore, you do not want to run a BT codec over the Internet if you can use a more efficient codec instead. (3) Transcoding: Even when using a BT audio device, a well-designed VoIP end point will always transcode between the Internet codec and the BT codec, because: a) the reason given in 2) above b) the BT device lacks the CPU power and memory to run the entire VoIP stack c) it allows for a packet-loss concealment operation in between two lossy lags of the end-to-end connection. Note that such transcoding is also standard with DECT devices, where base stations even transcode between G.722 and G.722 (yes: twice the same codec). In short, there is no benefit from the BT and Internet codecs being modes of one and the same codec. This complete lack of overlap means that: I) it is better to standardize two separate codecs II) Bluetooth is out of scope for the Internet codec. [Raymond]: In the same order as your numbered list below: (1) True, Bluetooth != Internet for now, but why not look into the future and explore what is possible and will be very good to have for the future? (2) Your argument here doesn't make sense to me. For PC-to-PC calls, there is no reason to use an ultra-low-complexity mode, so you don't need to suffer the lower coding efficiency, and therefore your concern is not relevant. For any call that involves a Bluetooth headset, it would be better to use a low-complexity mode of the IETF codec on the Bluetooth headset than to go through CVSD transcoding and suffer the significant quality degradation of CVSD and the additional coding delay due to transcoding. (3) a) The reason in 2) is not a reason as I explained above. b) Didn’t you say Moore’s Law will take care of that? :-) Furthermore, I am not sure it is necessary for the Bluetooth headset to run the entire VoIP stack. c) It is not at all clear that doing PLC two times on each of the two lossy links is necessarily better than doing PLC just once after the packets go through the two lossy links. It may well be that the latter approach is better, considering that the transcoding distortion of the second link may go up substantially when encoding the output of the first PLC for the first lossy link. I) Standardizing two separate codecs takes more time and effort and requires transcoding which increases total coding distortion and total coding delay. II) For you and some, it is out of the scope, but for others it is not. Different people have different views. My view is that if we can cover this very useful usage scenario without too much trouble, why leave it out? [Stephen]: Bluetooth is clearly out-of-scope for this group. -- Ticket URL: <http://trac.tools.ietf.org/wg/codec/trac/ticket/21#comment:1> codec <http://tools.ietf.org/codec/>
- Re: [codec] #21: Supporting Wireless Links? stephen botzko
- [codec] #21: Supporting Wireless Links? codec issue tracker
- Re: [codec] #21: Supporting Wireless Links? codec issue tracker
- Re: [codec] #21: Supporting Wireless Links? codec issue tracker
- Re: [codec] requirements #21 (new): Supporting Wi… codec issue tracker
- Re: [codec] #21: Supporting Wireless Links? codec issue tracker