Re: [rtcweb] Opus over the Internet?

"Mandyam, Giridhar" <mandyam@quicinc.com> Fri, 31 August 2012 20:59 UTC

Return-Path: <mandyam@quicinc.com>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AC0F721F8523 for <rtcweb@ietfa.amsl.com>; Fri, 31 Aug 2012 13:59:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.1
X-Spam-Level:
X-Spam-Status: No, score=-6.1 tagged_above=-999 required=5 tests=[AWL=0.498, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RRhidzjg9kh0 for <rtcweb@ietfa.amsl.com>; Fri, 31 Aug 2012 13:59:43 -0700 (PDT)
Received: from wolverine02.qualcomm.com (wolverine02.qualcomm.com [199.106.114.251]) by ietfa.amsl.com (Postfix) with ESMTP id 82ADA21F8505 for <rtcweb@ietf.org>; Fri, 31 Aug 2012 13:59:43 -0700 (PDT)
X-IronPort-AV: E=McAfee;i="5400,1158,6821"; a="229543068"
Received: from ironmsg04-l.qualcomm.com ([172.30.48.19]) by wolverine02.qualcomm.com with ESMTP; 31 Aug 2012 13:59:43 -0700
X-IronPort-AV: E=Sophos; i="4.80,349,1344236400"; d="scan'208,217"; a="292905884"
Received: from nasanexhc04.na.qualcomm.com ([172.30.48.17]) by Ironmsg04-L.qualcomm.com with ESMTP/TLS/RC4-SHA; 31 Aug 2012 13:59:42 -0700
Received: from NASANEXD01H.na.qualcomm.com ([169.254.8.250]) by nasanexhc04.na.qualcomm.com ([172.30.48.17]) with mapi id 14.02.0318.001; Fri, 31 Aug 2012 13:59:42 -0700
From: "Mandyam, Giridhar" <mandyam@quicinc.com>
To: Koen Vos <koen.vos@skype.net>, "rtcweb@ietf.org" <rtcweb@ietf.org>
Thread-Topic: [rtcweb] Opus over the Internet?
Thread-Index: Ac2HfMsrJ/6oA0BvSY28PJ4c6WjX7AACCdSkAAC8HqAAGo0kgAAOkVBA
Date: Fri, 31 Aug 2012 20:59:42 +0000
Message-ID: <CAC8DBE4E9704C41BCB290C2F3CC921A162D308E@nasanexd01h.na.qualcomm.com>
References: <D79146E3783B6942A3E8BC43352BBB460579F14B@TK5EX14MBXC254.redmond.corp.microsoft.com> <D79146E3783B6942A3E8BC43352BBB460579F16D@TK5EX14MBXC254.redmond.corp.microsoft.com>, <CAC8DBE4E9704C41BCB290C2F3CC921A162D2D09@nasanexd01h.na.qualcomm.com> <D79146E3783B6942A3E8BC43352BBB460579F1ED@TK5EX14MBXC254.redmond.corp.microsoft.com>
In-Reply-To: <D79146E3783B6942A3E8BC43352BBB460579F1ED@TK5EX14MBXC254.redmond.corp.microsoft.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.45.230.6]
Content-Type: multipart/alternative; boundary="_000_CAC8DBE4E9704C41BCB290C2F3CC921A162D308Enasanexd01hnaqu_"
MIME-Version: 1.0
Subject: Re: [rtcweb] Opus over the Internet?
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 31 Aug 2012 20:59:51 -0000

Thanks for getting back to me.  It looks like the assembled speaker panels and the mix of talkers followed a similar methodology to PP and PP2.

3GPP TR 26.935 (Packet Switched (PS) conversational multimedia applications; Performance characterisation of default codecs) lists in Table 12 three types of background noise conditions for AMR-WB:  car, street and cafeteria.  Cafeteria is the closest to office IMO -  it includes background babble in an open environment.

-Giri


From: Koen Vos [mailto:koen.vos@skype.net]
Sent: Friday, August 31, 2012 1:30 PM
To: Mandyam, Giridhar; rtcweb@ietf.org
Subject: RE: [rtcweb] Opus over the Internet?

Hi Giri,

Dynastat's subjective test involved the following design parameters:

- subjects: 32, four panels of eight subjects, native speakers of North American English
- randomizations: one per listening panel
- talkers: eight (four males, four females), native speakers of North American English
- speech samples: four 8 sec. sentence-pairs per talker
- experimental design: partially-balanced, randomized-blocks
Dynastat constructed the randomizations (i.e., pseudo-randomized presentation sequences) for each listening panel. Each randomization included 256 trials (i.e., 8 talkers x 32 test conditions).

We tested only office noise at 15 dB SNR because we didn't have more conditions available.

As web browsers are commonly used in an office, knowledge of a codec's performance in that kind of environment would seem helpful to this working group.  Do I understand correctly that the 3GPP and 3GPP2 tests did not include any office noise conditions?

best,
koen.
________________________________
From: Mandyam, Giridhar [mandyam@quicinc.com]
Sent: Friday, August 31, 2012 8:16 AM
To: Koen Vos; rtcweb@ietf.org<mailto:rtcweb@ietf.org>
Subject: RE: [rtcweb] Opus over the Internet?
Thank you for pointing out these results.

I think these results represent a very good start, but the results as provided may not be sufficient to characterize Opus.  I've already cited the Dynastat study in 3GPP2 for EVRC-B characterization in a separate email.  In addition to packet loss (in the 3GPP2 study, the analog would be frame error rate), background noise conditions were varied (based on MNRU, street noise, car noise).  This is the kind of testing that is typical for codecs standardized in 3GPP and 3GPP2.

There are several details missing.  For instance,

Was only office noise tested? If so, why?
How many listeners were used?
What was the mix of talkers (e.g. no. of female, no. of male)?

If you can point to more details of the testing, that would be very helpful for members of the group.

-Giri Mandyam

From: rtcweb-bounces@ietf.org<mailto:rtcweb-bounces@ietf.org> [mailto:rtcweb-bounces@ietf.org]<mailto:[mailto:rtcweb-bounces@ietf.org]> On Behalf Of Koen Vos
Sent: Friday, August 31, 2012 7:38 AM
To: rtcweb@ietf.org<mailto:rtcweb@ietf.org>
Subject: Re: [rtcweb] Opus over the Internet?

Lars Eggert wrote:
> That's all great, but is there any qualitative comparison data of the different codecs available?

Please have a look at the test Dynastat did with SILK: http://developer.skype.com/resources/SILKDataSheet.pdf

Note that SILK in Opus is quite similar to the old SILK tested by Dynastat.  All changes were thoroughly tested to not degrade performance, and in most cases improved it.  So the SILK Dynastat results give a lower bound on the performance of a subset of the Opus modes.

While these plots don't show G.722, it was in fact included in the test for clean input signals without packet loss.  According to the test results:
- SILK at 8.85 kbps is outperformed by G.722 at 64 kbps with statistical significance
- SILK at 12.65 kbps is a statistical tie with G.722 at 64 kbps
- SILK at 18.85 kbps and above outperforms G.722 at 64 kbps with statistical significance.

The packet loss in that test was maximally random (ie not bursty).  While that may not perfectly mimic real networks, I can think of no reason why the results would be reversed or very different in practice.

To give one more measure of suitability for the Internet: Skype has used SILK to deliver 100s of billions of Skype-to-Skype voice minutes over the Internet, and our users are very satisfied with it.

Hope this helps,
koen.

Lars Eggert wrote:
That's all great, but is there any qualitative comparison data of the different codecs available?



--

Sent from a mobile device; please excuse typos.



On Aug 30, 2012, at 21:07, "Jean-Marc Valin" <jmvalin at mozilla.com> wrote:



> On 12-08-30 09:14 AM, Markus.Isomaki at nokia.com wrote:

>> Are there any tests that actually compare different codecs, say Opus

>> vs. G.722 vs. AMR-WB, in typical Internet use, meaning some loss and

>> jitter? I suppose the performance is only partially related to the

>> codec, and partially to other implementation decisions, so it might

>> be difficult to compare apples-with-apples. But since people are

>> arguing that Opus is an "Internet codec", it would be actually nice

>> to see some results in this sense. Or does "Internet codec" mean

>> something else?

>

> The term "Internet codec" means different things to different people,

> but to me this means more than "having good packet loss concealment".

> Sure we have PLC (though it was never compared to G.722 or AMR-WB) and

> we also have (optional) low bit-rate embedded redundancy (also called

> FEC) and (also optional) independent frames, but that's just one aspect.

>

> One of the first things we've been asked when designing Opus was to make

> the rate *really* adaptable because we never know what kind of rates

> will be available. This not only meant having a wide range of bitrates,

> but also being able to vary in small increments. This is why Opus scales

> from about 6 kb/s to 512 kb/s, in increments of 0.4 kb/s (one byte with

> 20 ms frames). Compare that to AMR-WB, which scales from 6.6 to 23.85 in

> (on average) ~2.5 kb/s increments. The reason Opus can have more than

> 1200 possible bitrates is because on the Internet, other layers in the

> protocol stack provide octet granularity framing/sizing. We don't need

> to spend 11 bits signalling the bitrate because UDP already encodes the

> packet size.

>

> There's also practical aspects. If you look at the rates supported by

> AMR-WB, you see that *none* of them represents an integer number of

> bytes per frame. For example, 23.85 kb/s is 59 bytes plus 5 bits per

> frame. It's definitely not the only cause, but the bottom line is that

> the payload format for AMR-WB (rfc4867) is quite complex. Now compare

> this with the Opus RTP payload

> (http://tools.ietf.org/html/draft-spittka-payload-rtp-opus). Although

> it's not complete, it's not only much simpler, but it also makes it

> possible to decode RTP packets without having even seen the SDP or any

> out-of-band signalling.

>

> People may have other definitions, but that's what *I* mean by Opus

> being an "Internet codec".

>

> Cheers,

>

>    Jean-Marc

>

>>> -----Original Message----- From: rtcweb-bounces at ietf.org

>>> [mailto:rtcweb-bounces at ietf.org] On Behalf Of ext Stefan Hakansson

>>> LK Sent: 30 August, 2012 14:58 To: Jean-Marc Valin Cc:

>>> rtcweb at ietf.org Subject: Re: [rtcweb] Confirmation of consensus on

>>> audio codecs

>>>

>>> On 08/29/2012 03:10 PM, Jean-Marc Valin wrote:

>> On 08/29/2012 05:51 AM, Stefan Hakansson LK wrote:

>>>> ...

>>

>>>>>> That is great, but sort of underlines that there would be no

>>>>>> harm in delaying the decision until there are experiences

>>>>>> made from real world use - 'cause it would not be that long

>>>>>> till that experience has been made (Markus also brought up

>>>>>> the IPR status as a reason for waiting - I have no idea how

>>>>>> long it would take to know more about that).

>>

>> As Harald is pointing out, rtcweb implementations are going to ship

>> pretty soon.

>>>>

>>>> If that is the case (and I think and hope so), why would we need

>>>> to make it MTI before seeing it in action?

>>>>

>>>> [...]

>>>>

>> Are you expecting *another* single, standardized, royalty-free codec

>> that operates over vast ranges of bitrates and operating conditions,

>> from narrowband speech to stereo music, all with low delay, coming

>> out in the next year? If not, what are you really waiting for?

>>>>

>>>> No, I'm not expecting that. I would just prefer us to see that

>>>> Opus does indeed deliver as promised before making it MTI. If it

>>>> does, fine. If not, we'd have to discuss what to do then.

>>>>

>>>> To me it is like if you're going to place a bet, for an upcoming

>>>> big race, on a horse that has never been in an actual race, but

>>>> shows great promise. If you know that the horse is going to

>>>> participate in a few practice races soon, would you not prefer to

>>>> wait and see how it fares in those before placing the bet (given

>>>> that the odds would not change)?

>>>>

>>>> Anyway, that's my view. Let's see what the chairs say.

>>

>>>>>> As Paul suggested, I was referring to the lack of formal,

>>>>>> controlled, characterization tests. That is how other SDOs do

>>>>>> it. I don't think that is the only way to do it, but I think

>>>>>> we should at least have either such tests conducted or

>>>>>> experience from deployment and use (in a wide range of

>>>>>> conditions and device types) before making it MTI.

>>

>> Opus has had "ITU-style" testing on English (

>> http://www.opus-codec.org/comparison/GoogleTest1.pdf ) and Mandarin

>> ( http://www.opus-codec.org/comparison/GoogleTest2.pdf ). And if you

>> don't trust Google on the tests I linked to, it's also been tested

>> by Nokia:

>>

>>>> http://research.nokia.com/files/public/%5B16%5D_InterSpeech2011_Voice_

>>

>>>>

> Quality_Characterization_of_IETF_Opus_Codec.pdf

>>>> Come on, those tests are very limited compared to a formal

>>>> characterization test. (Example:

>>>> www.itu.int/dms_pub/itu-t/opb/tut/T-TUT-ASC-2010-MSW-E.docx<http://www.itu.int/dms_pub/itu-t/opb/tut/T-TUT-ASC-2010-MSW-E.docx>).

>>>> There is very little info on the material used, the environment,

>>>> processing and scripts and so on. And there seems to be no tests

>>>> whatsoever (at least not involving humans) with actual channels

>>>> introducing jitter and losses.

>>>>

>>>> Note: I am in no way proposing that a formal characterization

>>>> test is needed, or even the right thing to do. It is a costly and

>>>> time consuming process, and alternative approaches could prove to

>>>> be more efficient. What I am saying is that I think we should not

>>>> mandate a codec that has neither gone through that kind of formal

>>>> characterization testing nor has any field experience from actual

>>>> use on a reasonable scale (covering different conditions and

>>>> device types). It just seems wrong, especially given that we

>>>> will soon have field experience.

>>>>

>>

>> Now, unlike other SDOs, the testing did not stop there. ITU-T codecs

>> generally end up being testing with something in the order of tens

>> of minutes worth of audio. In *addition* to that kind of testing,

>> Opus also had automated testing with hundreds of years worth of

>> audio. If anything, I think other SDOs should learn something here.

>>>>

>>>> This may very well be true. I guess this comes down to how much

>>>> you trust that the quality assessment models (e.g. PEAQ, POLQA)

>>>> give the same result as human test subjects would. But I think

>>>> this sounds like a really good thing.

>>>>

>>

>> Jean-Marc

>>

>>>>

>>>

>>> _______________________________________________ rtcweb mailing

>>> list rtcweb at ietf.org https://www.ietf.org/mailman/listinfo/rtcweb

> _______________________________________________

> rtcweb mailing list

> rtcweb at ietf.org

> https://www.ietf.org/mailman/listinfo/rtcweb