Re: [codec] #16: Multicast?

"Raymond (Juin-Hwey) Chen" <> Tue, 11 May 2010 18:22 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 106043A6C34 for <>; Tue, 11 May 2010 11:22:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: 0.334
X-Spam-Status: No, score=0.334 tagged_above=-999 required=5 tests=[AWL=-0.668, BAYES_50=0.001, EXTRA_MPART_TYPE=1, HTML_MESSAGE=0.001]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id hhba+-pQHvuU for <>; Tue, 11 May 2010 11:21:50 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id A5AE03A6C36 for <>; Tue, 11 May 2010 11:18:31 -0700 (PDT)
Received: from [] by with ESMTP (Broadcom SMTP Relay (Email Firewall v6.3.2)); Tue, 11 May 2010 11:15:39 -0700
X-Server-Uuid: 02CED230-5797-4B57-9875-D5D2FEE4708A
Received: from ([]) by ([]) with mapi; Tue, 11 May 2010 11:17:02 -0700
From: "Raymond (Juin-Hwey) Chen" <>
To: Christian Hoene <>, 'Koen Vos' <>
Date: Tue, 11 May 2010 11:15:30 -0700
Thread-Topic: [codec] #16: Multicast?
Thread-Index: Acrw0m8GNXZpDck0Qjy3nWpGN9Rz1gAQeaxwAAXDTKA=
Message-ID: <>
References: <> <> <> <> <> <000001cae173$dba012f0$92e038d0$@de> <> <001101cae177$e8aa6780$b9ff3680$@de> <> <002d01cae188$a330b2c0$e9921840$@de> <> <> <> <002c01cae939$5c01f400$1405dc00$@de> <>, <009901caede1$43f366d0$cbda3470$@de> <> <CB68DF4CFBEF4942881AD37AE1A7E8C74B90345C0D@IRV! E XCHCC... <1273441939.> <> <006101caf117$aaf3b2c0$00db1840$@de>
In-Reply-To: <006101caf117$aaf3b2c0$00db1840$@de>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: yes
x-cr-hashedpuzzle: C9zW D8NC Hotn J1mh MplF M25d RmVI RvQp YjPa dteL d37V ff5K gbYp gxmJ iSl9 kxmW; 3; YwBvAGQAZQBjAEAAaQBlAHQAZgAuAG8AcgBnADsAaABvAGUAbgBlAEAAdQBuAGkALQB0AHUAZQBiAGkAbgBnAGUAbgAuAGQAZQA7AGsAbwBlAG4ALgB2AG8AcwBAAHMAawB5AHAAZQAuAG4AZQB0AA==; Sosha1_v1; 7; {5A4376BE-A677-4DCE-9A1D-90028296F3BF}; cgBjAGgAZQBuAEAAYgByAG8AYQBkAGMAbwBtAC4AYwBvAG0A; Tue, 11 May 2010 18:15:30 GMT; UgBFADoAIABbAGMAbwBkAGUAYwBdACAAIwAxADYAOgAgAE0AdQBsAHQAaQBjAGEAcwB0AD8A
x-cr-puzzleid: {5A4376BE-A677-4DCE-9A1D-90028296F3BF}
acceptlanguage: en-US
MIME-Version: 1.0
X-WSS-ID: 67F7414120S121750474-01-01
Content-Type: multipart/related; boundary="_006_CB68DF4CFBEF4942881AD37AE1A7E8C74B90346104IRVEXCHCCR01c_"; type="multipart/alternative"
Cc: "" <>
Subject: Re: [codec] #16: Multicast?
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 11 May 2010 18:22:02 -0000

Other than potential echo issues, the biggest problem with a one-way delay longer than a few hundred ms is that such a long delay makes it very difficult to interrupt each other, resulting in the start-stop-start-stop cycles I previously talked about.  Therefore, I agree with Ben that if the lab test did not have echoes and did not involve the test subjects trying to interrupt each other, then the test results may appear more benign than what one would experience in the real world.

Note that the top curve in the first figure below is for "listening-only tests".  Well, in that case there was no interaction/interruption at all, so if there was no echoes, either, then it is no wonder that the curve stayed essentially flat.  I do wonder what made the curve go down at 1300 ms; I guess to understand this we need to know what the lab set up was for this test.  Thus, I echo Marshall's opinion that we need the original paper/contribution.

My personal experience with the delay impairment is much worse than the middle curve (MOS-CQS) would suggest and is close to the bottom curve (MOS-CQE).  Back in early 1980s the phone calls I made from southern California to East Asia were carried through geosynchronous satellites with a one-way delay slightly more than 500 ms (see  I absolutely hated it, because turn-taking was severely impaired and the only way to interrupt the person at the other side was to keep talking (rudely, I may say) until the other person finally stopped.  Then, starting in late 1980s undersea cables were used to carry my traditional circuit-switched calls to the same person in East Asia, and all of a sudden the delay was much shorter and interrupting each other felt as easy as face-to-face conversation.  It's a night-and-day difference!  Even in early 2000s when I used my cell phone to call my son's cell phone in another cellular network, I could tell that there was a significant delay that noticeably impaired our turn-taking and our ability to interrupt each other, and I didn't like it at all.  Now you know why I advocate low-delay voice communications, have been working on low-delay speech coding for two decades, and have even published a book chapter on low-delay speech coding :o)


From: [] On Behalf Of Christian Hoene
Sent: Tuesday, May 11, 2010 7:39 AM
To: 'Koen Vos'
Subject: Re: [codec] #16: Multicast?


may I present some results of the ITU-T SG12 on the perceptual effects of delay?
For many years, it was assumed that 150ms is the boundary for interactive voice conversations (see Nobuhiko Kitawaki, and Kenzo Itoh: Pure Delay Effects on Speech Quality in Telecommunications, IEEE J. on Selected Areas in Commun., Vol.9, No.4, pp.586-593, May 1991) Until 400ms quality is still acceptable (about toll quality). The ITU-T G.107 quality model reflects this opinion.
However, in the recent years, new results have shown that the impact of delay on conversation quality is NOT as strong as assumed. At the ITU-T, numerous contributions have been made on this issue:
Contribution of BT "Comparison of E-Model and subjective test data for pure-delay conditions" from 2007-01-08:
MOS-CQS are subjective conversational tests
MOS-CQE is the E-Modell (G.107)
MOS-LQO are result from listening-only tests.

Also, LM Ericsson described very interesting results in "Investigation of the influence of pure delay, packet loss and audio-video synchronization for different conversation tasks" from 2007-09-24. For example:

Overall, it seems that the limit of 150ms is greatly overestimated. A much relaxed timing is allowed.
Seeing these figures, I have to assume that the ITU-T G.107 standard was a plot of the telcos to make life of VoIP vendors hard. Well done...

With best regards,

 Christian Hoene


Dr.-Ing. Christian Hoene

Interactive Communication Systems (ICS), University of Tübingen

Sand 13, 72076 Tübingen, Germany, Phone +49 7071 2970532

>-----Original Message-----

>From: [] On Behalf Of Koen Vos

>Sent: Tuesday, May 11, 2010 8:23 AM

>To:; Benjamin M. Schwartz


>Subject: Re: [codec] #16: Multicast?


>Quoting Benjamin M. Schwartz:


>> Quoting Koen Vos <>:

>>> For typical VoIP applications, Moore's law has lessened the pressure

>>> to reduce bitrates, delay and complexity, and has shifted the focus to

>>> fidelity instead.


>> I think this is a typo, and you mean "lessened the pressure to

>> reduce bitrates and complexity, and has shifted the focus to

>> fidelity and delay instead".


>Not a typo: codecs have become more wasteful with delay, while

>delivering better fidelity.  G.718 evolved out of AMR-WB and has more

>than twice the delay.  Same for G.729.1 versus G.729.  This is not by



>The main rationale for codec delay being less important today is that

>faster hardware has reduced end-to-end delay in every step along the

>way.  As a result, a typical VoIP connection now operates at a flatter

>part of the "impairment-vs-delay" curve, meaning that reducing delay

>by N ms at a given fidelity gives a smaller improvement to end users

>today than it did some years ago.  Therefore, the weight on minimizing

>delay in the "codec design problem" has gone down, and the optimum

>codec operating point has naturally shifted towards higher delay, in

>favor of fidelity.


>I've mentioned before that average delay on Internet connections seems

>to be 40% to 50% lower now than just 5 years ago, which is just one

>contributor to lower end-to-end delay.  That doesn't mean high-delay

>connections don't exist - they do, for instance over dial-up or 3G.

>But in those cases it's still better to use a moderate packet rate

>(and bitrate), to minimize congestion risk.


>The confusion may come from the fact that the trade-off between

>fidelity and delay changes towards high quality levels: once fidelity

>saturates, delay gets priority.  Even more so because such high

>fidelity enables new, delay-sensitive applications like distributed

>music performances.  This is reflected in the ultra-low delay

>requirements in the requirements document.


>To summarize, the case for using sub-20 ms frame sizes with

>medium-fidelity quality is now weaker than ever, because the relative

>importance of fidelity has gone up.





>codec mailing list