Re: [codec] Thresholds and delay.

Michael Knappe <mknappe@juniper.net> Tue, 11 May 2010 18:58 UTC

Return-Path: <mknappe@juniper.net>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 0F6613A6BF6 for <codec@core3.amsl.com>; Tue, 11 May 2010 11:58:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.655
X-Spam-Level:
X-Spam-Status: No, score=-4.655 tagged_above=-999 required=5 tests=[AWL=-1.157, BAYES_50=0.001, HTML_MESSAGE=0.001, MIME_BAD_LINEBREAK=0.5, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id O0KAIvoARgHN for <codec@core3.amsl.com>; Tue, 11 May 2010 11:58:50 -0700 (PDT)
Received: from exprod7og103.obsmtp.com (exprod7og103.obsmtp.com [64.18.2.159]) by core3.amsl.com (Postfix) with ESMTP id 172C73A6A5E for <codec@ietf.org>; Tue, 11 May 2010 11:58:50 -0700 (PDT)
Received: from source ([66.129.224.36]) (using TLSv1) by exprod7ob103.postini.com ([64.18.6.12]) with SMTP ID DSNKS+mo1ZvqoYb0ugn90wO9ez33rG2HxrFF@postini.com; Tue, 11 May 2010 11:58:40 PDT
Received: from EMBX02-HQ.jnpr.net ([fe80::18fe:d666:b43e:f97e]) by P-EMHUB02-HQ.jnpr.net ([fe80::88f9:77fd:dfc:4d51%11]) with mapi; Tue, 11 May 2010 11:56:49 -0700
From: Michael Knappe <mknappe@juniper.net>
To: "stephen.botzko@gmail.com" <stephen.botzko@gmail.com>, "bmschwar@fas.harvard.edu" <bmschwar@fas.harvard.edu>
Date: Tue, 11 May 2010 11:56:48 -0700
Thread-Topic: [codec] Thresholds and delay.
Thread-Index: AcrxNwRz7jmtZalKQL+8yRAIZ5yTkAABK9yE
Message-ID: <05542EC42316164383B5180707A489EE1D593774EC@EMBX02-HQ.jnpr.net>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: multipart/alternative; boundary="_000_05542EC42316164383B5180707A489EE1D593774ECEMBX02HQjnprn_"
MIME-Version: 1.0
Cc: "codec@ietf.org" <codec@ietf.org>
Subject: Re: [codec] Thresholds and delay.
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 11 May 2010 18:58:52 -0000

Stephen,

Agreed that achieving low enough latencies for sidetone perception should not be a goal of the wg, but we should be aiming if at all possible for better than 250 ms one-way delay in typical (and non-tandemed) deployments. The knee of the one-way delay impairment factor begins rising non-linearly somewhere between 150 and 250 ms.

Mike

________________________________
From: codec-bounces@ietf.org <codec-bounces@ietf.org>
To: Ben Schwartz <bmschwar@fas.harvard.edu>
Cc: codec@ietf.org <codec@ietf.org>
Sent: Tue May 11 14:19:22 2010
Subject: Re: [codec] Thresholds and delay.

>>>
In the presence of echo, round-trip delay must be kept below 30 ms to
ensure that the echo is perceived as sidetone, according to the Springer
handbook of speech processing:
>>>
Though true, I don't think this is a mainstream consideration.

VOIP phones that are capable of speakerphone operation all have acoustic echo cancelers, and those cancelers are already tuned to deal with internet delays with other voice codecs.  Certainly our phones and videoconferencing systems do not have problems with path delays of this order (hundreds of milliseconds).

From my own experience (not testing) I agree with Brian's claim that 500 ms round trip is acceptable for most conversation.

It does depend on what you are doing, and there are certainly tasks where much lower delays are needed.

Regards,
Stephen Botzko



On Tue, May 11, 2010 at 2:06 PM, Ben Schwartz <bmschwar@fas.harvard.edu<mailto:bmschwar@fas.harvard.edu>> wrote:
On Tue, 2010-05-11 at 12:48 -0400, Marshall Eubanks wrote:
> As a point of order, I object to any graphs without an available paper
> behind them.

I have located the first paper mentioned by Christian Hoene at
http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=81952
but of course it's paywalled.

One test in that paper told trained subjects to "Take turns reading
random numbers aloud as fast as possible", on a pair of handsets with
narrowband uncompressed audio and no echo.  Subjects were able to detect
round-trip delays down to 90 ms.  Conversational efficiency was impaired
even with round-trip delay of 100 ms.

Let me emphasize again that these delays are round-trip, not one-way,
there is no echo, and the task, while designed to expose latency, is
probably less demanding than musical performance.

In the presence of echo, round-trip delay must be kept below 30 ms to
ensure that the echo is perceived as sidetone, according to the Springer
handbook of speech processing:

(http://books.google.com/books?id=Slg10ekZBkAC&lpg=PA83&ots=wc9yM9WrCs&dq=sidetone%20delay%2030%20ms&lr&pg=PA84#v=onepage&q&f=false)

Such low delays are clearly impossible on many paths, but for Boston to
New York City (or London to Paris), ping times can be less than 18 ms,
making echo->sidetone conversion just barely possible for a codec with
5ms frames.

I accept Brian Rosen's claim that a slow conversation doesn't normally
suffer greatly from round-trip latencies up to 500 ms, but under some
circumstances much lower latencies are valuable.  Let's make sure
they're achievable for those who can use them.

--Ben

_______________________________________________
codec mailing list
codec@ietf.org<mailto:codec@ietf.org>
https://www.ietf.org/mailman/listinfo/codec