Re: [codec] Thresholds and delay.

Michael Knappe <> Wed, 12 May 2010 15:55 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id D57DA3A6BA7 for <>; Wed, 12 May 2010 08:55:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -4.616
X-Spam-Status: No, score=-4.616 tagged_above=-999 required=5 tests=[AWL=-0.617, BAYES_50=0.001, RCVD_IN_DNSWL_MED=-4]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id JuQFo+-vToIu for <>; Wed, 12 May 2010 08:55:34 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 7F82B3A6BA1 for <>; Wed, 12 May 2010 08:43:57 -0700 (PDT)
Received: from source ([]) (using TLSv1) by ([]) with SMTP ID DSNKS+rMr3r3E6f/; Wed, 12 May 2010 08:43:47 PDT
Received: from ([fe80::18fe:d666:b43e:f97e]) by ([fe80::fc92:eb1:759:2c72%11]) with mapi; Wed, 12 May 2010 08:43:36 -0700
From: Michael Knappe <>
To: stephen botzko <>, Ben Schwartz <>
Date: Wed, 12 May 2010 08:43:30 -0700
Thread-Topic: [codec] Thresholds and delay.
Thread-Index: Acrx5ATwcmHgc2YcRr2WZP6vf/NYHQABdnLk
Message-ID: <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
user-agent: Microsoft-Entourage/
acceptlanguage: en-US
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Cc: "" <>
Subject: Re: [codec] Thresholds and delay.
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 12 May 2010 15:55:35 -0000

Yes, let’s close off on the AEC and sidetone discussion,  my comments again from earlier in this thread:

Agreed that achieving low enough latencies for sidetone perception should not be a goal of the wg, but we should be aiming if at all possible for better than 250 ms one-way delay in typical (and non-tandemed) deployments. The knee of the one-way delay impairment factor begins rising non-linearly somewhere between 150 and 250 ms.



On 5/12/10 7:47 AM, "stephen botzko" <> wrote:

   It also means that the AEC should be retuned depending on the round-trip delay.
Jari is (I think) pointing out that this last statement is not true.

I agree with Jari's assessment, though I am not sure exactly what you mean by "retuning".  The far-end user will perceive the fed-back echo in various ways, and that does depend on the round-trip time.  However, the job of the AEC is to remove the echo, so there is no fed-back echo to hear.  And the algorithms which remove echo do not depend on the round-trip delay to the far end.  At least not the good ones.

Perhaps more substantively, I do not think this AEC discussion actually matters in this WG context.  We are not working on an AEC, we are working on an Internet Codec.  Even if (for argumentation purposes) you accept the idea that somehow the AEC needs to be tuned to the round-trip delay, the round-trip delay varies enormously depending on the connection, and this round-trip time in general is not even discoverable (esp. if gateways or SBCs are used).  Nothing we are doing in this group will change any of the above.

As far as sidetone goes, I do not understand why that keeps coming up either.

For the speakerphone use case eliminating AECs from both ends requires two conditions:
(a) the round trip delay has to be very low (30 ms or less, from all delay sources)
(b) there has to be sufficient attenuation on the loop.

The first condition has been brought up repeatedly.  For general internet WAN connections it is clearly not met, and it is in fact difficult to meet even on more local connections.

The second condition has not been discussed much.  But if you do have low delay but do not have enough attenuation, what you get is feedback, and not sidetone.  Since users have control over the speaker volume, the second condition (in general) also can not be guaranteed.

All speakerphones that I know of are either half-duplex or use AECs.  This is true even for PSTN phones used in local-loop circuit-switched calls, which is the lowest delay telephony connection that there is.  So for telephony at least, it is clear that AECs are needed.

So I do not think continued debate on the value of sidetone is productive.  I agree with Michael's comment "achieving low enough latencies for sidetone perception should not be a goal of the wg"

Stephen Botzko

On Wed, May 12, 2010 at 8:23 AM, Ben Schwartz <> wrote:
On Wed, 2010-05-12 at 10:01 +0200, wrote:
> Generally I believe that AECs mainly take care of the acoustic echo
> generated in the phone itself (operating on the microphone signal,
> acoustic delay up to a few ms). Do you mean that there is additional
> processing on the receiving side for the echo returning from the B
> user side?

Only in the user's head.  The psychoacoustics of echo are very different
depending on the echo time.  This means that the perceived echo-canceler
fidelity depends on the acoustic round-trip delay.  It also means that
the AEC should be retuned depending on the round-trip delay.