Re: [codec] #19: How large is the frame size depended delay / the serialization delay?

"Christian Hoene" <> Sat, 01 May 2010 17:12 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 458C13A68DA for <>; Sat, 1 May 2010 10:12:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -4.011
X-Spam-Status: No, score=-4.011 tagged_above=-999 required=5 tests=[AWL=-0.363, BAYES_50=0.001, HELO_EQ_DE=0.35, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id ROosyAkWMHui for <>; Sat, 1 May 2010 10:12:27 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id E69453A6BE6 for <>; Sat, 1 May 2010 10:12:24 -0700 (PDT)
Received: from hoeneT60 ([]) (authenticated bits=0) by (8.13.6/8.13.6) with ESMTP id o41HBpOi029088 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for <>; Sat, 1 May 2010 19:11:58 +0200
From: Christian Hoene <>
References: <> <001101cae177$e8aa6780$b9ff3680$@de> <> <002d01cae188$a330b2c0$e9921840$@de> <> <> <> <> <> <> <>
In-Reply-To: <>
Date: Sat, 01 May 2010 19:11:52 +0200
Message-ID: <002f01cae951$6d802d60$48808820$@de>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_0030_01CAE962.3108FD60"
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: AcrpMp+sn+iSbbxBTgGizlvP/ru38wAG8Fcg
Content-Language: de
X-AntiVirus-Spam-Check: failed (checked by Avira MailGate: version: 3.0.0-4; spam filter version: unknown; host: mx05)
X-AntiVirus: checked by Avira MailGate (version: 3.0.0-4; AVE:; VDF:; host: mx05); id=318-HQt87a
Subject: Re: [codec] #19: How large is the frame size depended delay / the serialization delay?
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 01 May 2010 17:12:35 -0000

I agree that serialization, processing, and implementation delay should be distinguished.
Assume a low-cost VoIP phone with its processing power being fully utilized by one call: Then, the DSP/CPU needs an entire frame
duration to encode and decode frames. Thus, the latency is increase by one frame length in addition to the serialization delay,
propagation delay, algorithmic delay, dejittering delay, echo cancelling delay, ...  Running the chips at 100% load is of course
cost saving compared to add some more computational resource. But is this still a relevant issue today? 
I am not sure whether it always make sense for mobile device to run at 100% load. Of course, from a energy consumption perceptive it
make sense to run CMOS circuit at the lowest possible frequency as power consumption drops quadratic. But maybe running the CPU/DSP
at higher speed and switching to power save mode if after the a frame has been decoded/encoded is be equally energy efficient…
Even if a gateways DSP is utilized fully, the processing delays must not be very large. For example, take a gateway serving 10000
calls and all CPUs/DSPs are at 100%. Then, the time needed for encoding/decoding should be frame duration/10000, if the
encoding/decoding is well scheduled.
Also, I do not think we shall consider implementation delay, which occurs due to suboptimal implementation. For example, some years
ago we tested the RTT of two Linux softphones link directly together using G.711. It was 400ms. The implementation delay could be a
good performance metric to differentiate two otherwise equal products. Also, the algorithmic processing delay might be subject to
similar market optimization.
Having said this, I would anyhow suggest to include the processing delay into the measurement of the end-to-end (acoustic round)
trip time.  Those measurements should be part of the control loop that optimizes the overall conversation call quality.
Dr.-Ing. Christian Hoene
Interactive Communication Systems (ICS), University of Tübingen 
Sand 13, 72076 Tübingen, Germany, Phone +49 7071 2970532 
From: [] On Behalf Of stephen botzko
Sent: Saturday, May 01, 2010 3:31 PM
To: Koen Vos
Subject: Re: [codec] #16: Multicast?
If the frame-size multiplier is due to serialization, then I agree with Koen's assessment.  In fact on many connections the
multiplier would be less than 1. Dial-up is of course the worst case here, and on those links the multiplier ought to be close to 2.
Variations due to congestion (and on some links, polling) are (IMHO) better modeled as jitter.  

Gateways are another matter, with the delays being highly dependent on the product architecture.  Interupt latencies, context
switching, bus architectures, etc. can dominate, so it is totally possible that reducing the frame size might actually increase the
latency (since it increases the packets per second load on the gateway).  So I agree with Koen on this as well.

Anecdotal models based on industry experience can be useful guides - though if we are going to use these models to drive
requirements, I'd prefer something more analytical.

Stephen Botzko
On Sat, May 1, 2010 at 2:07 AM, Koen Vos <> wrote:
Quoting "Raymond (Juin-Hwey) Chen":
 One-way delay = codec-independent delay + 3*(codec frame size) + (codec look-ahead) + (codec filtering delay if any)

This formula was obtained from an experienced engineer who has been working on IP phones related fields for more than a decade,
At Skype We have 100+ years of combined VoIP experience, and a focus on minimizing delay as part of our goal to maximize quality.
The consensus among our engineers is that the multiplier is closer to 1 than to 2, at least for software VoIP applications over
typical Internet connections.  Some years ago the situation was slightly worse because dial-up was more prevalent.

Similar 3X multiplier is also observed in VoIP gateways.  Even with a fast processor/system optimized from ground up to be
low-delay, the measured "codec-dependent" one-way delay of such a VoIP gateway using the G.711 codec with a 5 ms frame/packet size
is between 12 and 17 ms, or around 3X the frame size.
As I've pointed out before, that doesn't say much about how the delay increases with larger frame sizes.  Perhaps the 12~17 ms
includes a constant delay of 7 ms, and the marginal growth of delay with frame size is 1x.


codec mailing list