Re: [codec] #16: Multicast?

"Raymond (Juin-Hwey) Chen" <> Tue, 04 May 2010 23:27 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id BAE183A6985 for <>; Tue, 4 May 2010 16:27:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -0.222
X-Spam-Status: No, score=-0.222 tagged_above=-999 required=5 tests=[AWL=-0.223, BAYES_50=0.001]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id cUoG5xtm95ox for <>; Tue, 4 May 2010 16:27:16 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id ED3C63A6A24 for <>; Tue, 4 May 2010 16:01:11 -0700 (PDT)
Received: from [] by with ESMTP (Broadcom SMTP Relay (Email Firewall v6.3.2)); Tue, 04 May 2010 16:00:45 -0700
X-Server-Uuid: B55A25B1-5D7D-41F8-BC53-C57E7AD3C201
Received: from ([]) by ([]) with mapi; Tue, 4 May 2010 16:02:08 -0700
From: "Raymond (Juin-Hwey) Chen" <>
To: Christian Hoene <>
Date: Tue, 04 May 2010 16:00:42 -0700
Thread-Topic: [codec] #16: Multicast?
Thread-Index: AcrimhyNUh3NLsf5RvSfcmuOhmvRfwBUwYUAAVLFoDAApwRsUA==
Message-ID: <>
References: <> <> <> <> <> <000001cae173$dba012f0$92e038d0$@de> <> <001101cae177$e8aa6780$b9ff3680$@de> <> <002d01cae188$a330b2c0$e9921840$@de> <> <> <> <002c01cae939$5c01f400$1405dc00$@de>
In-Reply-To: <002c01cae939$5c01f400$1405dc00$@de>
Accept-Language: en-US
Content-Language: en-US
acceptlanguage: en-US
MIME-Version: 1.0
X-WSS-ID: 67FE789731G114354521-01-01
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Cc: "" <>
Subject: Re: [codec] #16: Multicast?
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 04 May 2010 23:27:21 -0000

Hi Christian,

Sorry for the delay of my response.  I was busy with other things and not available to respond to IETF emails until now.

You wrote:
> The arguments on costs of the gateways is raised often. Thus, it is worthwhile to 
> consider in-depth.

> May I ask you for more information about the design of VoIP Gateways. Particular, I am > interested in the partition between audio
> processing and protocol stack on DSP and CPU.

> Does DSP take over all codec processing? May the CPU do some parts of the computation > before, during or after DSP does the signal
> processing?

[Raymond]: I asked an engineering manager who was deeply involved in the design of high-density VoIP gateways. He said that in such gateways, due to the high number of voice channels (thousands) per box, a large number of DSPs and micro-controllers are used, and they are usually structured in a hierarchical way.  The DSPs typically take care of all speech codec processing, echo cancellation, DMTF tone detection, and fax, etc.  The DSPs are usually divided into groups, with each groups of DSPs controlled by a single micro-controller, which handles things like RTP, jitter buffering, packetization, QoS statistics, and moving the voice traffic to and from the DSPs in the group.  Then, on top of that there may be higher-performance controllers, each connected to many such groups of micro-controller + DSPs.  These higher-performance controllers may handle things like call setup, UDP/IP/RTP, routing to and from internal processor groups, and routing to and from external networks/devices.

> How do you count number of channels? Do all voice channels have the same weight 
> regardless their sampling rate?
> Say suppose, if the mixing is done for 48kHz instead of 8kHz, how many resource are we 
> allowed to consume more?

[Raymond]: I am not sure what you meant. The channel count is just counting the actual physical voice channels that the gateway can handle simultaneously; it is not a weighted sum. Are you thinking that a 48 kHz channel should be counted more than an 8 kHz channel because it requires more computational resources? Typical VoIP gateways only support 8 kHz telephone-bandwidth speech, so 48 kHz is out of the picture.  
With that said, the complexity difference between speech codecs can make a big difference in the channel density.  Let's say a VoIP gateway supports X simultaneous voice channels running the G.711 codec.  Since the complexity of G.711 PCM is next to nothing, the complexity of each voice channel is dominated by the echo canceller (EC).  Now if you replace the G.711 codec by the G.729A codec which takes about 10 MIPS of computational complexity for a full-duplex codec, that can easily decrease the channel density to X/2.5 per gateway, depending on the EC and other things.  If you replace the G.711 codec by the G.728 codec that takes 30+ MIPS, the channel density can easily go down to X/4 ~ X/5 or worse.  
Thus, if you choose a high-complexity codec, you would need to buy a lot more VoIP gateways to support the same number of voice channels than if you use a low-complexity codec. The cost difference is very real and can be very big.