Re: [codec] #16: Multicast?

"Raymond (Juin-Hwey) Chen" <rchen@broadcom.com> Mon, 26 April 2010 06:50 UTC

Return-Path: <rchen@broadcom.com>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 0CF373A67F0 for <codec@core3.amsl.com>; Sun, 25 Apr 2010 23:50:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.202
X-Spam-Level:
X-Spam-Status: No, score=-0.202 tagged_above=-999 required=5 tests=[AWL=-0.204, BAYES_50=0.001, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8R5VsNFDP6oq for <codec@core3.amsl.com>; Sun, 25 Apr 2010 23:49:56 -0700 (PDT)
Received: from MMS3.broadcom.com (mms3.broadcom.com [216.31.210.19]) by core3.amsl.com (Postfix) with ESMTP id 6EEFE3A67EC for <codec@ietf.org>; Sun, 25 Apr 2010 23:49:50 -0700 (PDT)
Received: from [10.9.200.131] by MMS3.broadcom.com with ESMTP (Broadcom SMTP Relay (Email Firewall v6.3.2)); Sun, 25 Apr 2010 23:49:29 -0700
X-Server-Uuid: B55A25B1-5D7D-41F8-BC53-C57E7AD3C201
Received: from IRVEXCHCCR01.corp.ad.broadcom.com ([10.252.49.31]) by IRVEXCHHUB01.corp.ad.broadcom.com ([10.9.200.131]) with mapi; Sun, 25 Apr 2010 23:49:29 -0700
From: "Raymond (Juin-Hwey) Chen" <rchen@broadcom.com>
To: Koen Vos <koen.vos@skype.net>
Importance: low
X-Priority: 5
Date: Sun, 25 Apr 2010 23:49:27 -0700
Thread-Topic: [codec] #16: Multicast?
Thread-Index: AcrkrPe8JLVEu3WHTT+0OV80CrqVRQAVc8Zw
Message-ID: <CB68DF4CFBEF4942881AD37AE1A7E8C74B901365EF@IRVEXCHCCR01.corp.ad.broadcom.com>
References: <062.7439ee5d5fd36480e73548f37cb10207@tools.ietf.org> <3E1D8AD1-B28F-41C5-81C6-478A15432224@csperkins.org> <D6C2F445-BE4A-4571-A56D-8712C16887F1@americafree.tv> <C0347188-A2A1-4681-9F1E-0D2ECC4BDB3B@csperkins.org> <u2x6e9223711004210733g823b4777y404b02330c49dec1@mail.gmail.com> <000001cae173$dba012f0$92e038d0$@de> <r2q6e9223711004211010gfdee1a70q972e8239fef10435@mail.gmail.com> <001101cae177$e8aa6780$b9ff3680$@de> <t2t6e9223711004211119i6b107798pa01fc4b1d33debf1@mail.gmail.com> <002d01cae188$a330b2c0$e9921840$@de> <CB68DF4CFBEF4942881AD37AE1A7E8C74AB3F4A017@IRVEXCHCCR01.corp.ad.broadcom.com> <4BD11C50.2020206@usherbrooke.ca> <CB68DF4CFBEF4942881AD37AE1A7E8C74AB3F4A270@IRVEXCHCCR01.corp.ad.broadcom.com> <20100424135607.84293hkaa13j1zvr@mail.skype.net> <CB68DF4CFBEF4942881AD37AE1A7E8C74AB3F4A289@IRVEXCHCCR01.corp.ad.broadcom.com> <20100424181620.352034g28cnjr010@mail.skype.net> <CB68DF4CFBEF4942881AD37AE1A7E8C74AB3F4A290@IRVEXCHCCR01.corp.ad.broadcom.com> <20100425122429.2136460zti0p5fjh@mail.skype.net>
In-Reply-To: <20100425122429.2136460zti0p5fjh@mail.skype.net>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
MIME-Version: 1.0
X-WSS-ID: 67CBE8F331G103842230-01-01
Content-Type: multipart/alternative; boundary="_000_CB68DF4CFBEF4942881AD37AE1A7E8C74B901365EFIRVEXCHCCR01c_"
Cc: "codec@ietf.org" <codec@ietf.org>
Subject: Re: [codec] #16: Multicast?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 26 Apr 2010 06:50:09 -0000

In-line...



-----Original Message-----
From: Koen Vos [mailto:koen.vos@skype.net]
Sent: Sunday, April 25, 2010 12:24 PM
To: Raymond (Juin-Hwey) Chen
Cc: codec@ietf.org
Subject: RE: [codec] #16: Multicast?



Hi Raymond,



Jitter buffers have no problem implementing a non-integer-frame delay,

because packets are queued and read non-synchronously.

[Raymond]: I am talking about adaptive jitter buffer that tries to minimize the delay through the jitter buffer dynamically depending on the observed network jitter.  If the jitter is small, you decrease that delay, and if it is large, you increase that delay. An engineer who actually implemented such an adaptive jitter buffer in an IP phone told me that the non-integer-frame delay made it pretty messy to implement (I didn't say it was not possible; it's just messy), so for implementation simplicity's sake, the jitter delay was often chosen to be an integer number of frames. He also said that a smaller frame size gives you more frequent observations of the network jitter and thus makes the jitter estimate more responsive and accurate.



Processing time matters on low-end hardware - a small fraction of

today's VoIP end points.

[Raymond]: Processing time certainly matters for IP phones, and there are a lot of enterprise IP phones deployed today. I heard that it is actually significantly cheaper for enterprises to have their entire phone systems IP-phone-based than analog-phone-based. I won't be surprised that before too long the vast majority of enterprises will use only IP phones.  Even consumer phones and cell phones are moving toward IP-based.  Eventually that would be a very large percentage of VoIP end points.



And transmission delay increases (perhaps) linearly with the *packet

size*, not with the *frame size*.  For a 32 kbps codec with 5 ms

frames, packets are just 30% smaller than with a 16 kbps codecs with

20 ms frames.

[Raymond]: Agreed. My previous comments on transmission delay was based on the TDM rather than packet scenario, but I was just using that simplified TDM example to make a point that transmission delay cannot be zero, as your 1X frame size multiplier would imply.  Even with your statement above, a larger codec frame size still makes a larger packet size, which then increases the transmission delay, so you can't say transmission delay is zero or is independent of the codec size.

In any case, these are really minor details.  My key point is that your 1X multiplier for the codec frame size is simply theoretically impossible.  The rule of thumb used by IP phone engineers is around 3X codec frame size.



Let me ask you something: how often is G.729 used with 10 ms packets,

or Broadvoice with 5 ms packets?

[Raymond]: Not very often, but that's because previously network routers/switches didn't like to handle too many packets per second, and the higher packet header overhead associated with a smaller packet size means the overall bit-rate would be higher than desired or allowed, so the time of small packet size for low-delay VoIP hasn't really come yet.  However, with the help of Moore's Law, network routers/switches are becoming much faster now, and I was told that they can handle a 5 ms packet size without problems; furthermore, the speed of backbone networks and access networks keep increasing with time, so the bit-rate concern will also decrease with time.

Unlike processing speed and communication speed that continuously get improved with time for decades, delay is one thing that will NOT get improved with time and Moore's Law cannot do anything about that!

If the IETF codec has a minimum frame size of 20 ms, we will be stuck with the longer overall delay associated with that, and Moore's Law will not help us reduce that delay in the future.  On the other hand, in addition to using a 20 ms frame size for bit-rate-sensitive applications, if the IETF codec also has a low-delay mode that uses a 5 ms frame size, then at least for delay-sensitive applications, people have a choice to achieve a lower delay by paying the price of a higher overall bit-rate (i.e. with packet header counted), and this higher bit-rate will be less and less of a concern as the network speed keep increasing with time.

Therefore, recognizing that delay cannot be helped by Moore's Law but bit-rate can, it would be wise for the IETF codec WG to adopt a low-delay mode for the codec in order to be future-proof.





best,

koen.







Quoting "Raymond (Juin-Hwey) Chen":



> Hi Koen,

>

>

>

> My comments in-line below.

>

>

>

> Best Regards,

>

>

>

> Raymond

>

>

>

> -----Original Message-----

> From: Koen Vos [mailto:koen.vos@skype.net]

> Sent: Saturday, April 24, 2010 6:16 PM

> To: Raymond (Juin-Hwey) Chen

> Cc: codec@ietf.org

> Subject: RE: [codec] #16: Multicast?

>

>

>

> Quoting "Raymond (Juin-Hwey) Chen":

>

>> My main point, though, is not in the exact one-way delay value for a

>

>> codec with a 5 ms frame size, but rather that with a 5 ms frame size

>

>> you can get a much lower one-way delay than with a 20 ms frame size.

>

>

>

> It would be about 15 ms lower - don't know if that counts as "much" :)

>

>

>

> [Raymond]: I don't agree that it will be only 20 - 5 = 15 ms lower.

> That will be true only if your one-way delay formula below is true,

> but theoretically it cannot be.  See my comment below your formula.

>

>

>

> Also, note that for a given probability of packets arriving too late

>

> to be played out, the jitter buffer delay is independent of the frame

>

> size.

>

> [Raymond]: That may be true theoretically, but in practical

> implementations, selecting a jitter buffer delay that is not

> divisible by the packet size would make the adaptive jitter buffer

> pretty messy to implement.  If we make the it divisible by the

> packet size, then a smaller packet size gives you more granularity

> to work with and can result in lower average delay as the codec

> frames go through the adaptive jitter buffer.

>

>

>

>>> - most delay comes from the network and is not codec related, and

>

>>> - one-way delay grows almost linearly with frame size.

>

>>

>

>> Doesn't your last line above contradicts with the second last line?

>

>

>

> I meant that approximately:

>

>     one-way delay = codec-independent delay + frame size

>

>

>

> ("codec algorithmic delay" would be more accurate than "frame size")

>

>

>

> [Raymond]: First, I agree that codec algorithmic buffering delay is

> more accurate than frame size since it can also include the

> "look-ahead" delay and filtering delay if sub-band

> analysis/synthesis is used.  However, your formula implies that for

> the codec-related delay, the "multiplier" to be used for the codec

> frame size is only 1.  That's unrealistic and theoretically

> impossible.  For that to happen, after you wait one frame of time

> for the current frame of input audio samples to arrive at your input

> signal buffer (that's one frame of codec-related delay already), you

> need an infinitely fast processor to finish the encoding operation

> instantly, then you need an infinitely fast communication link to

> ship all the bits in the compressed frame to the decoder instantly,

> and then you need an infinitely fast processor to finish decoding

> the frame instantly and start playing back the current frame of

> audio without any delay.  That's just impossible.

>

> In reality, if the processor is just barely fast enough to implement

> the codec in real time, then you need nearly a full frame of time to

> finish the encoding and decoding operations. That makes the

> multiplier to be 2 already.  If your communication link is just

> barely fast enough to transmit your packets at the same speed they

> are generated without piling up unsent packets, then it takes

> another frame of time to finish transmitting the compressed bits in

> a frame to the decoder.  That makes the multiplier to be 3 already.

>

> Granted, in practice the processor and the communication link are

> usually faster than just barely enough, so the processing delay and

> the transmission delay can be less than 1 frame each.  However,

> there are other miscellaneous uncounted delays that tends to depend

> on the codec size in various ways.  Thus, a typical IP phone

> implementation would have

>

>   One-way delay = codec-independent delay + 3*(codec frame size) +

> (codec look-ahead) + (codec filtering delay if any).

>

> Hence, the one-way delay difference between a 20 ms and a 5 ms codec

> frame size would be 45 ms + (codec look-ahead difference) + (codec

> filtering delay difference).

>

> Consequently, for the conference bridge application, the total

> difference in one-way delay can easily be in the 90 to 100 ms range.

> When adding this delay difference to all the other codec-independent

> delay components, it is still a huge difference that the users can

> easily notice, especially since it will most likely push the total

> one-way delay significantly beyond the 150 ms limit.

>

>

>

>> I am aware of a header compression technology for VoIP over Cable

>

>> applications that can compress the header size to a very small

>

>> fraction of the original size, but it is probably not widely used.

>

>

>

> Yes, header compression works between end-points on a cable.  That's

>

> different from "between arbitrary Internet end points".

>

>

>

> [Raymond]: The cable operators' networks are still IP networks.  If

> the technology can work there, I don't see why it cannot work

> elsewhere in the Internet.  I know it is currently not available

> between arbitrary Internet end points.  I am just saying that

> technologies exist that can potentially be deployed in the Internet

> to compress the packet headers to a very small fraction of the

> uncompressed headers.

>

>

>

> best,

>

> koen.

>

>

>