Re: [codec] #16: Multicast?

Hi Stephen,

Thanks for sharing your broad knowledge in this.  You made some good points, and I basically agree with them.  I just have some comments in-line.

Raymond
From: stephen botzko [mailto:stephen.botzko@gmail.com]
Sent: Sunday, April 25, 2010 1:38 PM
To: Koen Vos
Cc: Raymond (Juin-Hwey) Chen; codec@ietf.org
Subject: Re: [codec] #16: Multicast?

>>>
And transmission delay increases (perhaps) linearly with the *packet size*, not with the *frame size*.  For a 32 kbps codec with 5 ms frames, packets are just 30% smaller than with a 16 kbps codecs with 20 ms frames.
>>>
 "Packet size" here has to include layer 2 overhead, not just IP overhead, making your argument even stronger. In the case of Ethernet, Layer-2 overhead is 38-42 bytes per packet (depending on whether a vlan tag is present), so it is about the same as the IP/UDP/RTP overhead.  And of course there's encryption pads, VPN encapsulation, etc. that apply in many cases.

There is a floor transmission delay when you send the minimum size packets the network path allows.
[Raymond]: Agreed.

There is an incremental delay due to serialization when you send larger size packets than the minimum. (At each hop you wait until you receive last bit in the packet before you forward the first bit).   I'd agree that a reasonable model for the incremental delay is that it scales linearly with the increase in packet size.  But the floor delay is usually too large for this to matter dominate.
[Raymond]: As I mentioned in my last response to Koen, I was just using a simplified TDM example to make a point that the transmission delay cannot be zero as the 1X frame size multiplier implies.  I agree that for packet systems the delay increase will be smaller than in TDM, but it will not be zero.  This transmission delay component is indeed small compared with what you called the floor delay.  However, using the IP phone engineers' rule of thumb of 3X codec frame size, the total codec-dependent component of the delay can be quite significant when compared with the floor delay, especially for a 20 ms frame size.

And on top of that is the variable delay  (jitter) due to congestion, layer 2 retransmission, and the like.  That also will not scale linearly with frame size or packet size.

So arguments that increasing the frame size by 10 ms will increase the overall delay by 50 ms make no sense to me at all.
[Raymond]: I have abandoned that 5X frame size formula many emails ago.  I was trying to make it simple, but later I realized that this over-simplified approach is not good, so several emails ago I already replaced it with one-way delay = codec-independent delay + 3*(codec frame size) + (codec look-ahead) + (codec filtering delay if any).  The main debate now is centered on whether the multiplier of the codec frame size should be 1 as Koen said or 3 as I was told by experienced IP phone engineers.  I argue that 1X is theoretically impossible.  It is interesting to note that the ITU-T uses a multiplier of 2X.  I think 2X is probably achievable for the idealized situation.  In practice, however, many nitty-gritty details get in the way of getting that idealized situation, and little additional delays just keep getting added, resulting in a real-world realistic 3X multiplier.  With a 3X multiplier, the one-way delay difference between a 20 ms and a 5 ms codec frame size would be 45 ms + (codec look-ahead difference) + (codec filtering delay difference).  For the conference bridge application, the total difference in one-way delay will double to the 90 to 100 ms range.  That's a VERY significant difference that typical users will notice (it's like adding another cell phone call delay), especially if it pushes the total one-way delay significantly beyond the 150 ms guideline.   Therefore, I argue that for the best user experience in conference bridge calls, the IETF codec should have a low-delay mode with a small codec frame size such as 5 ms, and let the continually increasing speed of communication links make the header overhead bit-rate become less and less of an issue in the future.  (Even now, for those people who have high speed connection to their computers, it is already not an issue.  It is better for them to get low delay than to worry about bit-rate or packet header overhead.)

Stephen Botzko
On Sun, Apr 25, 2010 at 3:24 PM, Koen Vos <koen.vos@skype.net<mailto:koen.vos@skype.net>> wrote:
Hi Raymond,

Jitter buffers have no problem implementing a non-integer-frame delay, because packets are queued and read non-synchronously.

Processing time matters on low-end hardware - a small fraction of today's VoIP end points.  Even then, the higher coding efficiency of longer frames can be translated into lower complexity.

And transmission delay increases (perhaps) linearly with the *packet size*, not with the *frame size*.  For a 32 kbps codec with 5 ms frames, packets are just 30% smaller than with a 16 kbps codecs with 20 ms frames.

Let me ask you something: how often is G.729 used with 10 ms packets, or Broadvoice with 5 ms packets?

best,
koen.

Quoting "Raymond (Juin-Hwey) Chen":
Hi Koen,

My comments in-line below.

Best Regards,

Raymond

-----Original Message-----
From: Koen Vos [mailto:koen.vos@skype.net<mailto:koen.vos@skype.net>]
Sent: Saturday, April 24, 2010 6:16 PM
To: Raymond (Juin-Hwey) Chen
Cc: codec@ietf.org<mailto:codec@ietf.org>
Subject: RE: [codec] #16: Multicast?

Quoting "Raymond (Juin-Hwey) Chen":
My main point, though, is not in the exact one-way delay value for a

codec with a 5 ms frame size, but rather that with a 5 ms frame size

you can get a much lower one-way delay than with a 20 ms frame size.

It would be about 15 ms lower - don't know if that counts as "much" :)

[Raymond]: I don't agree that it will be only 20 - 5 = 15 ms lower.  That will be true only if your one-way delay formula below is true, but theoretically it cannot be.  See my comment below your formula.

Also, note that for a given probability of packets arriving too late

to be played out, the jitter buffer delay is independent of the frame

size.

[Raymond]: That may be true theoretically, but in practical implementations, selecting a jitter buffer delay that is not divisible by the packet size would make the adaptive jitter buffer pretty messy to implement.  If we make the it divisible by the packet size, then a smaller packet size gives you more granularity to work with and can result in lower average delay as the codec frames go through the adaptive jitter buffer.

- most delay comes from the network and is not codec related, and

- one-way delay grows almost linearly with frame size.

Doesn't your last line above contradicts with the second last line?

I meant that approximately:

   one-way delay = codec-independent delay + frame size

("codec algorithmic delay" would be more accurate than "frame size")

[Raymond]: First, I agree that codec algorithmic buffering delay is more accurate than frame size since it can also include the "look-ahead" delay and filtering delay if sub-band analysis/synthesis is used.  However, your formula implies that for the codec-related delay, the "multiplier" to be used for the codec frame size is only 1.  That's unrealistic and theoretically impossible.  For that to happen, after you wait one frame of time for the current frame of input audio samples to arrive at your input signal buffer (that's one frame of codec-related delay already), you need an infinitely fast processor to finish the encoding operation instantly, then you need an infinitely fast communication link to ship all the bits in the compressed frame to the decoder instantly, and then you need an infinitely fast processor to finish decoding the frame instantly and start playing back the current frame of audio without any delay.  That's just impossible.

In reality, if the processor is just barely fast enough to implement the codec in real time, then you need nearly a full frame of time to finish the encoding and decoding operations. That makes the multiplier to be 2 already.  If your communication link is just barely fast enough to transmit your packets at the same speed they are generated without piling up unsent packets, then it takes another frame of time to finish transmitting the compressed bits in a frame to the decoder.  That makes the multiplier to be 3 already.

Granted, in practice the processor and the communication link are usually faster than just barely enough, so the processing delay and the transmission delay can be less than 1 frame each.  However, there are other miscellaneous uncounted delays that tends to depend on the codec size in various ways.  Thus, a typical IP phone implementation would have

 One-way delay = codec-independent delay + 3*(codec frame size) + (codec look-ahead) + (codec filtering delay if any).

Hence, the one-way delay difference between a 20 ms and a 5 ms codec frame size would be 45 ms + (codec look-ahead difference) + (codec filtering delay difference).

Consequently, for the conference bridge application, the total difference in one-way delay can easily be in the 90 to 100 ms range. When adding this delay difference to all the other codec-independent delay components, it is still a huge difference that the users can easily notice, especially since it will most likely push the total one-way delay significantly beyond the 150 ms limit.

I am aware of a header compression technology for VoIP over Cable

applications that can compress the header size to a very small

fraction of the original size, but it is probably not widely used.

Yes, header compression works between end-points on a cable.  That's

different from "between arbitrary Internet end points".

[Raymond]: The cable operators' networks are still IP networks.  If the technology can work there, I don't see why it cannot work elsewhere in the Internet.  I know it is currently not available between arbitrary Internet end points.  I am just saying that technologies exist that can potentially be deployed in the Internet to compress the packet headers to a very small fraction of the uncompressed headers.

best,

koen.

_______________________________________________
codec mailing list
codec@ietf.org<mailto:codec@ietf.org>
https://www.ietf.org/mailman/listinfo/codec