Re: [codec] #16: Multicast?

Koen Vos <koen.vos@skype.net> Tue, 27 April 2010 07:16 UTC

Return-Path: <koen.vos@skype.net>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 1326E3A6A27 for <codec@core3.amsl.com>; Tue, 27 Apr 2010 00:16:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.539
X-Spam-Level:
X-Spam-Status: No, score=-4.539 tagged_above=-999 required=5 tests=[AWL=-0.540, BAYES_50=0.001, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AstwOi2iHyGN for <codec@core3.amsl.com>; Tue, 27 Apr 2010 00:16:17 -0700 (PDT)
Received: from mail.skype.net (mail.skype.net [212.187.172.39]) by core3.amsl.com (Postfix) with ESMTP id 18C773A6C31 for <codec@ietf.org>; Tue, 27 Apr 2010 00:16:16 -0700 (PDT)
Received: from mail.skype.net (localhost [127.0.0.1]) by mail.skype.net (Postfix) with ESMTP id 21DB760135BE3; Tue, 27 Apr 2010 08:16:03 +0100 (IST)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=skype.net; h=message-id :date:from:to:cc:subject:references:in-reply-to:mime-version :content-type:content-transfer-encoding; s=mail; bh=bnuXNvOFlh13 9o2ucGsKUz5REa8=; b=fJvNlU1VwhUgiO4B5P9BartcB9zKkkp+Ri1wLiTV0hVs ek4cGTBNnXW0+ZTLwyrkqAVr9P7N8+0w5YduHnRLMldz6ZSXW+gJN123EF9Lmoqd Xtc916lhghTKx7HLrW9pg8U6RHYaWgVKkbHUZTfSz6ESReQs4voqMZCyJ9VLMp8=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=skype.net; h=message-id:date:from :to:cc:subject:references:in-reply-to:mime-version:content-type: content-transfer-encoding; q=dns; s=mail; b=fNntddv9y5H3vxYKQYXY PebJhK5b9S1EY3jInxeGpwcq6JCEIq63U/4GkSl0ODf0op7JpzJonVcn+MHPNaix sh56bvkfE5AU+F6UgtNXt75aXDaXzQ5ZXmkbzUIPH18LNNvkRrPxGnbGsD7j22Yi UBWHQGBcjF3KqZVfPeC6mo0=
Received: from localhost (localhost [127.0.0.1]) by mail.skype.net (Postfix) with ESMTP id 1F60660135BE1; Tue, 27 Apr 2010 08:16:03 +0100 (IST)
X-Virus-Scanned: Debian amavisd-new at dub-mail.skype.net
Received: from mail.skype.net ([127.0.0.1]) by localhost (dub-mail.skype.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DDMUpkkGkywW; Tue, 27 Apr 2010 08:16:02 +0100 (IST)
Received: by mail.skype.net (Postfix, from userid 33) id 06EC860135BE2; Tue, 27 Apr 2010 08:16:02 +0100 (IST)
Received: from adsl-71-141-115-202.dsl.snfc21.pacbell.net (adsl-71-141-115-202.dsl.snfc21.pacbell.net [71.141.115.202]) by mail.skype.net (Horde Framework) with HTTP; Tue, 27 Apr 2010 00:16:01 -0700
Message-ID: <20100427001601.28347kv06z915l4h@mail.skype.net>
Date: Tue, 27 Apr 2010 00:16:01 -0700
From: Koen Vos <koen.vos@skype.net>
To: "Raymond (Juin-Hwey) Chen" <rchen@broadcom.com>
References: <062.7439ee5d5fd36480e73548f37cb10207@tools.ietf.org> <3E1D8AD1-B28F-41C5-81C6-478A15432224@csperkins.org> <D6C2F445-BE4A-4571-A56D-8712C16887F1@americafree.tv> <C0347188-A2A1-4681-9F1E-0D2ECC4BDB3B@csperkins.org> <u2x6e9223711004210733g823b4777y404b02330c49dec1@mail.gmail.com> <000001cae173$dba012f0$92e038d0$@de> <r2q6e9223711004211010gfdee1a70q972e8239fef10435@mail.gmail.com> <001101cae177$e8aa6780$b9ff3680$@de> <t2t6e9223711004211119i6b107798pa01fc4b1d33debf1@mail.gmail.com> <002d01cae188$a330b2c0$e9921840$@de> <CB68DF4CFBEF4942881AD37AE1A7E8C74AB3F4A017@IRVEXCHCCR01.corp.ad.broadcom.com> <4BD11C50.2020206@usherbrooke.ca> <CB68DF4CFBEF4942881AD37AE1A7E8C74AB3F4A270@IRVEXCHCCR01.corp.ad.broadcom.com> <20100424135607.84293hkaa13j1zvr@mail.skype.net> <CB68DF4CFBEF4942881AD37AE1A7E8C74AB3F4A289@IRVEXCHCCR01.corp.ad.broadcom.com> <20100424181620.352034g28cnjr010@mail.skype.net> <CB68DF4CFBEF4942881AD37AE1A7E8C74AB3F4A290@IRVEXCHCCR01.corp.ad.broadcom.com> <20100425122429.2136460zti0p5fjh@mail.skype.net> <CB68DF4CFBEF4942881AD37AE1A7E8C74B901365EF@IRVEXCHCCR01.corp.ad.broadcom.com>
In-Reply-To: <CB68DF4CFBEF4942881AD37AE1A7E8C74B901365EF@IRVEXCHCCR01.corp.ad.broadcom.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"; DelSp="Yes"; format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Internet Messaging Program (IMP) H3 (4.3.4)
Cc: "codec@ietf.org" <codec@ietf.org>
Subject: Re: [codec] #16: Multicast?
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 27 Apr 2010 07:16:19 -0000

Hi Raymond,

Please don't get me wrong: I share your vision that, over time,  
Moore's law will drive adoption of the highest possible audio quality  
in IP end points.  And the creation of the very low-delay, full-band,  
(near-)transparent and even multi-channel codec this requires falls  
indeed within the objectives of this WG, if you ask me.

best,
koen.


Quoting "Raymond (Juin-Hwey) Chen":
> In-line...
>
>
>
> -----Original Message-----
> From: Koen Vos [mailto:koen.vos@skype.net]
> Sent: Sunday, April 25, 2010 12:24 PM
> To: Raymond (Juin-Hwey) Chen
> Cc: codec@ietf.org
> Subject: RE: [codec] #16: Multicast?
>
>
>
> Hi Raymond,
>
>
>
> Jitter buffers have no problem implementing a non-integer-frame delay,
>
> because packets are queued and read non-synchronously.
>
> [Raymond]: I am talking about adaptive jitter buffer that tries to  
> minimize the delay through the jitter buffer dynamically depending  
> on the observed network jitter.  If the jitter is small, you  
> decrease that delay, and if it is large, you increase that delay. An  
> engineer who actually implemented such an adaptive jitter buffer in  
> an IP phone told me that the non-integer-frame delay made it pretty  
> messy to implement (I didn't say it was not possible; it's just  
> messy), so for implementation simplicity's sake, the jitter delay  
> was often chosen to be an integer number of frames. He also said  
> that a smaller frame size gives you more frequent observations of  
> the network jitter and thus makes the jitter estimate more  
> responsive and accurate.
>
>
>
> Processing time matters on low-end hardware - a small fraction of
>
> today's VoIP end points.
>
> [Raymond]: Processing time certainly matters for IP phones, and  
> there are a lot of enterprise IP phones deployed today. I heard that  
> it is actually significantly cheaper for enterprises to have their  
> entire phone systems IP-phone-based than analog-phone-based. I won't  
> be surprised that before too long the vast majority of enterprises  
> will use only IP phones.  Even consumer phones and cell phones are  
> moving toward IP-based.  Eventually that would be a very large  
> percentage of VoIP end points.
>
>
>
> And transmission delay increases (perhaps) linearly with the *packet
>
> size*, not with the *frame size*.  For a 32 kbps codec with 5 ms
>
> frames, packets are just 30% smaller than with a 16 kbps codecs with
>
> 20 ms frames.
>
> [Raymond]: Agreed. My previous comments on transmission delay was  
> based on the TDM rather than packet scenario, but I was just using  
> that simplified TDM example to make a point that transmission delay  
> cannot be zero, as your 1X frame size multiplier would imply.  Even  
> with your statement above, a larger codec frame size still makes a  
> larger packet size, which then increases the transmission delay, so  
> you can't say transmission delay is zero or is independent of the  
> codec size.
>
> In any case, these are really minor details.  My key point is that  
> your 1X multiplier for the codec frame size is simply theoretically  
> impossible.  The rule of thumb used by IP phone engineers is around  
> 3X codec frame size.
>
>
>
> Let me ask you something: how often is G.729 used with 10 ms packets,
>
> or Broadvoice with 5 ms packets?
>
> [Raymond]: Not very often, but that's because previously network  
> routers/switches didn't like to handle too many packets per second,  
> and the higher packet header overhead associated with a smaller  
> packet size means the overall bit-rate would be higher than desired  
> or allowed, so the time of small packet size for low-delay VoIP  
> hasn't really come yet.  However, with the help of Moore's Law,  
> network routers/switches are becoming much faster now, and I was  
> told that they can handle a 5 ms packet size without problems;  
> furthermore, the speed of backbone networks and access networks keep  
> increasing with time, so the bit-rate concern will also decrease  
> with time.
>
> Unlike processing speed and communication speed that continuously  
> get improved with time for decades, delay is one thing that will NOT  
> get improved with time and Moore's Law cannot do anything about that!
>
> If the IETF codec has a minimum frame size of 20 ms, we will be  
> stuck with the longer overall delay associated with that, and  
> Moore's Law will not help us reduce that delay in the future.  On  
> the other hand, in addition to using a 20 ms frame size for  
> bit-rate-sensitive applications, if the IETF codec also has a  
> low-delay mode that uses a 5 ms frame size, then at least for  
> delay-sensitive applications, people have a choice to achieve a  
> lower delay by paying the price of a higher overall bit-rate (i.e.  
> with packet header counted), and this higher bit-rate will be less  
> and less of a concern as the network speed keep increasing with time.
>
> Therefore, recognizing that delay cannot be helped by Moore's Law  
> but bit-rate can, it would be wise for the IETF codec WG to adopt a  
> low-delay mode for the codec in order to be future-proof.
>
>
>
>
>
> best,
>
> koen.
>
>
>
>
>
>
>
> Quoting "Raymond (Juin-Hwey) Chen":
>
>
>
>> Hi Koen,
>
>>
>
>>
>
>>
>
>> My comments in-line below.
>
>>
>
>>
>
>>
>
>> Best Regards,
>
>>
>
>>
>
>>
>
>> Raymond
>
>>
>
>>
>
>>
>
>> -----Original Message-----
>
>> From: Koen Vos [mailto:koen.vos@skype.net]
>
>> Sent: Saturday, April 24, 2010 6:16 PM
>
>> To: Raymond (Juin-Hwey) Chen
>
>> Cc: codec@ietf.org
>
>> Subject: RE: [codec] #16: Multicast?
>
>>
>
>>
>
>>
>
>> Quoting "Raymond (Juin-Hwey) Chen":
>
>>
>
>>> My main point, though, is not in the exact one-way delay value for a
>
>>
>
>>> codec with a 5 ms frame size, but rather that with a 5 ms frame size
>
>>
>
>>> you can get a much lower one-way delay than with a 20 ms frame size.
>
>>
>
>>
>
>>
>
>> It would be about 15 ms lower - don't know if that counts as "much" :)
>
>>
>
>>
>
>>
>
>> [Raymond]: I don't agree that it will be only 20 - 5 = 15 ms lower.
>
>> That will be true only if your one-way delay formula below is true,
>
>> but theoretically it cannot be.  See my comment below your formula.
>
>>
>
>>
>
>>
>
>> Also, note that for a given probability of packets arriving too late
>
>>
>
>> to be played out, the jitter buffer delay is independent of the frame
>
>>
>
>> size.
>
>>
>
>> [Raymond]: That may be true theoretically, but in practical
>
>> implementations, selecting a jitter buffer delay that is not
>
>> divisible by the packet size would make the adaptive jitter buffer
>
>> pretty messy to implement.  If we make the it divisible by the
>
>> packet size, then a smaller packet size gives you more granularity
>
>> to work with and can result in lower average delay as the codec
>
>> frames go through the adaptive jitter buffer.
>
>>
>
>>
>
>>
>
>>>> - most delay comes from the network and is not codec related, and
>
>>
>
>>>> - one-way delay grows almost linearly with frame size.
>
>>
>
>>>
>
>>
>
>>> Doesn't your last line above contradicts with the second last line?
>
>>
>
>>
>
>>
>
>> I meant that approximately:
>
>>
>
>>     one-way delay = codec-independent delay + frame size
>
>>
>
>>
>
>>
>
>> ("codec algorithmic delay" would be more accurate than "frame size")
>
>>
>
>>
>
>>
>
>> [Raymond]: First, I agree that codec algorithmic buffering delay is
>
>> more accurate than frame size since it can also include the
>
>> "look-ahead" delay and filtering delay if sub-band
>
>> analysis/synthesis is used.  However, your formula implies that for
>
>> the codec-related delay, the "multiplier" to be used for the codec
>
>> frame size is only 1.  That's unrealistic and theoretically
>
>> impossible.  For that to happen, after you wait one frame of time
>
>> for the current frame of input audio samples to arrive at your input
>
>> signal buffer (that's one frame of codec-related delay already), you
>
>> need an infinitely fast processor to finish the encoding operation
>
>> instantly, then you need an infinitely fast communication link to
>
>> ship all the bits in the compressed frame to the decoder instantly,
>
>> and then you need an infinitely fast processor to finish decoding
>
>> the frame instantly and start playing back the current frame of
>
>> audio without any delay.  That's just impossible.
>
>>
>
>> In reality, if the processor is just barely fast enough to implement
>
>> the codec in real time, then you need nearly a full frame of time to
>
>> finish the encoding and decoding operations. That makes the
>
>> multiplier to be 2 already.  If your communication link is just
>
>> barely fast enough to transmit your packets at the same speed they
>
>> are generated without piling up unsent packets, then it takes
>
>> another frame of time to finish transmitting the compressed bits in
>
>> a frame to the decoder.  That makes the multiplier to be 3 already.
>
>>
>
>> Granted, in practice the processor and the communication link are
>
>> usually faster than just barely enough, so the processing delay and
>
>> the transmission delay can be less than 1 frame each.  However,
>
>> there are other miscellaneous uncounted delays that tends to depend
>
>> on the codec size in various ways.  Thus, a typical IP phone
>
>> implementation would have
>
>>
>
>>   One-way delay = codec-independent delay + 3*(codec frame size) +
>
>> (codec look-ahead) + (codec filtering delay if any).
>
>>
>
>> Hence, the one-way delay difference between a 20 ms and a 5 ms codec
>
>> frame size would be 45 ms + (codec look-ahead difference) + (codec
>
>> filtering delay difference).
>
>>
>
>> Consequently, for the conference bridge application, the total
>
>> difference in one-way delay can easily be in the 90 to 100 ms range.
>
>> When adding this delay difference to all the other codec-independent
>
>> delay components, it is still a huge difference that the users can
>
>> easily notice, especially since it will most likely push the total
>
>> one-way delay significantly beyond the 150 ms limit.
>
>>
>
>>
>
>>
>
>>> I am aware of a header compression technology for VoIP over Cable
>
>>
>
>>> applications that can compress the header size to a very small
>
>>
>
>>> fraction of the original size, but it is probably not widely used.
>
>>
>
>>
>
>>
>
>> Yes, header compression works between end-points on a cable.  That's
>
>>
>
>> different from "between arbitrary Internet end points".
>
>>
>
>>
>
>>
>
>> [Raymond]: The cable operators' networks are still IP networks.  If
>
>> the technology can work there, I don't see why it cannot work
>
>> elsewhere in the Internet.  I know it is currently not available
>
>> between arbitrary Internet end points.  I am just saying that
>
>> technologies exist that can potentially be deployed in the Internet
>
>> to compress the packet headers to a very small fraction of the
>
>> uncompressed headers.
>
>>
>
>>
>
>>
>
>> best,
>
>>
>
>> koen.
>
>>
>
>>
>
>>
>
>
>
>
>
>
>