Re: [codec] draft test and processing plan for the IETF Codec

Roman Shpount <roman@telurix.com> Mon, 18 April 2011 19:27 UTC

Return-Path: <roman@telurix.com>
X-Original-To: codec@ietfc.amsl.com
Delivered-To: codec@ietfc.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfc.amsl.com (Postfix) with ESMTP id ECA5CE082A for <codec@ietfc.amsl.com>; Mon, 18 Apr 2011 12:27:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.976
X-Spam-Level:
X-Spam-Status: No, score=-2.976 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([208.66.40.236]) by localhost (ietfc.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ryhsscO2645J for <codec@ietfc.amsl.com>; Mon, 18 Apr 2011 12:27:39 -0700 (PDT)
Received: from mail-ey0-f172.google.com (mail-ey0-f172.google.com [209.85.215.172]) by ietfc.amsl.com (Postfix) with ESMTP id 45AC2E070C for <codec@ietf.org>; Mon, 18 Apr 2011 12:27:39 -0700 (PDT)
Received: by eye13 with SMTP id 13so1921657eye.31 for <codec@ietf.org>; Mon, 18 Apr 2011 12:27:38 -0700 (PDT)
Received: by 10.14.52.65 with SMTP id d41mr1890394eec.85.1303154858450; Mon, 18 Apr 2011 12:27:38 -0700 (PDT)
Received: from mail-ew0-f44.google.com (mail-ew0-f44.google.com [209.85.215.44]) by mx.google.com with ESMTPS id q53sm4298341eeh.25.2011.04.18.12.27.35 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 18 Apr 2011 12:27:36 -0700 (PDT)
Received: by ewy19 with SMTP id 19so1918407ewy.31 for <codec@ietf.org>; Mon, 18 Apr 2011 12:27:35 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.213.14.129 with SMTP id g1mr4010878eba.93.1303154855259; Mon, 18 Apr 2011 12:27:35 -0700 (PDT)
Received: by 10.213.98.83 with HTTP; Mon, 18 Apr 2011 12:27:34 -0700 (PDT)
In-Reply-To: <999109E6BC528947A871CDEB5EB908A0039FA920@XMB-RCD-209.cisco.com>
References: <BANLkTin69jpyXuR9z95yO3eXEnnZFVY5MA@mail.gmail.com> <2127459324.276910.1303150427218.JavaMail.root@lu2-zimbra> <999109E6BC528947A871CDEB5EB908A0039FA920@XMB-RCD-209.cisco.com>
Date: Mon, 18 Apr 2011 15:27:34 -0400
Message-ID: <BANLkTikpPTPiOvf_6bYTvuP5VKDRS-DSxA@mail.gmail.com>
From: Roman Shpount <roman@telurix.com>
To: "Michael Ramalho (mramalho)" <mramalho@cisco.com>
Content-Type: multipart/alternative; boundary="0015174c10ea3c2de304a13662b5"
Cc: codec@ietf.org
Subject: Re: [codec] draft test and processing plan for the IETF Codec
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 18 Apr 2011 19:27:43 -0000

What I said was that in IP phones you normally deal with audio that is very
similar to the test audio samples before they been passed through some type
of filter. There is quite a bit of filtering going on in the microphone,
DAC, and after the DAC to actually produce the audio, but all of those are
designed so that you end up with more or less the desired audio spectrum.
After this point you can either pass it through a bandpass filter if this is
required by the CODEC (and a lot of codecs have a filter as a part of their
specification) or you can encode the audio directly (it is not uncommon to
have 50-3900 Hz signal in 8KHz PCMU signal from an IP phone). There is no
reason why IP phone should have only 300 to 3400 Hz in case of narrowband,
or 50-7000 in case of wideband signal. In fact, most of the IP phones have a
wider audio spectrum.
_____________
Roman Shpount


On Mon, Apr 18, 2011 at 3:07 PM, Michael Ramalho (mramalho) <
mramalho@cisco.com> wrote:

>  > Also, bandpass filtering is not really "pre-distorting".
>
> Why not?  It creates spectral distortion in the signal before the encoder.
>
> http://en.wikipedia.org/wiki/Distortion#Frequency_response_distortion
>
> MAR: I agree that the adjective should be “BANDPASS filtering”; as
> amplitude attenuation (i.e., amplitude distortion) is by definition desired
> in the attenuation bands.
>
>
>
> MAR: One is usually only concerned about “amplitude/frequency/phase
> distortion” in the passband. And for that reason it is sometimes desirous to
> have linear phase filters with constant group delay.
>
>
>
> MAR: Granted, a reasonably sharp bandpass filter of the type likely desired
> for a test plan will likely not have linear phase … and thus will likely
> have some phase distortion.
>
>
>
> MAR: However, the human ear is mostly insensitive to (reasonably small)
> phase distortion.
>
>
>
> MAR: What type of “signal conditioning (pre) distortion” are you concerned
> about?
>
>
>
> MAR: If you said the above in jest, I apologize for not seeing a smiley
> face.
>
>
>
> MAR: Additionally, in practice you may not know what type of bandpass
> filtering is in use prior to the codec. For example, the wideband handsets
> for hardware IP phones may need to meet defined masks (e.g., tia810B).*
> Microphones also introduce non-flat passbands. By your definition is a lot
> of “pre-distortion” present in the test signals as well.
>
>
>
> Michael Ramalho
>
>
>
> * I think Roman stated that there is no need for such filters in IP phones,
> thus I disagree with that statement as well. One usually has to employ
> specific filters to meet frequency dependent input masks on such devices.
>
>
>
> PS – I have a 24 bit recording system at home … so I don’t like distortions
> either.
>
>
>
> *From:* codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] *On Behalf
> Of *Koen Vos
> *Sent:* Monday, April 18, 2011 2:14 PM
> *To:* Stephen Botzko
>
> *Cc:* codec@ietf.org
> *Subject:* Re: [codec] draft test and processing plan for the IETF Codec
>
>
>
> Stephen Botzko wrote:
> > I don't see how it "invalidates the conclusion", as the input signal is
> the same
> > for all codecs in any event.
>
> The input signal being the same doesn't preclude a bias.  The bias comes
> from
> the fact that the input signal is an artificial test signal designed to
> match the
> response of ITU codecs.
>
> As Paul said earlier: "There's no doubt that increased audio bandwidth,
> other
> things being equal, enhances the perception of quality".  Therefore,
> artificially
> preventing some codecs to deliver the bandwidth they would in the real
> world
> introduces a bias in the results.
>
> And I don't see what conclusions to draw from biased results.
>
> > Also, bandpass filtering is not really "pre-distorting".
>
> Why not?  It creates spectral distortion in the signal before the encoder.
>
> http://en.wikipedia.org/wiki/Distortion#Frequency_response_distortion
>
> best,
> koen.
>
>  ------------------------------
>
> *From: *"Stephen Botzko" <stephen.botzko@gmail.com>
> *To: *"Koen Vos" <koen.vos@skype.net>
> *Cc: *"Paul Coverdale" <coverdale@sympatico.ca>, codec@ietf.org
> *Sent: *Monday, April 18, 2011 10:34:22 AM
> *Subject: *Re: [codec] draft test and processing plan for the IETF Codec
>
> in-line
>
> On Mon, Apr 18, 2011 at 12:48 PM, Koen Vos <koen.vos@skype.net> wrote:
>
> Stephen Botzko wrote:
> > If you simply want to know which "sounds better" to the user,
>
> That's probably the best you can hope for yes.
>
>
> > then perhaps bandpass filtering gets in the way.
>
> Correct.
>
>
>
> > If you want to see if there are there is an underlying difference in
> intelligibility
> > or user tolerance for the coding artifacts,, then the bandpass filtering
> might be
> > useful, since it controls for the known preference that users have for
> wider
> > frequency response.
>
> Sounds like an interesting academic study.  You should also look into any
> long-term health effects (so you can argue for a 5 year test plan!).
>
> One thing we know for sure though: pre-distoring test signals creates a
> bias in the
> results and thus invalidates any conclusion from the test.
>
>
> I don't think this is particularly academic, such filtering seems to show
> up in most test plans I've seen. I don't see how it "invalidates the
> conclusion", as the input signal is the same for all codecs in any event.
>
> Also, bandpass filtering is not really "pre-distorting".
>
> Best,
> Stephen Botzko
>
>
> best,
> koen.
>
>  ------------------------------
>
> *From: *"Stephen Botzko" <stephen.botzko@gmail.com>
>
>
> *To: *"Koen Vos" <koen.vos@skype.net>
>
> *Cc: *"Paul Coverdale" <coverdale@sympatico.ca>, codec@ietf.org
> *Sent: *Monday, April 18, 2011 4:18:05 AM
>
>
> *Subject: *Re: [codec] draft test and processing plan for the IETF Codec
>
> in-line
> Stephen Botzko
>
> On Mon, Apr 18, 2011 at 3:27 AM, Koen Vos <koen.vos@skype.net> wrote:
>
> Hi Paul,
>
>
> > I think where this discussion is going is that we need to be more
> > precise in defining what we mean by "NB", "WB", "SWB" and "FB" if
> > we want to make meaningful comparisons between codecs.
>
> The discussion so far was about whether to pre-distort test signals by
> bandpass filtering.
>
>
>
> I think this might depend on what you want to learn from the test.
>
> If you simply want to know which "sounds better" to the user, then perhaps
> bandpass filtering gets in the way.
>
> If you want to see if there are there is an underlying difference in
> intelligibility or user tolerance for the coding artifacts,, then the
> bandpass filtering might be useful, since it controls for the known
> preference that users have for wider frequency response.
>
>
>
>
> I don't see what the name of a codec's mode has to do with meaningful
> comparisons.  It's the sampling rate that matters: what happens when a
> VoIP application swaps one codec for another while leaving all else the
> same.  So where possible you want to compare codecs running at equal
> sampling rates.  That gives a clear grouping of codecs for 8, 16 and
> 48 kHz (some call these NB, WB and FB).
>
> The open question is what to do in between 16 and 48 kHz.  Opus accepts
> 24 kHz signals, other codecs use 32 kHz (and they all call it SWB).
> Here you could either compare directly, which puts the 32 kHz codecs at
> an advantage.  Or you could run Opus in FB mode by upsampling the 32
> kHz signal to 48 kHz, as Jean-Marc suggested for 32 and 64 kbps.
>
>
> best,
> koen.
>
>
> ----- Original Message -----
> From: "Paul Coverdale" <coverdale@sympatico.ca>
> To: "Koen Vos" <koen.vos@skype.net>
> Cc: codec@ietf.org, "Anisse Taleb" <anisse.taleb@huawei.com>
>
> Sent: Sunday, April 17, 2011 5:40:33 PM
> Subject: RE: [codec] draft test and processing plan for the IETF Codec
>
> Hi Koen,
>
> There's no doubt that increased audio bandwidth, other things being equal,
> enhances the perception of quality (well, up to the point where the input
> signal spectrum itself runs out of steam). I think where this discussion is
> going is that we need to be more precise in defining what we mean by "NB",
> "WB", "SWB" and "FB" if we want to make meaningful comparisons between
> codecs. In fact, the nominal -3 dB passband bandwidth of G.722 is actually a
> minimum of 50 to 7000 Hz, you can go up to 8000 Hz and still meet the
> anti-aliassing requirement.
>
> Regards,
>
> ...Paul
>
> >-----Original Message-----
> >From: Koen Vos [mailto:koen.vos@skype.net]
> >Sent: Sunday, April 17, 2011 1:44 AM
> >To: Paul Coverdale
> >Cc: codec@ietf.org; Anisse Taleb
> >Subject: Re: [codec] draft test and processing plan for the IETF Codec
> >
> >Hi Paul,
> >
> >> The filtering described in the test plan [..] is there to establish
> >> a common bandwidth (and equalization characteristic in some cases)
> >> for the audio chain (be it NB, WB, SWB) so that subjects can focus
> >> on comparing the distortion introduced by each of the codecs in the
> >> test, without confounding it with bandwidth effects.
> >
> >I believe it would be a mistake to test with band-limited signals, for
> >these reasons:
> >
> >1. Band-limited test signals are atypical of real-world usage.  People
> >in this WG have always emphasized that we should test with realistic
> >scenarios (like network traces for packet loss), and the proposal goes
> >against that philosophy.
> >
> >2. Band limiting the input hurts a codec's performance.  In the Google
> >test for instance, Opus-WB@20 kbps outperformed the LP7 anchor --
> >surely that wouldn't happen if Opus ran on an LP7 signal.  That makes
> >the proposed testing procedure less relevant for deciding whether this
> >codec will be of value on the Internet.
> >
> >3. Audio bandwidth matters to end users.  Real-life experiments show
> >that codecs with more bandwidth boost user ratings and call durations.
> >(E.g. see slides 2, 3 of
> >http://www.ietf.org/proceedings/77/slides/codec-3.pdf)
> >So if a codec scores higher "just" because it encodes more bandwidth,
> >that's still a real benefit to users.  And the testing procedure
> >proposed already reduces the impact of differing bandwidths, by using
> >MOS scores without pairwise comparisons.
> >
> >4. Testing with band-limited signals risks perpetuating crippled codec
> >design.  In order to do well in the tests, a codec designer would be
> >"wise" to downsample the input or otherwise optimize towards the
> >artificial test signals.  This actually lowers the performance for
> >real-world signals, and usually adds complexity.  And as long as
> >people design codecs with a band-limited response, they'll argue to
> >test with one as well.  Let's break this circle.
> >
> >I also found it interesting how the chosen bandwidths magically match
> >those of ITU standards, while potentially hurting Opus.  For instance,
> >Opus-SWB has only 12 kHz bandwidth, but would still be tested with a
> >14 kHz signal.
> >
> >best,
> >koen.
> >
> >
> >----- Original Message -----
> >From: "Paul Coverdale" <coverdale@sympatico.ca>
> >To: "Koen Vos" <koen.vos@skype.net>
> >Cc: codec@ietf.org, "Anisse Taleb" <anisse.taleb@huawei.com>
> >Sent: Saturday, April 16, 2011 6:25:04 PM
> >Subject: RE: [codec] draft test and processing plan for the IETF Codec
> >
> >Hi Koen and Jean-Marc,
> >
> >The filtering described in the test plan is not meant to be for anti-
> >aliassing, it is there to establish a common bandwidth (and equalization
> >characteristic in some cases) for the audio chain (be it NB, WB, SWB) so
> >that subjects can focus on comparing the distortion introduced by each
> >of the codecs in the test, without confounding it with bandwidth
> >effects.
> >
> >Regards,
> >
> >...Paul
> >
> >>-----Original Message-----
> >>From: Koen Vos [mailto:koen.vos@skype.net]
> >>Sent: Saturday, April 16, 2011 4:07 PM
> >>To: Paul Coverdale
> >>Cc: codec@ietf.org; Anisse Taleb
> >>Subject: Re: [codec] draft test and processing plan for the IETF Codec
> >>
> >>Paul Coverdale wrote:
> >>> You mean that VoIP applications have no filtering at all, not even
> >>> anti-aliassing?
> >>
> >>The bandpass filter in the test plan runs on the downsampled signal,
> >>so it's not an anti-aliasing filter.
> >>
> >>Also, the plan's bandpass for narrowband goes all the way up to Nyquist
> >>(4000 Hz), whereas for wideband it goes only to 7000 Hz.  So if the
> >>bandpass filters were to somehow deal with aliasing, they are not being
> >>used consistently.
> >>
> >>I presume the resamplers in the plan use proper anti-aliasing filters
> >>representative of those in VoIP applications (and described in
> >>Jean-Marc's post).
> >>
> >>best,
> >>koen.
> >>
> >>
> >>----- Original Message -----
> >>From: "Paul Coverdale" <coverdale@sympatico.ca>
> >>To: "Koen Vos" <koen.vos@skype.net>, "Anisse Taleb"
> >><anisse.taleb@huawei.com>
> >>Cc: codec@ietf.org
> >>Sent: Saturday, April 16, 2011 4:42:06 AM
> >>Subject: RE: [codec] draft test and processing plan for the IETF Codec
> >>
> >>Hi Koen,
> >>
> >>You mean that VoIP applications have no filtering at all, not even
> >>anti-aliassing?
> >>
> >>...Paul
> >>
> >>>-----Original Message-----
> >>>From: codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] On Behalf
> >>>Of Koen Vos
> >>>Sent: Saturday, April 16, 2011 1:04 AM
> >>>To: Anisse Taleb
> >>>Cc: codec@ietf.org
> >>>Subject: Re: [codec] draft test and processing plan for the IETF Codec
> >>>
> >>>Hi Anisse,
> >>>
> >>>I noticed your plan tests with band-limited signals: Narrowband
> >signals
> >>>are
> >>>filtered from 300-4000 Hz, Wideband from 50-7000 Hz, Superwideband
> >from
> >>>50-14000 Hz.
> >>>
> >>>However, VoIP applications have no such band-pass filters (which
> >>degrade
> >>>quality and add complexity).  So results will be more informative to
> >>the
> >>>WG
> >>>and potential adopters of the codec if the testing avoids band-pass
> >>>filtering as well.  We want test conditions to mimic the real world as
> >>>closely as possible.
> >>>
> >>>Instead of band-pass filtering, tests on speech could use a simple
> >>high-
> >>>pass
> >>>filter with a cutoff around 50 Hz, as many VoIP applications do indeed
> >>>have
> >>>such a filter.
> >>>
> >>>best,
> >>>koen.
> >>>
> >>>
> >>
> >
>
>
> _______________________________________________
> codec mailing list
> codec@ietf.org
> https://www.ietf.org/mailman/listinfo/codec
>
>
>
>
>
> _______________________________________________
> codec mailing list
> codec@ietf.org
> https://www.ietf.org/mailman/listinfo/codec
>
>