Re: [codec] McGill university speech database

Michael Knappe <mknappe@juniper.net> Fri, 08 October 2010 16:06 UTC

Return-Path: <mknappe@juniper.net>
X-Original-To: codec@core3.amsl.com
Delivered-To: codec@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 8B6153A68D6 for <codec@core3.amsl.com>; Fri, 8 Oct 2010 09:06:15 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.068
X-Spam-Level:
X-Spam-Status: No, score=-106.068 tagged_above=-999 required=5 tests=[AWL=0.531, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JD4lsbz6pSFE for <codec@core3.amsl.com>; Fri, 8 Oct 2010 09:06:10 -0700 (PDT)
Received: from exprod7og115.obsmtp.com (exprod7og115.obsmtp.com [64.18.2.217]) by core3.amsl.com (Postfix) with ESMTP id 4FB893A686B for <codec@ietf.org>; Fri, 8 Oct 2010 09:06:10 -0700 (PDT)
Received: from source ([66.129.224.36]) (using TLSv1) by exprod7ob115.postini.com ([64.18.6.12]) with SMTP ID DSNKTK9BsZxyUUP169dVLrzjrqcGSi67WJ9D@postini.com; Fri, 08 Oct 2010 09:07:15 PDT
Received: from EMBX02-HQ.jnpr.net ([fe80::18fe:d666:b43e:f97e]) by P-EMHUB03-HQ.jnpr.net ([::1]) with mapi; Fri, 8 Oct 2010 09:01:07 -0700
From: Michael Knappe <mknappe@juniper.net>
To: Anisse Taleb <anisse.taleb@huawei.com>, "codec@ietf.org" <codec@ietf.org>
Date: Fri, 08 Oct 2010 09:01:05 -0700
Thread-Topic: [codec] McGill university speech database
Thread-Index: ActgIQL1rV2vEHjq5UuWF93xCyy/iQG1c4xQAALMfR0=
Message-ID: <C8D48E51.1E21C%mknappe@juniper.net>
In-Reply-To: <31A06A2BB2D2AB4F9EF23A695455C007943DCB@szxeml502-mbx.china.huawei.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-Entourage/13.3.0.091002
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [codec] McGill university speech database
X-BeenThere: codec@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Codec WG <codec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/codec>
List-Post: <mailto:codec@ietf.org>
List-Help: <mailto:codec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/codec>, <mailto:codec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Oct 2010 16:06:15 -0000

Anisse,

Agreed that language diversity in the source material is a necessity, I am
working on obtaining additional language samples. Environmental simulation
is also highly desirable. There will certainly be environmental diversity in
the listening environment in the uncontrolled, widescale mushra testing that
we are working on setting up, but let's look at a few representative
conditions in the samples themselves. It would be nice to provide
representative noise / reverb additive in the semi-formal controlled
testing, although I am concerned about the rapid increase in test case
permutations and listener fatigue with the limited subject pool size we'll
have available. Your suggestions here are certainly welcome. Also agree that
PLC stressing packet loss conditions do need to be included.

Cheers,

Mike


On 10/8/10 8:22 AM, "Anisse Taleb" <anisse.taleb@huawei.com> wrote:

> Dear Michael,
> 
> Just a few comments.
> 
> The TSP database is indeed quite extensive, however, if I am not mistaken, it
> contains English language only (?).
> 
> Regarding the balancing of items, one of the main use cases of this codec are
> conversational applications, as such, the testing should include as much
> speech items as possible in different conditions, including noisy, reverberant
> rooms, error conditions/packet losses and more. Furthermore, It is well known
> that codecs performance may depend on the language of the items used in
> testing and it is not unheard-of to have a codec pass a quality performance
> requirement in a certain language and utterly fails in some others. The need
> for language diversity in testing is even more important given the intended
> wide distribution of the codec.
> 
> Kind regards,
> /Anisse
> 
> -----Original Message-----
> From: codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] On Behalf Of
> Michael Knappe
> Sent: Wednesday, September 29, 2010 11:55 PM
> To: codec@ietf.org
> Subject: [codec] McGill university speech database
> 
> Just received permission from Dr. Peter Kabal at McGill University to use
> their 1400 utterance, 24 different talker (12 male, 12 female) Harvard
> sentence speech database for our codec testing efforts in the IETF codec WG.
> Includes the original 48 kHz files. My preference is just to put the McGill
> link to the 539 MB CD ISO image file up on the codec wiki, if that sounds ok
> with everyone I will get final permission to post the link from Dr. Kabal and
> get them up on the wiki asap.
> 
> Next step is to work on getting rights to representative music content. A
> capella vocals (e.g. Tom's Diner) , orchestral crescendo's, castanets, solo
> violin, jazz trumpet/ensembles, rock/electronica etc would all be good to
> include for testing, please reply with any suggestions.
> 
> Thanks to Jean-Marc for the pointer to Dr. Kabal and the speech database!
> 
> Cheers,
> 
> Mike
> _______________________________________________
> codec mailing list
> codec@ietf.org
> https://www.ietf.org/mailman/listinfo/codec