Re: [codec] McGill university speech database

Anisse,

Agreed that language diversity in the source material is a necessity, I am
working on obtaining additional language samples. Environmental simulation
is also highly desirable. There will certainly be environmental diversity in
the listening environment in the uncontrolled, widescale mushra testing that
we are working on setting up, but let's look at a few representative
conditions in the samples themselves. It would be nice to provide
representative noise / reverb additive in the semi-formal controlled
testing, although I am concerned about the rapid increase in test case
permutations and listener fatigue with the limited subject pool size we'll
have available. Your suggestions here are certainly welcome. Also agree that
PLC stressing packet loss conditions do need to be included.

Cheers,

Mike

On 10/8/10 8:22 AM, "Anisse Taleb" <anisse.taleb@huawei.com> wrote:

> Dear Michael,
> 
> Just a few comments.
> 
> The TSP database is indeed quite extensive, however, if I am not mistaken, it
> contains English language only (?).
> 
> Regarding the balancing of items, one of the main use cases of this codec are
> conversational applications, as such, the testing should include as much
> speech items as possible in different conditions, including noisy, reverberant
> rooms, error conditions/packet losses and more. Furthermore, It is well known
> that codecs performance may depend on the language of the items used in
> testing and it is not unheard-of to have a codec pass a quality performance
> requirement in a certain language and utterly fails in some others. The need
> for language diversity in testing is even more important given the intended
> wide distribution of the codec.
> 
> Kind regards,
> /Anisse
> 
> -----Original Message-----
> From: codec-bounces@ietf.org [mailto:codec-bounces@ietf.org] On Behalf Of
> Michael Knappe
> Sent: Wednesday, September 29, 2010 11:55 PM
> To: codec@ietf.org
> Subject: [codec] McGill university speech database
> 
> Just received permission from Dr. Peter Kabal at McGill University to use
> their 1400 utterance, 24 different talker (12 male, 12 female) Harvard
> sentence speech database for our codec testing efforts in the IETF codec WG.
> Includes the original 48 kHz files. My preference is just to put the McGill
> link to the 539 MB CD ISO image file up on the codec wiki, if that sounds ok
> with everyone I will get final permission to post the link from Dr. Kabal and
> get them up on the wiki asap.
> 
> Next step is to work on getting rights to representative music content. A
> capella vocals (e.g. Tom's Diner) , orchestral crescendo's, castanets, solo
> violin, jazz trumpet/ensembles, rock/electronica etc would all be good to
> include for testing, please reply with any suggestions.
> 
> Thanks to Jean-Marc for the pointer to Dr. Kabal and the speech database!
> 
> Cheers,
> 
> Mike
> _______________________________________________
> codec mailing list
> codec@ietf.org
> https://www.ietf.org/mailman/listinfo/codec