[Speechsc] FW: MRCPV2 Verification: Insufficient/bad speech

Nik Waldron <nik.waldron@kaz-group.com> Fri, 21 November 2008 00:49 UTC

Return-Path: <speechsc-bounces@ietf.org>
X-Original-To: speechsc-archive@optimus.ietf.org
Delivered-To: ietfarch-speechsc-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 0127728C16C; Thu, 20 Nov 2008 16:49:47 -0800 (PST)
X-Original-To: speechsc@core3.amsl.com
Delivered-To: speechsc@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 45EBA3A68B4 for <speechsc@core3.amsl.com>; Thu, 20 Nov 2008 16:47:45 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.489
X-Spam-Level:
X-Spam-Status: No, score=-0.489 tagged_above=-999 required=5 tests=[AWL=1.115, BAYES_00=-2.599, RELAY_IS_203=0.994, UNPARSEABLE_RELAY=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rK49OGh3vJ2l for <speechsc@core3.amsl.com>; Thu, 20 Nov 2008 16:47:44 -0800 (PST)
Received: from mail02.kaz-group.com (mail02.kaz-group.com [203.28.13.56]) by core3.amsl.com (Postfix) with ESMTP id 1D1313A6873 for <speechsc@ietf.org>; Thu, 20 Nov 2008 16:47:43 -0800 (PST)
Received: from AUKGHB01.Corporate.KAZ-Group.priv (aukghb01.corporate.kaz-group.priv) by mail02.kaz-group.com (Clearswift SMTPRS 5.2.5) with ESMTP id <T8acb491cebac10f02abb4@mail02.kaz-group.com> for <speechsc@ietf.org>; Fri, 21 Nov 2008 11:47:33 +1100
Date: Fri, 21 Nov 2008 11:47:35 +1100
MIME-Version: 1.0
To: speechsc@ietf.org
Cc:
From: Nik Waldron <nik.waldron@kaz-group.com>
X-Mailer: Microsoft Outlook v 11.00.8217, MSOC v 2.00.4007.00
Message-ID: <OF7095386F.6E2C2490-ON4B257508.00044624@kaz-group.com>
X-MIMETrack: Serialize by Router on AUKGHB01/KAZGROUP/AU(Release 6.5.2|June 01, 2004) at 11/21/2008 11:47:39 AM, Serialize complete at 11/21/2008 11:47:39 AM
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579
Subject: [Speechsc] FW: MRCPV2 Verification: Insufficient/bad speech
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://www.ietf.org/mailman/private/speechsc>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: speechsc-bounces@ietf.org
Errors-To: speechsc-bounces@ietf.org

A resubmit of the previous question accidentally sent as Rich Text

_____________________________________________
From: Nik Waldron 
Sent: Friday, November 21, 2008 9:52 AM
To: 'speechsc@ietf.org'
Subject: MRCPV2 Verification: Insufficient/bad speech

My question regards the training phase in a speaker verification session:
My reading of the draft RFC is that a training session may be initiated by
a VERIFY call, and that the speech data streamed via RTP is then used to
build the model.  Implicit in the design is that when a speaker is deemed
to have finished a turn in the dialog (say answering a voice enrolment
question), the verification system will return a VERIFICATION-COMPLETE.  I
am currently assuming that this is a detected "end-of-speech" condition.
Further VERIFY calls may then be used to refine the speaker model.

My confusion arises when speech has been detected, but the speech is for
some reason unsuitable to complete training with (e.g. there is not enough
speech, the speech is too noisy etc.).  How should the verifier respond?  

In the use case I am working on the desired behaviour is for the client to
VERIFY again on the same session and pass through more speech data, but I
can't seem to find a suitable error code/completion cause (the generic
"error" completion code seems to suit a serious internal error only).  I
was expecting an "insufficient data" and/or "bad quality data" type
feedback to the client.

Alternatively is the timing of the VERIFY-COMPLETE message not tied to
dialogue turns?

I have to now assumed that MRCPv2 client (e.g. a VXML browser) did not
have its own mechanism for detecting the end of a dialogue turn, hence it
needed notification in this case.

Can someone let me know which interpretation is correct, or what the
correct behaviour of a MRCPv2 server should be?

Regards,


NIK WALDRON
_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www.ietf.org/mailman/listinfo/speechsc
Supplemental web site:
&lt;http://www.standardstrack.com/ietf/speechsc&gt;