[Speechsc] Speaker Verification - Insufficient or Noisy Speech
Nik Waldron <nik.waldron@kaz-group.com> Mon, 12 January 2009 00:07 UTC
Return-Path: <speechsc-bounces@ietf.org>
X-Original-To: speechsc-archive@optimus.ietf.org
Delivered-To: ietfarch-speechsc-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id D31113A6885; Sun, 11 Jan 2009 16:07:02 -0800 (PST)
X-Original-To: speechsc@core3.amsl.com
Delivered-To: speechsc@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id B098E3A6885 for <speechsc@core3.amsl.com>; Sun, 11 Jan 2009 16:07:01 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.183
X-Spam-Level:
X-Spam-Status: No, score=0.183 tagged_above=-999 required=5 tests=[AWL=-0.672, BAYES_20=-0.74, J_CHICKENPOX_53=0.6, RELAY_IS_203=0.994, UNPARSEABLE_RELAY=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WvtHrI5uu78M for <speechsc@core3.amsl.com>; Sun, 11 Jan 2009 16:07:00 -0800 (PST)
Received: from mail02.kaz-group.com (mail02.kaz-group.com [203.28.13.56]) by core3.amsl.com (Postfix) with ESMTP id C11D73A6407 for <speechsc@ietf.org>; Sun, 11 Jan 2009 16:06:59 -0800 (PST)
Received: from AUKGHB01.Corporate.KAZ-Group.priv (aukghb01.corporate.kaz-group.priv) by mail02.kaz-group.com (Clearswift SMTPRS 5.2.5) with ESMTP id <T8bd6ee5710ac10f02a8d4@mail02.kaz-group.com> for <speechsc@ietf.org>; Mon, 12 Jan 2009 11:06:38 +1100
To: speechsc@ietf.org
MIME-Version: 1.0
X-Mailer: Lotus Notes Release 6.5.2 June 01, 2004
Message-ID: <OF919927DC.D4031D76-ONCA25753B.0080EF56-CA25753C.0000A36E@kaz-group.com>
From: Nik Waldron <nik.waldron@kaz-group.com>
Date: Mon, 12 Jan 2009 11:06:42 +1100
X-MIMETrack: Serialize by Router on AUKGHB01/KAZGROUP/AU(Release 6.5.2|June 01, 2004) at 01/12/2009 11:06:40 AM, Serialize complete at 01/12/2009 11:06:40 AM
Cc: Nik Waldron <nik.waldron@kaz-group.com>
Subject: [Speechsc] Speaker Verification - Insufficient or Noisy Speech
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://www.ietf.org/mailman/private/speechsc>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: speechsc-bounces@ietf.org
Errors-To: speechsc-bounces@ietf.org
I sent an email previously requesting information on how a speaker
verification
system implementing MRCPv2 should cope in the situation, where there was
insufficient or poor quality speech arriving on the RTP audio stream. It
seemed
to me that was an area of some deficiency in the specification. I
received no
feedback other than one response saying that to his knowledge there were
no
other implementers for Speaker Verification.
Below I outline the MRCPv2 exchanges for a training operation:
C->S: MRCP/2.0 207 START-SESSION 314161
Channel-Identifier:32AECB23433801@speakverify
Repository-URI:http://www.example.com/voiceprintdbase/
Voiceprint-Mode:train
Voiceprint-Identifier:johnsmith.voiceprint
S->C: MRCP/2.0 82 314161 200 COMPLETE
Channel-Identifier:32AECB23433801@speakverify
C->S: MRCP/2.0 76 VERIFY 314162
Channel-Identifier:32AECB23433801@speakverify
S->C: MRCP/2.0 85 314162 200 IN-PROGRESS
Channel-Identifier:32AECB23433801@speakverify
The end-point detector show insufficient data (which is buffered), or bad
signal quality (bad SNR for example). Note that no START-OF-INPUT has NOT
been sent although speech has begun.
S->C: MRCP/2.0 140 VERIFICATION-COMPLETE 314162 COMPLETE
Channel-Identifier:32AECB23433801@speakverify
Completion-Cause:002 no-input-timeout
This is undesirable from my perspective since it gives the impression to
the
client that no data has been received (untrue in the insufficient data
case), and
provides no distinction between this and the "bad data" case. This
information
might be of utility to a call-flow designer in an IVR system.
I also note that in the case of text-independent verifiers several turns
worth of
data may be required for a verification. Several rounds of "no input"
timeouts
would surely be confusing to the client, yet this class of verifiers may
be unable
to generate and nlsml+xml response on the nth dialog turn.
The enrolment might then continue:
C->S: MRCP/2.0 76 VERIFY 314163
Channel-Identifier:32AECB23433801@speakverify
S->C: MRCP/2.0 85 314163 200 IN-PROGRESS
Channel-Identifier:32AECB23433801@speakverify
S->C: MRCP/2.0 96 START-OF-INPUT 314163 IN-PROGRESS
Channel-Identifier:32AECB23433801@speakverify
S->C: MRCP/2.0 131 VERIFICATION-COMPLETE 314163 COMPLETE
Channel-Identifier:32AECB23433801@speakverify
Completion-Cause:000 success
C->S: MRCP/2.0 76 VERIFY 314164
Channel-Identifier:32AECB23433801@speakverify
S->C: MRCP/2.0 85 314164 200 IN-PROGRESS
Channel-Identifier:32AECB23433801@speakverify
S->C: MRCP/2.0 96 START-OF-INPUT 314164 IN-PROGRESS
Channel-Identifier:32AECB23433801@speakverify
S->C: MRCP/2.0 131 VERIFICATION-COMPLETE 314164 COMPLETE
Channel-Identifier:32AECB23433801@speakverify
Completion-Cause:000 success
C->S: MRCP/2.0 81 END-SESSION 314174
Channel-Identifier:32AECB23433801@speakverify
S->C: MRCP/2.0 82 314174 200 COMPLETE
Channel-Identifier:32AECB23433801@speakverify
Since I received no responses (perhaps due to being close to the holiday
season),
I will venture a proposal for extending the RFC to include the bad signal
cases
(+ indicates an addition, * a modification)
+------------+--------------------------+---------------------------+
| Cause-Code | Cause-Name | Description |
+------------+--------------------------+---------------------------+
| 000 | success | VERIFY or |
| | | VERIFY-FROM-BUFFER |
| | | request completed |
| | | successfully. The verify |
| | | decision can be |
| | | "accepted", "rejected", |
| | | or "undecided". |
| 001 | error | VERIFY or |
| | | VERIFY-FROM-BUFFER |
| | | request terminated |
| | | prematurely due to a |
| | | verification resource or |
| | | system error. |
| 002 | no-input-timeout | VERIFY request completed |
| | | with no result due to a |
| | | no-input-timeout. |
| 003 | too-much-speech-timeout | VERIFY request completed |
| | | result due to too much |
| | | speech. |
| 004 | speech-too-early | VERIFY request completed |
| | | with no result due to |
| | | spoke too soon. |
+ | 005 | insufficient-speech | VERIFY or |
+ | | | VERIFY-FROM-BUFFER |
+ | | | request completed |
+ | | | successfully but had |
+ | | | insufficient speech to |
+ | | | complete. More speech |
+ | | | will complete the current |
+ | | | incremental operation |
+ | 006 | bad-speech | VERIFY or |
+ | | | VERIFY-FROM-BUFFER |
+ | | | request completed |
+ | | | unsuccessfully, the |
+ | | | speech quality was too |
+ | | | poor |
* | 007 | buffer-empty | VERIFY-FROM-BUFFER |
| | | request completed with no |
| | | result due to empty |
| | | buffer. |
* | 008 | out-of-sequence | Verification operation |
| | | failed due to |
| | | out-of-sequence method |
| | | invocations. For example |
| | | calling VERIFY before |
| | | QUERY-VOICEPRINT. |
* | 009 | repository-uri-failure | Failure accessing |
| | | Repository URI. |
* | 010 | repository-uri-missing | Repository-uri is not |
| | | specified. |
* | 011 | voiceprint-id-missing | Voiceprint-identification |
| | | is not specified. |
* | 012 | voiceprint-id-not-exist | Voiceprint-identification |
| | | does not exist in the |
| | | voiceprint repository. |
+------------+--------------------------+---------------------------+
Alternatively the new entries could be appended for compatibility. The
only
disadvantage to doing so would be that entries would not be grouped in the
table by category.
I'll happily accept any corrections to my understanding, incase I have
misread
the spec, or feedback on my suggestions.
NIK WALDRON
_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www.ietf.org/mailman/listinfo/speechsc
Supplemental web site:
<http://www.standardstrack.com/ietf/speechsc>
- [Speechsc] Speaker Verification - Insufficient or… Nik Waldron
- Re: [Speechsc] Speaker Verification - Insufficien… Dan Burnett
- Re: [Speechsc] Speaker Verification - Insufficien… Nik Waldron
- Re: [Speechsc] Speaker Verification - Insufficien… Eric Burger
- Re: [Speechsc] Speaker Verification - Insufficien… Arsen Chaloyan
- Re: [Speechsc] Speaker Verification - Insufficien… Eric Burger