[Speechsc] Continuous speech recognition in MRCP
Tomáš Valenta <tomas.valenta@speechtech.cz> Wed, 09 March 2011 15:44 UTC
Return-Path: <tomas.valenta@speechtech.cz>
X-Original-To: speechsc@core3.amsl.com
Delivered-To: speechsc@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 68E233A689F for <speechsc@core3.amsl.com>; Wed, 9 Mar 2011 07:44:28 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.395
X-Spam-Level:
X-Spam-Status: No, score=-1.395 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HOST_EQ_CZ=0.904, MIME_8BIT_HEADER=0.3]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Phzi3F3xcN2x for <speechsc@core3.amsl.com>; Wed, 9 Mar 2011 07:44:27 -0800 (PST)
Received: from fred.zcu.cz (fred.zcu.cz [IPv6:2001:718:1801:1057::1:19]) by core3.amsl.com (Postfix) with ESMTP id 36D963A6A2A for <speechsc@ietf.org>; Wed, 9 Mar 2011 07:44:22 -0800 (PST)
Received: from [192.168.2.201] (uk511r01-kky.fav.zcu.cz [147.228.47.142]) by fred.zcu.cz (Postfix) with ESMTPS id DAA3FA075CA6 for <speechsc@ietf.org>; Wed, 9 Mar 2011 16:45:31 +0100 (CET)
Message-ID: <4D77A084.7050506@speechtech.cz>
Date: Wed, 09 Mar 2011 16:45:08 +0100
From: Tomáš Valenta <tomas.valenta@speechtech.cz>
Organization: SpeechTech, s.r.o.
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; cs; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7
MIME-Version: 1.0
To: speechsc@ietf.org
Content-Type: text/plain; charset="ISO-8859-2"; format="flowed"
Content-Transfer-Encoding: 7bit
X-ZCU-MailScanner-ID: DAA3FA075CA6.A0A96
X-ZCU-MailScanner: Found to be clean
X-ZCU-MailScanner-SpamCheck: not spam, SpamAssassin (not cached, score=-2.9, required 5, autolearn=not spam, ALL_TRUSTED -1.00, BAYES_00 -1.90)
X-ZCU-MailScanner-From: tomas.valenta@speechtech.cz
Subject: [Speechsc] Continuous speech recognition in MRCP
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/speechsc>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Mar 2011 15:50:01 -0000
Dear SpeechSC list members, in our company we are implementing TTS and ASR solutions using MRCP. For ASR we would like to use the protocol not only for recognition of short utterances based on simple grammar; scheme C->S: RECOGNIZE S->C: IN-PROGRESS S->C: START-OF-INPUT S->C: RECOGNITION-COMPLETE (result) but also for continuous speech recognition (e.g. minutes or tens of minutes, dictations, ...) with immediate results. Unfortunately there is no such approach in MRCPv2 specification draft. We thought about using following scheme: C->S: RECOGNIZE (continuous) S->C: IN-PROGRESS S->C: START-OF-INPUT S->C: IN-PROGRESS (partial_result_1) ... S->C: IN-PROGRESS (partial_result_n) C->S: STOP Imagine an application for writing dictation so that user can see immediately what he said. The recognizer could be located on a remote machine. The (continuous) parameter to the RECOGNIZE request could be a type of grammar (built-in language model specification, in fact) or a header value. Do you find this approach a clean solution? Or do not you think continuous speech recognition should be part of MRCPv2 specification? Kindest regards and thanks for comments, Tomas Valenta PS. Originally we started discussing this topic with Arsen Chaloyan (author of UniMRCP) here: https://groups.google.com/d/topic/unimrcp/pSaDbhHPh3M/discussion
- [Speechsc] Continuous speech recognition in MRCP Tomáš Valenta
- Re: [Speechsc] Continuous speech recognition in M… Eric Burger