Re: [Speechsc] Continuous speech recognition in MRCP
Eric Burger <eburger@standardstrack.com> Wed, 09 March 2011 21:37 UTC
Return-Path: <eburger@standardstrack.com>
X-Original-To: speechsc@core3.amsl.com
Delivered-To: speechsc@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id A48DE3A6AC3 for <speechsc@core3.amsl.com>; Wed, 9 Mar 2011 13:37:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.406
X-Spam-Level:
X-Spam-Status: No, score=-102.406 tagged_above=-999 required=5 tests=[AWL=-0.107, BAYES_00=-2.599, MIME_8BIT_HEADER=0.3, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Mgeroar-s9uP for <speechsc@core3.amsl.com>; Wed, 9 Mar 2011 13:37:32 -0800 (PST)
Received: from gs19.inmotionhosting.com (gs19.inmotionhosting.com [205.134.249.249]) by core3.amsl.com (Postfix) with ESMTP id 94BF93A6778 for <speechsc@ietf.org>; Wed, 9 Mar 2011 13:37:32 -0800 (PST)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=standardstrack.com; h=Received:Subject:Mime-Version:Content-Type:From:In-Reply-To:Date:Cc:Message-Id:References:To:X-Mailer; b=tqtLJcDk3AlG2cfuT6iAn58ZZxwFYSbfeZ+3CtbfJPzpYEmy8iyGqCi7gLNJ76rSXUHEy/AjFQnrL3i7vwk1UmT3Hf9AAUq7IEspIkhBVMsRMUNDIpI1IhTvZWzp9BTT;
Received: from ip68-100-199-8.dc.dc.cox.net ([68.100.199.8] helo=[192.168.15.134]) by gs19.inmotionhosting.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.69) (envelope-from <eburger@standardstrack.com>) id 1PxR3r-0007uB-Ty; Wed, 09 Mar 2011 13:36:52 -0800
Mime-Version: 1.0 (Apple Message framework v1082)
Content-Type: multipart/signed; boundary="Apple-Mail-49--723021856"; protocol="application/pkcs7-signature"; micalg="sha1"
From: Eric Burger <eburger@standardstrack.com>
In-Reply-To: <4D77A084.7050506@speechtech.cz>
Date: Wed, 09 Mar 2011 16:38:45 -0500
Message-Id: <19596725-73B1-4234-9349-5C440BAF960C@standardstrack.com>
References: <4D77A084.7050506@speechtech.cz>
To: Tomáš Valenta <tomas.valenta@speechtech.cz>
X-Mailer: Apple Mail (2.1082)
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - gs19.inmotionhosting.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - standardstrack.com
Cc: speechsc@ietf.org
Subject: Re: [Speechsc] Continuous speech recognition in MRCP
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/speechsc>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Mar 2011 21:37:33 -0000
That looks sensible. Right now that extension would be out of the current MRCP charter, but if there is interest, I'm sure folks could get together to extend MRCPv2, either as an individual submission or with rechartering of the work group. Will you be in Prague for IETF 80? It's under 1.5 hours away from Plzen... On Mar 9, 2011, at 10:45 AM, Tomáš Valenta wrote: > Dear SpeechSC list members, > > in our company we are implementing TTS and ASR solutions using MRCP. For ASR we would like to use the protocol not only for recognition of short utterances based on simple grammar; scheme > > C->S: RECOGNIZE > S->C: IN-PROGRESS > S->C: START-OF-INPUT > S->C: RECOGNITION-COMPLETE (result) > > but also for continuous speech recognition (e.g. minutes or tens of minutes, dictations, ...) with immediate results. Unfortunately there is no such approach in MRCPv2 specification draft. We thought about using following scheme: > > C->S: RECOGNIZE (continuous) > S->C: IN-PROGRESS > S->C: START-OF-INPUT > S->C: IN-PROGRESS (partial_result_1) > ... > S->C: IN-PROGRESS (partial_result_n) > C->S: STOP > > Imagine an application for writing dictation so that user can see immediately what he said. The recognizer could be located on a remote machine. > > The (continuous) parameter to the RECOGNIZE request could be a type of grammar (built-in language model specification, in fact) or a header value. Do you find this approach a clean solution? Or do not you think continuous speech recognition should be part of MRCPv2 specification? > > Kindest regards and thanks for comments, > Tomas Valenta > > PS. Originally we started discussing this topic with Arsen Chaloyan (author of UniMRCP) here: > https://groups.google.com/d/topic/unimrcp/pSaDbhHPh3M/discussion > > _______________________________________________ > Speechsc mailing list > Speechsc@ietf.org > https://www.ietf.org/mailman/listinfo/speechsc > Supplemental web site: > <http://www.standardstrack.com/ietf/speechsc>
- [Speechsc] Continuous speech recognition in MRCP Tomáš Valenta
- Re: [Speechsc] Continuous speech recognition in M… Eric Burger