RE: [speechsc] Fallback <audio> and 003 uri-failure

"Shanmugham, Saravanan" <sarvi@cisco.com> Tue, 16 May 2006 20:32 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1Fg6DY-0005ey-JV; Tue, 16 May 2006 16:32:32 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1Fg6DW-0005et-PA for speechsc@ietf.org; Tue, 16 May 2006 16:32:30 -0400
Received: from sj-iport-4.cisco.com ([171.68.10.86]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1Fg6DU-00051s-9O for speechsc@ietf.org; Tue, 16 May 2006 16:32:30 -0400
Received: from sj-dkim-7.cisco.com ([171.68.10.88]) by sj-iport-4.cisco.com with ESMTP; 16 May 2006 13:32:26 -0700
X-IronPort-AV: i="4.05,134,1146466800"; d="scan'208,217"; a="1807336253:sNHT71949114"
Received: from sj-core-4.cisco.com (sj-core-4.cisco.com [171.68.223.138]) by sj-dkim-7.cisco.com (8.12.11/8.12.11) with ESMTP id k4GKWPqK013820; Tue, 16 May 2006 13:32:25 -0700
Received: from vtg-um-e2k6.sj21ad.cisco.com (vtg-um-e2k6.cisco.com [171.70.93.77]) by sj-core-4.cisco.com (8.12.10/8.12.6) with ESMTP id k4GKWPsF005784; Tue, 16 May 2006 13:32:25 -0700 (PDT)
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Subject: RE: [speechsc] Fallback <audio> and 003 uri-failure
Date: Tue, 16 May 2006 13:32:24 -0700
Message-ID: <03772D1EC8DE624A863058C75874A75CF00457@vtg-um-e2k6.sj21ad.cisco.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [speechsc] Fallback <audio> and 003 uri-failure
Thread-Index: AcZ5JV02aopYs4PKTmCm+h8t9Q+QowAAP6/Q
From: "Shanmugham, Saravanan" <sarvi@cisco.com>
To: Andrew Wahbe <awahbe@voicegenie.com>, "Carter, Jerry" <jerry.carter@nuance.com>
DKIM-Signature: a=rsa-sha1; q=dns; l=36885; t=1147811545; x=1148675545; c=relaxed/simple; s=sjdkim7001; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; d=cisco.com; i=sarvi@cisco.com; z=From:=22Shanmugham,=20Saravanan=22=20<sarvi@cisco.com> |Subject:RE=3A=20[speechsc]=20Fallback=20<audio>=20and=20=20003=20uri-failure; X=v=3Dcisco.com=3B=20h=3DCoqB4sXF6NyRoqXn74GtX/YtzTQ=3D; b=TspipaTLJQsbv3C1L6iKD+ouy+ITGX8KwuXxVaYQC+W1BUFSDh690zawDMRJ9B4HV1LeK/B4 g9mUgeDUva12h/vbB0q0T5lOJnK4MFNsPIwNX1gC3QzBSZ3pG4G3CR2E;
Authentication-Results: sj-dkim-7.cisco.com; header.From=sarvi@cisco.com; dkim=pass ( 59 extraneous bytes; sig from cisco.com verified; );
X-Spam-Score: 0.1 (/)
X-Scan-Signature: fd5444711ebc07787129c4e51d028ee2
Cc: "IETF SPEECHSC (E-mail)" <speechsc@ietf.org>, David R Oran <oran@cisco.com>, "Burger, Eric" <EBurger@cantata.com>
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1140050313=="
Errors-To: speechsc-bounces@ietf.org

I agree this may be usefull information for the server to send to the
client. 
But I do not believe that it is urgent enough that we try to address it
in this specification.
 
Like I said earlier, I like the suggestion from Dave Oran on how to
approach this problem.
Details about the TTS speech markup compilation, Recognizer Grammar
Compilation errors etc could be returned in the message body of
associated *-COMPLETE event or an appropriate EXCEPTION event.
 
The format of the message body(possibly in XML) needs to be well defined
if we expect this to work interoperably between all clients and servers.
I suggest interested parties write a draft proposing a format for this
error message body and also the EXCEPTION event if they see a need for
it.
 
Thanks,
Sarvi


________________________________

	From: Andrew Wahbe [mailto:awahbe@voicegenie.com] 
	Sent: Tuesday, May 16, 2006 1:12 PM
	To: Carter, Jerry
	Cc: IETF SPEECHSC (E-mail); David R Oran; Burger, Eric
	Subject: Re: [speechsc] Fallback <audio> and 003 uri-failure
	
	
	I agree that this isn't a strange thing to ask for. I think the
question here is not if this is needed but when it should be added. i.e.
does this go into MRCP v2 or some future draft. It could even be a short
draft that just specifies new headers to enable the feature and to
communicate the failed URIs.
	
	Andrew
	
	Carter, Jerry wrote: 

		This question is probably best answered directly by
members of the UI
		Designer community such as Blade Kotelly, but the
responses that I've gotten
		fall into two classes.
		
		The first group notes that since VoiceXML 2.x does not
have this capability,
		current applications must be designed without this
information.  I call this
		the "If I can't do it, I don't even want to think about
it" response.
		
		The second group notes that knowing exactly what errors
occurred and by
		inference what the caller heard, allows the application
server to better
		tailor subsequent interactions by replaying certain
portions or by producing
		prompts that avoid troublesome servers.  Transmitting
this information to
		the application server is done by the client.  Few if
any directed dialog 
		applications support this capability today, but the next
generation of
		NL-focused state-based dialog engines being developed
and deployed by AT&T,
		IBM, Nuance, and others support far more flexible
applications.
		
		
		  

			-----Original Message-----
			From: Burger, Eric [mailto:EBurger@cantata.com]
			Sent: Tuesday, May 16, 2006 12:33 PM
			To: David R Oran; Jerry Carter
			Cc: IETF SPEECHSC (E-mail)
			Subject: RE: [speechsc] Fallback <audio> and 003
uri-failure
			
			Would a client actually *do* anything with the
enhanced information?
			Sure it is a nice-to-have, but does any real
application care to know
			anything besides "something went wrong"?
			
			-----Original Message-----
			From: David R Oran [mailto:oran@cisco.com]
			Sent: Tuesday, May 09, 2006 8:34 AM
			To: Jerry Carter
			Cc: IETF SPEECHSC (E-mail); Andrew Wahbe; Dave
Burke
			Subject: Re: [speechsc] Fallback <audio> and 003
uri-failure
			
			
			On May 8, 2006, at 5:43 PM, Jerry Carter wrote:
			
			    

				Experience leads me to believe that the
simpler approach does not
				work for arbitrary SSML documents.
Because URIs may point to
				streaming data or return content based
on cookies, SSML may use the
				same URI to reference different data.
Having explicit marker
				events and error reporting (either as
events or messages) may allow
				the client to determine the exact URI
instance that failed -- for
				those rare cases where this is
necessary.
				
				      

			Why can't the client re-reference the URI itself
to see if it's an
			aliasing problem? Strikes me this is a general
issue with any content
			indirection scheme and it isn't the job of MRCP
to provide the
			forensics. On the other hand, giving the client
a pointer into the
			SSML document for where the synthesizer "gave
up" would seem to be
			useful and not much of a burden on the server.
If we do something
			like this, it's important to not go *too* far
and wind up with
			something complex like compiler tracebacks
syntactically and
			semantically mandated by MRCP. Here's one
possible approach:
			
			1) we allow some opaque (to MRCP) data to be
returned on the URI
			failure event.
			2) we suggest to people the server can use this
to provide forensics
			to the client for where in parsing/playing the
SSML the server barfed
			and possibly why.
			
			Dave.
			
			
			    

				  C->S: MRCP/2.0 489 SPEAK 543257
	
Channel-Identifier:32AECB23433802@speechsynth
	
Content-Type:application/ssml+xml
				        Content-Length:???
				
				        <?xml version="1.0"?>
				        <speak version="1.0"
	
xmlns="http://www.w3.org/2001/10/synthesis"
<http://www.w3.org/2001/10/synthesis> 
	
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
<http://www.w3.org/2001/XMLSchema-instance> 
	
xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
	
http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
<http://www.w3.org/2001/10/synthesishttp://www.w3.org/TR/speech-synthesi
s/synthesis.xsd> "
				            xml:lang=" <xml:lang=>
en-US" xml:base=" <xml:base=> "http://www.example.com/"
<http://www.example.com/> >
				          <audio src="uri.wav"> <!--
invalid URI -->
				            <mark name="inside first"/>
				            <audio src="uri.wav"/> <!--
valid URI -->
				          </audio>
				          <mark name="before second"/>
				          <audio src="uri.wav"/>
				        </speak>
				
				  S->C: MRCP/2.0 28 543257 200
IN-PROGRESS
	
Channel-Identifier:32AECB23433802@speechsynth
				
				  S->C: MRCP/2.0 543257 407 IN-PROGRESS
	
Channel-Identifier:32AECB23433802@speechsynth
				        Completion-Cause:009 uri
resolution problem
				        Failed-URI-Cause:404
	
Failed-URI:http://www.example.com/uri.wav
				
				  S->C: MRCP/2.0 SPEECH-MARKER 543257
IN-PROGRESS
	
Channel-Identifier:32AECB23433802@speechsynth
				        Speech-Marker:timestamp=;inside
first
				
				  S->C: MRCP/2.0 SPEECH-MARKER 543257
IN-PROGRESS
	
Channel-Identifier:32AECB23433802@speechsynth
				        Speech-Marker:timestamp=;before
second
				
				  S->C: MRCP/2.0 SPEAK-COMPLETE 543257
COMPLETE
	
Channel-Identifier:32AECB23433802@speechsynth
				        Completion-Cause:000 normal
				
				
				
				On May 8, 2006, at 4:39 PM, Dave Burke
wrote:
				
				      

				At this late stage, I think changing the
message exchange pattern
				is too incisive (I also quite like the
patter...). Though, I
				understand you concern for a
proliferation of events.... With that
				in mind, how about taking a variation of
what's been discussed for
				grammars and go with:
				
				1. Allow Failed-URI to appear multiple
times in SPEAK-COMPLETE
				(and reports failed <audio>s) with
return type 003 uri-failure.
				These headers cannot be combined to one
comma separated list
				because commas are valid reserved URI
tokens.
				
				2. Combine the the reason in the
Failed-URI as you suggested so
				that we can have multiple Failed-URIs.
				
				This is sufficient for the MRCP client
to detect what has been
				played and what hasn't and is consistent
with SSML.
				
				Dave
				
				----- Original Message ----- From:
"Carter, Jerry"
				<jerry.carter@nuance.com>
<mailto:jerry.carter@nuance.com> 
				To: "Andrew Wahbe"
<awahbe@voicegenie.com> <mailto:awahbe@voicegenie.com> ; "Dave Burke"
				<david.burke@voxpilot.com>
<mailto:david.burke@voxpilot.com> 
				Cc: <speechsc@ietf.org>
<mailto:speechsc@ietf.org> 
				Sent: Monday, May 08, 2006 8:34 PM
				Subject: RE: [speechsc] Fallback <audio>
and 003 uri-failure
				
				
				        

				I agree that the current text is clear.
As described in section
				5, there is
				a single response delivered for each
message.  Unfortunately, as
				this case
				and a similar analysis for grammar
definitions shows [1], error
				handling is
				an area of weakness in the -09 draft.
				
				There are two solutions that come to
mind.
				
				* Add additional events which can be
used for error reporting.
				This seems
				to be the direction that you and Dave
are endorsing.
				
				* Alternatively, relax the single
response requirement so that
				requests
				follow a natural progression from
PENDING to IN-PROGRESS to
				COMPLETE. Each
				request would generate exactly one
COMPLETE response.  This final
				response
				might be preceded by zero or more
IN-PROGRESS messages which
				would in turn
				be preceded by zero or more PENDING
messages.
				
				I fear the events approach leads to a
proliferation of events and
				confuses
				the semantics of the language.
Conversely, the clear progression
				in message
				handling states is easily described by
adding a paragraph or two
				to section
				5.
				
				
				[1]
http://www1.ietf.org/mail-archive/web/speechsc/current/
				msg01797.html
				
				
				          

				-----Original Message-----
				From: Andrew Wahbe
[mailto:awahbe@voicegenie.com]
				Sent: Monday, May 08, 2006 3:08 PM
				To: Dave Burke
				Cc: Carter, Jerry; speechsc@ietf.org
				Subject: Re: [speechsc] Fallback <audio>
and 003 uri-failure
				
				I was going to give a similar reply but
I wanted to reference the
				restriction text in the spec.
Unfortunately, I haven't been able
				to find
				it though the term "response" does imply
it... of course PENDING
				and
				IN-PROGRESS could be interpreted as a
kind of "provisional"
				response....
				though the examples paint a different
picture (only 1 response
				to each
				request).
				
				Section 5.3 says:
				
				   After receiving and interpreting the
request message for a
				method,
				   the server resource responds with an
MRCPv2 response message.
				
				and
				
				   A PENDING or IN-PROGRESS
				   status indicates that further Event
messages may be delivered
				with
				   that request-id.
				
				Perhaps the limit of one response to
each request should be stated
				explicitly somewhere (sorry if I missed
it).
				
				Andrew
				
				Dave Burke wrote:
				            

				That works if we change the second
response to an event as
				              

				suggested
				            

				by Andrew (the MRCP message exchange
pattern rightly restricts
				              

				one
				            

				response to each request).
				
				Dave
				
				----- Original Message ----- From:
"Carter, Jerry"
				<jerry.carter@nuance.com>
<mailto:jerry.carter@nuance.com> 
				To: "Dave Burke"
<david.burke@voxpilot.com> <mailto:david.burke@voxpilot.com> ;
<speechsc@ietf.org> <mailto:speechsc@ietf.org> 
				Sent: Monday, May 08, 2006 4:45 PM
				Subject: RE: [speechsc] Fallback <audio>
and 003 uri-failure
				
				
				Would not notification along these lines
be appropriate?  If
				              

				so, > perhaps
				            

				adding this example to the specification
would be useful.
				
				  C->S: MRCP/2.0 489 SPEAK 543257
	
Channel-Identifier:32AECB23433802@speechsynth
	
Content-Type:application/ssml+xml
				        Content-Length:???
				
				        <?xml version="1.0"?>
				           <speak version="1.0"
	
xmlns="http://www.w3.org/2001/10/synthesis"
<http://www.w3.org/2001/10/synthesis> 
	
xmlns:xsi="http://www.w3.org/2001/XMLSchema-
				              

				instance"
				            

	
xsi:schemaLocation="http://www.w3.org/2001/10/
				              

				synthesis
				            

	
http://www.w3.org/TR/speech-synthesis/
				              

				synthesis.xsd"
				            

				               xml:lang="en-US"
xml:base=" <xml:base=> http://
				              

				www.example.com/">
				            

				             <audio src="baduri.wav">
<!-- invalid URI -->
				               <audio
src="gooduri.wav"/> <!-- valid URI -->
				             </audio>
				          </speak>
				
				
				  S->C: MRCP/2.0 28 543257 200
IN-PROGRESS
	
Channel-Identifier:32AECB23433802@speechsynth
				
				  S->C: MRCP/2.0 543260 407 IN-PROGRESS
	
Channel-Identifier:32AECB23433802@speechsynth
				        Completion-Cause:009 uri
resolution problem
				        Failed-URI-Cause:404
	
Failed-URI:http://www.example.com/baduri.wav
				
				  S->C: MRCP/2.0 79 SPEAK-COMPLETE
543257 COMPLETE
	
Channel-Identifier:32AECB23433802@speechsynth
				        Completion-Cause:000 normal
				
				
				Dave Burke wrote:
				              

				If I have some SSML along the lines of
				
				<speak>
				<audio src="baduri.wav"> <!-- invalid
URI -->
				<audio src="gooduri.wav"/> <!-- valid
URI -->
				</audio>
				</speak>
				
				will I get a SPEAK-COMPLETE with 000
normal or 003 uri-failure?
				
				SSML requires that processing continues
but that the
				hosting environment be notified. It
would be useful to
				clarify that this is indeed the case
with the basicsynth
				/ speechsynth and that 003 uri-failure
will be returned.
				Without this, the most trivial of media
server applications
				(i.e. playing announcements) is not
possible to be
				implemented robustly.
				
				Dave
				                

	
_______________________________________________
				Speechsc mailing list
				Speechsc@ietf.org
	
https://www1.ietf.org/mailman/listinfo/speechsc
				
				
	
_______________________________________________
				Speechsc mailing list
				Speechsc@ietf.org
	
https://www1.ietf.org/mailman/listinfo/speechsc
				
				
				              

	
_______________________________________________
				Speechsc mailing list
				Speechsc@ietf.org
	
https://www1.ietf.org/mailman/listinfo/speechsc
				          

	
_______________________________________________
				Speechsc mailing list
				Speechsc@ietf.org
	
https://www1.ietf.org/mailman/listinfo/speechsc
				
				        

	
_______________________________________________
				Speechsc mailing list
				Speechsc@ietf.org
	
https://www1.ietf.org/mailman/listinfo/speechsc
				      

			_______________________________________________
			Speechsc mailing list
			Speechsc@ietf.org
			https://www1.ietf.org/mailman/listinfo/speechsc
			
			_______________________________________________
			Speechsc mailing list
			Speechsc@ietf.org
			https://www1.ietf.org/mailman/listinfo/speechsc
			    

		
		_______________________________________________
		Speechsc mailing list
		Speechsc@ietf.org
		https://www1.ietf.org/mailman/listinfo/speechsc
		
		
		  

_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc