Re: [Dmsp] Comments on draft-engelsma-dmsp-01.txt
Chris Cross <xcross@us.ibm.com> Tue, 21 March 2006 20:24 UTC
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1FLnPX-0007jF-OQ; Tue, 21 Mar 2006 15:24:59 -0500
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1FLnPW-0007jA-KI for dmsp@ietf.org; Tue, 21 Mar 2006 15:24:58 -0500
Received: from e34.co.us.ibm.com ([32.97.110.152]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1FLnPV-0007Md-Up for dmsp@ietf.org; Tue, 21 Mar 2006 15:24:58 -0500
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e34.co.us.ibm.com (8.12.11/8.12.11) with ESMTP id k2LKOvDe016288 for <dmsp@ietf.org>; Tue, 21 Mar 2006 15:24:57 -0500
Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VER6.8) with ESMTP id k2LKLvPM263406 for <dmsp@ietf.org>; Tue, 21 Mar 2006 13:21:57 -0700
Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id k2LKOvLB003186 for <dmsp@ietf.org>; Tue, 21 Mar 2006 13:24:57 -0700
Received: from d03nm119.boulder.ibm.com (d03nm119.boulder.ibm.com [9.17.195.145]) by d03av03.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id k2LKOuOm003175 for <dmsp@ietf.org>; Tue, 21 Mar 2006 13:24:57 -0700
In-Reply-To: <330A23D8336C0346B5C1A5BB19666647027B03B6@ATLANTIS.Brooktrout.com>
Subject: Re: [Dmsp] Comments on draft-engelsma-dmsp-01.txt
To: dmsp@ietf.org
X-Mailer: Lotus Notes Release 7.0 HF85 November 04, 2005
Message-ID: <OF498838B1.C546D371-ON85257138.0069AE33-85257138.007020C2@us.ibm.com>
From: Chris Cross <xcross@us.ibm.com>
Date: Tue, 21 Mar 2006 15:24:44 -0500
X-MIMETrack: Serialize by Router on D03NM119/03/M/IBM(Release 6.53HF654 | July 22, 2005) at 03/21/2006 13:27:47
MIME-Version: 1.0
X-Spam-Score: 0.5 (/)
X-Scan-Signature: 2fe944273194be3112d13b31c91e6941
X-BeenThere: dmsp@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Distributed Multimodal Synchronization Protocol <dmsp.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/dmsp>, <mailto:dmsp-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/dmsp>
List-Post: <mailto:dmsp@ietf.org>
List-Help: <mailto:dmsp-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/dmsp>, <mailto:dmsp-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0990740117=="
Errors-To: dmsp-bounces@ietf.org
Eric, Thanks for your comments. It takes a bit of work to wade through a spec this size and I appreciate the effort. "Burger, Eric" <EBurger@cantata.com> wrote on 03/20/2006 08:48:35 PM: > Section 3. > > Binary encoding - blech. Will anyone use XML if all of the normative > text describes the binary encoding? Conversely, given how much easier it > is to generate, parse, and debug XML, would it not be better to have the > normative text use XML, and have the mapping of tags to binary values in > the appendix? This is just the first draft. There are different approaches to organize the spec, e.g., whether we create seperate chapters or RFCs for each encoding. It is not our intent to have only one encoding serve as the normative specification for other encodings. This first draft springs from the fact we have implemented the binary encoding but not the XML one. > > Seems very VoiceXML-centric, to the point it may only work with > VoiceXML. Is that OK? That's an accurate observation. We are looking to support a "dialog level" programming model. The alternative is to work at the level of speech engines, which is covered by MRCP. I'm open to generalizing as long as we can hang on to a dialog level abstraction and continue to support a VoiceXML server as an endpoint. > > User-Agent field in SIG_INIT: says for advertising capabilities, but it > is just a string identifying the GUA. A better mechanism is to advertise > capabilities. Open to suggestion here. The intent is to provide an efficient one-turn init event. > RESULT: is there any reason not to simply tunnel EMMA or NLSML? See section 4.2.2.9. Extended Recognition Result Type. The interpretation is part of the payload of EVT_RECO_RESULTEX event. EMMA is anticipated to be one of the types. > Translating the real result into a DMSP result will be error-prone and > is guaranteed to not supply what the application desires. What is the > use case? It is not a VoiceXML browser in the handset; that is what > MRCPv2 is for. It is inconceivable that it is a network-based VoiceXML > browser using a handset ASR engine; if the handset has the power to run > ASR, it most likely has the power to run a VoiceXML browser. > > For that matter, what does the GUA do with recognition results? Is it to > populate fields or to help in low-confidence situations? If the former, > then it isn't worth having confidence scores - there should not be more > than one value. If the latter, what does the interaction look like? I am > asking, because presumably the VoiceXML interpreter will go into its "I > did not get that" portion of the form. I am assuming that the goal is to > allow the user to visually pick from a list of results. I was thinking > that it might be more compact to have the GUA send the VUA the correct > pick by reference, but that is too much state to carry around (which > pick of which result are we referring to ). Thus the current model where > the GUA pushes down the result string is a good way to go. Don't assume that the application author will only want to handle n-best results in the voice modality. He may prompt the user with "what did you say?" and pop up a list to choose from. The same argument goes for the interpretation and/or recognition results. There's all kinds of creative things that the GUA can do with that information. MRCP by definition does not support dialog level application programming. So your assertion that there won't be VoiceXML in a handset is incorrect. DMSP is designed to support a couple of broad use cases: Interaction Manager and peer-to-peer configurations. The latter includes an X+V multimodal browser where the VoiceXML is rendered by a remote VoiceXML server. Turn your assertion around: are there devices that could support a VoiceXML interpreter but not ASR/TTS? > > SIG_VXML_START: which is not really going to be used, SIG_INIT or > SIG_VXML_START? SIG_VXML_START was an optimization developed for a low bandwidth link. From the description: "The SIG_INIT message is used when fine-grained control of which events the client will listen is needed, and latency is not an issue." > > Can Dispatch: Which is more likely, a series of "can you do this?" or > "what can you do?" If the latter, then it would be better to have a > single OPTIONS message. If the former, then the mechanism as described > is OK. OK :) > > Get/Set Cookies security and privacy considerations Understand there is work to do here. The point is that we'll need to propogate session information when distributing to a VoiceXML server (or other dialog level VUA.) > > Strings: most of the strings are or will need to be Unicode. For > example, arbitrary form text data can easily be non-Western. Likewise, > expect International URI's to end up as Unicode or UTF-16. If every byte > counts, then I would offer selecting the charset in SIG_INIT or > SIG_VXML_START, with a default to UTF-8. Every byte counts so utf-8 is probably the default. Maybe string encoding is part of the initial session negotiation? > > DOM keydown, keyup, keypress events: I don't have the DOM reference > handy. Do these refer to actual keyboard presses or ink strokes? If so, > who would use a key-by-key protocol for a distributed, web-oriented > stimulus protocol? Others in the multimodal community, such as some OMA members, have pressed for this level of granularity (no pun intended.) I don't think key-by-key protocol is practical on a real network and it is generally not necessary in dialog level interaction. > > General: Much easier to build parsers that have all of the fixed-length > data items up front. Take Table 36, for example. Having the Error Code > follow Correlation means I can immediately figure out the status without > having to parse the Node and Location fields. I might not care, > depending on the error. If I do care, there is no harm in having the > Error Code up front. Good suggestion. > > Need to explain how a loop could occur (Section 4.4) We have a use case that illustrates this that I will dig up. > > _______________________________________________ > Dmsp mailing list > Dmsp@ietf.org > https://www1.ietf.org/mailman/listinfo/dmsp
_______________________________________________ Dmsp mailing list Dmsp@ietf.org https://www1.ietf.org/mailman/listinfo/dmsp
- RE: [Dmsp] Comments on draft-engelsma-dmsp-01.txt Burger, Eric
- [Dmsp] Comments on draft-engelsma-dmsp-01.txt Burger, Eric
- RE: [Dmsp] Comments on draft-engelsma-dmsp-01.txt Brian Marquette
- Re: [Dmsp] Comments on draft-engelsma-dmsp-01.txt Chris Cross
- RE: [Dmsp] Comments on draft-engelsma-dmsp-01.txt Brian Marquette
- RE: [Dmsp] Comments on draft-engelsma-dmsp-01.txt Engelsma Jonathan-QA2678
- RE: [Dmsp] Comments on draft-engelsma-dmsp-01.txt Burger, Eric
- RE: [Dmsp] Comments on draft-engelsma-dmsp-01.txt Burger, Eric