Re: [Speechsc] Last Call: <draft-ietf-speechsc-mrcpv2-24.txt> (Media Resource Control Protocol Version 2 (MRCPv2)) to Proposed Standard

Dan Burnett <dburnett@voxeo.com> Mon, 31 October 2011 01:25 UTC

Return-Path: <dburnett@voxeo.com>
X-Original-To: speechsc@ietfa.amsl.com
Delivered-To: speechsc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D6ACE11E80A2; Sun, 30 Oct 2011 18:25:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.12
X-Spam-Level:
X-Spam-Status: No, score=-2.12 tagged_above=-999 required=5 tests=[AWL=0.479, BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4lL8mJ6OyLGu; Sun, 30 Oct 2011 18:25:21 -0700 (PDT)
Received: from voxeo.com (mmail.voxeo.com [66.193.54.208]) by ietfa.amsl.com (Postfix) with ESMTP id D1A5D11E80A4; Sun, 30 Oct 2011 18:25:20 -0700 (PDT)
Received: from [209.237.253.51] (account dburnett@voxeo.com HELO [10.119.5.202]) by voxeo.com (CommuniGate Pro SMTP 5.3.8) with ESMTPSA id 98386219; Mon, 31 Oct 2011 01:20:22 +0000
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset="us-ascii"
From: Dan Burnett <dburnett@voxeo.com>
In-Reply-To: <BANLkTimxON1YJUTtS68BrRWweZRN2GW+6Q@mail.gmail.com>
Date: Sun, 30 Oct 2011 21:20:21 -0400
Content-Transfer-Encoding: quoted-printable
Message-Id: <A38C5B0F-1739-478F-9FB1-D67DC4A6E9C2@voxeo.com>
References: <20110316191330.15705.6182.idtracker@localhost> <BANLkTimxON1YJUTtS68BrRWweZRN2GW+6Q@mail.gmail.com>
To: Slawomir Testowy <slawomir.testowy@gmail.com>
X-Mailer: Apple Mail (2.1084)
Cc: speechsc@ietf.org, ietf@ietf.org
Subject: Re: [Speechsc] Last Call: <draft-ietf-speechsc-mrcpv2-24.txt> (Media Resource Control Protocol Version 2 (MRCPv2)) to Proposed Standard
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/speechsc>, <mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/speechsc>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 31 Oct 2011 01:25:22 -0000

On Apr 18, 2011, at 3:10 AM, Slawomir Testowy wrote:

> Hi!
> 
> Some other comments:
> 
>> 4.2. Managing Resource Control Channels
> 
>> When the client wants to add a media processing resource to the
>> session, it issues a SIP re-INVITE transaction.
> 
> Is it possible to allocate more than one resource of the same type
> during one SIP dialog? Example in 4.2 shows how to allocate
> synthesizer and recognizer, but does not specify if there may be e.g.
> more than one synthesizer.

No, it is not possible.  There is no way, for example, to specify which recognizer is being deleted when you deallocate one.  I will clarify this restriction in the specification.

> 
> 
> 
>> 6.2.9. Content-Encoding
>> 
>> 
>>  The content-encoding entity-header is used as a modifier to the
>>  media-type.  When present, its value indicates what additional
>>  content encoding has been applied to the entity-body, and thus what
>>  decoding mechanisms must be applied in order to obtain the media-type
>>  referenced by the content-type header field.  Content-encoding is
>>  primarily used to allow a document to be compressed without losing
>>  the identity of its underlying media type.  Note that the SDP session
>>  can be used to determine accepted encodings (see Section 7).  This
>>  header field MAY occur on all messages.
> 
> Section 7 describes usage of OPTIONS method of SIP and Accept-Encoding
> header is returned by SIP response, not SDP answer, so I guess "Note that
> the SDP session can be used" should be changed to "Note that the SIP
> session can be used".

Good catch.  I will make this change.

> 
>>  When a CONTROL request to jump backward is issued to a currently
>> speaking synthesizer resource, and the target jump point is before
>>  the start of the current "SPEAK" request, the current "SPEAK" request
>>  MUST restart from the beginning of its speech data and the response
>>  to the CONTROL request MUST contain this header field with a value of
>>  "true" indicating a restart.
> 
> Why sometimes requests are surrounded by quotation marks (like "SPEAK")
> and sometimes not (like CONTROL request)? This happens through all the
> specification. This may be a minor nit, but makes the whole paper look like a
> "draft" :)

This is an artifact of having had multiple editors over the years :)  I will correct this.

> 
>> 8.4.7. Prosody-Parameters
> 
>> The prosody parameter headers in the "SET-PARAMS" or "SPEAK" request
>> only apply if the speech data is of type text/plain and does not use
>> a speech markup format.
> 
> Why is it so? Why it is not true for Voice-Parameters?

Technically they are similar in that both can be specified within SSML.  However, this distinction is a subtle one and reflects common practice in voice output design -- a designer is more likely to want to specify a default voice as a header than default prosody, because the former is commonly needed for a document as a whole (even though it can be changed within a document) while the latter typically only applies to specific text (even though one could change it for the document as a whole).

> Is it true for CONTROL (i.e. current SPEAK must be text/plain)?

The distinction between Prosody and Voice parameters applies in the case of CONTROL as well.

> 
> Specification does not say anything about it.
> 
>> 8.4.16. Load-Lexicon
>> 
>> 
>>  This header field is used to indicate whether a lexicon has to be
>>  loaded or unloaded.  The default value for this header field is
>>  "true".  This header field MAY be specified in a DEFINE-LEXICON
>>  method.
> 
> I propose rewording this paragraph to explicilty state that "true" means
> "load lexicon" and "false" means "unload lexicon".

This is a good clarification.  I will add it.

> 
> 
> 
> Thanks.
> Slawek Testowy
> 
> 
> 
> 
> 2011/3/16 The IESG <iesg-secretary@ietf.org>:
>> 
>> The IESG has received a request from the Speech Services Control WG
>> (speechsc) to consider the following document:
>> - 'Media Resource Control Protocol Version 2 (MRCPv2)'
>>  <draft-ietf-speechsc-mrcpv2-24.txt> as a Proposed Standard
>> 
>> The IESG plans to make a decision in the next few weeks, and solicits
>> final comments on this action. Please send substantive comments to the
>> ietf@ietf.org mailing lists by 2011-04-13. (This allows an additional two
>> weeks for review since the document is large and the review period overlaps
>> the Prague IETF meeting). Exceptionally, comments may be
>> sent to iesg@ietf.org instead. In either case, please retain the
>> beginning of the Subject line to allow automated sorting.
>> 
>> The file can be obtained via
>> http://datatracker.ietf.org/doc/draft-ietf-speechsc-mrcpv2/
>> 
>> IESG discussion can be tracked via
>> http://datatracker.ietf.org/doc/draft-ietf-speechsc-mrcpv2/
>> 
>> 
>> 
>> No IPR declarations have been submitted directly on this I-D.
>> _______________________________________________
>> Speechsc mailing list
>> Speechsc@ietf.org
>> https://www.ietf.org/mailman/listinfo/speechsc
>> Supplemental web site:
>> &lt;http://www.standardstrack.com/ietf/speechsc&gt;
>> 
> _______________________________________________
> Speechsc mailing list
> Speechsc@ietf.org
> https://www.ietf.org/mailman/listinfo/speechsc
> Supplemental web site:
> &lt;http://www.standardstrack.com/ietf/speechsc&gt;