Re: [Speechsc] RAI review of draft-ietf-speechsc-mrcpv2-19

Arsen Chaloyan <achaloyan@yahoo.com> Tue, 14 July 2009 09:44 UTC

Return-Path: <achaloyan@yahoo.com>
X-Original-To: speechsc@core3.amsl.com
Delivered-To: speechsc@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 43C943A67E4 for <speechsc@core3.amsl.com>; Tue, 14 Jul 2009 02:44:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.202
X-Spam-Level:
X-Spam-Status: No, score=-1.202 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, MIME_QP_LONG_LINE=1.396]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bF13SQUTaD9K for <speechsc@core3.amsl.com>; Tue, 14 Jul 2009 02:44:22 -0700 (PDT)
Received: from n76.bullet.mail.sp1.yahoo.com (n76.bullet.mail.sp1.yahoo.com [98.136.44.48]) by core3.amsl.com (Postfix) with SMTP id DF6E53A68F0 for <speechsc@ietf.org>; Tue, 14 Jul 2009 02:44:22 -0700 (PDT)
Received: from [216.252.122.219] by n76.bullet.mail.sp1.yahoo.com with NNFMP; 14 Jul 2009 09:44:25 -0000
Received: from [67.195.9.82] by t4.bullet.sp1.yahoo.com with NNFMP; 14 Jul 2009 09:44:24 -0000
Received: from [67.195.9.98] by t2.bullet.mail.gq1.yahoo.com with NNFMP; 14 Jul 2009 09:44:24 -0000
Received: from [127.0.0.1] by omp102.mail.gq1.yahoo.com with NNFMP; 14 Jul 2009 09:44:24 -0000
X-Yahoo-Newman-Property: ymail-3
X-Yahoo-Newman-Id: 704360.22074.bm@omp102.mail.gq1.yahoo.com
Received: (qmail 7838 invoked by uid 60001); 14 Jul 2009 09:44:24 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1247564664; bh=6PGtjmTXhy7iihojsv08oW5n0ww+/Jh8Xuq5Hpgw1Ds=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=kjrP4kcJsGw02o2JbkYiJ0TtUHFGQsEkPiRM2ExIzl2kbXMnYm3fUSk1htmyHWoLHFi7VamT8nOD8VNYb2n0tRo97cBIo0JLdpNZoJqlYFOr1dXXoFYkGAUqkX+0GZ0GkJBUiYhcUFsvTIwKVFwHOfSxFRo8EdGw3ZJ8slaWRt4=
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=SSqhwA+BmdddtSmqOh9YtgAPoNjq/QKounPRHLADvW3Bwi9q6o/8BJ9HmJEwLXA5DW+ZaSZAIWTyWwTcKG6BcUac/qXvuXw6jVmlDKJsNjFtijFcqbxP79d91GYhvvP6PrIetpmpIa59C3+ikzcfjJHuI+nFJP77ZAMGdcFo160=;
Message-ID: <441543.7572.qm@web111316.mail.gq1.yahoo.com>
X-YMail-OSG: eM1wf8kVM1kaMWG0pcbN15Qa8D7mOp9TThBq1rX7mzGowibXVY.wTuOv7tmqRATaOj4f.KoL2stoj0fcg6tWPerSYc2an0xbuj0LzHrcjfF7x7zYrLXwMWRfRjoJECXT7qpPgvBp3L.zCFCIsbmFAS_AcTcz.72DSori4qT1XXjmw6rbjzPOMCzlUd1IRLroY3YRP285MXlrSKmpq2vMSR8.I2BRCboRDjdsJYyGtVOBWyo9ziJ.3CpBrhl3sI6fgE80khvFKg9fLaByPaT7sohMiHdWWFZXnLkQMn.4B_MBnfEnxggFD3QlypCj1uR6wALZ0N9tWAyt08TwsTFO7nheLMrRGv5AywZjg_gwYZh7y9_VQa68sg--
Received: from [91.198.247.201] by web111316.mail.gq1.yahoo.com via HTTP; Tue, 14 Jul 2009 02:44:23 PDT
X-Mailer: YahooMailRC/1358.21 YahooMailWebService/0.7.289.15
References: <033101c9ff3a$cbe33160$63a99420$%roni@huawei.com>
Date: Tue, 14 Jul 2009 02:44:23 -0700
From: Arsen Chaloyan <achaloyan@yahoo.com>
To: Roni Even <Even.roni@huawei.com>, sarvi@cisco.com, dburnett@voxeo.com
In-Reply-To: <033101c9ff3a$cbe33160$63a99420$%roni@huawei.com>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="0-1520693546-1247564663=:7572"
Cc: speechsc@ietf.org, oran@cisco.com, rai@ietf.org
Subject: Re: [Speechsc] RAI review of draft-ietf-speechsc-mrcpv2-19
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/speechsc>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Jul 2009 09:44:24 -0000

Hi,
Two small issues left you may want to fix.

- Section 9.4.30
Media type should be "text/plain" instead of "plain/text"
http://www.w3.org/TR/2004/REC-speech-synthesis-20040907/

- Find BARGE-IN-OCCURED replace with BARGE-IN-OCCURRED
There are 3 such occurrences in the text.

Thanks,
Arsen.




________________________________
From: Roni Even <Even.roni@huawei.com>
To: sarvi@cisco.com; dburnett@voxeo.com
Cc: speechsc@ietf.org; oran@cisco.com; rai@ietf.org
Sent: Wednesday, July 8, 2009 12:40:17 AM
Subject: [Speechsc] RAI review of draft-ietf-speechsc-mrcpv2-19

 
Hi,
I was
assigned to do a RAI review of the draft.  The draft looks ready for
publication to me. I have some comments mostly editorial. 
The
only issue I see that is not pure editorial is the issue of the different
parameters like confidence threshold, sensitivity level (see comments 11, 13,
15, 16 and 17). I think that some clarification on the semantics and the scale
(for example are the values linearly spaced) as well as when they are useful
will be helpful to implementers.
1.       In figure 1 Expand the abbreviations TTS, ASR, SV , SI
and how they are related to the media resource types in 3.1
2.       In figure 1 there is a SIP dialog between the MRCPv2
client and the media source/sink, what is this dialog, I only saw in section 4
a dialog between the client and server.
3.       In section 3.2 you have “For
example: sip:mrcpv2@example.net” twice one after
the other.
 
4.       In the example in section 4.2 you
“a=cmid:1”, cmid is specified later in the document so maybe you
can add some reference to where it is specified
 
5.       In the example is section 4.2 and in
following examples you have “m=audio 49170 RTP/AVP 0 96” but do not
have an rtpmap parameter for mapping 96 (dynamic payload type number) to a
media encoding name.
 
6.       In section 4.3 “Also note that
more that one media session can be associated with a single resource if need
be, but this scenario is not useful for the current set of resources”.
There is a typo the second “that” should be “than”. I
am also not sure if the current syntax in this document can support the mode.
 
 
7.       In section 4.3 “The formatting
of the"cmid" attribute in SDP RFC3388 [RFC4566]”. I think you
meant SDP grouping and need the reference to RFC 3388..
 
 
8.       In section 5.1 “The message-length
field specifies the length of the message, including the start-line” is
the length in Bytes, there is no unit specified.
 
9.       In section 6.3.1, typo you have
“Verfication “ instead of verification.. It appears twice in the
section.
 
10.   In the example in section 7 you have
“m=audio 0 RTP/AVP 0 1 3” payload type 1 was deleted from the IANA
registry, maybe have another payload type number.
 
11.   In section 9.4.1, 9.4.2 and 9.4.3
you specify confidence threshold, sensitivity level and speed vs accuracy. What
is the scale here; is it linear between 0 and 1. What is the absolute value of
the number, if you receive the same confidence level from two recognizers are
they the same (e.g. when using context block to switch servers).  For the
speed vs accuracy, how does the client know what is the relation between the
value and the number of available sessions, since this seems to be the reason
for using this parameter.
 
12.   In 9.4.9 and in 10.4.8, 11.4.11 what
are the values for media-type-value, you also mention audio and video but it
looks to me that this document only discusses voice.
 
13.   In 9.4.35 and 9.4.36 what is the
scale for the consistency here. How does one know what close means. What is the
consistency between different recognizers.
 
14.   In section 9.6.3.3 in the example
(figure 2) confidence should be 0.75 and not 75
 
15.   In section 10.4.1 it is not clear
how you measure the sensitivity in order to specify, is it based on some SNR
translated to 0 to 1 scale?
 
16.   In 11.4.6 the same issue with the
scale, how does the client know how to set a value when working with different
speaker verification servers.
 
17.   In 11.5.2.9 you state that the
verification-score is not a probability, so what is it. How can the client decide
if, for example, 0 is a good score for specifying the threshold.  I also
noticed that the values in the example in section 11..5.2.10 are very precise
like 0.98514 is this the expected precision. The examples here and in section
11.11 do not show the threshold, if the threshold is required for this flow why
not show it in the example?
 
18..   In section 12.3 the suggestion is to
use SRTP as the mandatory interoperability mode. If the reason for mandating
SRTP is for a common mode you should also decide on a key exchange mechanism. I
suggest you look at http://tools.ietf.org/html/draft-ietf-avt-srtp-not-mandatory-02 for discussion on media security.
 
19.   In section13.7.2 you specify the attribute
resource as session level yet in the example in section 4.2 it is a media level
attribute. The same goes for the channel attribute
 
Thanks
 
Roni
Even