[Speechsc] AD review of draft-ietf-speechsc-mrcpv2-20

Robert Sparks <rjsparks@nostrum.com> Tue, 29 September 2009 15:05 UTC

Return-Path: <rjsparks@nostrum.com>
X-Original-To: speechsc@core3.amsl.com
Delivered-To: speechsc@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 116C73A6922 for <speechsc@core3.amsl.com>; Tue, 29 Sep 2009 08:05:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.501
X-Spam-Level:
X-Spam-Status: No, score=-2.501 tagged_above=-999 required=5 tests=[AWL=0.099, BAYES_00=-2.599, SPF_PASS=-0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PTzWJZrW-D5W for <speechsc@core3.amsl.com>; Tue, 29 Sep 2009 08:05:29 -0700 (PDT)
Received: from nostrum.com (nostrum-pt.tunnel.tserv2.fmt.ipv6.he.net [IPv6:2001:470:1f03:267::2]) by core3.amsl.com (Postfix) with ESMTP id 21EBE3A69A6 for <speechsc@ietf.org>; Tue, 29 Sep 2009 08:05:28 -0700 (PDT)
Received: from [192.168.2.2] (pool-173-71-53-15.dllstx.fios.verizon.net [173.71.53.15]) (authenticated bits=0) by nostrum.com (8.14.3/8.14.3) with ESMTP id n8TF6l6P066813 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Tue, 29 Sep 2009 10:06:47 -0500 (CDT) (envelope-from rjsparks@nostrum.com)
Message-Id: <862ADFEF-C942-4945-8252-48BE7A7D420F@nostrum.com>
From: Robert Sparks <rjsparks@nostrum.com>
To: speechsc@ietf.org
Content-Type: text/plain; charset="US-ASCII"; format="flowed"; delsp="yes"
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v936)
Date: Tue, 29 Sep 2009 10:06:47 -0500
X-Mailer: Apple Mail (2.936)
Received-SPF: pass (nostrum.com: 173.71.53.15 is authenticated by a trusted mechanism)
Cc: draft-ietf-speechsc-mrcpv2@tools.ietf.org, speechsc-chairs@tools.ietf.org
Subject: [Speechsc] AD review of draft-ietf-speechsc-mrcpv2-20
X-BeenThere: speechsc@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Speech Services Control Working Group <speechsc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/speechsc>
List-Post: <mailto:speechsc@ietf.org>
List-Help: <mailto:speechsc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/speechsc>, <mailto:speechsc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 29 Sep 2009 15:05:31 -0000

Hi Folks -

I'm working on moving MRCPv2 along. I've found several things so far
that I'd like to discuss and/or have the document address before we take
the document into IETF last call.

This is a large and complex document. Apologies that my review has  
taken so long.

After talking with Eric and Dave, I'm sending these all in one message  
instead
of splitting them into several threads at the beginning. When you  
reply to a particular
point, it would be useful to me if you adjusted the subject line to  
indicate which point
you are replying to.

These are not listed in any particular order. Nits are grouped at the  
end.

Thanks!

RjS
----------------------------------------------------------------------------------------------
(The following apply to revision -20)

1 The Introduction points to RFC4313 for a discussion on why MRCPv2
   does not use RTSP and details on alternatives, but I don't find that
   discussion in 4313. Was that discussion captured somewhere? If so,
   please point to that. Otherwise, modify this text.

2 The SIP examples throughout the draft need to be adjusted to reflect
   correct syntax and intended use. There are several aspects of the SIP
   messages, in particular, that are currently in error. Consider
   showing partial SIP headers focusing only on what's important to the
   example as an alternative to showing full messages (that will have to
   be carefully reviewed). Some examples of issues that need to be
   corrected (this is not exhaustive)
     2.1 Several responses are missing "received=" in their Via header
       fields
     2.2 The o= line in answers (as in offer/answer) must be different
       from the o= line in the offer.
     2.3 The branch parameter values need to be reviewed very  
carefully -
       the first example incorrectly reuses the branch from the INVITE
       in an ACK to a 200 OK. Then the _next transaction_ also reuses
       the branch.
     2.4 There is a to-tag in the OPTIONS request on page 44

3 The MRCP examples need to be similarly reviewed
     3.1 are all the content-lengths correct? (I think the 2nd message's
       on page 59 isn't)
     3.2 It's ok, probably even a good idea, to elide values that are  
not
       important to understanding an example, but please be consistent -
       the first message on page 65, for example, has an explicit
       (probably incorrect) MRCP length, but elided the mime-body length
     3.3 The example in section 9.14 shows two RECOGNITION-COMPLETE
       messages to the same RECOGNIZE. Were these intended to show two
       alternate possible responses? If so, the document should make
       that more clear.

4 Returning a SIP 501 at the end of section 4.3 is not the right thing
   to do. 501 means the responding element does not implement the
   method. You are probably looking for 488 Not Acceptable Here.

5 This needs to be run through an ABNF checker. There are production
   rules and terminals missing - they either need to be defined or
   pointers to where they are defined need to be added.

6 The document occasionally mentions an MRCP proxy (there is a 503
   Proxy Timeout code even), but I can't find where such proxies are
   defined? Page 32 also talks about intermediaries.

7 Some additional discussion around connection establishment and
   sharing/reuse is probably needed
     7.1 Where does an element look in a peer's certificate to determine
       it's reached the peer it has intended to reach?
     7.2 What happens if a connection gets closed?
     7.3 Must events come over the connection the request was sent on?
     7.4 There should be some guidance on only reusing connections when
       the identity of the peer matches what was confirmed when the
       connection was opened. (Specifically, if it's possible for an
       MRCPv2 server to host services for more than one domain, you
       don't want to blindly reuse the connection you made to talk to A
       to talk to B just because DNS aimed you to the same address/port
       to reach them.)

8 Section 6.1.2 should be explicit about what it means by "empty header
   field"

9 With respect to the URI indirection mechanisms defined in the draft:
     9.1 Much of the text assumes these URIs will be HTTP/HTTPS. But  
other
       parts of the text, and the syntax goes out of the way to allow
       arbitrary URI types. Please help look for places where the
       recommendations and requirements stated only make sense when the
       URI is HTTP or HTTPS.
     9.2 There's currently no discussion about authenticating the
       requester seeking access to the resource pointed to by one of
       these URIs. Security considerations should call out that if the
       URI leaks, the content leaks. There should probably also be more
       explicit discussion of how long a server should be expected to
       hold onto the state indicated by such a URI (how long can a
       client expect it to be there, and when does a server decide a
       client or set of clients is mounting a state exhaustion attack?),
       whether it should allow multiple accesses from a single client,
       whether it should allow accesses from multiple clients, and what
       it means to a client if the attempt to access the resource fails.

10 Why is there both a "Fetch Hint" and a "Audio Fetch Hint". Why does
   the syntax allow for extensibility in the values for those fields?

11 On page 111, the document talks about timing between audio flows and
   RECOGNIZE methods. It claims there are "a number of mechanisms" for
   dealing with the race conditions. Would it be possible to list a few
   of these as informative examples? You might also consider pointing
   out that the delta between the start of an audio flow (or the point
   in an ongoing stream that you intended to start RECOGNIZEing) and the
   receipt of a RECOGNIZE command could be quite large if TCP is
   reacting to congestion. The prohibition at the end of the paragraph
   ("MUST NOT buffer anything it receives beforehand.") seems odd.
   What's the rationale for it? Finally - did the group consider
   indicating RTP timestamps in the RECOGNIZE request to indicate where
   to start recognition as one of the mechanisms pointed to above?

12 Why is the record semantic defined in 10.4.7 different from the one
   in 9.4.8/9.4.22 (specifically, by providing a way to request a server
   store something somewhere other than on that server)? Why does this
   section allow an arbitrary URI scheme to be passed in here? What is
   an implementation supposed to do if it doesn't know the scheme? What
   does it do if attempts to use a URI with a scheme it recognizes
   results in failure? The security considerations section should
   discuss how this might be abused by providing a URI that points at a
   victim.

13 What should an element do if it receives a status code that it
   doesn't recognize? If that's not already specified in the document,
   it should be added.

14 Consider additional clarification around "Note that "GET-PARAMS"
   returns header values that apply to the whole session and not values
   that have a request level scope."

15 How are parameters like "Confidence Threshold" and "Sensitivity
   Level" interoperable? Would you expect .5 to mean the same thing to
   two different implementations? I'm guessing that the intent is that
   the server gets to interpret these values in an
   implementation-specific way, and the utility of these knobs is that
   you tune them over time to a given server. If that's right, the text
   should explicitly point that out.

16 Something I'm still trying to think through and would like other
   folks to comment on - apologies if I've missed where this is treated
   already: Can a server ever issue a reINVITE affecting an MRCPv2
   session (to change codecs for example)? If so, are there any places
   in the text that need to call that out?

17 On page 15, there's a requirement that "There MUST be one SDP m-line
   for each MRCPv2 resource to be used in the session. " This looks like
   it would prevent offering things like alternates, v4 and v6, etc. Is
   this what's intended?

18 Nits
     18.1 Section 3 paragraph 2 sentence 1: SIP is not the "session
       management protocol"
     18.2 The word "pipe" is used ("control pipe", "audio pipe" for
       example) with no definition and there are well-defined terms that
       could be used instead.
     18.3 Paragraph spanning pages 15 and 16 - I suggest explicitly  
noting
       that the reINVITE receives an error response.
     18.4 There is an unnatural break in the flow of the prose on page  
16
       when the text shifts from an overview of the protocol to giving
       an example. Suggest breaking the example into a subsection to
       make it clear what you're intending.
     18.5 Please use the terms "header" and "header
       field" consistently and align the use of those phrases with
       the definitions in section 2.1 of RFC 5322.
     18.6 The conditional language in 6.1.1 is hard to follow. In
       particular, the paragraph starting "If both error 404 and
       another" is awkward. Please consider clarifying these clauses.
     18.7 typo on page 43: "veriifcation"
     18.8 The string "ECMAScript" is used once with no definition.
     18.9 The term "kill-on-barge-in" is used without any definition.
       Please add a reference or a definition.
     18.10 Page 121 says: "The Personal-Grammar-URI,"..."is  
created"... . I
       think you meant to say the resource indicated by that URI is
       created.
     18.11 Consider using "octet" for "byte". In places where you are
       describing lengths, consider talking about whether leading 0s
       have meaning (it would probably be good to explicitly call out
       that you don't want such a string to be interpreted base-8).