[Speechsc] AD review of draft-ietf-speechsc-mrcpv2-20

Robert Sparks <rjsparks@nostrum.com> Tue, 29 September 2009 15:05 UTC

Message-Id: <862ADFEF-C942-4945-8252-48BE7A7D420F@nostrum.com>
From: Robert Sparks <rjsparks@nostrum.com>
To: speechsc@ietf.org
Content-Type: text/plain; charset="US-ASCII"; format="flowed"; delsp="yes"
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v936)
Date: Tue, 29 Sep 2009 10:06:47 -0500
Received-SPF: pass (nostrum.com: 173.71.53.15 is authenticated by a trusted mechanism)
Cc: draft-ietf-speechsc-mrcpv2@tools.ietf.org, speechsc-chairs@tools.ietf.org
Subject: [Speechsc] AD review of draft-ietf-speechsc-mrcpv2-20
Precedence: list

Hi Folks -

I'm working on moving MRCPv2 along. I've found several things so far
that I'd like to discuss and/or have the document address before we take
the document into IETF last call.

This is a large and complex document. Apologies that my review has
taken so long.

After talking with Eric and Dave, I'm sending these all in one message
instead
of splitting them into several threads at the beginning. When you
reply to a particular
point, it would be useful to me if you adjusted the subject line to
indicate which point
you are replying to.

These are not listed in any particular order. Nits are grouped at the
end.

Thanks!

RjS
----------------------------------------------------------------------------------------------
(The following apply to revision -20)

1 The Introduction points to RFC4313 for a discussion on why MRCPv2
does not use RTSP and details on alternatives, but I don't find that
discussion in 4313. Was that discussion captured somewhere? If so,
please point to that. Otherwise, modify this text.

2 The SIP examples throughout the draft need to be adjusted to reflect
correct syntax and intended use. There are several aspects of the SIP
messages, in particular, that are currently in error. Consider
showing partial SIP headers focusing only on what's important to the
example as an alternative to showing full messages (that will have to
be carefully reviewed). Some examples of issues that need to be
corrected (this is not exhaustive)
2.1 Several responses are missing "received=" in their Via header
fields
2.2 The o= line in answers (as in offer/answer) must be different
from the o= line in the offer.
2.3 The branch parameter values need to be reviewed very
carefully -
the first example incorrectly reuses the branch from the INVITE
in an ACK to a 200 OK. Then the _next transaction_ also reuses
the branch.
2.4 There is a to-tag in the OPTIONS request on page 44

3 The MRCP examples need to be similarly reviewed
3.1 are all the content-lengths correct? (I think the 2nd message's
on page 59 isn't)
3.2 It's ok, probably even a good idea, to elide values that are
not
important to understanding an example, but please be consistent -
the first message on page 65, for example, has an explicit
(probably incorrect) MRCP length, but elided the mime-body length
3.3 The example in section 9.14 shows two RECOGNITION-COMPLETE
messages to the same RECOGNIZE. Were these intended to show two
alternate possible responses? If so, the document should make
that more clear.

4 Returning a SIP 501 at the end of section 4.3 is not the right thing
to do. 501 means the responding element does not implement the
method. You are probably looking for 488 Not Acceptable Here.

5 This needs to be run through an ABNF checker. There are production
rules and terminals missing - they either need to be defined or
pointers to where they are defined need to be added.

6 The document occasionally mentions an MRCP proxy (there is a 503
Proxy Timeout code even), but I can't find where such proxies are
defined? Page 32 also talks about intermediaries.

7 Some additional discussion around connection establishment and
sharing/reuse is probably needed
7.1 Where does an element look in a peer's certificate to determine
it's reached the peer it has intended to reach?
7.2 What happens if a connection gets closed?
7.3 Must events come over the connection the request was sent on?
7.4 There should be some guidance on only reusing connections when
the identity of the peer matches what was confirmed when the
connection was opened. (Specifically, if it's possible for an
MRCPv2 server to host services for more than one domain, you
don't want to blindly reuse the connection you made to talk to A
to talk to B just because DNS aimed you to the same address/port
to reach them.)

8 Section 6.1.2 should be explicit about what it means by "empty header
field"

9 With respect to the URI indirection mechanisms defined in the draft:
9.1 Much of the text assumes these URIs will be HTTP/HTTPS. But
other
parts of the text, and the syntax goes out of the way to allow
arbitrary URI types. Please help look for places where the
recommendations and requirements stated only make sense when the
URI is HTTP or HTTPS.
9.2 There's currently no discussion about authenticating the
requester seeking access to the resource pointed to by one of
these URIs. Security considerations should call out that if the
URI leaks, the content leaks. There should probably also be more
explicit discussion of how long a server should be expected to
hold onto the state indicated by such a URI (how long can a
client expect it to be there, and when does a server decide a
client or set of clients is mounting a state exhaustion attack?),
whether it should allow multiple accesses from a single client,
whether it should allow accesses from multiple clients, and what
it means to a client if the attempt to access the resource fails.

10 Why is there both a "Fetch Hint" and a "Audio Fetch Hint". Why does
the syntax allow for extensibility in the values for those fields?

11 On page 111, the document talks about timing between audio flows and
RECOGNIZE methods. It claims there are "a number of mechanisms" for
dealing with the race conditions. Would it be possible to list a few
of these as informative examples? You might also consider pointing
out that the delta between the start of an audio flow (or the point
in an ongoing stream that you intended to start RECOGNIZEing) and the
receipt of a RECOGNIZE command could be quite large if TCP is
reacting to congestion. The prohibition at the end of the paragraph
("MUST NOT buffer anything it receives beforehand.") seems odd.
What's the rationale for it? Finally - did the group consider
indicating RTP timestamps in the RECOGNIZE request to indicate where
to start recognition as one of the mechanisms pointed to above?

12 Why is the record semantic defined in 10.4.7 different from the one
in 9.4.8/9.4.22 (specifically, by providing a way to request a server
store something somewhere other than on that server)? Why does this
section allow an arbitrary URI scheme to be passed in here? What is
an implementation supposed to do if it doesn't know the scheme? What
does it do if attempts to use a URI with a scheme it recognizes
results in failure? The security considerations section should
discuss how this might be abused by providing a URI that points at a
victim.

13 What should an element do if it receives a status code that it
doesn't recognize? If that's not already specified in the document,
it should be added.

14 Consider additional clarification around "Note that "GET-PARAMS"
returns header values that apply to the whole session and not values
that have a request level scope."

15 How are parameters like "Confidence Threshold" and "Sensitivity
Level" interoperable? Would you expect .5 to mean the same thing to
two different implementations? I'm guessing that the intent is that
the server gets to interpret these values in an
implementation-specific way, and the utility of these knobs is that
you tune them over time to a given server. If that's right, the text
should explicitly point that out.

16 Something I'm still trying to think through and would like other
folks to comment on - apologies if I've missed where this is treated
already: Can a server ever issue a reINVITE affecting an MRCPv2
session (to change codecs for example)? If so, are there any places
in the text that need to call that out?

17 On page 15, there's a requirement that "There MUST be one SDP m-line
for each MRCPv2 resource to be used in the session. " This looks like
it would prevent offering things like alternates, v4 and v6, etc. Is
this what's intended?

18 Nits
18.1 Section 3 paragraph 2 sentence 1: SIP is not the "session
management protocol"
18.2 The word "pipe" is used ("control pipe", "audio pipe" for
example) with no definition and there are well-defined terms that
could be used instead.
18.3 Paragraph spanning pages 15 and 16 - I suggest explicitly
noting
that the reINVITE receives an error response.
18.4 There is an unnatural break in the flow of the prose on page
16
when the text shifts from an overview of the protocol to giving
an example. Suggest breaking the example into a subsection to
make it clear what you're intending.
18.5 Please use the terms "header" and "header
field" consistently and align the use of those phrases with
the definitions in section 2.1 of RFC 5322.
18.6 The conditional language in 6.1.1 is hard to follow. In
particular, the paragraph starting "If both error 404 and
another" is awkward. Please consider clarifying these clauses.
18.7 typo on page 43: "veriifcation"
18.8 The string "ECMAScript" is used once with no definition.
18.9 The term "kill-on-barge-in" is used without any definition.
Please add a reference or a definition.
18.10 Page 121 says: "The Personal-Grammar-URI,"..."is
created"... . I
think you meant to say the resource indicated by that URI is
created.
18.11 Consider using "octet" for "byte". In places where you are
describing lengths, consider talking about whether leading 0s
have meaning (it would probably be good to explicitly call out
that you don't want such a string to be interpreted base-8).

[Speechsc] AD review of draft-ietf-speechsc-mrcpv… Robert Sparks
Re: [Speechsc] AD review of draft-ietf-speechsc-m… Arsen Chaloyan
Re: [Speechsc] AD review of draft-ietf-speechsc-m… Dan Burnett