[clue] AD Review: draft-ietf-clue-protocol-13

Adam Roach <adam@nostrum.com> Fri, 02 June 2017 00:48 UTC

Return-Path: <adam@nostrum.com>
X-Original-To: clue@ietfa.amsl.com
Delivered-To: clue@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B1760129B65; Thu, 1 Jun 2017 17:48:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.88
X-Spam-Level:
X-Spam-Status: No, score=-1.88 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RP_MATCHES_RCVD=-0.001, T_SPF_HELO_PERMERROR=0.01, T_SPF_PERMERROR=0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IuZ8MWBFz4vf; Thu, 1 Jun 2017 17:48:36 -0700 (PDT)
Received: from nostrum.com (raven-v6.nostrum.com [IPv6:2001:470:d:1130::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 15C2B12945A; Thu, 1 Jun 2017 17:48:35 -0700 (PDT)
Received: from Svantevit.roach.at (cpe-70-122-154-80.tx.res.rr.com [70.122.154.80]) (authenticated bits=0) by nostrum.com (8.15.2/8.15.2) with ESMTPSA id v520mWCl039437 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Thu, 1 Jun 2017 19:48:33 -0500 (CDT) (envelope-from adam@nostrum.com)
X-Authentication-Warning: raven.nostrum.com: Host cpe-70-122-154-80.tx.res.rr.com [70.122.154.80] claimed to be Svantevit.roach.at
From: Adam Roach <adam@nostrum.com>
To: clue@ietf.org
Cc: clue-chairs@ietf.org, draft-ietf-clue-protocol@tools.ietf.org
Message-ID: <a30828ea-1db8-fccd-9c2b-ddc0a1dcb08d@nostrum.com>
Date: Thu, 01 Jun 2017 19:48:30 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:52.0) Gecko/20100101 Thunderbird/52.1.0
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="------------4531E6A4C5A1DCEBF2B7F286"
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/clue/kGxXt-BzS8xwF3owNt044clyGfY>
Subject: [clue] AD Review: draft-ietf-clue-protocol-13
X-BeenThere: clue@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: CLUE - ControLling mUltiple streams for TElepresence <clue.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/clue>, <mailto:clue-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/clue/>
List-Post: <mailto:clue@ietf.org>
List-Help: <mailto:clue-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/clue>, <mailto:clue-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 02 Jun 2017 00:48:40 -0000

CLUE working group --

I have completed my AD review for the CLUE protocol document. Based on 
my reading, I do not think it is yet ready for IETF last call.

Most of my comments below are feedback that the document authors should 
treat as normal last call comments. The feedback that I consider to 
block progressing the document, in my role as AD, is explicitly marked 
with the prefix "BLOCKER", and these will need to be resolved in a new 
version of the document before progressing it further. Note that it is 
entirely possible that something I have marked "BLOCKER" may stem from 
an error on my part; so recognize that these are not demands for change, 
as much as a need to have things either fixed in the document or 
explained to me.

Title: The rule of thumb is that all but a small handful of well-known 
acronyms need to be expanded in titles and abstracts. I recognize that 
"CLUE" is a bit tortured, as acronyms go, but the title of this document 
is, broadly speaking, opaque. Please change it to something meaningful, 
such as "Protocol for Controlling Multiple Streams for Telepresence (CLUE)"

General: There are three instances of excessively long lines in the 
document.

General: Please number and caption the figures in this document. Also, 
please refer to the figure numbers when pointing to them (e.g., the 
first paragraph of section 6.1 should read something like: "As soon as 
the sub-state machine of the MP (Figure 1) is activated..."."

BLOCKER: General: There are several mentions of timeouts and retry 
thresholds in the text and its corresponding state machines; however, 
the document neither defines nor cites a document as defining what these 
timeout and retry values are. These need to be defined and described. If 
the timer and retry scheme allows the two ends of the connection to have 
different values for timeouts and number of retries, then there need to 
be additional error procedures that allow the MC and MP state machines 
to stay in sync (if the timer/retry values can be different, it's 
possible for one state machine to transition to "terminated," while the 
other is still active, and you need messaging to clean this up). The 
remainder of this comment is non-blocking: Related to this, the document 
frequently refers to retries as "expiring" (e.g., "retry expired" on the 
state diagrams). That doesn't really make sense unless "retry" is the 
name of a timer rather than a counter; I think you mean to say 
"exhausted" or something similar.

General: This document defines six message types, all of which have at 
least two names, many of which have more. It would be a lot easier to 
keep track of what is being described if these were kept consistent. I 
suggest choosing one term for each concept and sticking with it. In 
other words, please pick only one name from each of the following lines, 
and do not use any of the others:

 1. OPTIONS, options
 2. OPTIONS RESPONSE, optionsResponse
 3. ADVERTISEMENT, ADV, advertisement
 4. ADVERTISEMENT ACKNOWLEDGEMENT, ADV ACK, ACK, NACK, ack
 5. CONFIGURE, CONF, CONF+ACK, configure
 6. CONFIGURE RESPONSE, CONF RESPONSE, configureResponse

The terminology section appears to be in alphabetical order, except for 
"Capture Encoding" (which was presumably "Media Capture Encoding" in 
some earlier version). Please fix.

The definitions for "Endpoint", "MCU", and "Media Stream/Stream" vary 
from the definition given in draft-ietf-clue-framework. Is this intentional?

The first paragraph of section 4 mentions the framework and data model 
documents without citations. There should probably be citations to the 
corresponding documents.

Section 4 contains the following text:

    The CLUE protocol represents the mechanism
    for the exchange of CLUE information between CLUE Participants

This is sufficiently circular as to be basically meaningless. Suggest: 
"...for the exchange of telepresence information between..."

Section 5: The versioning scheme in here is rather perplexing. Is there 
some technical reason the protocol restricts major version numbers to a 
single digit and minor versions to a single digit? Strongly suggest that 
this should be expanded to allow multiple digits for both the major and 
minor versions.

Section 5: It should be made clear that the XML Schema excerpts in 
section 5 are non-normative. In particular, I recommend adding the 
following text to the end of the first paragraph of section 5: "This 
section includes non-normative excerpts of the schema to aid in 
describing it."

Section 5: If the single-digit version numbers are maintained (and, 
again, I strongly recommend against this), the definition of versionType 
appears to be wrong: it allows a major version digit of "0", which would 
seem to be precluded by the way versions are currently defined.

Section 5: The XML definition allows zero or more <clueId> elements to 
appear in a message. If more than one is allowed, the document should 
explain how multiple IDs are handled. If they are not, then the schema 
and/or text needs to prohibit having more than one.

Section 5: The description for <sequenceNr> says that a 402 will be sent 
in the case of an "unexpected" sequence number. This needs 
clarification: is this for case where a sequence number gap is detected? 
A repeated sequence number? A number that is too small? All of the 
above? At least describe this with the 402 error code, and point to that 
description from here.

Section 5: The description for <v> would benefit from the addition of 
"This document describes version 1.0".

Section 5.1: "The OPTIONS message is sent by the CP which is the CI to 
the CP which is the CR as soon as the CLUE data channel is ready." 
Although it's a problem in other parts of the document too, the dense 
use of nearly identical two-letter acronyms in here makes it quite hard 
to read. I had to keep consulting the definitions of those acronyms to 
decode this sentence (as well as several similar ones). Consider just 
expanding these terms (as well as "MP" and "MC") instead of using them 
in prose.

Section 5.1 starts to wobble back and forth between referring to 
elements with and without angle brackets (e.g., it uses both 
"supportedVersions" and "<supportedVersions>"). Please pick one and 
stick with it.

BLOCKER: Section 5.1: The description of <supportedVersions> describes a 
scheme in which multiple supported versions can be listed; and, if the 
list is omitted, it implies that only the version described in <v> is 
supported. This text does not define (nor does any other text that I can 
find) what <v> should be set to when <supportedVersions> is used. 
Intuitively, it seems that <v> should be set to the largest minor 
version of the smallest major version advertised in <supportedVersions>, 
but that (or whatever the correct answer is) needs to be clearly spelled 
out.

BLOCKER: Section 5.2: "If the responseCode is of the type 2xx the 
response MUST also include..." -- you can't use "2xx" in a normative 
statement without first defining what it means. As you don't use the 
"#xx" format anywhere else, I suggest rephrasing: "If the <responseCode> 
is between 200 and 299 inclusive, the response MUST also include..."

Sections 5.2, 5.4, 5.6: It seems really odd that the document defines a 
base clueMessageType that all messages derive from, but then leaves all 
the response types (optionsResponse, ack, configureResponse) to 
repeatedly and independently add <responseCode>,<responseString>, and 
the sequence number of the corresponding message over and over again. I 
would strongly suggest adding something like:

    <!-- CLUE RESPONSE TYPE -->
    <xs:complexType name="clueResponseType">
    <xs:complexContent>
    <xs:extension base="clueMessageType">
    <xs:sequence>
    <xs:element name="responseCode" type="responseCodeType"/>
    <xs:element name="reasonString" type="xs:string" minOccurs="0"/>
    <xs:element name="requestSequenceNr" type="xs:positiveInteger"/>
    <xs:any namespace="##other" processContents="lax" minOccurs="0"/>
    </xs:sequence>
    </xs:extension>
    </xs:complexContent>
    </xs:complexType>

...and then defining those three response types as being extensions of 
"clueResponse" instead of "clueMessage", like this:

    <!-- ADV ACK MESSAGE TYPE -->
    <xs:complexType name="advAcknowledgementMessageType">
    <xs:complexContent>
    <xs:extension base="clueResponseType">
    <xs:anyAttribute namespace="##other" processContents="lax"/>
    </xs:extension>
    </xs:complexContent>
    </xs:complexType>

Regardless of how you do this, any section that adds "responseCode" and 
"reasonString" needs to point to section 5.7 in its description to 
explain how those fields are populated and interpreted.

Section 5.5: This section allows a boolean flag in a <configure> message 
to acknowledge an <advertisement>. Is this intended to always be handled 
like a 200? Consider: if you define a 201 response code in the future, 
will implementations be unable to convey its meaning in a CONF + ACK? 
Given that the document defines a class of codes for success rather than 
a simple success flag in general, it seems that this <ack> element 
should carry a success response code rather than just a boolean. If you 
decide to keep the boolean, be very clear that it is to be treated as a 
200 rather than any other potential success code; and that conveying any 
other kind of success requires a separate <ack> message.

Section 5.5: the final paragraph mentions the <captureEncodings> element 
-- it would be helpful to add something like "see 
[I-D.ietf-clue-datamodel] for the definition of <captureEncodings>."

BLOCKER: Section 5.7 indicates that there is a class of response codes, 
starting with "1", which are used to indicate "delayed or incomplete" 
responses. The document does not describe any protocol behavior for this 
class of response. The description of the meaning of this class (and the 
obvious parallels to HTTP and SIP) imply that some subsequent response 
associated with the same request will be arriving at some point in the 
future. If that's the intention, this document needs a *lot* more text 
(and corresponding adjustments to the state machines) to explain how 
these 100-class codes are handled. In practice, since the current 
version of the document does not define nor make use of 100-class codes, 
I suggest that the most reasonable path forward is to remove discussion 
of codes starting with "1" from the first paragraph of section 5.7, 
instead adding a paragraph immediately following it that says something 
like:

   This document does not define response codes starting with "1", and such
   response codes are not allowed to appear in major version 1 of the CLUE
   protocol. The range from 100 to 199 inclusive is reserved for future 
major
   versions of the protocol to define response codes for delayed or 
incomplete
   operations if necessary. Response codes starting with "5" through "9" are
   reserved for future major versions of the protocol to define new 
classes of
   response, and are not allowed in major version 1 of the CLUE protocol.
   Response codes starting with "0" are not allowed.

Section 5.7: "The response codes and strings defined for use with CLUE 
are as follows" - this strongly implies that the descriptions given in 
this table are the only ones that are allowed in CLUE <reasonString> 
element, and that any other messages should presumably be treated as an 
error. Surely that's not what you mean. Suggest rephrasing to indicate 
that the "Description" text can be sent in the <reasonString>, but that 
implementations can (and are encouraged to) include more specific 
descriptions of the error condition, if possible.

Section 5.7: I can't figure out how an implementation could ever send a 
"300" response code. If the XML syntax is incorrect, that's a 301. If 
the message contains an invalid value, that's a 302. And those are the 
only two conditions that are described for 300. I think you need to give 
"300" a bit more thought -- if you can't come up with a more generic 
description (e.g., "low-level request error"), you probably want to 
remove it.

Section 5.7: for 403, be clear which identifier is meant. Do you mean 
"The <clueId> used in the message is not valid..."?

Section 5.7: for 404, please be clear that you're talking about the 
sequence number rather than just using "number" without qualification.

Section 5.7: The description for 405 uses the acronym "MCC" without 
expanding it. Please expand it.

BLOCKING: The state machines in section 6 and its subsections don't have 
transitions for all possible messages that could arrive in a state. This 
can cause interop issues. Please add text that clearly indicates whether 
such messages do or do not cause a transition. (This might be as simple 
as "messages not shown for a state do not cause the state to change," 
but only if you carefully check that this is true -- for example, what 
should an MP state machine do do if it gets a "CONF + ACK" in the state 
"WAIT FOR CONF"?)

Section 6: The fifth paragraph says "CLUE channel" where it means "CLUE 
data channel."

Section 6: The eighth paragraph says:

    The CP moves from the ACTIVE state to the IDLE one when the sub-state
    machines that have been activated are (both) in the relative
    TERMINATED state (see sections Section 6.1 and Section 6.2).

The "both" in this paragraph is confusing, since it's possible to have 
only one or the other machine running. Please rephrase.

Section 6.2 describes a state machine that starts in a state called 
"WAIT FOR ADV." This state does not appear to be timer-supervised, 
meaning that implementations of this state machine can stay in this 
state literally forever. Is that the intention?

Section 6.2 also describes the possibility of sending a <ack> and 
<configure> separately or as a combined message. I would have expected 
to see some discussion here about why implementations might choose one 
behavior over the other.

Section 7: The second sentence of the second paragraph needs a verb.

Section 7, paragraph 3: this claims that versions are a non-negative 
*integer* rather than a *digit*. As I mention above, this seems to be 
the right way to handle versions, but it is decidedly at odds with text 
elsewhere in the document and in the schema. Regardless of how you 
choose to treat versions, the text needs to be consistent.

Section 7, final paragraph: replace "Clue" with "CLUE."

Section 8 contains a paragraph starting with "In that case, the new 
information..." but it's not clear which case it's referring to. Please 
replace "In that case" with a description of the case under consideration.

Section 8 contains the phrase "Similarly to what said before..." -- this 
is ungrammatical and needs to be rephrased.

Section 8 indicates that extensions need to indicate "the standard 
version of the protocol the extension refers to." Given that there is 
compatibility within a major version of a protocol, I think this means 
to say "the major standard version of the protocol that the extension 
refers to."

Section 8 has a paragraph starting "For that reason..." -- it's not 
clear what reason is being referred to here. Please clarify.

Section 8 contains schema that contains a <version> element. It should 
be clarified that this is the *protocol* version, not the *extension* 
version (or, if it's the extension version, that needs to be spelled out 
too, but I think you'll need a new namespace for that...?)

BLOCKING: Section 8.1 is an example section, which are non-normative in 
IETF documents. It contains 2119-style normative language, however. 
These normative statements need to be moved out of the example section 
(and probably into section 8). The remainder of this comment is 
non-blocking: I also find the "SHOULD" in this section to be highly 
perplexing. Can you explain the rationale behind requiring schema, but 
not requiring any description of what the schema *means?)

BLOCKING: The example in section 8.1 includes the following:

    xmlns="clue-info-extension-myVideoExtensions"

I'm pretty certain that namespaces are required to be identified by URIs 
rather than arbitrary strings.

Section 8.1: The final paragraph also has normative language in it, 
although it appears to be reiterating requirements from elsewhere in the 
document. I suggest lowercasing "MUST" in this paragraph.

Section 8.1: The final paragraph mentions the use of <options> and 
<optionsResponse> to negotiate the extension. An example demonstrating 
this negotiation would be extremely useful.

The schema in section 9 contains:

<xs:import namespace="urn:ietf:params:xml:ns:clue-info"
schemaLocation="data-model-schema-17.xsd"/>

If you do not intend to bake this "-17" into the document (and I can't 
imagine you do), please add an RFC editor note to change it to something 
else upon publication.

Also, the schema in section 9 is (with rare exception) unindented. This 
makes it *very* hard to read. Please consider formatting it with 
conventional XML indentation. (This also applies to the schema excerpts 
earlier in the document.)

Has there been any automated tool-based checking that the examples in 
section 10 to verify that they conform to the schema in section 9 (and 
the schemata it imports)?

Section 10.2: Please expand the acronym "MCCs" in the section title 
(keep in mind that this appears in the table of contents, where it needs 
to makes sense).

Section 10 in general: While it does consume a lot of space, I don't 
think that defining six rather different message types and then showing 
only *one* type in the examples is very illustrative of the protocol. I 
would *STRONGLY* suggest adding at least a response message, and ideally 
the example section should contain at least one example for each of the 
six message types.

Section 11, paragraph 4: Replace "Clue" with "CLUE."

Section 12.1 has a strange double-double quote around the URN name, and 
section 12.3 repeats this for the MIME type.

All subsections of section 12: please update all registrant contact 
information to point to the IESG (iesg@ietf.org) rather than the CLUE 
working group and one of the authors.

Section 12.4.1: These descriptions will appear in an IANA registry, 
where the phrase "in this document" will have no context and be rather 
nonsensical. Please rephrase.

Sections 13 through 23 should include an RFC editor note asking for 
removal before publication.

/a