Re: [clue] AD Review: draft-ietf-clue-protocol-13

Adam Roach <adam@nostrum.com> Mon, 10 July 2017 20:35 UTC

Return-Path: <adam@nostrum.com>
X-Original-To: clue@ietfa.amsl.com
Delivered-To: clue@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8583C1318A9; Mon, 10 Jul 2017 13:35:15 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.88
X-Spam-Level:
X-Spam-Status: No, score=-1.88 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RP_MATCHES_RCVD=-0.001, T_SPF_HELO_PERMERROR=0.01, T_SPF_PERMERROR=0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sov4wXsCZzBq; Mon, 10 Jul 2017 13:35:11 -0700 (PDT)
Received: from nostrum.com (raven-v6.nostrum.com [IPv6:2001:470:d:1130::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2F6F712F24E; Mon, 10 Jul 2017 13:35:11 -0700 (PDT)
Received: from Orochi.local (99-152-146-228.lightspeed.dllstx.sbcglobal.net [99.152.146.228]) (authenticated bits=0) by nostrum.com (8.15.2/8.15.2) with ESMTPSA id v6AKZ7pQ091562 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Mon, 10 Jul 2017 15:35:08 -0500 (CDT) (envelope-from adam@nostrum.com)
X-Authentication-Warning: raven.nostrum.com: Host 99-152-146-228.lightspeed.dllstx.sbcglobal.net [99.152.146.228] claimed to be Orochi.local
To: Simon Pietro Romano <spromano@unina.it>
Cc: clue@ietf.org, clue-chairs@ietf.org, draft-ietf-clue-protocol@tools.ietf.org
References: <a30828ea-1db8-fccd-9c2b-ddc0a1dcb08d@nostrum.com> <4523923C-254E-45B1-AC8E-B301834E0876@unina.it>
From: Adam Roach <adam@nostrum.com>
Message-ID: <df86d7b3-f45b-4e71-7e5b-8c3c1c7fe496@nostrum.com>
Date: Mon, 10 Jul 2017 15:35:02 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:52.0) Gecko/20100101 Thunderbird/52.2.1
MIME-Version: 1.0
In-Reply-To: <4523923C-254E-45B1-AC8E-B301834E0876@unina.it>
Content-Type: multipart/alternative; boundary="------------E08E3FEB9875DF153AC78B95"
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/clue/4Ifv3qlSklqeFz1TWnfUuxIWvBM>
Subject: Re: [clue] AD Review: draft-ietf-clue-protocol-13
X-BeenThere: clue@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: CLUE - ControLling mUltiple streams for TElepresence <clue.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/clue>, <mailto:clue-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/clue/>
List-Post: <mailto:clue@ietf.org>
List-Help: <mailto:clue-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/clue>, <mailto:clue-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jul 2017 20:35:16 -0000

For my own planning, when might I expect to see a revised edition of 
this document?

Thanks!

/a

On 6/12/17 08:14, Simon Pietro Romano wrote:
> Hello Adam,
>
> thanks a lot for your thorough review and sorry for answering that 
> late. I am currently abroad for a conference. Together with Roberta, 
> we’ll work on your comments and send back our answers in the next few 
> days.
>
> Cheers,
>
> Simon
>   _\\|//_
> ( O-O )
>       ~~~~~~~~~~~~~~~~~~~~~~o00~~(_)~~00o~~~~~~~~~~~~~~~~~~~~~~~~
> Simon Pietro Romano
>  Universita' di Napoli Federico II
>    Computer Engineering Department
>            Phone: +39 081 7683823 -- Fax: +39 081 7683816
>  e-mail: spromano@unina.it <mailto:spromano@unina.it>
>
>   <<Molti mi dicono che lo scoraggiamento è l'alibi degli
>  idioti. Ci rifletto un istante; e mi scoraggio>>. Magritte.
>                    oooO
>        ~~~~~~~~~~~~~~~~~~~~~~~( )~~~ Oooo~~~~~~~~~~~~~~~~~~~~~~~~~
>                \ (            (   )
>                                 \_)          ) /
>                          (_/
>
>
>
>> On 02 Jun 2017, at 02:48, Adam Roach <adam@nostrum.com 
>> <mailto:adam@nostrum.com>> wrote:
>>
>> CLUE working group --
>>
>> I have completed my AD review for the CLUE protocol document. Based 
>> on my reading, I do not think it is yet ready for IETF last call.
>>
>> Most of my comments below are feedback that the document authors 
>> should treat as normal last call comments. The feedback that I 
>> consider to block progressing the document, in my role as AD, is 
>> explicitly marked with the prefix "BLOCKER", and these will need to 
>> be resolved in a new version of the document before progressing it 
>> further. Note that it is entirely possible that something I have 
>> marked "BLOCKER" may stem from an error on my part; so recognize that 
>> these are not demands for change, as much as a need to have things 
>> either fixed in the document or explained to me.
>>
>> Title: The rule of thumb is that all but a small handful of 
>> well-known acronyms need to be expanded in titles and abstracts. I 
>> recognize that "CLUE" is a bit tortured, as acronyms go, but the 
>> title of this document is, broadly speaking, opaque. Please change it 
>> to something meaningful, such as "Protocol for Controlling Multiple 
>> Streams for Telepresence (CLUE)"
>>
>> General: There are three instances of excessively long lines in the 
>> document.
>>
>> General: Please number and caption the figures in this document. 
>> Also, please refer to the figure numbers when pointing to them (e.g., 
>> the first paragraph of section 6.1 should read something like: "As 
>> soon as the sub-state machine of the MP (Figure 1) is activated..."."
>>
>> BLOCKER: General: There are several mentions of timeouts and retry 
>> thresholds in the text and its corresponding state machines; however, 
>> the document neither defines nor cites a document as defining what 
>> these timeout and retry values are. These need to be defined and 
>> described. If the timer and retry scheme allows the two ends of the 
>> connection to have different values for timeouts and number of 
>> retries, then there need to be additional error procedures that allow 
>> the MC and MP state machines to stay in sync (if the timer/retry 
>> values can be different, it's possible for one state machine to 
>> transition to "terminated," while the other is still active, and you 
>> need messaging to clean this up). The remainder of this comment is 
>> non-blocking: Related to this, the document frequently refers to 
>> retries as "expiring" (e.g., "retry expired" on the state diagrams). 
>> That doesn't really make sense unless "retry" is the name of a timer 
>> rather than a counter; I think you mean to say "exhausted" or 
>> something similar.
>>
>> General: This document defines six message types, all of which have 
>> at least two names, many of which have more. It would be a lot easier 
>> to keep track of what is being described if these were kept 
>> consistent. I suggest choosing one term for each concept and sticking 
>> with it. In other words, please pick only one name from each of the 
>> following lines, and do not use any of the others:
>>
>>  1. OPTIONS, options
>>  2. OPTIONS RESPONSE, optionsResponse
>>  3. ADVERTISEMENT, ADV, advertisement
>>  4. ADVERTISEMENT ACKNOWLEDGEMENT, ADV ACK, ACK, NACK, ack
>>  5. CONFIGURE, CONF, CONF+ACK, configure
>>  6. CONFIGURE RESPONSE, CONF RESPONSE, configureResponse
>>
>> The terminology section appears to be in alphabetical order, except 
>> for "Capture Encoding" (which was presumably "Media Capture Encoding" 
>> in some earlier version). Please fix.
>>
>> The definitions for "Endpoint", "MCU", and "Media Stream/Stream" vary 
>> from the definition given in draft-ietf-clue-framework. Is this 
>> intentional?
>>
>> The first paragraph of section 4 mentions the framework and data 
>> model documents without citations. There should probably be citations 
>> to the corresponding documents.
>>
>> Section 4 contains the following text:
>>
>>     The CLUE protocol represents the mechanism
>>     for the exchange of CLUE information between CLUE Participants
>>
>> This is sufficiently circular as to be basically meaningless. 
>> Suggest: "...for the exchange of telepresence information between..."
>>
>> Section 5: The versioning scheme in here is rather perplexing. Is 
>> there some technical reason the protocol restricts major version 
>> numbers to a single digit and minor versions to a single digit? 
>> Strongly suggest that this should be expanded to allow multiple 
>> digits for both the major and minor versions.
>>
>> Section 5: It should be made clear that the XML Schema excerpts in 
>> section 5 are non-normative. In particular, I recommend adding the 
>> following text to the end of the first paragraph of section 5: "This 
>> section includes non-normative excerpts of the schema to aid in 
>> describing it."
>>
>> Section 5: If the single-digit version numbers are maintained (and, 
>> again, I strongly recommend against this), the definition of 
>> versionType appears to be wrong: it allows a major version digit of 
>> "0", which would seem to be precluded by the way versions are 
>> currently defined.
>>
>> Section 5: The XML definition allows zero or more <clueId> elements 
>> to appear in a message. If more than one is allowed, the document 
>> should explain how multiple IDs are handled. If they are not, then 
>> the schema and/or text needs to prohibit having more than one.
>>
>> Section 5: The description for <sequenceNr> says that a 402 will be 
>> sent in the case of an "unexpected" sequence number. This needs 
>> clarification: is this for case where a sequence number gap is 
>> detected? A repeated sequence number? A number that is too small? All 
>> of the above? At least describe this with the 402 error code, and 
>> point to that description from here.
>>
>> Section 5: The description for <v> would benefit from the addition of 
>> "This document describes version 1.0".
>>
>> Section 5.1: "The OPTIONS message is sent by the CP which is the CI 
>> to the CP which is the CR as soon as the CLUE data channel is ready." 
>> Although it's a problem in other parts of the document too, the dense 
>> use of nearly identical two-letter acronyms in here makes it quite 
>> hard to read. I had to keep consulting the definitions of those 
>> acronyms to decode this sentence (as well as several similar ones). 
>> Consider just expanding these terms (as well as "MP" and "MC") 
>> instead of using them in prose.
>>
>> Section 5.1 starts to wobble back and forth between referring to 
>> elements with and without angle brackets (e.g., it uses both 
>> "supportedVersions" and "<supportedVersions>"). Please pick one and 
>> stick with it.
>>
>> BLOCKER: Section 5.1: The description of <supportedVersions> 
>> describes a scheme in which multiple supported versions can be 
>> listed; and, if the list is omitted, it implies that only the version 
>> described in <v> is supported. This text does not define (nor does 
>> any other text that I can find) what <v> should be set to when 
>> <supportedVersions> is used. Intuitively, it seems that <v> should be 
>> set to the largest minor version of the smallest major version 
>> advertised in <supportedVersions>, but that (or whatever the correct 
>> answer is) needs to be clearly spelled out.
>>
>> BLOCKER: Section 5.2: "If the responseCode is of the type 2xx the 
>> response MUST also include..." -- you can't use "2xx" in a normative 
>> statement without first defining what it means. As you don't use the 
>> "#xx" format anywhere else, I suggest rephrasing: "If the 
>> <responseCode> is between 200 and 299 inclusive, the response MUST 
>> also include..."
>>
>> Sections 5.2, 5.4, 5.6: It seems really odd that the document defines 
>> a base clueMessageType that all messages derive from, but then leaves 
>> all the response types (optionsResponse, ack, configureResponse) to 
>> repeatedly and independently add <responseCode>,<responseString>, and 
>> the sequence number of the corresponding message over and over again. 
>> I would strongly suggest adding something like:
>>
>>     <!-- CLUE RESPONSE TYPE -->
>>     <xs:complexType name="clueResponseType">
>>     <xs:complexContent>
>>     <xs:extension base="clueMessageType">
>>     <xs:sequence>
>>     <xs:element name="responseCode" type="responseCodeType"/>
>>     <xs:element name="reasonString" type="xs:string" minOccurs="0"/>
>>     <xs:element name="requestSequenceNr" type="xs:positiveInteger"/>
>>     <xs:any namespace="##other" processContents="lax" minOccurs="0"/>
>>     </xs:sequence>
>>     </xs:extension>
>>     </xs:complexContent>
>>     </xs:complexType>
>>
>> ...and then defining those three response types as being extensions 
>> of "clueResponse" instead of "clueMessage", like this:
>>
>>     <!-- ADV ACK MESSAGE TYPE -->
>>     <xs:complexType name="advAcknowledgementMessageType">
>>     <xs:complexContent>
>>     <xs:extension base="clueResponseType">
>>     <xs:anyAttribute namespace="##other" processContents="lax"/>
>>     </xs:extension>
>>     </xs:complexContent>
>>     </xs:complexType>
>>
>> Regardless of how you do this, any section that adds "responseCode" 
>> and "reasonString" needs to point to section 5.7 in its description 
>> to explain how those fields are populated and interpreted.
>>
>> Section 5.5: This section allows a boolean flag in a <configure> 
>> message to acknowledge an <advertisement>. Is this intended to always 
>> be handled like a 200? Consider: if you define a 201 response code in 
>> the future, will implementations be unable to convey its meaning in a 
>> CONF + ACK? Given that the document defines a class of codes for 
>> success rather than a simple success flag in general, it seems that 
>> this <ack> element should carry a success response code rather than 
>> just a boolean. If you decide to keep the boolean, be very clear that 
>> it is to be treated as a 200 rather than any other potential success 
>> code; and that conveying any other kind of success requires a 
>> separate <ack> message.
>>
>> Section 5.5: the final paragraph mentions the <captureEncodings> 
>> element -- it would be helpful to add something like "see 
>> [I-D.ietf-clue-datamodel] for the definition of <captureEncodings>."
>>
>> BLOCKER: Section 5.7 indicates that there is a class of response 
>> codes, starting with "1", which are used to indicate "delayed or 
>> incomplete" responses. The document does not describe any protocol 
>> behavior for this class of response. The description of the meaning 
>> of this class (and the obvious parallels to HTTP and SIP) imply that 
>> some subsequent response associated with the same request will be 
>> arriving at some point in the future. If that's the intention, this 
>> document needs a *lot* more text (and corresponding adjustments to 
>> the state machines) to explain how these 100-class codes are handled. 
>> In practice, since the current version of the document does not 
>> define nor make use of 100-class codes, I suggest that the most 
>> reasonable path forward is to remove discussion of codes starting 
>> with "1" from the first paragraph of section 5.7, instead adding a 
>> paragraph immediately following it that says something like:
>>
>>   This document does not define response codes starting with "1", and 
>> such
>>   response codes are not allowed to appear in major version 1 of the CLUE
>>   protocol. The range from 100 to 199 inclusive is reserved for 
>> future major
>>   versions of the protocol to define response codes for delayed or 
>> incomplete
>>   operations if necessary. Response codes starting with "5" through 
>> "9" are
>>   reserved for future major versions of the protocol to define new 
>> classes of
>>   response, and are not allowed in major version 1 of the CLUE protocol.
>>   Response codes starting with "0" are not allowed.
>>
>> Section 5.7: "The response codes and strings defined for use with 
>> CLUE are as follows" - this strongly implies that the descriptions 
>> given in this table are the only ones that are allowed in CLUE 
>> <reasonString> element, and that any other messages should presumably 
>> be treated as an error. Surely that's not what you mean. Suggest 
>> rephrasing to indicate that the "Description" text can be sent in the 
>> <reasonString>, but that implementations can (and are encouraged to) 
>> include more specific descriptions of the error condition, if possible.
>>
>> Section 5.7: I can't figure out how an implementation could ever send 
>> a "300" response code. If the XML syntax is incorrect, that's a 301. 
>> If the message contains an invalid value, that's a 302. And those are 
>> the only two conditions that are described for 300. I think you need 
>> to give "300" a bit more thought -- if you can't come up with a more 
>> generic description (e.g., "low-level request error"), you probably 
>> want to remove it.
>>
>> Section 5.7: for 403, be clear which identifier is meant. Do you mean 
>> "The <clueId> used in the message is not valid..."?
>>
>> Section 5.7: for 404, please be clear that you're talking about the 
>> sequence number rather than just using "number" without qualification.
>>
>> Section 5.7: The description for 405 uses the acronym "MCC" without 
>> expanding it. Please expand it.
>>
>> BLOCKING: The state machines in section 6 and its subsections don't 
>> have transitions for all possible messages that could arrive in a 
>> state. This can cause interop issues. Please add text that clearly 
>> indicates whether such messages do or do not cause a transition. 
>> (This might be as simple as "messages not shown for a state do not 
>> cause the state to change," but only if you carefully check that this 
>> is true -- for example, what should an MP state machine do do if it 
>> gets a "CONF + ACK" in the state "WAIT FOR CONF"?)
>>
>> Section 6: The fifth paragraph says "CLUE channel" where it means 
>> "CLUE data channel."
>>
>> Section 6: The eighth paragraph says:
>>
>>     The CP moves from the ACTIVE state to the IDLE one when the sub-state
>>     machines that have been activated are (both) in the relative
>>     TERMINATED state (see sections Section 6.1 and Section 6.2).
>>
>> The "both" in this paragraph is confusing, since it's possible to 
>> have only one or the other machine running. Please rephrase.
>>
>> Section 6.2 describes a state machine that starts in a state called 
>> "WAIT FOR ADV." This state does not appear to be timer-supervised, 
>> meaning that implementations of this state machine can stay in this 
>> state literally forever. Is that the intention?
>>
>> Section 6.2 also describes the possibility of sending a <ack> and 
>> <configure> separately or as a combined message. I would have 
>> expected to see some discussion here about why implementations might 
>> choose one behavior over the other.
>>
>> Section 7: The second sentence of the second paragraph needs a verb.
>>
>> Section 7, paragraph 3: this claims that versions are a non-negative 
>> *integer* rather than a *digit*. As I mention above, this seems to be 
>> the right way to handle versions, but it is decidedly at odds with 
>> text elsewhere in the document and in the schema. Regardless of how 
>> you choose to treat versions, the text needs to be consistent.
>>
>> Section 7, final paragraph: replace "Clue" with "CLUE."
>>
>> Section 8 contains a paragraph starting with "In that case, the new 
>> information..." but it's not clear which case it's referring to. 
>> Please replace "In that case" with a description of the case under 
>> consideration.
>>
>> Section 8 contains the phrase "Similarly to what said before..." -- 
>> this is ungrammatical and needs to be rephrased.
>>
>> Section 8 indicates that extensions need to indicate "the standard 
>> version of the protocol the extension refers to." Given that there is 
>> compatibility within a major version of a protocol, I think this 
>> means to say "the major standard version of the protocol that the 
>> extension refers to."
>>
>> Section 8 has a paragraph starting "For that reason..." -- it's not 
>> clear what reason is being referred to here. Please clarify.
>>
>> Section 8 contains schema that contains a <version> element. It 
>> should be clarified that this is the *protocol* version, not the 
>> *extension* version (or, if it's the extension version, that needs to 
>> be spelled out too, but I think you'll need a new namespace for that...?)
>>
>> BLOCKING: Section 8.1 is an example section, which are non-normative 
>> in IETF documents. It contains 2119-style normative language, 
>> however. These normative statements need to be moved out of the 
>> example section (and probably into section 8). The remainder of this 
>> comment is non-blocking: I also find the "SHOULD" in this section to 
>> be highly perplexing. Can you explain the rationale behind requiring 
>> schema, but not requiring any description of what the schema *means?)
>>
>> BLOCKING: The example in section 8.1 includes the following:
>>
>>     xmlns="clue-info-extension-myVideoExtensions"
>>
>> I'm pretty certain that namespaces are required to be identified by 
>> URIs rather than arbitrary strings.
>>
>> Section 8.1: The final paragraph also has normative language in it, 
>> although it appears to be reiterating requirements from elsewhere in 
>> the document. I suggest lowercasing "MUST" in this paragraph.
>>
>> Section 8.1: The final paragraph mentions the use of <options> and 
>> <optionsResponse> to negotiate the extension. An example 
>> demonstrating this negotiation would be extremely useful.
>>
>> The schema in section 9 contains:
>>
>> <xs:import namespace="urn:ietf:params:xml:ns:clue-info"
>> schemaLocation="data-model-schema-17.xsd"/>
>>
>> If you do not intend to bake this "-17" into the document (and I 
>> can't imagine you do), please add an RFC editor note to change it to 
>> something else upon publication.
>>
>> Also, the schema in section 9 is (with rare exception) unindented. 
>> This makes it *very* hard to read. Please consider formatting it with 
>> conventional XML indentation. (This also applies to the schema 
>> excerpts earlier in the document.)
>>
>> Has there been any automated tool-based checking that the examples in 
>> section 10 to verify that they conform to the schema in section 9 
>> (and the schemata it imports)?
>>
>> Section 10.2: Please expand the acronym "MCCs" in the section title 
>> (keep in mind that this appears in the table of contents, where it 
>> needs to makes sense).
>>
>> Section 10 in general: While it does consume a lot of space, I don't 
>> think that defining six rather different message types and then 
>> showing only *one* type in the examples is very illustrative of the 
>> protocol. I would *STRONGLY* suggest adding at least a response 
>> message, and ideally the example section should contain at least one 
>> example for each of the six message types.
>>
>> Section 11, paragraph 4: Replace "Clue" with "CLUE."
>>
>> Section 12.1 has a strange double-double quote around the URN name, 
>> and section 12.3 repeats this for the MIME type.
>>
>> All subsections of section 12: please update all registrant contact 
>> information to point to the IESG (iesg@ietf.org) rather than the CLUE 
>> working group and one of the authors.
>>
>> Section 12.4.1: These descriptions will appear in an IANA registry, 
>> where the phrase "in this document" will have no context and be 
>> rather nonsensical. Please rephrase.
>>
>> Sections 13 through 23 should include an RFC editor note asking for 
>> removal before publication.
>>
>> /a
>>
>