Re: [clue] AD Review: draft-ietf-clue-signaling-11

Adam Roach <adam@nostrum.com> Mon, 10 July 2017 20:34 UTC

Return-Path: <adam@nostrum.com>
X-Original-To: clue@ietfa.amsl.com
Delivered-To: clue@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2CF9212F26D; Mon, 10 Jul 2017 13:34:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.881
X-Spam-Level:
X-Spam-Status: No, score=-1.881 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-0.001, T_SPF_HELO_PERMERROR=0.01, T_SPF_PERMERROR=0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Z4b92a0XZMfz; Mon, 10 Jul 2017 13:34:00 -0700 (PDT)
Received: from nostrum.com (raven-v6.nostrum.com [IPv6:2001:470:d:1130::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D345412F287; Mon, 10 Jul 2017 13:33:59 -0700 (PDT)
Received: from Orochi.local (99-152-146-228.lightspeed.dllstx.sbcglobal.net [99.152.146.228]) (authenticated bits=0) by nostrum.com (8.15.2/8.15.2) with ESMTPSA id v6AKXnmD091417 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Mon, 10 Jul 2017 15:33:52 -0500 (CDT) (envelope-from adam@nostrum.com)
X-Authentication-Warning: raven.nostrum.com: Host 99-152-146-228.lightspeed.dllstx.sbcglobal.net [99.152.146.228] claimed to be Orochi.local
To: Robert Hansen <rohanse2@cisco.com>, clue@ietf.org
Cc: clue-chairs@ietf.org, draft-ietf-clue-signaling@tools.ietf.org
References: <0b69d2f1-11e1-8fd1-d4a1-2faacc0a8528@nostrum.com> <5168f94a-811e-897a-3c83-f44b17a11568@cisco.com>
From: Adam Roach <adam@nostrum.com>
Message-ID: <f42b90de-824c-a742-a8c7-2f14841279d1@nostrum.com>
Date: Mon, 10 Jul 2017 15:33:44 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:52.0) Gecko/20100101 Thunderbird/52.2.1
MIME-Version: 1.0
In-Reply-To: <5168f94a-811e-897a-3c83-f44b17a11568@cisco.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/clue/Xyq7BeLLW3mF_nz1n2bJ_OLW42g>
Subject: Re: [clue] AD Review: draft-ietf-clue-signaling-11
X-BeenThere: clue@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: CLUE - ControLling mUltiple streams for TElepresence <clue.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/clue>, <mailto:clue-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/clue/>
List-Post: <mailto:clue@ietf.org>
List-Help: <mailto:clue-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/clue>, <mailto:clue-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Jul 2017 20:34:03 -0000

For my own planning purposes, when might we expect to see a revised version?

Thanks!

/a

On 6/7/17 12:22, Robert Hansen wrote:
> Hi Adam,
>
> Thank you for the very detailed review - I'll find the time to review 
> it, respond and post a revised draft ASAP.
>
> Rob
>
> On 03/06/17 01:14, Adam Roach wrote:
>> CLUE working group --
>>
>> I have completed my AD review for the CLUE signaling document. The 
>> document is generally in good shape, but I think we'll need another 
>> revision before putting it in front of the IETF for last call 
>> (especially due to the apparent incomplete removal of support for 
>> specifying multiple CLUE groups per session).
>>
>> Most of my comments below are feedback that the document authors 
>> should treat as normal last call comments. The feedback that I 
>> consider to block progressing the document, in my role as AD, is 
>> explicitly marked with the prefix "BLOCKER", and these will need to 
>> be resolved in a new version of the document before progressing it 
>> further. Note that it is entirely possible that something I have 
>> marked "BLOCKER" may stem from an error on my part; so recognize that 
>> these are not demands for change, as much as a need to have things 
>> either fixed in the document or explained to me.
>>
>> Title: The rule of thumb is that all but a small handful of 
>> well-known acronyms need to be expanded in titles and abstracts. I 
>> recognize that "CLUE" is a bit tortured, as acronyms go, but the 
>> title of this document is, broadly speaking, opaque. Please change it 
>> to something meaningful, such as "Session Signaling for Controlling 
>> Multiple Streams for Telepresence (CLUE)"
>>
>> BLOCKER: General: It is clear from reading this version of the 
>> document that earlier versions contained the notion of multiple 
>> "a=group:CLUE" lines in a single SDP description. This version 
>> appears to have tried to remove all related text, but there are still 
>> enough mentions that talk about CLUE groups in a way that implies 
>> that there can be multiples so as to cause confusion. These need to 
>> be cleaned up. I call the specific instances out on a 
>> section-by-section basis below. I'm mentioning it up here since it's 
>> really only one blocker issue with a bunch of instances.
>>
>> General: The _protocol_ document has a host of different terms for 
>> each kind of CLUE message (e.g., "ADVERTISEMENT," "ADV," and 
>> "advertisement" for the same operation). This document exacerbates 
>> the situation by introducing yet more variations, such as 
>> "Advertisement" and "Configure". Please coordinate with 
>> draft-ietf-clue-protocol to use a consistent set of names for these 
>> operations between the two documents.
>>
>> Introduction: The convention I see in this document is to write 
>> defined terms with initial caps; please replace "encoding group" with 
>> "Encoding Group."
>>
>> Section 3: Please add a reference to RFC3840 (e.g.: 'The "sip.clue" 
>> media feature tag [RFC3840] indicates...")
>>
>> Section 4.2: "Presence of the data channel in a CLUE group..." 
>> implies there can be more than one group. Replace "a" with "the."
>>
>> Section 4.3: ...its "mid" value MUST be included in a CLUE group' 
>> implies there can be more than one group. Replace "a" with "the."
>>
>> Section 4.4.1: "...in a CLUE group as defined above." implies there 
>> can be more than one group. Replace "a" with "the."
>>
>> Section 4.4.1: '..."m=" lines in the same CLUE group in the SDP 
>> message...' very strongly implies there can be more than one group. 
>> Rephrase, perhaps along the lines of "...CLUE-controlled "m=" lines 
>> in the SDP message..."
>>
>> 4.4.2: 'These "m=" lines are CLUE-controlled and hence MUST include 
>> their "mid" in the CLUE group corresponding to the CLUE group of the 
>> Encoding they wish to receive.' is getting pretty explicit about the 
>> presence of multiple CLUE groups. Fix.
>>
>> 4.5.2.1, first sentence: Replace "If the recipient is a 
>> CLUE-capable..." with "If the recipient of an offer is a 
>> CLUE-capable..."
>>
>> BLOCKER: Section 4.5.2.2: For avoidance of doubt, this section should 
>> clearly indicate what the answer should do with CLUE-controlled lines 
>> that it has no intention of receiving (for sendonly) or sending (for 
>> recvonly). I believe the expectation here is to set the port to zero 
>> (rather than, e.g., setting the direction to inactive). The document 
>> should explicitly state this behavior: if implementations make 
>> different choices between port-zero and inactive and don't expect the 
>> other behavior, you can end up with incompatibilities.
>>
>> Section 4.5.3.1, paragraph 2: My recollection is that telling 
>> implementors not to send media is frequently misinterpreted to mean 
>> that they don't have to send/receive RTCP either. This causes all 
>> kinds of grief. It will probably head off issues if this section is 
>> phrased more like "...MAY choose not to send RTP on the 
>> non-CLUE-controlled channels (although RTCP is still sent and 
>> received as normal) during the period..."
>>
>> Section 4.5.4.1: 'Subsequent offer/answer exchanges MAY add 
>> additional "m=" lines...' -- this should probably also mention "and 
>> activate inactive ones."
>>
>> Section 4.5.4.1: 'Subsequent offer/answer exchanges MAY also 
>> deactivate "m=" lines for CLUE-controlled media.' -- again, the 
>> interpretation of "deactivate" may be different between implementors. 
>> Please be clear about whether this means "a=inactive", port=0, or both.
>>
>> Section 4.5.4.1: The final paragraph talks about "deactivating" 
>> non-CLUE media. Again, this should be explicit about what is meant.
>>
>> Section 4.5.4.2: 'If, in an ongoing non-CLUE call, an SDP 
>> offer/answer exchange completes with both sides having included a 
>> data channel "m=" line in their SDP and with the "mid" for that 
>> channel in corresponding CLUE groups..." implies that there can be 
>> more than one CLUE group. Fix.
>>
>> Section 4.5.4.3: "...include the data channel in a matching CLUE 
>> group..." implies there can be more than one group. Replace "a 
>> matching" with "the."
>>
>> Section 4.5.4.3: "Any active "m=" lines still included in a CLUE 
>> group..." implies there can be more than one group. Replace "a" with 
>> "the."
>>
>> Section 4.5.4.3: "Note that this is distinct from cases where the 
>> CLUE protocol negotiation fails, or an error occurs in the CLUE 
>> protocol; see [I-D.ietf-clue-protocol] for details of media and state 
>> preservation in this circumstance." -- I carefully scrubbed the CLUE 
>> protocol document to try to determine what this is referring to. 
>> Please change it to "see [I-D.ietf-clue-protocol] section X.Y.Z", but 
>> replacing "X.Y.Z" with the section that provides the details you 
>> allude to.
>>
>>
>> BLOCKER: Compare the normative statements in paragraph 2 of Section 5.3:
>>
>>    Generally, implementations that receive messages for which they have
>>    incomplete information SHOULD wait until they have the corresponding
>>    information they lack before sending messages to make changes related
>>    to that information.  For example, an answerer that receives a new
>>    SDP offer with three new "a=sendonly" CLUE "m=" lines for which it
>>    has received no CLUE Advertisement providing the corresponding
>>    capture information SHOULD include corresponding "a=inactive" lines
>>    in its answer, and SHOULD make a new SDP offer with "a=recvonly" when
>>    and if a new Advertisement arrives with Captures relevant to those
>>    Encodings.
>>
>> With the normative statements in section 4.5.2.2:
>>
>>    If the initial offer contained "a=recvonly" CLUE-controlled media
>>    lines the recipient SHOULD include corresponding "a=sendonly" CLUE-
>>    controlled media lines for accepted Encodings
>>    ...
>>    If the initial offer contained "a=sendonly" CLUE-controlled media
>>    lines the recipient MAY include corresponding "a=recvonly" CLUE-
>>    controlled media lines
>>
>> 5.3 says "SHOULD set a=inactive" in the exact same circumstances 
>> 4.5.2.2 says "SHOULD set a=sendonly". Please pick one expected 
>> behavior and make sure both sections agree. Ideally, you would 
>> refactor this so that the normative statement is made in only one 
>> location.
>>
>>
>> Section 7 appears to be oddly silent on the use of RTCP MUX. I 
>> suspect that the intention is that CLUE will generally multiplex RTCP 
>> when using BUNDLE? I would expect this to be called out.
>>
>> Section 7 offers that the use of BUNDLE has the advantage of reducing 
>> the number of ICE candidates that need to be collected. This is true 
>> only in the case that some m= section are marked as bundle-only; 
>> this, in turn, implies that a bundled offer is going to contain one 
>> or more m= sections that will fail if the remote endpoint does not 
>> implement BUNDLE. This interacts particularly badly with the text in 
>> section 4.5.1 that allows, under certain circumstances, that 'the 
>> initial offer MAY contain "m=" lines for CLUE-controlled media.', 
>> since you can end up with the whole CLUE session falling apart if the 
>> bundle-only lines are not chosen carefully. This is a relatively 
>> complicated issue that I wouldn't expect impementors to get correct 
>> without some guidance on the use of bundle-only, and in particular 
>> its interaction with CLUE-controlled lines. Please add text to 
>> Section 7 that discusses these issues.
>>
>> Section 8: Please insert a page break before the ladder diagram so 
>> that the entity names don't appear on a page by themselves.
>>
>> Section 8: "In this case Bob is the Channel Initiator..." this isn't 
>> clear (and, in fact, it's counterintuitive to me) -- perhaps there 
>> should be some text indicating *why* Bob is the Channel Initiator.
>>
>> General, but surfaced in section 8: The procedures described in this 
>> document virtually guarantee that every CLUE call that is established 
>> will result in glare (response code 491) behavior. This might cause 
>> the operations folks some heartburn, as it means that their error 
>> counts will spike once CLUE is deployed. Further, without fairly 
>> advanced analysis of the callflow, this will make it impossible to 
>> distinguish "expected" CLUE-induced 491s from the oddball actual 
>> glare conditions usually signaled by 491. Has any consideration been 
>> given to avoiding this situation (e.g., by having the called party 
>> wait on the order of one second before attempting to negotiate its 
>> encodings)?
>>
>>
>> Section 8 contains the following text:
>>
>>    Bob also sends his SDP answer as part of SIP 200 OK 2. Alongside his
>>    original audio, video and CLUE "m=" lines he includes two active
>>    recvonly "m= "lines and a zeroed "m=" line for the third.
>>
>> This should probably be qualified to indicate that these are the same 
>> m-sections as were present in the offer (as opposed to creating new 
>> sections).
>>
>>
>> Section 8: Replace "Having received this Alice..." with "Having 
>> received this offer, Alice..."
>>
>> Section 9: "From the lack of the data channel and grouping 
>> framework..." -- it should be noted that implementations that end up 
>> in communication with normal non-CLUE WebRTC implementations might 
>> get a datachannel but no CLUE group. The quoted text implies that the 
>> presence or absence of a datachannel can be used to determine CLUE 
>> support, which might cause implementors to rely on that exclusively. 
>> Please change the text to indicate that the CLUE group should be used 
>> as the sole determinant of CLUE support (at least, as far as SDP 
>> signaling is concerned).
>>
>> Section 10: It is rather unusual to include authors in the 
>> acknowledgements section. For each of Rob Hansen, Paul Kyzivat, and 
>> Christian Groves, I suggest removing the individual's name from 
>> either the Acknowledgements section or from the authors list.
>>
>> Section 11.2: "This specification registers a new media feature tag 
>> in the SIP [RFC3264] tree..." This should cite RFC3261 for SIP rather 
>> than RFC3264.
>>
>> Section 11.2: Please indicate "iesg@ietf.org" as the contact for this 
>> registration.
>>
>>
>> BLOCKING: Section 12 indicates that no security mechanisms are 
>> mandatory to use "due to the issues addressed in [RFC7202]." This 
>> vastly misconstrues the intent and text of RFC7202, which (roughly 
>> speaking) says "we don't mandate security for RTP in general because 
>> it is used in a vast and varied array of applications." CLUE is a 
>> very specific, very narrow application of RTP that does not suffer 
>> from the generality that makes a one-size-fits-all solution for RTP 
>> infeasible. In fact, RFC7202 is quite clear that its guidance applies 
>> exclusively to RTP, and not to protocols that *use* RTP: "Documents 
>> that define an interoperable class of applications using RTP are 
>> subject to [RFC3365], and thus need to specify MTI security 
>> mechanisms." If there are specific arguments put forth in RFC7202 
>> that the working group believes also apply to this document, please 
>> reiterate them here. The current citation to RFC7202 is problematic, 
>> though, since it says the opposite of what this security section 
>> implies.
>>
>> If I understand it correctly, though, the argument to put forth here 
>> is that implementations MAY choose not to secure legacy 
>> (non-CLUE-controlled) media lines because SIP was specified prior to 
>> RFC3365, and was therefore not bound by its requirements; and, as a 
>> consequence, many existing SIP endpoints are incapable of using 
>> DTLS-SRTP. This rationale is orthogonal to the guidance in RFC7202, 
>> and should not cite it.
>>
>> The use of "DTLS" to refer to "DTLS-SRTP" is also confusing, as they 
>> are different things.
>>
>> I propose changing the paragraph as follows:
>>
>>    This attack can be prevented by ensuring that the media recipient 
>> intends
>>    to receive the media packets.  As such, all CLUE-capable devices MUST
>>    support key negotiation and receiver intent assurance via DTLS-SRTP
>>    [RFC5763] on CLUE-controlled RTP "m=" lines.  As specified in
>>    [I-D.ietf-clue-framework], all CLUE-controlled RTP streams must be
>>    secured and implemented using mechanisms such as SRTP [RFC3711].  
>> CLUE
>>    implementations MAY choose not to require the use of SRTP to 
>> secure legacy
>>    (non-CLUE-controlled) media for backwards compatibility with older
>>    SIP clients that are incapable of supporting SRTP.
>>
>> Section 12: "To prevent this, SIP signaling SHOULD always be 
>> encrypted using TLS..." While I agree with the statement, I think 
>> it's a bit outside the purview of this document to specify this. 
>> Suggest: "...SIP signaling used to set up CLUE sessions SHOULD..."
>>
>> /a
>>
>