[clue] AD Review: draft-ietf-clue-signaling-11

Adam Roach <adam@nostrum.com> Sat, 03 June 2017 00:14 UTC

From: Adam Roach <adam@nostrum.com>
To: clue@ietf.org
Cc: clue-chairs@ietf.org, draft-ietf-clue-signaling@tools.ietf.org
Message-ID: <0b69d2f1-11e1-8fd1-d4a1-2faacc0a8528@nostrum.com>
Date: Fri, 02 Jun 2017 19:14:51 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:52.0) Gecko/20100101 Thunderbird/52.1.0
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/clue/qTA6_Eb91XyVWob9cbYUDV1G0Dg>
Subject: [clue] AD Review: draft-ietf-clue-signaling-11
Precedence: list

CLUE working group --

I have completed my AD review for the CLUE signaling document. The 
document is generally in good shape, but I think we'll need another 
revision before putting it in front of the IETF for last call 
(especially due to the apparent incomplete removal of support for 
specifying multiple CLUE groups per session).

Most of my comments below are feedback that the document authors should 
treat as normal last call comments. The feedback that I consider to 
block progressing the document, in my role as AD, is explicitly marked 
with the prefix "BLOCKER", and these will need to be resolved in a new 
version of the document before progressing it further. Note that it is 
entirely possible that something I have marked "BLOCKER" may stem from 
an error on my part; so recognize that these are not demands for change, 
as much as a need to have things either fixed in the document or 
explained to me.

Title: The rule of thumb is that all but a small handful of well-known 
acronyms need to be expanded in titles and abstracts. I recognize that 
"CLUE" is a bit tortured, as acronyms go, but the title of this document 
is, broadly speaking, opaque. Please change it to something meaningful, 
such as "Session Signaling for Controlling Multiple Streams for 
Telepresence (CLUE)"

BLOCKER: General: It is clear from reading this version of the document 
that earlier versions contained the notion of multiple "a=group:CLUE" 
lines in a single SDP description. This version appears to have tried to 
remove all related text, but there are still enough mentions that talk 
about CLUE groups in a way that implies that there can be multiples so 
as to cause confusion. These need to be cleaned up. I call the specific 
instances out on a section-by-section basis below. I'm mentioning it up 
here since it's really only one blocker issue with a bunch of instances.

General: The _protocol_ document has a host of different terms for each 
kind of CLUE message (e.g., "ADVERTISEMENT," "ADV," and "advertisement" 
for the same operation). This document exacerbates the situation by 
introducing yet more variations, such as "Advertisement" and 
"Configure". Please coordinate with draft-ietf-clue-protocol to use a 
consistent set of names for these operations between the two documents.

Introduction: The convention I see in this document is to write defined 
terms with initial caps; please replace "encoding group" with "Encoding 
Group."

Section 3: Please add a reference to RFC3840 (e.g.: 'The "sip.clue" 
media feature tag [RFC3840] indicates...")

Section 4.2: "Presence of the data channel in a CLUE group..." implies 
there can be more than one group. Replace "a" with "the."

Section 4.3: ...its "mid" value MUST be included in a CLUE group' 
implies there can be more than one group. Replace "a" with "the."

Section 4.4.1: "...in a CLUE group as defined above." implies there can 
be more than one group. Replace "a" with "the."

Section 4.4.1: '..."m=" lines in the same CLUE group in the SDP 
message...' very strongly implies there can be more than one group. 
Rephrase, perhaps along the lines of "...CLUE-controlled "m=" lines in 
the SDP message..."

4.4.2: 'These "m=" lines are CLUE-controlled and hence MUST include 
their "mid" in the CLUE group corresponding to the CLUE group of the 
Encoding they wish to receive.' is getting pretty explicit about the 
presence of multiple CLUE groups. Fix.

4.5.2.1, first sentence: Replace "If the recipient is a CLUE-capable..." 
with "If the recipient of an offer is a CLUE-capable..."

BLOCKER: Section 4.5.2.2: For avoidance of doubt, this section should 
clearly indicate what the answer should do with CLUE-controlled lines 
that it has no intention of receiving (for sendonly) or sending (for 
recvonly). I believe the expectation here is to set the port to zero 
(rather than, e.g., setting the direction to inactive). The document 
should explicitly state this behavior: if implementations make different 
choices between port-zero and inactive and don't expect the other 
behavior, you can end up with incompatibilities.

Section 4.5.3.1, paragraph 2: My recollection is that telling 
implementors not to send media is frequently misinterpreted to mean that 
they don't have to send/receive RTCP either. This causes all kinds of 
grief. It will probably head off issues if this section is phrased more 
like "...MAY choose not to send RTP on the non-CLUE-controlled channels 
(although RTCP is still sent and received as normal) during the period..."

Section 4.5.4.1: 'Subsequent offer/answer exchanges MAY add additional 
"m=" lines...' -- this should probably also mention "and activate 
inactive ones."

Section 4.5.4.1: 'Subsequent offer/answer exchanges MAY also deactivate 
"m=" lines for CLUE-controlled media.' -- again, the interpretation of 
"deactivate" may be different between implementors. Please be clear 
about whether this means "a=inactive", port=0, or both.

Section 4.5.4.1: The final paragraph talks about "deactivating" non-CLUE 
media. Again, this should be explicit about what is meant.

Section 4.5.4.2: 'If, in an ongoing non-CLUE call, an SDP offer/answer 
exchange completes with both sides having included a data channel "m=" 
line in their SDP and with the "mid" for that channel in corresponding 
CLUE groups..." implies that there can be more than one CLUE group. Fix.

Section 4.5.4.3: "...include the data channel in a matching CLUE 
group..." implies there can be more than one group. Replace "a matching" 
with "the."

Section 4.5.4.3: "Any active "m=" lines still included in a CLUE 
group..." implies there can be more than one group. Replace "a" with "the."

Section 4.5.4.3: "Note that this is distinct from cases where the CLUE 
protocol negotiation fails, or an error occurs in the CLUE protocol; see 
[I-D.ietf-clue-protocol] for details of media and state preservation in 
this circumstance." -- I carefully scrubbed the CLUE protocol document 
to try to determine what this is referring to. Please change it to "see 
[I-D.ietf-clue-protocol] section X.Y.Z", but replacing "X.Y.Z" with the 
section that provides the details you allude to.


BLOCKER: Compare the normative statements in paragraph 2 of Section 5.3:

    Generally, implementations that receive messages for which they have
    incomplete information SHOULD wait until they have the corresponding
    information they lack before sending messages to make changes related
    to that information.  For example, an answerer that receives a new
    SDP offer with three new "a=sendonly" CLUE "m=" lines for which it
    has received no CLUE Advertisement providing the corresponding
    capture information SHOULD include corresponding "a=inactive" lines
    in its answer, and SHOULD make a new SDP offer with "a=recvonly" when
    and if a new Advertisement arrives with Captures relevant to those
    Encodings.

With the normative statements in section 4.5.2.2:

    If the initial offer contained "a=recvonly" CLUE-controlled media
    lines the recipient SHOULD include corresponding "a=sendonly" CLUE-
    controlled media lines for accepted Encodings
    ...
    If the initial offer contained "a=sendonly" CLUE-controlled media
    lines the recipient MAY include corresponding "a=recvonly" CLUE-
    controlled media lines

5.3 says "SHOULD set a=inactive" in the exact same circumstances 4.5.2.2 
says "SHOULD set a=sendonly". Please pick one expected behavior and make 
sure both sections agree. Ideally, you would refactor this so that the 
normative statement is made in only one location.


Section 7 appears to be oddly silent on the use of RTCP MUX. I suspect 
that the intention is that CLUE will generally multiplex RTCP when using 
BUNDLE? I would expect this to be called out.

Section 7 offers that the use of BUNDLE has the advantage of reducing 
the number of ICE candidates that need to be collected. This is true 
only in the case that some m= section are marked as bundle-only; this, 
in turn, implies that a bundled offer is going to contain one or more m= 
sections that will fail if the remote endpoint does not implement 
BUNDLE. This interacts particularly badly with the text in section 4.5.1 
that allows, under certain circumstances, that 'the initial offer MAY 
contain "m=" lines for CLUE-controlled media.', since you can end up 
with the whole CLUE session falling apart if the bundle-only lines are 
not chosen carefully. This is a relatively complicated issue that I 
wouldn't expect impementors to get correct without some guidance on the 
use of bundle-only, and in particular its interaction with 
CLUE-controlled lines. Please add text to Section 7 that discusses these 
issues.

Section 8: Please insert a page break before the ladder diagram so that 
the entity names don't appear on a page by themselves.

Section 8: "In this case Bob is the Channel Initiator..." this isn't 
clear (and, in fact, it's counterintuitive to me) -- perhaps there 
should be some text indicating *why* Bob is the Channel Initiator.

General, but surfaced in section 8: The procedures described in this 
document virtually guarantee that every CLUE call that is established 
will result in glare (response code 491) behavior. This might cause the 
operations folks some heartburn, as it means that their error counts 
will spike once CLUE is deployed. Further, without fairly advanced 
analysis of the callflow, this will make it impossible to distinguish 
"expected" CLUE-induced 491s from the oddball actual glare conditions 
usually signaled by 491. Has any consideration been given to avoiding 
this situation (e.g., by having the called party wait on the order of 
one second before attempting to negotiate its encodings)?


Section 8 contains the following text:

    Bob also sends his SDP answer as part of SIP 200 OK 2.  Alongside his
    original audio, video and CLUE "m=" lines he includes two active
    recvonly "m= "lines and a zeroed "m=" line for the third.

This should probably be qualified to indicate that these are the same 
m-sections as were present in the offer (as opposed to creating new 
sections).


Section 8: Replace "Having received this Alice..." with "Having received 
this offer, Alice..."

Section 9: "From the lack of the data channel and grouping framework..." 
-- it should be noted that implementations that end up in communication 
with normal non-CLUE WebRTC implementations might get a datachannel but 
no CLUE group. The quoted text implies that the presence or absence of a 
datachannel can be used to determine CLUE support, which might cause 
implementors to rely on that exclusively. Please change the text to 
indicate that the CLUE group should be used as the sole determinant of 
CLUE support (at least, as far as SDP signaling is concerned).

Section 10: It is rather unusual to include authors in the 
acknowledgements section. For each of Rob Hansen, Paul Kyzivat, and 
Christian Groves, I suggest removing the individual's name from either 
the Acknowledgements section or from the authors list.

Section 11.2: "This specification registers a new media feature tag in 
the SIP [RFC3264] tree..." This should cite RFC3261 for SIP rather than 
RFC3264.

Section 11.2: Please indicate "iesg@ietf.org" as the contact for this 
registration.


BLOCKING: Section 12 indicates that no security mechanisms are mandatory 
to use "due to the issues addressed in [RFC7202]." This vastly 
misconstrues the intent and text of RFC7202, which (roughly speaking) 
says "we don't mandate security for RTP in general because it is used in 
a vast and varied array of applications." CLUE is a very specific, very 
narrow application of RTP that does not suffer from the generality that 
makes a one-size-fits-all solution for RTP infeasible. In fact, RFC7202 
is quite clear that its guidance applies exclusively to RTP, and not to 
protocols that *use* RTP: "Documents that define an interoperable class 
of applications using RTP are subject to [RFC3365], and thus need to 
specify MTI security mechanisms." If there are specific arguments put 
forth in RFC7202 that the working group believes also apply to this 
document, please reiterate them here. The current citation to RFC7202 is 
problematic, though, since it says the opposite of what this security 
section implies.

If I understand it correctly, though, the argument to put forth here is 
that implementations MAY choose not to secure legacy 
(non-CLUE-controlled) media lines because SIP was specified prior to 
RFC3365, and was therefore not bound by its requirements; and, as a 
consequence, many existing SIP endpoints are incapable of using 
DTLS-SRTP. This rationale is orthogonal to the guidance in RFC7202, and 
should not cite it.

The use of "DTLS" to refer to "DTLS-SRTP" is also confusing, as they are 
different things.

I propose changing the paragraph as follows:

    This attack can be prevented by ensuring that the media recipient intends
    to receive the media packets.  As such, all CLUE-capable devices MUST
    support key negotiation and receiver intent assurance via DTLS-SRTP
    [RFC5763] on CLUE-controlled RTP "m=" lines.  As specified in
    [I-D.ietf-clue-framework], all CLUE-controlled RTP streams must be
    secured and implemented using mechanisms such as SRTP [RFC3711].  CLUE
    implementations MAY choose not to require the use of SRTP to secure legacy
    (non-CLUE-controlled) media for backwards compatibility with older
    SIP clients that are incapable of supporting SRTP.

Section 12: "To prevent this, SIP signaling SHOULD always be encrypted 
using TLS..." While I agree with the statement, I think it's a bit 
outside the purview of this document to specify this. Suggest: "...SIP 
signaling used to set up CLUE sessions SHOULD..."

/a

Re: [clue] AD Review: draft-ietf-clue-signaling-11 Adam Roach
[clue] AD Review: draft-ietf-clue-signaling-11 Adam Roach
Re: [clue] AD Review: draft-ietf-clue-signaling-11 Robert Hansen
Re: [clue] AD Review: draft-ietf-clue-signaling-11 Adam Roach
Re: [clue] AD Review: draft-ietf-clue-signaling-11 Rob Hansen (rohanse2)
Re: [clue] AD Review: draft-ietf-clue-signaling-11 Roni Even
Re: [clue] AD Review: draft-ietf-clue-signaling-11 Adam Roach
Re: [clue] AD Review: draft-ietf-clue-signaling-11 Adam Roach
Re: [clue] AD Review: draft-ietf-clue-signaling-11 Paul Kyzivat
Re: [clue] AD Review: draft-ietf-clue-signaling-11 Adam Roach
Re: [clue] AD Review: draft-ietf-clue-signaling-11 Adam Roach
Re: [clue] AD Review: draft-ietf-clue-signaling-11 Rob Hansen (rohanse2)
Re: [clue] AD Review: draft-ietf-clue-signaling-11 Roni Even