Re: [clue] AD Review: draft-ietf-clue-signaling-11

Adam Roach <adam@nostrum.com> Tue, 26 September 2017 15:28 UTC

Return-Path: <adam@nostrum.com>
X-Original-To: clue@ietfa.amsl.com
Delivered-To: clue@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A48081342AC for <clue@ietfa.amsl.com>; Tue, 26 Sep 2017 08:28:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.879
X-Spam-Level:
X-Spam-Status: No, score=-1.879 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, T_SPF_HELO_PERMERROR=0.01, T_SPF_PERMERROR=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4TnGHs5bEzNh for <clue@ietfa.amsl.com>; Tue, 26 Sep 2017 08:27:58 -0700 (PDT)
Received: from nostrum.com (raven-v6.nostrum.com [IPv6:2001:470:d:1130::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 998C7134285 for <clue@ietf.org>; Tue, 26 Sep 2017 08:27:58 -0700 (PDT)
Received: from Svantevit.roach.at (cpe-70-122-154-80.tx.res.rr.com [70.122.154.80]) (authenticated bits=0) by nostrum.com (8.15.2/8.15.2) with ESMTPSA id v8QFRqEi040796 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Tue, 26 Sep 2017 10:27:55 -0500 (CDT) (envelope-from adam@nostrum.com)
X-Authentication-Warning: raven.nostrum.com: Host cpe-70-122-154-80.tx.res.rr.com [70.122.154.80] claimed to be Svantevit.roach.at
From: Adam Roach <adam@nostrum.com>
To: "Rob Hansen (rohanse2)" <rohanse2@cisco.com>, "clue@ietf.org" <clue@ietf.org>
References: <0b69d2f1-11e1-8fd1-d4a1-2faacc0a8528@nostrum.com> <d4cfe8e14c7c40f0963f5d3e65fd17f9@XCH-RCD-016.cisco.com> <c4e95707-1fc6-0806-d878-da57397b1dde@nostrum.com>
Message-ID: <3221debc-5b55-470c-fb0c-7ea8bb5c8a45@nostrum.com>
Date: Tue, 26 Sep 2017 10:27:52 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:52.0) Gecko/20100101 Thunderbird/52.3.0
MIME-Version: 1.0
In-Reply-To: <c4e95707-1fc6-0806-d878-da57397b1dde@nostrum.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/clue/2hI68tHew3SSgHblM8oG2K1WiSQ>
Subject: Re: [clue] AD Review: draft-ietf-clue-signaling-11
X-BeenThere: clue@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: CLUE - ControLling mUltiple streams for TElepresence <clue.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/clue>, <mailto:clue-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/clue/>
List-Post: <mailto:clue@ietf.org>
List-Help: <mailto:clue-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/clue>, <mailto:clue-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 26 Sep 2017 15:28:00 -0000

Rob --

Just a quick ping to check when we might expect to see a revised version 
of the document.

/a

On 8/24/17 7:29 PM, Adam Roach wrote:
> Thanks! Responses inline.
>
> On 8/20/17 21:40, Rob Hansen (rohanse2) wrote:
>> Section 4.5.4.3: "Note that this is distinct from cases where the 
>> CLUE protocol negotiation fails, or an error occurs in the CLUE 
>> protocol; see [I-D.ietf-clue-protocol] for details of media and state 
>> preservation in this circumstance." -- I carefully scrubbed the CLUE 
>> protocol document to try to determine what this is referring to. 
>> Please change it to "see [I-D.ietf-clue-protocol] section X.Y.Z", but 
>> replacing "X.Y.Z" with the section that provides the details you 
>> allude to.
>>
>> [Rob] I believe when I wrote this the plan was that call preservation 
>> actions in the event of a protocol error/failure would be addressed 
>> as part of the protocol document, but that this section had not yet 
>> been written, and that remains the case. Simon, is this something you 
>> have planned, or can you point me at the relevant section?
>
> Simon is on vacation for (I think) at least another week or so; but I 
> agree that this may need some coordination. See also my earlier 
> response to Roni.
>
>> BLOCKER: Compare the normative statements in paragraph 2 of Section 5.3:
>>
>>      Generally, implementations that receive messages for which they 
>> have
>>      incomplete information SHOULD wait until they have the 
>> corresponding
>>      information they lack before sending messages to make changes 
>> related
>>      to that information.  For example, an answerer that receives a new
>>      SDP offer with three new "a=sendonly" CLUE "m=" lines for which it
>>      has received no CLUE Advertisement providing the corresponding
>>      capture information SHOULD include corresponding "a=inactive" lines
>>      in its answer, and SHOULD make a new SDP offer with "a=recvonly" 
>> when
>>      and if a new Advertisement arrives with Captures relevant to those
>>      Encodings.
>>
>> With the normative statements in section 4.5.2.2:
>>
>>      If the initial offer contained "a=recvonly" CLUE-controlled media
>>      lines the recipient SHOULD include corresponding "a=sendonly" CLUE-
>>      controlled media lines for accepted Encodings
>>      ...
>>      If the initial offer contained "a=sendonly" CLUE-controlled media
>>      lines the recipient MAY include corresponding "a=recvonly" CLUE-
>>      controlled media lines
>>
>> 5.3 says "SHOULD set a=inactive" in the exact same circumstances 
>> 4.5.2.2 says "SHOULD set a=sendonly". Please pick one expected 
>> behavior and make sure both sections agree. Ideally, you would 
>> refactor this so that the normative statement is made in only one 
>> location.
>>
>> [Rob] I don't think these sections are in conflict - the quoted 
>> paragraph from section 5.3 is referring to cases where the SDP offer 
>> includes "a=sendonly" lines, whereas the section in 4.5.2.2 saying 
>> "SHOULD set a=sendonly" is talking about that the SDP *answer* 
>> including "a=sendonly" lines in response to the offerer's 
>> "a=recvonly" lines. It's the paragraph below that corresponds to the 
>> quoted 5.3 paragraph, which says that the SDP *answer* MAY include 
>> "a=recvonly" in its response or "MAY" wait, and then references 
>> section 5.3, which is where the quoted paragraph with recommendation 
>> that implementations should wait and send a subsequent SDP is 
>> included. We ended up with this approach because, even though in most 
>> cases implementations should wait until they receive the information 
>> about the encodings and their contents via the CLUE channel, there 
>> are some valid use-cases where implementations will know this 
>> up-front and hence can avoid the need for multiple SDP exchanges.
>
> Ah, okay. I see what you're getting at here. I think the problem, 
> then, is that the language in 5.3 isn't really normative per se (or, 
> rather, it shouldn't be normative), as much as it is illustrative. 
> (This is reinforced by the phrasing "For example...") I would propose:
>
>     For example, an answerer that receives a new
>     SDP offer with three new "a=sendonly" CLUE "m=" lines for which it
>     has received no CLUE Advertisement providing the corresponding
>     capture information would typically include corresponding 
> "a=inactive"
>     lines in its answer, and make a new SDP offer with "a=recvonly" only
>     when and if a new Advertisement arrives with Captures relevant to
>     those Encodings.
>
>
>> General, but surfaced in section 8: The procedures described in this 
>> document virtually guarantee that every CLUE call that is established 
>> will result in glare (response code 491) behavior. This might cause 
>> the operations folks some heartburn, as it means that their error 
>> counts will spike once CLUE is deployed. Further, without fairly 
>> advanced analysis of the callflow, this will make it impossible to 
>> distinguish "expected" CLUE-induced 491s from the oddball actual 
>> glare conditions usually signaled by 491. Has any consideration been 
>> given to avoiding this situation (e.g., by having the called party 
>> wait on the order of one second before attempting to negotiate its 
>> encodings)?
>>
>> [Rob] I definitely agree that glare is much more likely at the start 
>> of a CLUE call. There was quite a bit of discussion in the group on 
>> the pros and cons of introducing an asymmetry into the call messaging 
>> to avoid (or reduce the frequency) of glare, and how best to do so, 
>> but the final conclusion in the end was not to do so and to rely on 
>> SIP's mechanisms to resolve it.
>
> Sure. What I'd like to have positive confirmation on is: did the 
> working group specifically consider the operational aspects of this 
> decision? I agree that it works from a protocol perspective. I'm just 
> worried that it will give operators unnecessary difficulty.
>
>> Section 10: It is rather unusual to include authors in the 
>> acknowledgements section. For each of Rob Hansen, Paul Kyzivat, and 
>> Christian Groves, I suggest removing the individual's name from 
>> either the Acknowledgements section or from the authors list.
>>
>> [Rob] The authors list hasn't really been updated since the initial 
>> stages. Looking at other docs like the framework one I can see 
>> they've been revised a fair bit. For now I've left the authors as-is 
>> and removed the duplicate names from the acknowledgements, but will 
>> reach out to Paul and Roni for guidance here.
>
> Thanks. Either resolution makes sense to me, and I suspect that the 
> current author list is correct.
>
>> Section 8: "In this case Bob is the Channel Initiator..." this isn't 
>> clear (and, in fact, it's counterintuitive to me) -- perhaps there 
>> should be some text indicating *why* Bob is the Channel Initiator.
>>
>> [Rob] I've made explicit that, when the SCTP over DTLS channel is 
>> negotiated, Bob ends up the client and hence the Channel Initiator. 
>> However, when I went to double-check that that was how the initiator 
>> role was assigned, I can't actually find anything in the protocol or 
>> datachannel document that defines who ends up with the Initiator 
>> role. That definitely seems like something that we need to fix... 
>> (unless I've just failing to find it). Simon, is this something 
>> you're planning to address?
>
> The reason this seems counter-intuitve to me is that it is backwards 
> from how RTCWEB (JSEP) works in the general case. To be clear, for 
> datachannels, the TLS client is selected by the "a=setup" attribute; 
> and JSEP implementations are required (MUST) to put "a=setup:actpass" 
> in their offers, and expected (SHOULD) to put "a=setup:active" in 
> their answers. The rationale here is: the way ICE ends up working, the 
> answerer will have the first opportunity to send a packet, so this 
> reduces overall setup time by ~1/2-RTT.
>
> Of course, CLUE is free to do this however it wants [1]; but doing it 
> opposite from RTCWEB is likely to confuse people beyond just me. I 
> think you'd also need a reasonably good rationale, as a naïve analysis 
> of CLUE is that doing it the way you currently have in your examples 
> is generally going to impose an additional 1/2-RTT delay on 
> datachannel establishment. But I freely admit that I haven't spent a 
> lot of time thinking about the low-level details, and could be 
> overlooking something.
>
> /a
>
> ____
> [1] Subject to the constraints in 
> <https://tools.ietf.org/html/draft-ietf-mmusic-sctp-sdp-11>, sections 
> 10 - 11