Re: [MMUSIC] Continuous Nomination (Was: Faster ICE by role reversal?)

🔓Dan Wing <dwing@cisco.com> Sat, 08 November 2014 00:35 UTC

Return-Path: <dwing@cisco.com>
X-Original-To: mmusic@ietfa.amsl.com
Delivered-To: mmusic@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 632851A0032 for <mmusic@ietfa.amsl.com>; Fri, 7 Nov 2014 16:35:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.795
X-Spam-Level:
X-Spam-Status: No, score=-14.795 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-0.594, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ly70p7F2mAhd for <mmusic@ietfa.amsl.com>; Fri, 7 Nov 2014 16:35:35 -0800 (PST)
Received: from rcdn-iport-3.cisco.com (rcdn-iport-3.cisco.com [173.37.86.74]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2C2FB1A000E for <mmusic@ietf.org>; Fri, 7 Nov 2014 16:35:35 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=6677; q=dns/txt; s=iport; t=1415406936; x=1416616536; h=mime-version:subject:from:in-reply-to:date:cc: content-transfer-encoding:message-id:references:to; bh=QAJJ+Jp27sAyFG0vphaQa7RlGoKEcYRX6EW+XvgWS0k=; b=fWNiVISuofdwFX71dn65XULRr0YVVn4kiGpmebPxfIa/o0m2xj4t0TOU Ibrn5HGc4lE1lAAnYzkrOn0aBDu3oLRspbJZ5Qga8aMnmS4XTmmpMpngs Vhnr1sQfZtE9OwEabO/umW1r81A/7ZhVduV/PIy5Urs8sP4NEBT5EeaM+ w=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ag4FAE9kXVStJV2b/2dsb2JhbABWBYMOVFnLX4dNAoEaFgEBAQEBfYQCAQEBAwE6PwULCw4KJwchJREGE4gsAwkJDchADYZZAQEBAQEBAQEBAQEBAQEBAQEBAQEBEwSKdYNigU8qDjMHgy2BHgWEajMChl+KfYRKQ4ISgTKNeEKCZ4QJhBocL4EIJIEfAQEB
X-IronPort-AV: E=Sophos;i="5.07,336,1413244800"; d="scan'208";a="370431319"
Received: from rcdn-core-4.cisco.com ([173.37.93.155]) by rcdn-iport-3.cisco.com with ESMTP; 08 Nov 2014 00:35:35 +0000
Received: from sjc-vpn1-815.cisco.com (sjc-vpn1-815.cisco.com [10.21.99.47]) by rcdn-core-4.cisco.com (8.14.5/8.14.5) with ESMTP id sA80ZT00031358 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Sat, 8 Nov 2014 00:35:30 GMT
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\))
From: 🔓Dan Wing <dwing@cisco.com>
In-Reply-To: <545D4A84.1000608@jitsi.org>
Date: Fri, 07 Nov 2014 16:35:29 -0800
Content-Transfer-Encoding: quoted-printable
Message-Id: <42179677-E238-4EF8-8ED2-E7AA03661F05@cisco.com>
References: <CABkgnnVrXKz-7M_Qn7pSZBxCJTdQYPDOcEzrEbbv6eYrQs1Dhg@mail.gmail.com> <67A963F0-3667-47A7-B116-4712BA1147AD@vidyo.com> <CABkgnnV+ARP5xC-z=3AshUObUX_m3uisLY6NcsgEZVq-1drU8Q@mail.gmail.com> <DD8DA86E-3C6A-44A7-B4E1-92CC0742369D@vidyo.com> <CAOJ7v-1OpbtEujbp4rZOnmOxXB2hoTfjtn5U_kR5wML5sXD_4Q@mail.gmail.com> <545911E9.3070300@ericsson.com> <CAOJ7v-2ikhh+2Y5avJOjR=86UikOfSo169k3jSvFsU=52o3+zw@mail.gmail.com> <CAOJ7v-1oS999xAd52ANcWfW98Dq7nPJV0=1Z5o9fPigFaT7+1w@mail.gmail.com> <CABkgnnVCpRJUdKn34jsds1Dwe8nOA8uoJw4T35ogQ-LTtyb4Rg@mail.gmail.com> <CAOJ7v-1eG43K=vfSuLyuAs2W+_jPTaLpWRcGaCwo=v7XCcHSqA@mail.gmail.com> <545D4A84.1000608@jitsi.org>
To: Emil Ivov <emcho@jitsi.org>
X-Mailer: Apple Mail (2.1878.6)
Archived-At: http://mailarchive.ietf.org/arch/msg/mmusic/guHG67zNcBQszc2mpKzUdihGZBI
Cc: Jonathan Lennox <jonathan@vidyo.com>, Ari Keränen <ari.keranen@ericsson.com>, mmusic <mmusic@ietf.org>
Subject: Re: [MMUSIC] Continuous Nomination (Was: Faster ICE by role reversal?)
X-BeenThere: mmusic@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Multiparty Multimedia Session Control Working Group <mmusic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mmusic>, <mailto:mmusic-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mmusic/>
List-Post: <mailto:mmusic@ietf.org>
List-Help: <mailto:mmusic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mmusic>, <mailto:mmusic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 08 Nov 2014 00:35:37 -0000

On Nov 7, 2014, at 2:41 PM, Emil Ivov <emcho@jitsi.org> wrote:

> Hey all,
> 
> In the context of the ongoing discussion on two potential ICE improvements, I wanted to spin off this thread on the idea of continuous nomination.
> 
> So first of all, as I already mentioned in Toronto, I am very much in favour of a mechanism that would allow ICE processing to yield more than a single nominated pair. Back then the discussion was about MPRTP but there would likely be other use cases as well.
> 
> In that context, it seems perfectly valid to me that nomination among all validated pairs can occur multiple times within a session. Again, I find this to be a very valid and useful feature.
> 
> I am however bothered by the "ICE never ends" suggestions made by the document below and more specifically by the fact that candidates may continue to be trickled forever. I think that's both unnecessary and a problem.
> 
> It is a problem because
> 
> a) being able to know when the remote agent is done trickling is one of the very reasons we started the work on trickle ICE. This allows to quickly and deterministically detect and declare failure when it occurs. Not having this would require implementations to use magic timers of the sort: "Surely if I couldn't connect in .. say 20 seconds ... then the situation must be hopeless ... I think".
> 
> b) being able to continuously trickle new candidates forever and ever means that they would have to be paired with all of the peer's *initial* candidates. This means that ICE agents would have to maintain all gathered candidates even when they are not working for them.
> 
> More importantly however, the whole concept is likely unnecessary because an ICE restart only sounds scary, and it doesn't actually need to be ;).
> 
> A restart is non-disruptive as media can keep flowing while we are doing ICE processing.
> 
> 5245 already supports pair reuse and we can easily port that concept to multipath ICE.
> 
> The one thing that we might be missing today is the syntax to allow for a restart without using offer/answer.

For MICE and some other use cases, I agree that need to signal (in SDP) is the 'scary' part of an ICE restart.

> This would be trivial to resolve: we just need to allow trickling a restart command. Sending your ufrag and pwd in a trickle update should do the job.

That is a nice approach -- move the 'ICE restart' signaling to an (existing, working) media path.


> 
> To put that into perspective, let's look at the following use cases:
> 
> I. You have used the new multipath ICE and you have successfully discovered a couple of working routes: one going through your wi-fi interface and another one going through 4G. During your session, you move closer to your Wi-Fi access point so you detect that RTT and packet loss start looking better on your Wi-Fi interface.
> 
> You decide to switch from 4G to Wi-Fi.
> 
> *No ICE restart* would be necessary here because no new paths would need to negotiated. This would be a native feature of the new multipath ICE.
> 
> II. You start on Wi-Fi in the presence of a 4G interface that you keep for backup. Multipath ICE comes up with pairs over both your interfaces. I don't believe this requires an ICE restart either. You just switch between pre-existing path and one of them expires as pointed out in Justin's document below
> 
> III. You setup a session at a point of time where 4G is your only working interface. Your Wi-Fi interface then comes up and you want to include it in the game. Here's what you do:
> 
> a) You gather host, srflx and relay candidates for the new interface.
> b) You gather candidates on the old interface that haven't been valid before but that are worth trying again (this is non-essential)
> 
> d) you then trickle a new ufrag and pwd to trigger a restart.

If that trickling happens over the SIP/WebRTC/XMPP signaling channel, MICE would prefer not requiring that SDP signaling step or to at least allow it to be done later.

-d


> b) you batch trickle all currently working candidates that you'd wish to preserve (you can omit any candidates you no longer wish to use, for whatever reason).
> c) you trickle the newly gathered candidates.
> 
> The remote party, your peer, would then do the following
> 
> a) Gather all candidates that didn't work before but that deserve a second chance. This time *this is essential*. SRFLX candiates on your peer might not have worked with your 4G connection but they have every chance of working this time. There are a number of other scenarios where candidates would be worth retrying.
> 
> b) Your peer would then trickle back to you all the working candidates it wishes to preserve (as long as they are still valid)
> c) It trickles all candidates it gathered for the occasion, that it wishes to retry.
> 
> Connectivity checks then proceed as usual. It might be worth paying special attention to the in-use candidates. Implementations might want to either confirm that data is still flowing there, or start the checks with them. This would determine how urgently ICE processing needs to yield results and it may imply changes in nomination strategies.
> 
> Thoughts?
> 
> Emil
> 
> 
> On 7.11.14, 0:44, Justin Uberti wrote:
>> Agreed.
>> 
>> I wrote up a slightly more fleshed-out proposal at
>> https://docs.google.com/document/d/1P1XPCRJKBkSjwCzIIEUJmp7V694_FzJQe-fvN8bk-Xw/edit
>> 
>> On Thu, Nov 6, 2014 at 3:29 PM, Martin Thomson <martin.thomson@gmail.com
>> <mailto:martin.thomson@gmail.com>> wrote:
>> 
>>    On 6 November 2014 13:47, Justin Uberti <juberti@google.com
>>    <mailto:juberti@google.com>> wrote:
>>    > I looked into how MICE works; I think that continuous nomination might be
>>    > able to effectively support that use case, as well as the case that Brandon
>>    > was interested in, namely picking the route that uses the best set of TURN
>>    > servers.
>> 
>> 
>>    I've always believe that continuous ICE operation was the only way to
>>    ensure proper reliability.  Multipath or otherwise.  You just need to
>>    work out when it is safe to completely stop on any given pair,
>>    assuming that you don't want to be testing the complete mesh all the
>>    time.  But you also might want multiple paths available, or maybe
>>    backup paths.
>> 
>> 
> 
> -- 
> https://jitsi.org