[rtcweb] Filling in details on "trickle ICE"

Eric Rescorla <ekr@rtfm.com> Fri, 24 August 2012 15:15 UTC

Return-Path: <ekr@rtfm.com>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6395021F86CB for <rtcweb@ietfa.amsl.com>; Fri, 24 Aug 2012 08:15:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.847
X-Spam-Level:
X-Spam-Status: No, score=-102.847 tagged_above=-999 required=5 tests=[AWL=0.130, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SD1AWrxEP1Ep for <rtcweb@ietfa.amsl.com>; Fri, 24 Aug 2012 08:15:21 -0700 (PDT)
Received: from mail-ey0-f172.google.com (mail-ey0-f172.google.com [209.85.215.172]) by ietfa.amsl.com (Postfix) with ESMTP id 4A38621F86C9 for <rtcweb@ietf.org>; Fri, 24 Aug 2012 08:15:21 -0700 (PDT)
Received: by eaai11 with SMTP id i11so591974eaa.31 for <rtcweb@ietf.org>; Fri, 24 Aug 2012 08:15:20 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-originating-ip:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=gYJV6ZAMsPxQnjh9sBFVg8biwu3AHsxCpt+DPS1AonU=; b=UN9n88dPwOGE2eqC515SeJyBtF5rG8ibwc7FjaJ/a1X1ZSxaIS8s7hSftfSuxpZJoN QB9uqmC0Jm/MP2iZOs/nTnyA6snny3iUvvkwyKIaHETlNDXvGGf1qewzlXCRHxPzmU8B Wytbl6/MQPXIlwL1nOgdmFPU0LnCp2TV9/ZilH9Apxv2ahXkmgX0fWTBVbDL6Rx1dKxX avxuEWOzce9Nl/GtO/pECoIvVkkR3Am/15g6fCHXNhqDavRFDBbaA1CWMzNCc5WtzKuh ZFdMne9YMPE/LnzwuzSEDZCdUaxhLbkSqZOfR727SU+yzezHci2xI+InOsMjipGZNVaC jZKA==
Received: by 10.14.202.131 with SMTP id d3mr8000342eeo.32.1345821320277; Fri, 24 Aug 2012 08:15:20 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.14.187.10 with HTTP; Fri, 24 Aug 2012 08:14:39 -0700 (PDT)
X-Originating-IP: [74.95.2.173]
From: Eric Rescorla <ekr@rtfm.com>
Date: Fri, 24 Aug 2012 08:14:39 -0700
Message-ID: <CABcZeBMzgAs=hK38hCjS7t6yLjkTydS2TQUb8R3rBbRKGakVdQ@mail.gmail.com>
To: rtcweb@ietf.org
Content-Type: text/plain; charset=ISO-8859-1
X-Gm-Message-State: ALoCoQm2eWY4OBBOcweVpER102t+yQrX1MHOTiYMMzjA67Wz9+za6jlMMmqE6j8++NXDvDvZ0s7+
Subject: [rtcweb] Filling in details on "trickle ICE"
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 24 Aug 2012 15:15:22 -0000

I've been working on doing Firefox's trickle ICE implementation and
have had to make a bunch of decisions that suggest to me that perhaps
we need an update to RFC 5245 to specify exactly how trickle ICE
behaves. Here are some of the things we need to resolve.


WHEN CHECKS START
RFC 5245 currently specifies that connectivity checks start upon
receiving the offer/answer (See Section 5):

   When an agent receives an initial offer, it will check if the offerer
   supports ICE, determine its own role, gather candidates, prioritize
   them, choose default candidates, encode and send an answer, and for
   full implementations, form the check lists and begin connectivity
   checks.

However, if you are trickling all your candidates, then this algorithm
is no longer appropriate, since the check list will be empty, so we
need a new algorithm here. There seem to be a number of natural algorithms
here, including:

(a) as soon as any check list is non-empty, start that media stream.
(b) as soon as the first media stream has a non-empty check list
start that media stream
(c) as soon as all check lists are non-empty start the first
media stream.

Note that S 5.7.4 specifies that you are supposed to start with the
first media stream, which is what makes (b) and (c) compelling.


WHEN ICE ENDS
7.1.3.3 describes the conditions required to update the check list
based on transaction completion:

   Regardless of whether the check was successful or failed, the
   completion of the transaction may require updating of check list and
   timer states.

   If all of the pairs in the check list are now either in the Failed or
   Succeeded state:

   o  If there is not a pair in the valid list for each component of the
      media stream, the state of the check list is set to Failed.


Consider the following case:

- Alice and Bob are on different addresses which both use RFC 1918
  space.
- Alice sends Bob the candidate 10.0.0.10 as her first candidate at
  This just happens to correspond to an existing host on Bob's network.
- Bob creates a check list consisting solely of 10.0.0.10 and starts
  checks.
- Because 10.0.0.10 is a real host on Bob's network, it generates an ICMP
  error and per S 7.1.3.1 Bob marks the transaction as Failed:

Unless I have missed something, Given the algorithm above, the check
list now consists solely of Failed candidates and the valid list is
now empty, so this causes the media stream to be Failed. If this is
the only media stream, according to 8.1.2. the entire ICE process is
now marked as Failed.

What's letting this happen is a race condition between candidates
arriving and checks failing. I suspect that there are others, so we
probably need some way to handle these cases.


WHEN NEW TRICKLE CANDIDATES CAN BE PROCESSED
A related question is when new trickle candidates can be processed.
Say that a new candidate is received when a stream has a nominated
candidate on the valid list? Do we add a new candidate to the
list? Note that S 8.1.2 would ordinarily have us stopping checks
on all other extant candidates pairs.


HOW DO OTHER MEDIA STREAMS GET UNFROZEN
Say I finish checks on one stream but another stream successfully.
Ordinarily, I would start to unfreeze other check lists, but what
if those are empty? How do they get unfrozen when candidates come
in?


INTERACTION WITH NON-TRICKLE ICE
I'm concerned about the potential interactions between trickle ICE
and non trickle ICE. Obviously, an extant ICE implementation can
only handle getting all the candidates at once, so in order to
have interop we need to hold all the candidates until we get the
"null" finished candidate indication. But this means that in the
case where the non-trickle version is the offerer, then checks
aren't performed simultaneously. Consider the following case:

- Alice is a non-trickle endpoint
- Bob is a trickle endpoint
- Alice and Bob are behind standard address-dependent filtering NATs
- Bob has two STUN servers but one is down.
- The JS holds off on sending Bob's offer to Alice until all the
  candidates are in. (Which, as I say above, is I think mandatory).

I believe we get the following behavior.


Alice          Alice NAT                Bob NAT           Bob
 Bob STUN 1           Bob STUN 2

OFFER ------------------------------------------------------>
						            BINDING REQUEST ->
 						            BINDING REQUEST -------------------------->
							    <- BINDING RESPONSE
					            onicecandidate(...)
                X <----------------------- CONNECTIVITY CHECK
 						            BINDING REQUEST (RT) --------------------->
                X <------------------ CONNECTIVITY CHECK (RT)
		  		                          ...
						     [Give up on STUN 2]
					            onicecandidate(null)
                                 /----------------------  ANSWER
                                /              [Give up on
CONNECTIVITY CHECK; ICE fails]
<------------------------------/				


The key thing here is that if the STUN timers for connectivity checks
and for initial binding requests are similar, then Bob's checks can
fail before Alice has time to open up a pinhole. This doesn't happen
in regular ICE because the checks are synched in each direction.


I suspect that this list is incomplete, but I think it's enough to
suggest that there may be some details in trickle ICE that need to
at least be documented and perhaps investigated/resolved.

To the chairs: can you provide some guidance as to how to proceed with
this discussion, what forum it should be discussed in, etc.?

Thanks,
-Ekr