[rtcweb] Congestion Control proposal

Randell Jesup <randell-ietf@jesup.org> Thu, 06 October 2011 15:33 UTC

Return-Path: <randell-ietf@jesup.org>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3469321F8BDB for <rtcweb@ietfa.amsl.com>; Thu, 6 Oct 2011 08:33:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.573
X-Spam-Level:
X-Spam-Status: No, score=-2.573 tagged_above=-999 required=5 tests=[AWL=0.026, BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qnNhcUwSqgur for <rtcweb@ietfa.amsl.com>; Thu, 6 Oct 2011 08:33:17 -0700 (PDT)
Received: from r2-chicago.webserversystems.com (r2-chicago.webserversystems.com [173.236.101.58]) by ietfa.amsl.com (Postfix) with ESMTP id F19B421F8BA9 for <rtcweb@ietf.org>; Thu, 6 Oct 2011 08:33:16 -0700 (PDT)
Received: from pool-173-49-141-165.phlapa.fios.verizon.net ([173.49.141.165] helo=[192.168.1.12]) by r2-chicago.webserversystems.com with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.69) (envelope-from <randell-ietf@jesup.org>) id 1RBpzm-0006wM-Tt; Thu, 06 Oct 2011 10:36:27 -0500
Message-ID: <4E8DCA06.5060506@jesup.org>
Date: Thu, 06 Oct 2011 11:32:22 -0400
From: Randell Jesup <randell-ietf@jesup.org>
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:6.0.1) Gecko/20110830 Thunderbird/6.0.1
MIME-Version: 1.0
To: Harald Alvestrand <harald@alvestrand.no>, "rtcweb@ietf.org" <rtcweb@ietf.org>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - r2-chicago.webserversystems.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - jesup.org
X-Source:
X-Source-Args:
X-Source-Dir:
Subject: [rtcweb] Congestion Control proposal
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Oct 2011 15:33:18 -0000

Situation:

We have a UDP connection, either direct or via a TURN server.

On that UDP connection, we've negotiated DTLS.

We're running multiple RTP streams in both directions using DTLS-SRTP.

We also are running multiple data channels, both reliable and unreliable,
over DTLS (perhaps UDP-DTLS-SCTP).

We want to congestion-control all the network data, preferably in as
unified fashion as possible.

Google disclosed an algorithm for doing "predictive" bottleneck-buffer
queue-length sensing based on packet arrival times compared to RTP
timestamps.

Their algorithm should work well in most cases on a single channel, but
with multiple channels there may be inequities and hunting, plus the
application may want to have input into which streams get the available
bits.

I propose an extended version of Google's algorithm, based around computing
the available bandwidth and buffer-fullness across either all the media
channels, or the media channels and the data channels.


Constraints/assumptions:

1) variable number of media streams (SSRCs), audio and/or video
2) The streams may use different timestamp sample rates
3) The streams may have timestamp rates that are not synchronized;
    i.e. they may drift both relative to NTP and to each other.
4) Media streams may start and stop at any point
5) Audio streams may use DTX/VAD and so have gaps in sending
6) Video streams may use variable frame rates
7) Video streams may have major variation in bandwidth from frame
    to frame (iframes, motion or lack thereof, etc).
8) All media streams in a direction might be inactive or in DTX/VAD at
    any given time, for an indeterminate period
9) Media streams will have limits on their ability to adjust bandwidth
    use.  In some cases a renegotiation of streams and/or codecs may be
    required to meet the limitations of the bottleneck.
10) If the other end doesn't implement this congestion control mechanism
     we must adapt bandwidth based on normal RTCP

Goals: (roughly in order of importance)
1) Consistently low-delay
    Note that we don't say "staying within bandwidth" here; the goal
    is the end-user experience (delay&  loss), not the mechanism
    affecting delay and loss (bandwidth).
2) Low but not necessarily 0 loss.
3) Good use of available bandwidth (within 'fair-share' bounds).
4) Rapid adaptation to changes in bandwidth available
5) Good startup pattern
6) Stable algorithm; few bad "corner cases" and quick recovery from
    them.
7) Good performance in the face of true random loss.
8) Good performance/recovery when the browser is active (or another browser
    on the same access/bottleneck link).
9) Bandwidth must be controlled to the level of not allowing the JS app to
    use it in a denial-of-service attack.


The general algorithm is to take each incoming packet from each SSRC
source; convert the timestamp to an estimated (source) NTP time it was sent
at, including compensation for clock drift rates; and then using the stream
of source packets, sizes, and time of sending estimate the slope of the
incoming rate compared to the sending rate.  Effectively, we consider all
the media packets to be one, big stream using an algorithm effectively
similar to Google's, giving us more data points.  The merged packets will
look rather more like a video stream with a collection of small and large
packets than an audio stream.

To communicate the bandwidth available, we have two main options:

1) run the algorithm on the recipient side, and send the target value to
    the sender in the equivalent of TMMBR.  (We can't use normal TMMBR since
    that's for one SSRC, and we want it to apply to all SSRCs being sent.)
2) collect and process the arrival data, and communicate to the sender the
    slope of the line, the incoming bitrate, number of packets, and loss
    rate (anything else?).  The sender will calculate how it should respond.

Both of these require a new RTCP feedback message, though the first is very
closely associated with TMMBR.  I've had good experiences in the past with
case #2.


Concerns:

Stability of this merged stream is a concern; we'll need to do some good
real-world and simulated tests (see Testing below).


Startup:

There are issues both with starting too low (quality is poor at the start
and takes a while to get better), and with starting too high (you
immediately can get into a delay situation and have to go into recovery).
In general, starting too low is better than starting too high, as when we
ramp up past the bottleneck we will not be too far over the limit and so
recovery will be easier.  In general, the bottleneck link is often
upstream, though this is moderating as high-bandwidth broadband becomes
more common.

Options include:
   a) Start at fixed value
   b) Start at N% below the value from the last call with this person
      (probably requires the App to help here)
   c) Start at N% below the value from the last call with anyone
   d) Have the other side tell us the rate they successfully received in the
      last call.  Start the call at the lower of the last-call reception
      rate from the far end or our last-call sending rate, minus n%
   e) probe bandwidth at or before the start of the call using a packet
      train

'a' will be quite sub-optimal for most cases, though if the value is
low enough it won't be over-bandwidth.  Some applications (games, etc)
may want to limit the starting bandwidth to avoid negative interactions
with other dataflows.

'b' has a problem that if the last call had a much higher bottleneck
bandwidth (especially on the remote downstream), the new call may be
over-bandwidth, perhaps badly.  This won't be the norm, but may happen
especially with mobile/wifi at either end.

'c' has a problem that if the last callee had a much higher bottleneck
bandwidth (especially on the remote downstream), the new call may be
over-bandwidth, perhaps badly.  If the caller's upstream bandwidth is high
compared to typical downstream bandwidth, then the likelihood of starting
over-bandwidth is high.

'd' has the advantage of selecting an appropriate value regardless of
whether the bottleneck is on upstream or downstream.  The downside of 'd'
is that the bottlenecks may have changed since the last call in which case
we may start over-bandwidth.  The historical data could be transferred
via pre-call signalling.

'e' allows direct measurement of the current call bottlenecks.  However 'e'
can be misled if a mid-call router that isn't the bottleneck buffers the
packet train while housekeeping or doing other operations and then releases
them (it can collapse the signal from an earlier bottleneck).  It would
need to be combined with starting on the low side until a valid measurement
is completed.  Single measurements are imprecise with packet trains.

Also, we need to consider the impact on other existing PeerConnections -
for example, when you add a new person to a full-mesh conference, you'll want
to start their bandwidth at about 1/Nth of the former total outgoing
bandwidth, and reduce the outgoing bandwidth of those other channels to
provide the bandwidth for the new participant.

Obviously there is no obvious perfect solution, and combinations of
solutions may give the best overall experience.  In this situation, it
makes sense to leave the actual choice up to the implementations or even
the application, and make sure that they have the information needed to
make the choice, such as the last-call bandwidth from the other end.
(I don't believe that information would be a privacy leakage of any note.)


Data channels:

If we use SCTP-DTLS-UDP, we can rely on SCTP's congestion-control, perhaps
modified to "play nice" with the media streams, and to have some amount of
known priority relative to them.  (For example, if the media streams need
to back off and the data streams are using a consistent amount of data, it
could reduce the amount or rate data is sent.)  We also could pre-allocate
some amount of bandwidth to data, and so long as it's not using it media
could.  Note that SCTP's congestion control is supposed to be in some
manner pluggable, so we could modify/replace it fairly easily.

Worst case: it acts as a separate, TCP-friendly flow we compete with.

If we have to use TCP-over-UDP, then that will be congestion-controlled for
each individual flow, but the non-reliable channels will need separate
congestion control.  Simplest would be to add RTP-like timestamps to the
non-reliable data channels so they could pulled into the framework of the
RTP channels as if they were bursty RTP data.

Even if the data flows are separately congestion controlled, in many uses
they will be non-continuous flows (discrete messages as opposed to data or
file transfers).  These sorts of channels have different needs and
different impacts on the media channels from continuous flows: they're
bursty; low delay is important (especially on the initial packet for a
burst, which is often the only packet), and without a steady stream
congestion control will have little effect - and will have little impact on
the media flows.  For small bursts of data traffic, the "right" solution
may be to simply ignore them up to some limit, and above that limit start
to subtract the data channel usage from the available bandwidth for
variable-rate encoders.

The problem here will be defining "small".  This will need to be tested and
simulated so we know the impacts of different data usage patterns.  I'll
throw a straw-man proposal out there: if the increase in data usage over
the last 1/2 second is below 20% of the total estimated channel bandwidth,
no adjustment to the media channels shall be done, and normal congestion
control mechanisms will be allowed to operate with no interference.  Above
that value, the media channels if possible will be adjusted down in
bandwidth by some amount related to the increase in data channel bandwidth.

The reason for using the increase in bandwidth here is that for steady
flows, normal congestion control should operate well; the problem is when
you have a spike in bandwidth use - the latency in reaction may be longer
than the duration of the burst, and hurt end-to-end delay in the meantime.
So when there's a sudden jump in data bandwidth, you temporarily offset
media down to keep delay and buffering under control.  You could also
increase media bitrates if a steady flow suddenly drops, though normal
congestion control mechanisms should use the now-available bandwidth
quickly on their own.



JS Interface:

We need to tell the application about bandwidth changes, and give it the
option of making choices, both in the allocation of bits among the various
streams of data, and of the actual streams themselves (adding or shutting
down streams, or switching modes (mono/stereo, codecs possibly, etc)).  If
the application doesn't care to make a choice we'll do it for them.  We'll
also want the JS application to be able to affect the congestion-control
status of the data channels, so it could take bits from the data channels
without engendering a tug-of-war and temporary over-bandwidth/buffering.

There are inherent delays to some of these actions by the app, and in an
over-bandwidth case we need to reduce bandwidth *now*, so we may want to
have the "I'm over-bandwidth" case start by automatically reducing
bandwidth if possible, and inform the JS app of the reduction and let it
rebalance or change the details or nature of the call.

We could use some good straw-men proposals for an API here to the JS code.

Security comes in slightly here in the avoidance-of-DoS issue.  If we
reduce bandwidth on a notification from the far side first, then tell the
app, then even if the app goes rogue and hikes the bandwidth instead of
dropping it we'll just cut it back again.  We may need some extra levers
and limits, but I think this is all handleable.


Testing:

We'll want to experiment with both real-world packet traces and with
simulated scenarios, since as an active algorithm pure network traces only
go so far when testing.  We need to test with cross traffic on our side
of the bottleneck going to the far side, traffic that stays on our side or
stays on the other side, with appearance of new bottleneck nodes, with
route changes in mid-call, interface changes, wireless, wireless-in-motion
going into and out-of good signal range, competing traffic hosted by the
same browser, etc.


Extensions:

Can we generalize this to provide guidance when we're receiving connections
from multiple sources?

This would be gated on knowing if the bottleneck is in a "shared" section
of the path or not.  (*Usually* bottlenecks are on upstream, but that might
be changing.)


-- 
Randell Jesup
randell-ietf@jesup.org