[AVTCORE] RTP circuit breakers practical experience feedback

Fri, 07 August 2015 14:29 UTC

Date: Fri, 07 Aug 2015 10:29:04 -0400
Simon Perreault <sperreault@jive.com>
We implemented draft-ietf-avtcore-rtp-circuit-breakers-10 and put it in
production here at Jive. It is exposed to all kinds of traffic:
intra-DC, inter-DC, SIP peer to SIP peer, and wild public Internet
traffic. I obviously can't discuss traffic volume. Here are some lessons

- The "media timeout" circuit breaker is unusable in practice because at
least one popular B2BUA implementation we routinely talk to (not going
to name names) lies in its RTCP reports. In one particular instance, it
sends RTCP with non-increasing maximum sequence numbers even though it
is receiving the media we send it. We have found no way to work around
this in general apart from fingerprinting the implementation of the
remote endpoint (which we are *not* considering, for obvious reasons).

- The "RTCP timeout" circuit breaker is unusable in practice because it
is common to stop receiving RTCP when an RTP relay we are sending to
changes its destination from an RTCP-sending receiver to a
non-RTCP-sending receiver. There is no indication at the signalling
level that this is happening because what happens beyond the RTP relay
is invisible to us. We have found no way to work around this in general.

- The "congestion" circuit breaker has never triggered. I don't really
know how to interpret this: am I supposed to rejoice or is the circuit
breaker simply useless in practice?

In addition, the text describing the algorithm leaves much to the
programmer's interpretation. I've discussed this with the authors and I
expect a revision with tighter text.

- We have not implemented the "media usability" circuit breaker as we
were unable to find a good practical criteria for usability. I'd be
surprised if anyone else did.
