[AVTCORE] Slight rewording on Selective Forwarding Middlebox

Magnus Westerlund <magnus.westerlund@ericsson.com> Mon, 30 March 2015 11:14 UTC

Return-Path: <magnus.westerlund@ericsson.com>
X-Original-To: avt@ietfa.amsl.com
Delivered-To: avt@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com []) by ietfa.amsl.com (Postfix) with ESMTP id 992861ACDE8 for <avt@ietfa.amsl.com>; Mon, 30 Mar 2015 04:14:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.501
X-Spam-Status: No, score=-1.501 tagged_above=-999 required=5 tests=[BAYES_50=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([]) by localhost (ietfa.amsl.com []) (amavisd-new, port 10024) with ESMTP id rWhcfHV45CoH for <avt@ietfa.amsl.com>; Mon, 30 Mar 2015 04:14:15 -0700 (PDT)
Received: from sessmg23.ericsson.net (sessmg23.ericsson.net []) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8ABE21ACDE5 for <avt@ietf.org>; Mon, 30 Mar 2015 04:14:14 -0700 (PDT)
X-AuditID: c1b4fb2d-f79a46d0000006b4-2a-551930040230
Received: from ESESSHC005.ericsson.se (Unknown_Domain []) by sessmg23.ericsson.net (Symantec Mail Security) with SMTP id B9.6F.01716.40039155; Mon, 30 Mar 2015 13:14:12 +0200 (CEST)
Received: from [] ( by smtp.internal.ericsson.com ( with Microsoft SMTP Server id; Mon, 30 Mar 2015 13:14:11 +0200
Message-ID: <55193001.9070009@ericsson.com>
Date: Mon, 30 Mar 2015 13:14:09 +0200
From: Magnus Westerlund <magnus.westerlund@ericsson.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:31.0) Gecko/20100101 Thunderbird/31.5.0
MIME-Version: 1.0
To: IETF AVTCore WG <avt@ietf.org>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrKJMWRmVeSWpSXmKPExsUyM+JvjS6LgWSoweZJlhYve1ayOzB6LFny kymAMYrLJiU1J7MstUjfLoErY/WL9WwF0/0rnje+YGpgvGLexcjJISFgItE0aR4jhC0mceHe erYuRi4OIYGjjBIr/rxih3CWM0rcmPKEqYuRg4NXQFvic7M8SAOLgKrE9+ZlTCA2m4CFxM0f jWwgtqhAsMTP9t1gcV4BQYmTM5+wgNgiAkoSOyZtYwaxhQWsJZ5+bWYGGcksoCmxfpc+SJhZ QF6ieetssBIhoE0NTR2sExj5ZiGZNAuhYxaSjgWMzKsYRYtTi4tz042M9VKLMpOLi/Pz9PJS SzYxAsPp4JbfujsYV792PMQowMGoxMOrsE4iVIg1say4MvcQozQHi5I4r53xoRAhgfTEktTs 1NSC1KL4otKc1OJDjEwcnFINjJKSF8/U1C9O+3EhU/Lv4bl79vGv/fST9ZuiUuGJ/0+cfff7 bd9us+jKilz2lYJ7jYOfvV+yZdqCb2tbVxZYr3782ctrzc1Zu++9F2vb8mz+j57sLo2JxQt/ Ok+MXi7vw5a306jwPfPpgK8r5tduWnZH/JhpVNm3taGvrGN8n649KiZu/yqdOXmrEktxRqKh FnNRcSIA3O2z8wgCAAA=
Archived-At: <http://mailarchive.ietf.org/arch/msg/avt/2w2lvFYTmomQLLBrkJ96V5KBXGU>
Subject: [AVTCORE] Slight rewording on Selective Forwarding Middlebox
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/avt/>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 30 Mar 2015 11:14:17 -0000


Based on last weeks discussion with David Benham, I have attempted to
write the SFM section more neutral in regards to if one must have
independent SSRC spaces or not. Please review if you think this is
better or if you have further suggestion on changes needed.



3.7.  Selective Forwarding Middlebox

   Another method for handling media in the RTP mixer is to "project",
   or make available, all potential RTP sources (SSRCs) into a per-
   endpoint, independent RTP session.  The middlebox can select which of
   the potential sources that are currently actively transmitting media
   will be sent to each of the endpoints.  This is similar to the media
   switching Mixer but has some important differences in RTP details.

          +-A---------+             +-Middlebox-----------------+
          | +-RTP1----|             |-RTP1------+       +-----+ |
          | | +-Video-|             |-Video---+ |       |     | |
          | | |    AV1|------------>|---------+-+------>|     | |
          | | |       |<------------|BV1 <----+-+-------|  S  | |
          | | |       |<------------|CV1 <----+-+-------|  W  | |
          | | |       |<------------|DV1 <----+-+-------|  I  | |
          | | |       |<------------|EV1 <----+-+-------|  T  | |
          | | |       |<------------|FV1 <----+-+-------|  C  | |
          | | +-------|             |---------+ |       |  H  | |
          | +---------|             |-----------+       |     | |
          +-----------+             |                   |  M  | |
                                    |                   |  A  | |
          +-B---------+             |                   |  T  | |
          | +-RTP2----|             |-RTP2------+       |  R  | |
          | | +-Video-|             |-Video---+ |       |  I  | |
          | | |    BV1|------------>|---------+-+------>|  X  | |
          | | |       |<------------|AV1 <----+-+-------|     | |
          | | |       |<------------|CV1 <----+-+-------|     | |
          | | |       | :    :    : |: :  : : : : :  : :|     | |
          | | |       |<------------|FV1 <----+-+-------|     | |
          | | +-------|             |---------+ |       |     | |
          | +---------|             |-----------+       |     | |
          +-----------+             |                   |     | |
                                    :                   :     : :
                                    :                   :     : :
          +-F---------+             |                   |     | |
          | +-RTP6----|             |-RTP6------+       |     | |
          | | +-Video-|             |-Video---+ |       |     | |
          | | |    FV1|------------>|---------+-+------>|     | |
          | | |       |<------------|AV1 <----+-+-------|     | |
          | | |       | :    :    : |: :  : : : : :  : :|     | |
          | | |       |<------------|EV1 <----+-+-------|     | |
          | | +-------|             |---------+ |       |     | |
          | +---------|             |-----------+       +-----+ |
          +-----------+             +---------------------------+

                 Figure 17: Selective Forwarding Middlebox

   In the six endpoint conference depicted above in (Figure 17) one can
   see that endpoint A is aware of five incoming SSRCs, BV1-FV1.  If
   this middlebox intends to have a similar behavior as in Section 3.6.2
   where the mixer provides the endpoints with the two latest speaking
   endpoints, then only two out of these five SSRCs need concurrently
   transmit media to A.  As the middlebox selects the source in the
   different RTP sessions that transmit media to the endpoints, each RTP
   stream requires rewriting of certain RTP header fields when being
   projected from one session into another.  In particular, the sequence
   number needs to be consecutively incremented based on the packet
   actually being transmitted in each RTP session.  Therefore, the RTP
   sequence number offset will change each time a source is turned on in
   a RTP session.  The timestamp (possibly offset) stays the same.

   The RTP sessions can be considered independent, the SSRC numbers used
   can also be handled independently, thereby bypassing the requirement
   for SSRC collision detection and avoidance.  This will require tools
   such as remapping tables between the RTP sessions.However, using
   independent RTP sessions are not required, the switching behavior is
   possible to perform also with a common SSRC space.  However, in this
   case collision detection and handling becomes a different problem.
   Therefore, it is up to the implementation to use a single common SSRC
   space or separate ones.

   Using separate SSRC spaces have some implications.  For example, the
   RTP stream that is being sent by endpoint B to the middlebox (BV1)
   may use an SSRC value of 12345678.  When that RTP stream is sent to
   endpoint F by the middlebox, it can use any SSRC value, e.g.
   87654321.  As a result, each endpoint may have a different view of
   the application usage of a particular SSRC.  Any RTP level identity
   information, such as SDES items also needs to update the SSRC
   referenced, if the included SDES items are intended to be global.
   Thus the application must not use SSRC as references to RTP streams
   when communicating with other peers directly.  This also affects loop
   detection which will fail to work, as there is no common namespace
   and identities across the different legs in the communication session
   on RTP level.  Instead this responsibility falls onto higher layers.

   The middlebox is also responsible to receive any RTCP codec control
   requests coming from an endpoint, and decide if it can act on the
   request locally or needs to translate the request into the RTP
   session/transport leg that contains the media source.  Both endpoints
   and the middlebox need to implement conference related codec control
   functionalities to provide a good experience.  Commonly used are Full
   Intra Request to request from the media source to provide switching
   points between the sources, and Temporary Maximum Media Bit-rate
   Request (TMMBR) to enable the middlebox to aggregate congestion
   control responses towards the media source so to enable it to adjust
   its bit-rate (obviously only in case the limitation is not in the
   source to middlebox link).

   The selective forwarding middlebox has been introduced in recently
   developed videoconferencing systems in conjunction with, and to
   capitalize on, scalable video coding as well as simulcasting.  An
   example of scalable video coding is Annex G of H.264, but other
   codecs, including H.264 AVC and VP8 also exhibit scalability, albeit
   only in the temporal dimension.  In both scalable coding and
   simulcast cases the video signal is represented by a set of two or
   more bitstreams, providing a corresponding number of distinct
   fidelity points.  The middlebox selects which parts of a scalable
   bitstream (or which bitstream, in the case of simulcasting) to
   forward to each of the receiving endpoints.  The decision may be
   driven by a number of factors, such as available bit rate, desired
   layout, etc.  Contrary to transcoding MCUs, these "Selective
   Forwarding Units" (SFUs) have extremely low delay, and provide
   features that are typically associated with high-end systems
   (personalized layout, error localization) without any signal
   processing at the middlebox.  They are also capable of scaling to a
   large number of concurrent users, and--due to their very low delay--
   can also be cascaded.

   This version of the middlebox also puts different requirements on the
   endpoint when it comes to decoder instances and handling of the RTP
   streams providing media.  As each projected SSRC can, at any time,
   provide media, the endpoint either needs to be able to handle as many
   decoder instances as the middlebox received, or have efficient
   switching of decoder contexts in a more limited set of actual decoder
   instances to cope with the switches.  The application also gets more
   responsibility to update how the media provided is to be presented to
   the user.

   Note that this topology could potentially be seen as a media
   translator which include an on/off logic as part of its media
   translation.  The topology has the property that all SSRCs present in
   the session is visible to an endpoint.  It also has mixer aspects, as
   the streams it provides are not basically translated version, but
   instead they have conceptual property assigned to them and can be
   both turned on/off as well as being fully or partially delivered.
   Thus this topology appears to be some hybrid between the translator
   and mixer model.

   The differences between selective forwarding middlebox and a
   switching mixer (Section 3.6.2) are minor, and they share most
   properties.  The above requirement on having a large number of
   decoding instances or requiring efficient switching of decoder
   contexts, are one point of difference.  The other is how the
   identification is performed, where the Mixer uses CSRC to provide
   information on what is included in a particular RTP stream that
   represent a particular concept.  Selective forwarding gets the source
   information through the SSRC, and instead have to use other mechanism
   to make clear the streams current purpose.


Magnus Westerlund

Services, Media and Network features, Ericsson Research EAB/TXM
Ericsson AB                 | Phone  +46 10 7148287
Färögatan 6                 | Mobile +46 73 0949079
SE-164 80 Stockholm, Sweden | mailto: magnus.westerlund@ericsson.com