Re: [rtcweb] Fwd: I-D Action: draft-rosenberg-rtcweb-rtpmux-00.txt

Colin Perkins <csp@csperkins.org> Mon, 25 July 2011 12:50 UTC

Return-Path: <csp@csperkins.org>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2F3CB21F88DD for <rtcweb@ietfa.amsl.com>; Mon, 25 Jul 2011 05:50:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -103.599
X-Spam-Level:
X-Spam-Status: No, score=-103.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LKFMnwE0JEXG for <rtcweb@ietfa.amsl.com>; Mon, 25 Jul 2011 05:50:46 -0700 (PDT)
Received: from anchor-msapost-1.mail.demon.net (anchor-msapost-1.mail.demon.net [195.173.77.164]) by ietfa.amsl.com (Postfix) with ESMTP id 8296821F874C for <rtcweb@ietf.org>; Mon, 25 Jul 2011 05:50:46 -0700 (PDT)
Received: from [70.25.120.2] (helo=[10.255.254.41]) by anchor-post-1.mail.demon.net with esmtpsa (AUTH csperkins-dwh) (TLSv1:AES128-SHA:128) (Exim 4.69) id 1QlKcO-0002mk-hz; Mon, 25 Jul 2011 12:50:45 +0000
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset="us-ascii"
From: Colin Perkins <csp@csperkins.org>
In-Reply-To: <8785C0A3-31E5-44D7-8557-3BEEE4F95E3D@skype.net>
Date: Mon, 25 Jul 2011 08:50:35 -0400
Content-Transfer-Encoding: quoted-printable
Message-Id: <32ADCB9C-8303-46B8-AE6B-CDE5772FFBA5@csperkins.org>
References: <4E123C54.10405@jdrosen.net> <8785C0A3-31E5-44D7-8557-3BEEE4F95E3D@skype.net>
To: Matthew Kaufman <matthew.kaufman@skype.net>
X-Mailer: Apple Mail (2.1084)
Cc: rtcweb@ietf.org
Subject: Re: [rtcweb] Fwd: I-D Action: draft-rosenberg-rtcweb-rtpmux-00.txt
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Jul 2011 12:50:48 -0000

On 24 Jul 2011, at 16:44, Matthew Kaufman wrote:
> I've been looking for technical reasons and/or hard requirements that would argue against doing RTP multiplexing as proposed in this draft.
> 
> So far my conclusion is that the RTP specifications recommend against this, but do not prohibit it.
> 
> Supporting evidence:
> RFC 3550 Section 2.2 notes that multiplexing A&V together may be a problem if you want to receive one and not together (presumably in the multicast case) and refers to Section 5.2 for more support.
> 
> RFC 3550 Section 5.2 says "Separate audio and video streams SHOULD NOT be carried in a single RTP session and demultiplexed based on the payload type or SSRC fields." and provides five reasons. Note the "SHOULD NOT" as opposed to "MUST NOT".
> 
> The same section says that the five reasons apply to multiplexing on payload type, only the last two apply to multiplexing on SSRC. These remaining reasons are:
> 
>   4. An RTP mixer would not be able to combine interleaved streams of
>      incompatible media into one stream.
> 
> Not relevant to the RTCWEB peer-to-peer case as the browser is not an RTP mixer. Also not relevant to any RTCWEB use cases that I can identify. And, if an RTP mixer at one end it can simply not agree to use multiplexing.

This would work if you are directly talking to a mixer. It precludes the cases where you are talking indirectly to a mixer via a gateway, unless you define additional signalling to determine that there is a mixer on the remote side of the gateway. 

>   5. Carrying multiple media in one RTP session precludes: the use of
>      different network paths or network resource allocations if
>      appropriate;
> 
> Appears to not be relevant to the RTCWEB case where a unicast path will be negotiated using ICE.

An RTCWeb client using ICE could negotiate different network paths for the audio and video flows, and could use network-based QoS mechanisms, if flows were sent separately. Those mechanisms work just fine in unicast.

>      reception of a subset of the media if desired, for
>      example just audio if video would exceed the available bandwidth;
> 
> Appears to not be relevant to anything but the multicast cases. Any unicast case, including RTCWEB, can simply not have that component of the stream sent to it.

Relevant for proxies which do in-network adaptation.

>      and receiver implementations that use separate processes for the
>      different media, whereas using separate RTP sessions permits
>      either single- or multiple-process implementations.
> 
> Appears to not be relevant to the RTCWEB use case. Additionally, if it is necessary for there to be separate processes or even devices at the far end this can be solved by again simply not agreeing to use multiplexing.
> 
> In all my reading today I have not been able to find anything more concrete than the "SHOULD NOT" in section 5.2 of RFC3550. PLEASE follow up if you are aware of any other relevant specifications that would argue against using SSRC to multiplex audio and video streams over a single RTP session between a pair of compatible endpoints that agree to do so.


   Timing and Sequence Number Space:  An RTP SSRC is defined to identify
      a single timing and sequence number space.  Interleaving multiple
      payload types would require different timing spaces if the media
      clock rates differ and would require different sequence number
      spaces to tell which payload type suffered packet loss.  Using
      multiple clock rates in a single RTP session is problematic, as
      discussed in [I-D.ietf-avtext-multiple-clock-rates].  This can be
      avoided by partitioning the SSRC space between the two sessions,
      but that causes other problems as discussed below.

   RTCP Reception Reports:  RTCP sender reports and receiver reports can
      only describe one timing and sequence number space per SSRC, and
      do not carry a payload type field.  Multiplexing sessions based on
      the payload type breaks RTCP.  This can be avoided by partitioning
      the SSRC space between the two sessions, but that causes other
      problems as discussed below.

   Scalability:  RTP was built with media scalability in consideration.
      The simplest way of achieving separation between different
      scalability layers is placing them in different RTP sessions, and
      using the same SSRC and CNAME in each session to bind them
      together.  This is most commonly done in multicast, and not
      particularly applicable to RTC-Web, but gatewaying of such a
      session would then require more alterations and likely stateful
      translation.

   RTP Retransmission in Session Multiplexing mode:  RTP Retransmission
      [RFC4588] does have a mode for session multiplexing.  This would
      not be the main mode used in RTC-Web, but for interoperability and
      reduced cost in translation support for different RTP Sessions are
      beneficial.

   Forward Error Correction:  The "An RTP Payload Format for Generic
      Forward Error Correction" [RFC2733] and its update [RFC5109] can
      only be used on media formats that produce RTP packets that are
      smaller than half the MTU if the FEC flow and media flow being
      protected are to be sent in the same RTP session, this is due to
      "RTP Payload for Redundant Audio Data" [RFC2198].  This is because
      the SSRC value of the original flow is recovered from the FEC
      packets SSRC field.  So for anything that desires to use these
      format with RTP payloads that are close to MTU needs to put the
      FEC data in a separate RTP session compared to the original
      transmissions.  The usage of this type of FEC data has not been
      decided on in RTCWEB.

   SSRC Allocation and Collision:  The SSRC identifier is a random 32-
      bit number that is required to be globally unique within an RTP
      session, and that is reallocated to a new random value if an SSRC
      collision occurs between participants.  If two or more RTP
      sessions share a transport layer flow, there is no guarantee that
      their choice of SSRC values will be distinct, and there is no way
      in standard RTP to signal which SSRC values are used by which RTP
      session.  RTP is explicitly a group-based communication protocol,
      and new participants can join an RTP session at any time; these
      new participants may chose SSRC values that conflict with the SSRC
      values used in any of the multiplexed RTP sessions.  This problem
      can be avoided by partitioning the SSRC space, and signalling how
      the space is to be subdivided, but this is not backwards
      compatible with any existing RTP system.  In addition, subdividing
      the SSRC space makes it difficult to gateway between multiplexed
      RTP sessions and standard RTP sessions: the standard sessions may
      use parts of the SSRC space reserved in the multiplexed RTP
      sessions, requiring the gateway to rewrite RTCP packets, as well
      as the SSRC and CSRC list in RTP data packets.  Rewriting RTCP is
      a difficult task, especially when one considers extensions such as
      RTCP XR.

   Conflicting RTCP Report Types:  The extension mechanisms used in RTCP
      depend on separation of RTP sessions for different media types.
      For example, the RTCP Extended Report block for VoIP is suitable
      for conversational audio, but clearly not useful for Video.  This
      may cause unusable or unwanted reports to be generated for some
      streams, wasting capacity and confusing monitoring systems.  While
      this is problem may be unlikely for VoIP reports, it may be an
      issue for the more detailed media agnostic reports which are
      sometimes be used for different media types.  Also, this makes the
      implementation of RTCP more complex, since partitioning the SSRC
      space by media type needs not only to be one the media processing
      side, but also on the RTCP reporting

   RTCP Reporting and Scheduling:  The RTCP reporting interval and its
      packet scheduling will be affected if several RTP sessions are
      multiplexed onto the same transport layer flow.  The reporting
      interval is determined by the session bandwidth, and the reporting
      interval chosen for a high-rate video session will be different to
      the interval chosen by a low-rate VoIP session.  If such sessions
      are multiplexed, then participants in one session will see the
      SSRC values of the other session.  This will cause them to
      overestimate the number of participants in the session by a factor
      of two, thus doubling their RTCP reporting interval, and making
      their feedback less timely.  In the worst case, when an RTP
      session with very low RTCP bandwidth is multiplexed with an RTP
      session with high RTCP bandwidth, this may cause repeated RTCP
      timer reconsideration, leading to the members of the low bandwidth
      session timing out.  Participants in an RTP session configured
      with high bandwidth (and short RTCP reporting interval) will see
      RTCP reports from participants in the low bandwidth session much
      less often than expected, potentially causing them to repeatedly
      timeout and re-create state for those participants.  The split of
      RTCP bandwidth between senders and receivers (where at least 25%
      of the RTCP bandwidth is allocated to senders) will be disrupted
      if a session with few senders (e.g., a VoIP session) is
      multiplexed with a session with many senders (e.g., a video
      session).  These issues can be resolved if the partition of the
      SSRC is signalled, but this is not backwards compatible with any
      existing RTP system.  The partition would require re-implementing
      large part of the RTCP processing to take the individual sessions
      into account.

   Sampling Group Membership:  The mechanism defined in RFC2762 to
      sample the group membership, allowing participants to keep less
      state, assumes a single flat 32-bit SSRC space, and breaks if the
      SSRC space is shared between several RTP sessions.


-- 
Colin Perkins
http://csperkins.org/