[rtcweb] Review of draft-ietf-rtcweb-rtp-usage

Jim Spring <jmspring@gmail.com> Thu, 08 May 2014 14:49 UTC

Return-Path: <jmspring@gmail.com>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E64AD1A0085 for <rtcweb@ietfa.amsl.com>; Thu, 8 May 2014 07:49:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9kJAan_tINg7 for <rtcweb@ietfa.amsl.com>; Thu, 8 May 2014 07:49:33 -0700 (PDT)
Received: from mail-ob0-x22c.google.com (mail-ob0-x22c.google.com [IPv6:2607:f8b0:4003:c01::22c]) by ietfa.amsl.com (Postfix) with ESMTP id 589001A0077 for <rtcweb@ietf.org>; Thu, 8 May 2014 07:49:33 -0700 (PDT)
Received: by mail-ob0-f172.google.com with SMTP id wp18so3184655obc.31 for <rtcweb@ietf.org>; Thu, 08 May 2014 07:49:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=PyO6d668t4Eln/2+gCVANkCXBuJNpGnzFVmVkZN1tek=; b=nOWqWiguBRtaTjgCLrJue6iFznvKqOopvCf+j/15FGR8AYfpkfl56ir6Rukb4DIsMD Q+alv6VD/UdHngiwD0HhOupSBQMog9Tk/w+0tz2YQYngWdQ0q7r3//fzBRg6KtbCeRPA kraHG0Vh1SFDvfJBhUCWlns6P/vwmbDESkvv4KpC4nlgnVQITuo8/95UaOodOgGAY5CZ RlEYhPvKOkLhdzDcklLDPzUbvRT79NgFLVQe/gFLcbW+RZblNPLVMKM9Lr0LzjJZPMjt GBJFsvXdX+IjGOJpkgP4Y8jYbVcZsd7qlXuxEej7lsgUQqBvi7O1RberUb1awTaeQYL3 s3hA==
MIME-Version: 1.0
X-Received: by 10.182.102.99 with SMTP id fn3mr5177939obb.57.1399560568806; Thu, 08 May 2014 07:49:28 -0700 (PDT)
Received: by 10.76.158.199 with HTTP; Thu, 8 May 2014 07:49:28 -0700 (PDT)
Date: Thu, 08 May 2014 07:49:28 -0700
Message-ID: <CAF_CtF79d_TuwfYvZz3Cn0tNNXDWkzBn6MztGd7JnomHDx9Y9A@mail.gmail.com>
From: Jim Spring <jmspring@gmail.com>
To: "rtcweb@ietf.org" <rtcweb@ietf.org>
Content-Type: multipart/alternative; boundary="089e013d0d688ba20004f8e49623"
Archived-At: http://mailarchive.ietf.org/arch/msg/rtcweb/-ObEB6WdKu2dhibUcmA_XmOHCbg
Subject: [rtcweb] Review of draft-ietf-rtcweb-rtp-usage
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb/>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 08 May 2014 14:49:38 -0000

4.5.  RTP and RTCP Multiplexing

       ....

   Note that the use of RTP and RTCP multiplexed onto a single
   transport-layer flow ensures that there is occasional traffic sent on
   that port, even if there is no active media traffic.  This can be
   useful to keep NAT bindings alive, and is the recommend method for
   application level keep-alives of RTP sessions [RFC6263].

[JS] In the case of MUX, this may be the recommended method per
RFC6263 for keeping NAT bindings alive, but for WebRTC, we have also
talked about using STUN connectivity checks
[draft-ietf-rtcweb-stun-consent-freshness].  It seems a bit odd having
multiple methods specified.  If we adopt
draft-ietf-rtcweb-stun-consent-freshness, can the above be removed or
a section added to note the new draft.


4.7.  Symmetric RTP/RTCP

[JS] General question / comment - most other sections of the document
make a distinction when a WebRTC talks to another WebRTC client and
when talking to a legacy one.  This section does not, are there
concerns where a legacy client will not support Symmetric RTP/RTCP per
RFC4961?


4.8.  Choice of RTP Synchronisation Source (SSRC)

   Implementations are REQUIRED to support signalled RTP synchronisation
   source (SSRC) identifiers, using the "a=ssrc:" SDP attribute defined
   in Section 4.1 and Section 5 of [RFC5576].

[JS] This section appears to mandate SDP for signaling, other sections use
SDP as an example for signaling. Recommend reworking this to not require
specifics about SDP.


7.1.  Boundary Conditions and Circuit Breakers

   In the absence of a concrete congestion control algorithm, all WebRTC
   implementations MUST implement the RTP circuit breaker algorithm that
   is described in [I-D.ietf-avtcore-rtp-circuit-breakers].

[JS] At IETF 89, my understanding was that there were concerns around the
use of circuit breakers and the impact on call quality even in cases of
very minimal packet loss. Missing history/context, are circuit breakers a
"MUST"?


10.  Signalling Considerations

   RTP Profile:  The name of the RTP profile to be used in session.  The
      RTP/AVP [RFC3551] and RTP/AVPF [RFC4585] profiles can interoperate
      on basic level, as can their secure variants RTP/SAVP [RFC3711]
      and RTP/SAVPF [RFC5124].  The secure variants of the profiles do
      not directly interoperate with the non-secure variants, due to the
      presence of additional header fields for authentication in SRTP
      packets and cryptographic transformation of the payload.  WebRTC
      requires the use of the RTP/SAVPF profile, and this MUST be
      signalled if SDP is used.  Interworking functions might transform
      this into the RTP/SAVP profile for a legacy use case, by
      indicating to the WebRTC end-point that the RTP/SAVPF is used, and
      limiting the usage of the "a=rtcp-fb:" attribute to indicate a
      trr-int value of 4 seconds.

[JS] Another example assuming SDP for signaling. RFC5124 calls out other
possible signaling options as well.


11.  WebRTC API Considerations

[JS] General note - this section as well as Section 12 had sections where
the grammar seemed off a bit. Sometimes whole paragraphs come across as a
bit awkward. Due to time constraints, I will call out the ones that
immediately stood out. I'm happy to help with some text rewrite, but not
until later this week/early next week due to time.

Figure 1 on Page 31 and Figure 2 on Page 32 should be centered.

Specific corrections:

   The same MediaStreamTrack can also be included in multiple
   MediaStreams, thus multiple sets of MediaStreams can implicitly need
   to use the same synchronisation base.  To ensure that this works in
   all cases, and don't forces a end-point to change synchronisation
   base and CNAME in the middle of a ongoing delivery of any packet
   streams, which would cause media disruption; all MediaStreamTracks
   and their associated SSRCs originating from the same end-point needs
   to be sent using the same CNAME within one RTCPeerConnection.  This
   is motivating the strong recommendation in Section 4.9 to only use a
   single CNAME.


[JS]

   The same MediaStreamTrack can also be included in multiple
   MediaStreams, thus multiple sets of MediaStreams can implicitly need
   to use the same synchronisation base.  To ensure that this works in
   all cases, and *doesn't force an* end-point to change synchronisation
   base and CNAME in the middle of *the* delivery of any *ongoing* packet
   streams, which would cause media disruption; all MediaStreamTracks
   and their associated SSRCs originating from the same end-point *need*
   to be sent using the same CNAME within one RTCPeerConnection.  This
   is motivating the strong recommendation in Section 4.9 to only use a
   single CNAME.

-----

      The requirement on using the same CNAME for all SSRCs that
      originates from the same end-point, does not require middleboxes
      that forwards traffic from multiple end-points to only use a
      single CNAME.

 [JS]

      The requirement on using the same CNAME for all SSRCs that
      *originate* from the same end-point does not require *a middlebox*
      that forwards traffic from multiple end-points to only use a
      single CNAME.

-----

   Different CNAMEs normally need to be used for different
   RTCPeerConnection instances, as specified in Section 4.9.  Having two
   communication sessions with the same CNAME could enable tracking of a
   user or device across different services (see Section 4.4.1 of
   [I-D.ietf-rtcweb-security] for details).  A web application can
   request that the CNAMEs used in different RTCPeerConnection within a
   same-orign context to be the same, this allow for synchronization of
   the endpoint's RTP packet streams across the different
   RTCPeerConnections.

   [JS]

   Different CNAMEs normally need to be used for different
   RTCPeerConnection instances, as specified in Section 4.9.  Having two
   communication sessions with the same CNAME could enable tracking of a
   user or device across different services (see Section 4.4.1 of
   [I-D.ietf-rtcweb-security] for details).  A web application can
   request that the CNAMEs used in different RTCPeerConnection
*objects (within a*

*   same-orign context) be* the same, this allow for synchronization of
   the endpoint's RTP packet streams across the different
   RTCPeerConnections.

 -----

      Note: The motivation for supporting reception of multiple CNAMEs
      are to allow for forward compatibility with any future changes....

   [JS]

      Note: The motivation for supporting reception of multiple CNAMEs
      *is* to allow for forward compatibility with any future changes....


-----

      To separate media with different purposes:  An end-point might want
      to send RTP packet streams that have different purposes on
      different RTP sessions, to make it easy for the peer device to
      distinguish them.  For example, some centralised multiparty
      conferencing systems display the active speaker in high
      resolution, but show low resolution "thumbnails" of other
      participants.  Such systems might configure the end-points to send
      simulcast high- and low-resolution versions of their video using
      separate RTP sessions, to simplify the operation of the RTP
      middlebox.  In the WebRTC context this is currently possible to
      accomplished by establishing multiple WebRTC MediaStreamTracks
      that have the same media source in one (or more)
      RTCPeerConnection.

  [JS]

      To separate media with different purposes:  An end-point might want
      to send RTP packet streams that have different purposes on
      different RTP sessions, to make it easy for the peer device to
      distinguish them.  For example, some centralised multiparty
      conferencing systems display the active speaker in high
      resolution, but show low resolution "thumbnails" of other
      participants.  Such systems might configure the end-points to send
      simulcast high- and low-resolution versions of their video using
      separate RTP sessions, to simplify the operation of the RTP
      middlebox.  *In the WebRTC context this is currently possible
      by establishing *multiple WebRTC MediaStreamTracks
      that have the same media source in one (or more)
      RTCPeerConnection.


-----

      Experience with the Mbone tools (experimental RTP-
      based multicast conferencing tools from the late 1990s) has showed
      that RTCP reception quality reports for third parties can usefully
      be presented to the users in a way that helps them understand
      asymmetric network problems, and the approach of using separate
      RTP sessions prevents this.

[JS]

      Experience with the Mbone tools (experimental RTP-
      based multicast conferencing tools from the late 1990s) has showed
      that RTCP reception quality reports for third parties can*
      be presented to users* in a way that helps them understand
      asymmetric network problems, and the approach of using separate
      RTP sessions prevents this.


   -----


      There are various methods of implementation for the middlebox.  If
      implemented as a standard RTP mixer or translator, a single RTP
      session will extend across the middlebox and encompass all the
      end-points in one multi-party session.  Other types of middlebox
      might use separate RTP sessions between each end-point and the
      middlebox.  A common aspect is that these RTP middleboxes can use
      a number of tools to control the media encoding provided by a
      WebRTC end-point.  This includes functions like requesting
      breaking the encoding chain and have the encoder produce a so
      called Intra frame.  Another is limiting the bit-rate of a given
      stream to better suit the mixer view of the multiple down-streams.
      Others are controlling the most suitable frame-rate, picture
      resolution, the trade-off between frame-rate and spatial quality.
      The middlebox gets the significant responsibility to correctly
      perform congestion control, source identification, manage
      synchronisation while providing the application with suitable
      media optimizations.  The middlebox is also has to be a trusted
      node when it comes to security, since it manipulates either the
      RTP header or the media itself (or both) received from one end-
      point, before sending it on towards the end-point(s), thus they
      need to be able to decrypt and then encrypt it before sending it
      out.

[JS]

      There are various methods of implementation for the middlebox.  If
      implemented as a standard RTP mixer or translator, a single RTP
      session will extend across the middlebox and encompass all the
      end-points in one multi-party session.  Other types of *middleboxes*
      might use separate RTP sessions between each end-point and the
      middlebox.  A common aspect is that these RTP middleboxes can use
      a number of tools to control the media encoding provided by a
      WebRTC end-point.  This includes functions like requesting *the
      breaking of the* encoding chain and have the encoder produce a so
      called Intra frame.  Another is limiting the bit-rate of a given
      stream to better suit the mixer view of the multiple down-streams.
      Others are controlling the most suitable frame-rate, picture
      resolution, the trade-off between frame-rate and spatial quality.
      The middlebox *has* the responsibility to correctly
      perform congestion control, source identification, manage
      synchronisation while providing the application with suitable
      media optimizations.  The middlebox also has to be a trusted
      node when it comes to security, since it manipulates either the
      RTP header or the media itself (or both) received from one end-
      point, before sending it on towards the *other *end-point(s), thus they
      need to be able to decrypt and then *re-encrypt the stream*
before sending it
      out.

-----

      For cryptographic verification of the source
      SRTP would require additional security mechanisms, for example
      TESLA for SRTP [RFC4383], that are not part of the base WebRTC
      standards.

[JS NOTES] - This is the first I've seen mention of TESLA and RFC4383 in
regard
to WebRTC security. My gut tells me rather than referencing a particular doc
here, there should be a relevant section of
[draft-ietf-rtcweb-security-arch] sited.