Re: [MMUSIC] [avtext] framemarking: add frame size info

Hi Miguel,

This was discussed in IETF 97 during the AVTEXT session on Frame Marking.
See the slides and minutes.
https://datatracker.ietf.org/meeting/97/session/avtext

The recommended and agreed solution was to use RID rather than add frame size
in the Frame Marking header extension.

Thanks,
Mo

From: mmusic <mmusic-bounces@ietf.org<mailto:mmusic-bounces@ietf.org>> on behalf of Miguel París Díaz <mparisdiaz@gmail.com<mailto:mparisdiaz@gmail.com>>
Date: Monday, March 27, 2017 at 3:43 AM
To: Jonathan Lennox <jonathan@vidyo.com<mailto:jonathan@vidyo.com>>
Cc: "avtext@ietf.org<mailto:avtext@ietf.org>" <avtext@ietf.org<mailto:avtext@ietf.org>>, "mmusic@ietf.org<mailto:mmusic@ietf.org>" <mmusic@ietf.org<mailto:mmusic@ietf.org>>
Subject: Re: [MMUSIC] [avtext] framemarking: add frame size info

Hello again,
is there anybody considering this proposal, or nobody see the benefits?

Kind regards!!

2016-11-10 15:18 GMT+01:00 Miguel París Díaz <mparisdiaz@gmail.com<mailto:mparisdiaz@gmail.com>>:
Hello,
in the new draft of sdp-simulcast an "RTP Aspect" section [1] has been added, which explains how the media is handled on RTP level.

Specifically, In the Media-Switching Mixer section [2] the same thoughts I exposed are said:

   This section discusses the behavior in cases where the RTP middlebox
   behaves like the Media-Switching Mixer (Section 3.6.2<https://tools.ietf.org/html/draft-ietf-mmusic-sdp-simulcast-06#section-3.6.2>) in RTP
   Topologies [RFC7667<https://tools.ietf.org/html/rfc7667>].  The fundamental aspect here is that the media
   sources delivered from the middlebox will be the mixer's conceptual
   or functional ones.  For example, one media source may be the main
   speaker in high resolution video, while a number of other media
   sources are thumbnails of each participant.

   The above results in that the RTP stream produced by the mixer is one
   that switches between a number of received incoming RTP streams for
   different media sources and in different simulcast versions.  The
   mixer selects the media source to be sent as one of the RTP streams,
   and then selects among the available simulcast streams for the most
   appropriate one.  The selection criteria include available bandwidth
   on the mixer to receiver path and restrictions based on the
   functional usage of the RTP stream delivered to the receiver.  An
   example of the latter, is that it is unnecessary to forward a full HD
   video to a receiver if the display area is just a thumbnail.  Thus,
   restrictions may exist to not allow some simulcast streams to be
   forwarded for some of the mixer's media sources.

In our case to provide this feature, currently we have to depay the RTP packets, and apply different types of parses (depending on the codec) to read the frame size, which reduces the scalability of the system and hinder the implementation a lot.
Because of that, I think that having frame size (width and height) info in the Frame Marking RTP header extension is quite interesting to implement this kind of use cases in a easy and efficient way (the same that an audio-level extension header is provided to avoid analysing it in the middlebox side).

I am adding MMUSIC group in the thread, because I think that this also should be discussed in the context of the simulcast case.

Best!!

Refs
[1] https://tools.ietf.org/html/draft-ietf-mmusic-sdp-simulcast-06#section-7.2
[2] https://tools.ietf.org/html/draft-ietf-mmusic-sdp-simulcast-06#section-7.2.1

2016-08-30 12:37 GMT+02:00 Miguel París Díaz <mparisdiaz@gmail.com<mailto:mparisdiaz@gmail.com>>:
I assume that the media distributor has the information from the SDP (it performs the SDP negotiation which each "client"), but the point is that encoders may change the video size depending on the available bandwidth, the complexivity of the video source, etc., unless the sender forces the encoders' configuration with a fix frame size...

2016-08-26 19:07 GMT+02:00 Jonathan Lennox <jonathan@vidyo.com<mailto:jonathan@vidyo.com>>:
(As an individual.)

In the latest version of simulcast the media distributor would need the RID values, not the PT values, but the idea is the same — it needs the SDP.

Note that if the media distributor doesn’t have information from the SDP it can’t reliably identify the frame marking header extension at all, since header extension IDs are negotiated. So I’m not sure how much benefit there is to putting the size in the header extension.

That said, if we envision a scenario where encoders might be frequently changing their video size (in response to available network bandwidth, or the like), it might be useful for encoders to be able to indicate the current size they’re encoding without needing to send updated SDP all the time.

On Aug 26, 2016, at 12:52 PM, Paul E. Jones <paulej@packetizer.com<mailto:paulej@packetizer.com>> wrote:

Miguel,

You make the assumption that the media distributor will not see the SDP, I suppose.  While certainly a valid model, I'll admit that I had personally assumed any media forwarding function would see the SDP (or at least be told the PT values and any relevant flow information similar to what RFC 6236 provides) and would thus know which PT values correspond to what video resolutions if simulcast is employed.

Paul

------ Original Message ------
From: "Miguel París Díaz" <mparisdiaz@gmail.com<mailto:mparisdiaz@gmail.com>>
To: avtext@ietf.org<mailto:avtext@ietf.org>
Sent: 8/25/2016 10:12:48 AM
Subject: [avtext] framemarking: add frame size info

Hello,
it would be great having frame size (width and height) info in the Frame Marking RTP header extension [1].

Why?
For example, in the case of using simulcast in an SFU, selecting the stream by the size would ease the application development and improve the experience of the users.
Application developers don't usually have deep knowledge about media like bitrate, etc., but they know which video size has to be rendered in the GUI, which may depend on the client where the app is running: a mobile, a PC with a 13"· screen, a PC with 27" screen, etc.

In this way and taking a videoconference app as example, if a participant select another participant to be rendered as main video, the app could ask the SFU to select the video quality that better matches to 800x600 size.

What do you think about this idea?

Thanks and best regards!!

Refs
[1] https://tools.ietf.org/html/draft-ietf-avtext-framemarking-02

--
Miguel París Díaz
------------------------------------------------------------------------
Computer/Software engineer.
Researcher and architect in http://www.kurento.org<http://www.kurento.org/>
http://twitter.com/mparisdiaz
------------------------------------------------------------------------
_______________________________________________
avtext mailing list
avtext@ietf.org<mailto:avtext@ietf.org>
https://www.ietf.org/mailman/listinfo/avtext

--
Miguel París Díaz
------------------------------------------------------------------------
Computer/Software engineer.
Researcher and architect in http://www.kurento.org
http://twitter.com/mparisdiaz
------------------------------------------------------------------------

--
Miguel París Díaz
------------------------------------------------------------------------
Computer/Software engineer.
Researcher and architect in http://www.kurento.org
http://twitter.com/mparisdiaz
------------------------------------------------------------------------

--
Miguel París Díaz
------------------------------------------------------------------------
Computer/Software engineer.
Researcher and architect in http://www.kurento.org
http://twitter.com/mparisdiaz
------------------------------------------------------------------------