Re: [MMUSIC] Proposal for what bundle should say about demux

Colin, 

I've been deliberately merging layers A, B, C all into one in my discussion of this and focusing on the media stack code which has to deal with all those layers. The reasons I have been doing that is people just seemed to confused about what all the parts of RTP are such as RTP sessions. But most people seemed to understand what bits in the various packets they could look at what and what data had to end up at the right spot in the end. 

At a high level, I would draw the layers a little differently than you in that many of the implementation I work with do the jitter buffer after the data is decoded. This allows the implementations to do better packet loss concealment and play rate games where they speed up or slow down playback very slightly to smoothly adapt the jitter buffer depth. 

But regardless of the details of the order, the jitter buffer needs to look at the SSRC and if the SSRC has changed, it probably needs to retrain the jitter buffer. 

Lets talk a bit more about your diagram and what happens when Alice creates and Offer with that does not have use bundle and has two audio codecs. When an RTP packets #1 gets sent to Alice and it arrives before the answer, somewhere in layer B/C, something needs to look at the PT value to decide what codec this goes to. Alice knows nothing about what SSRC the other side is using. Now lets say that packet #2 arrives (again before answer) and packet #2 is on the same port, same PT, but has a different SSRC. None of the implementations I work with for audio would allocate a new PB or CR. Instead they would pass packet #2 throughout the same codec instance that packet #1 had used and expect things like the jitter buffer to note the SSRC change and act accordingly. For DSP based audio systems where the actual codec instances are loaded into the DSP and have a reserved set or resources allocated for them, this is a very common design. 

There is probably more variety in video applications but many conferences bridges act this way. 

So I have no problem if things split on SSRC and whatever processing that can be done knowing only that is done. That probably includes RTCP, FEC, RTX , who knows. But before the packed can be decoded by the codec, something needs to consider the PT and other things such as port when not doing bundle to make this work. Keep in mind things like PLI are much more likely to be sent at the layer near the codec than at layer B so layers B and C are not as cleanly separated as one might wish. 

That's a long way of saying, I was talking about the aggregate demux in media stack that include whatever happened at layers B and C. If layer B only used SSRC that's fine with me. I am trying to say what happens in the combination of layer B/C put together.  

What terminology should I be using to not cause confusion ?

On May 27, 2013, at 7:57 AM, Colin Perkins <csp@csperkins.org> wrote:

> I don't agree with the phrasing about "packet processing pipelines", but can't tell if this is a terminology disagreement or a more fundamental disconnect. The way I see the demultiplexing logically working is:
> 
>                    |
>                    | packets
>    +--             v
>    |           +----------+
>    |           |UDP socket|
> A  |           +----------+
>    |        RTP ||  |  |
>    |   and RTCP ||  |  +------> SCTP
>    |            ||  +---------> STUN/ICE
>    +--          ||
>    +--          ||
>    |       split by SSRC
>    |       ||   ||   ||
>    |       ||   ||   ||
> B  |      +--+ +--+ +--+
>    |      |PB| |PB| |PB| Playout buffer, process RTCP, FEC, etc.
>    |      +--+ +--+ +--+
>    +--      |   |     |
>    +--      |  /      |
>    |        +---+     |
>    |         /  |     |
> C  |      +--+ +--+ +--+
>    |      |CR| |CR| |CR| Codecs and rendering
>    |      +--+ +--+ +--+
>    +--
> 
> If your algorithm is for demultiplexing at layer C, i.e., to figure out what codec and rendering pipeline to use, then I think we're in agreement apart from terminology. 
> 
> For layer B, I believe the SSRC is the right thing to use to demultiplex, and fits with RTP and RTCP. This is where RTCP is processed, playout de-jitter buffering happens, FEC is processed, NACKs are sent, etc. It's logically independent of the decoding and rendering process since you can start filling your de-jitter buffer for an SSRC before you figure out if/how/where you're going to render that SSRC. 
> 
> For layer A, there was a clear-cut way of doing this with RTP, RTCP, and STUN. I haven't looked at SCTP enough to know how that affects things. I do think it's a logically separate issue, and should be documented separately to BUNDLE though, since it the same issues arise with non-bundled sessions.
> 
> An implementation might merge these together, of course, but to avoid confusion the standards should be clear what level they're considering.
> 
> Colin
> 
> 
> 
> On 27 May 2013, at 14:15, Cullen Jennings (fluffy) wrote:
>> Great - sounds like we agree this algorithm will work.
>> 
>> On May 27, 2013, at 6:41 AM, Colin Perkins <csp@csperkins.org> wrote:
>> 
>>> I'm not sure I agree.
>>> 
>>> As I said in my previous message to the list, if we are agreed that the m= lines in a BUNDLE group form a single RTP session, then I believe we need unique payload types across all m= lines. In this case, BUNDLE can simply say that regular RTP source demultiplexing based on the SSRC has to be performed, then the payload type can be used to match sources to m= lines for those applications that care about doing so. 
>>> 
>>> If we're not agreed that the m= lines in a BUNDLE group form a single RTP session, then we have a lot more to discuss...
>>> 
>>> Colin
>>> 
>>> 
>>> 
>>> On 23 May 2013, at 19:02, Cullen Jennings (fluffy) wrote:
>>>> Here's is my proposal for roughly what the bundle draft should say about this demux topic 
>>>> 
>>>> Application will decide which packet processing pipeline to pass an given RTP/RTCP packet to based on what the application knows:
>>>> 
>>>> 1) If future RFCs define new things (like RTP header extension), that explicitly specify the mapping, check if that future RFC is in use and if so then use that to form the mapping 
>>>> 
>>>> 2) If the PT type is uniquely identifies a mapping, use that to form the mapping
>>>> 
>>>> 3) If application knows the SSRC the other side will use, use that to form the mapping 
>>>> 
>>>> 4) if there is no way to know which pipeline to pass it too, the packet MAY be discarded or the application MAY decide to buffer it until the mapping is known 
>>>> 
>>>> This is trivial to implement. It meets the requirements for Plan A, Plan B, UCIF, CLUE, and so on. 
>>>> 
>>>> We could swap the order of step 2 and 3, My thinking for this order was the only time it made any difference the order was if the PT were unique and indicated a different mapping than the SSRC. The only way this could happen is with a SSRC collision so the PT is the one that would be correct not the SSRC. If someone feels strongly the order of steps 2 and 3 should be the opposite way around, I can live with that.
>>>> 
>>>> 
>>>> _______________________________________________
>>>> mmusic mailing list
>>>> mmusic@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/mmusic
>>> 
>>> 
>>> 
>>> -- 
>>> Colin Perkins
>>> http://csperkins.org/
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> mmusic mailing list
>>> mmusic@ietf.org
>>> https://www.ietf.org/mailman/listinfo/mmusic
>> 
>> _______________________________________________
>> mmusic mailing list
>> mmusic@ietf.org
>> https://www.ietf.org/mailman/listinfo/mmusic
> 
> 
> 
> -- 
> Colin Perkins
> http://csperkins.org/
> 
> 
> 
> _______________________________________________
> mmusic mailing list
> mmusic@ietf.org
> https://www.ietf.org/mailman/listinfo/mmusic