Re: [AVTCORE] Comments on draft-lennox-rtcweb-rtp-media-type-mux-00

Magnus Westerlund <magnus.westerlund@ericsson.com> Thu, 10 November 2011 14:40 UTC

Return-Path: <magnus.westerlund@ericsson.com>
X-Original-To: avt@ietfa.amsl.com
Delivered-To: avt@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A261B21F8B4D for <avt@ietfa.amsl.com>; Thu, 10 Nov 2011 06:40:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -105.93
X-Spam-Level:
X-Spam-Status: No, score=-105.93 tagged_above=-999 required=5 tests=[AWL=-0.571, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, SARE_LWSHORTT=1.24, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rJJuWwz-q+qz for <avt@ietfa.amsl.com>; Thu, 10 Nov 2011 06:40:03 -0800 (PST)
Received: from mailgw9.se.ericsson.net (mailgw9.se.ericsson.net [193.180.251.57]) by ietfa.amsl.com (Postfix) with ESMTP id 4449621F8B47 for <avt@ietf.org>; Thu, 10 Nov 2011 06:40:03 -0800 (PST)
X-AuditID: c1b4fb39-b7b3eae00000252a-f9-4ebbe24123df
Received: from esessmw0247.eemea.ericsson.se (Unknown_Domain [153.88.253.124]) by mailgw9.se.ericsson.net (Symantec Mail Security) with SMTP id 6C.EF.09514.142EBBE4; Thu, 10 Nov 2011 15:40:01 +0100 (CET)
Received: from [127.0.0.1] (153.88.115.8) by esessmw0247.eemea.ericsson.se (153.88.115.94) with Microsoft SMTP Server id 8.3.137.0; Thu, 10 Nov 2011 15:40:00 +0100
Message-ID: <4EBBE238.20103@ericsson.com>
Date: Thu, 10 Nov 2011 15:39:52 +0100
From: Magnus Westerlund <magnus.westerlund@ericsson.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:8.0) Gecko/20111105 Thunderbird/8.0
MIME-Version: 1.0
To: Harald Alvestrand <harald@alvestrand.no>
References: <4EB7B054.3000706@ericsson.com> <4EB7B4D8.5050003@alvestrand.no> <4EBBADD5.4020501@ericsson.com> <4EBBC481.5010504@alvestrand.no>
In-Reply-To: <4EBBC481.5010504@alvestrand.no>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 8bit
X-Brightmail-Tracker: AAAAAA==
Cc: Jonathan Lennox <jonathan@vidyo.com>, Jonathan Rosenberg <jonathan.rosenberg@skype.net>, IETF AVTCore WG <avt@ietf.org>
Subject: Re: [AVTCORE] Comments on draft-lennox-rtcweb-rtp-media-type-mux-00
X-BeenThere: avt@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Audio/Video Transport Core Maintenance <avt.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/avt>, <mailto:avt-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/avt>
List-Post: <mailto:avt@ietf.org>
List-Help: <mailto:avt-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/avt>, <mailto:avt-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Nov 2011 14:40:04 -0000

Hi,

Inline with comments on both topics.

On 2011-11-10 13:33, Harald Alvestrand wrote:
> This response has two parts:
>
> - One discussion about the scenarios Magnus is positing in his note
> (first half)
> - One discussion about evaluating the approaches proposed (second half)
>
> People who want to comment on just one of them may want to fork the
> discussion.
>
> On 11/10/2011 11:56 AM, Magnus Westerlund wrote:
>> On 2011-11-07 11:37, Harald Alvestrand wrote:
>>
>> The reason for that behavior is that they will think this issue is a
>> non-issue. Short term thinking rather than long term implications.
>> They will like to avoid the small but existing risk of establishing
>> more NAT pinholes. Also, to be certain that this not occurs, you
>> either have zero possibility to use the non bundled session to a
>> central node, or you configure your central node to never use bundling
>> to avoid this. I think more than one developer will look at this and
>> ignore the issue. Because if you don't think it through it do look
>> simple on the surface. You simply translate the SSRCs if you end up in
>> a collision.
>
> I still don't get this argument. Can you describe in more detail a
> situation where you think trouble will occur, which does not involve a
> transport translator?

The most likely case I think is the legacy gateway. Yes, the purpose 
with the fallback is to avoid it. However, if you have a SIP SBG and you 
anyway need to do ICE termination, then the media separation into two 
RTP sessions might actually be that you decide that it looks simpler to 
have bundled media on the WebRTC side and only one instance of the ICE 
termination, than to have two ICE terminations and not do the session 
separation.

But, I do agree, there is nothing forcing an implementation to do this. 
I only fear that implementors in their search for features to 
differentiate themselves will put something on the spec sheet for an SBG 
to an SIP or IMS system that says:

- Support WebRTC Bundle of media types, improved NAT traversal performance!


>
>>
>> When it comes to using a common session. I agree that for media mixing
>> central nodes (RTP mixer), the RTP session is anyway almost individual
>> between each end-point and end-point. The only information you forward
>> between the legs are the CSRC CNAMEs to ensure the possibility to bind
>> the information to the right sources.
> Not even that, for the case of central nodes that don't do source mixing.
>>
>> But, when looking at the cheapest central node solution, the transport
>> relay (transport translator) the common RTP session becomes
>> interesting as it limits the central nodes processing to bascially a
>> packet forwarder. This gives you on the order of 1000 times higher
>> capacity in the number of streams the central node can handle compared
>> to one that performs media operations given the same general purpose
>> hardware. Thus having a common RTP session and not having to change
>> anything within the packets, not even having to perform decryption and
>> encryption cycles in the relay is a big bonus.

> Can you give some reference for the claim of "1000 times higher capacity"?

I can give you the motivation why I believe 1000 be the right ball park 
figure.

If you do a pure packet forwarder what you have to do is.
- Receive the packet from a given UDP flow, look up the corresponding 
outgoing UDP flows, forward it out one all of these flows.

- If you have a media mixing functionality then you will have to:
   - Receive the RTP packet
   - Decrypt the RTP packet
   - Decode the RTP packet
   - Perform up to one mix per destination with this media and others
   - Encode the mix
   - Encrypt the mix

That are at least 5 different steps, Decryption, Decoding, Mixing, 
Encoding, Encryption where the required CPU operations per byte of data 
is a number of CPU operations. G.719 for example has a complexity figure 
of roughly 18 MIPS, how many RTP packets can you forward using 18 MIPS. 
Quite a lot.

- If it is non mixing entity like an RTP mixer that switches media 
streams, i.e. sends them out using the Mixer's SSRCs. Then you still 
have the decryption and encryption cycle even if the media mixing, 
encoding and decoding goes away. So this is clearly less complex, but 
still more heavy than just forward the packet.

>
> Still, I think we may be talking about CPU time per packet well below a
> millisecond for the middle case.

Yes, but if your goal for how many streams a single media range server 
should handle is in the region of 100000 or more, then it does matter.

>
> I have serious problems with evaluating
> draft-westerlund-avtcore-transport-multiplexing, because I still
> disagree with its problem description.
>
> That problem description is described in section 3.1 as "allow
> separation of streams that have different purposes, for example
> different media types".
>
> I agree fully with the part before the comma; I disagree fully with the
> part after the comma. The idea that the *infrastructure needs to
> dictate* that different media types have different purposes is very far
> from self-evident, and I believe it to be false.

I agree that the application should have the freedom to choose. And the 
goal of the RTP Multiplexing architecture document is to provide 
guidance in those choice to the application.

And I think my guidance so far is that separate media type do have a 
advantage of using separate RTP session. However, it is clear in the 
trade-off between additional UDP flows and having to build more 
elaborate demultiplexing of the incomming RTP packets based on the 
payload type I would also have chosen a single RTP session.

But if we remove that consideration from the equation, isn't it still a 
reasonably good choice to separate media types from each other from the 
sole reason that the processing handling of the different media types 
simply are different.

>> And I think it is important we do consider what happens when WebRTC is
>> successful. I fear the desire to interoperate will kill of the RTP
>> session completely as a useful tool. Instead all RTP users will end up
>> with a single RTP session and we have to re-invent mechanisms for
>> separation when the short-coming of not having an layer of
>> multiplexing for media stream purposes.
>>
>> I know that the window for this is closing, but I think we should at
>> least have some reconsideration if the direction proposed in Quebec
>> really was the most appropriate.
>
> In summary: I believe that pursuing multiplexing of RTP sessions across
> a single transport connection is a worthwhile pursuit, and the
> multiplexing shim seems a reasonable approach to make this work.
>
> I believe that the RTP users are capable of evaluting for themselves
> when using multiple RTP sessions is appropriate, and that the current
> shortcoming of SDP that forces them to use multiple RTP sessions when
> the media they are sending is classified under different top level MIME
> types should be addressed - and that
> draft-homberg-mmusic-sdp-bundle-negotiation addresses this in a fashion
> that does not do damage to the rest of the SDP/RTP ecosystem.


Then my argument towards WebRTC is that it should chose the RTP shim 
based multiplexing as also a realization of the case when you have 
multiple media types in one RTP session. The shim based solution clearly 
can support that also, both in the media plane on the signaling.

The reason is that if WebRTC choices the single RTP session over a lower 
layer transport as the only implemented choice then the application 
developers in any case that could have use for multiple RTP sessions 
will still be locked into two choices if they want to have inter 
operation with WebRTC end-points

A) Using multiple media types in one RTP session over a single UDP flow

B) Using multiple RTP sessions over independent UDP flows.

While using SHIM would enable independent RTP sessions over a single UDP 
flow. Thus being capable of handling all the use cases, including 
towards gateways and transport translators where one leg doesn't have 
the same capabilities as the other when it comes to using multiple RTP 
sessions over a single transport. Thus preventing a lock in effect to a 
potentially problematic solution, even if one do want to run multiple 
media types in a single RTP session.

Cheers

Magnus Westerlund

----------------------------------------------------------------------
Multimedia Technologies, Ericsson Research EAB/TVM
----------------------------------------------------------------------
Ericsson AB                | Phone  +46 10 7148287
Färögatan 6                | Mobile +46 73 0949079
SE-164 80 Stockholm, Sweden| mailto: magnus.westerlund@ericsson.com
----------------------------------------------------------------------