Re: [MMUSIC] 3D format negotiation in draft-greevenbosch-mmusic-signal-3d-format

"Pedro Capelastegui" <capelastegui@dit.upm.es> Wed, 16 November 2011 14:03 UTC

Return-Path: <capelastegui@dit.upm.es>
X-Original-To: mmusic@ietfa.amsl.com
Delivered-To: mmusic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BE6EC1F0C55 for <mmusic@ietfa.amsl.com>; Wed, 16 Nov 2011 06:03:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.366
X-Spam-Level:
X-Spam-Status: No, score=-1.366 tagged_above=-999 required=5 tests=[AWL=-1.167, BAYES_00=-2.599, J_CHICKENPOX_12=0.6, J_CHICKENPOX_13=0.6, J_CHICKENPOX_15=0.6, J_CHICKENPOX_16=0.6]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tTM2G3D2kAv5 for <mmusic@ietfa.amsl.com>; Wed, 16 Nov 2011 06:03:40 -0800 (PST)
Received: from mail.dit.upm.es (mail.dit.upm.es [IPv6:2001:720:1500:42:215:c5ff:fef6:86e4]) by ietfa.amsl.com (Postfix) with ESMTP id 933841F0C4B for <mmusic@ietf.org>; Wed, 16 Nov 2011 06:03:39 -0800 (PST)
Received: from correo.dit.upm.es (correo.dit.upm.es [IPv6:2001:720:1500:42:250:56ff:fea2:7367]) by mail.dit.upm.es (8.14.2/8.14.2) with ESMTP id pAGE3O4U010012 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 16 Nov 2011 15:03:24 +0100
Received: from delta (delta.dit.upm.es [138.4.7.204]) (authenticated bits=0) by correo.dit.upm.es (8.14.3/8.14.3/Debian-9.1ubuntu1) with ESMTP id pAGE3DPc026116 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Wed, 16 Nov 2011 15:03:14 +0100
From: Pedro Capelastegui <capelastegui@dit.upm.es>
To: 'Bert Greevenbosch' <Bert.Greevenbosch@huawei.com>, 'IETF - MMUSIC' <mmusic@ietf.org>
References: <002801cc9a45$69afeda0$3d0fc8e0$@dit.upm.es> <002901cc9a46$771e6290$655b27b0$@dit.upm.es> <46A1DF3F04371240B504290A071B4DB6231D5D12@szxeml509-mbs.china.huawei.com>
In-Reply-To: <46A1DF3F04371240B504290A071B4DB6231D5D12@szxeml509-mbs.china.huawei.com>
Date: Wed, 16 Nov 2011 15:03:24 +0100
Message-ID: <001801cca468$81d2ed60$8578c820$@dit.upm.es>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Outlook 14.0
Thread-Index: AQKXCGBtb1yACvM5goToMvn0QrotsAJ5XQo5AY9YFvKT+nd0AA==
Content-Language: es
Subject: Re: [MMUSIC] 3D format negotiation in draft-greevenbosch-mmusic-signal-3d-format
X-BeenThere: mmusic@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Multiparty Multimedia Session Control Working Group <mmusic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mmusic>, <mailto:mmusic-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mmusic>
List-Post: <mailto:mmusic@ietf.org>
List-Help: <mailto:mmusic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mmusic>, <mailto:mmusic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Nov 2011 14:03:40 -0000

Hi Bert,

Thanks for answering. Answers inline.

Regards,
Pedro

> -----Original Message-----
> From: Bert Greevenbosch [mailto:Bert.Greevenbosch@huawei.com]
> Sent: Tuesday, November 15, 2011 9:00 AM
> To: Pedro Capelastegui; 'IETF - MMUSIC'
> Subject: RE: [MMUSIC] 3D format negotiation in draft-greevenbosch-
> mmusic-signal-3d-format
> 
> Hi Pedro,
> 
> Sorry for my late response; I have been quite busy lately.
> 
> Thanks for your e-mail, and for thinking with us about how to move forward
> with the 3D drafts.
> 
> To summarise your proposal:
> (1) Point at multiple possible video streams in a single media "m=" line
(i.e. a
> group of video streams per one media line; I'll call that a "media
group").
> (2) Stipulate that per media group, only one video stream can be chosen.
> (3) In one of the media groups, use the "depend" attribute to indicate
which
> video streams are associated with which video streams from the other media
> group(s).
> (4) Use "3DS" grouping + "mid" attribute to label the media groups; needed
> for the "depend" attribute.
> (5) Use the payload type to differentiate the streams within a media
group.

Your summary is correct. However, I'd like to comment something about the
terminology. I think it's best to refer to the "m=" lines as "media
descriptions", and to the configurations associated with each payload type
number as "media format", since this matches the vocabulary used in other
RFCs.

 Your use of "video streams" and "media groups" could be helpful in drawing
parallels between both proposals, but it wouldn't be entirely accurate, for
the following reasons: 
	- A media descriptor ("media group") in my proposal is not the
equivalent to a "3DS" group of media streams in the current draft. 
	- A media format ("video stream") in my proposal may actually
include multiple video streams, either through frame packing, or by using
2D+auxiliary data formats.
 

> I have thought about your proposal, and the following issues arose:
> (A) The usage of the payload type requires a different payload type for
each
> video stream. This is OK for streams that use a dynamic payload type.
> However, certain RTP streams have fixed payload types. For example,
> MPEG-2 video over MPEG Transport Stream over RTP has fixed payload type
> 33. With your proposal, it is not possible to do 3D simulcast of L- and R-
> streams in MPEG-2.

This is something I hadn't taken into account. However, I'm not sure if it
could be a problem. As far as I can tell, all a static payload type implies
is that an encoding (such as MPEG-2) is mapped by default to a given number
(33, in this case). But I haven't been able to find a rule that prevents the
same encoding from being mapped to a dynamic payload type number, after
checking several RFCs about SDP and RTP. Is there any such rule that I may
have missed? Would this remapping of encodings with static payload types
break anything in existing implementations, otherwise?

Assuming that I've missed something, and static payload types cannot be
remapped, it's an issue that needs to be addressed in my proposal. On the
bright side, it doesn't prevent signalling 3D formats while using codecs
with static payload types - following your example, it WOULD be possible to
do 3D simulcast of L and R streams in MPEG-2, such as in the following SDP:

	v=0
	o=Alice 2890844526 2890842807 IN IP4 131.163.72.4
	s=The technology of 3D-TV
	c=IN IP4 131.164.74.2
	t=0 0
	a=group:3DS 1 2
	m=video 49170 RTP/AVP 33
	a=rtpmap: 33 MP2T/90000
	a=3dFormat:100 SC L
	a=mid:1
	m=video 49172 RTP/AVP 33
	a=rtpmap:33 MP2T/90000
	a=3dFormat:33 SC R
	a=mid:2
	a=depend 33 3dv 1:33
	m=audio 52890 RTP/AVP 10
	a=rtpmap:10 L16/16000/2

Where static payload types would have a negative impact would be in using
multiple 3d formats in the offer. In the example above, the offerer could
only add another 3d format (say, simulcast 2D+parallax) by adding a new
payload type - but we wouldn't be able to use MP2T encoding for the
alternative 3d format.

> (B) It is true, that the number of "m=" lines in your proposal is less
than in the
> current draft. But this is only a virtual reduction, as now multiple video
> streams are advertised in a single media line. In principle, the same
number
> of video streams is advertised.

This is one case where I think your choice of terminology can be confusing,
so I'll refer to what you call "video streams" as "media formats".

It is true that the amount of media formats offered would remain the same
across proposals, so the overall complexity of the sdp would not change.
However, I believe reducing the number of media descriptors (media lines)
makes the negotiation process more efficient, for the following reasons:
	- NAT traversal. The complexity of NAT traversal usually increases
with the number of media descriptors in an SDP. As an example, ICE needs to
gather candidates for each RTP media stream (i.e. each media descriptor) -
it is unaffected by the number of media formats in any descriptor.
	- Firewalling. Likewise, each media descriptor in the SDP offer
usually requires a unique listening port (plus another for RTCP), so adding
more descriptors may complicate firewall configuration.
	- Subsequent offers. Not a huge issue, but under your proposal, if a
user agent re-INVITES (e.g. to disable a stream), all discarded media lines
from the original offer must still be included, cluttering the SDP (though
their attributes may be omitted).

> (C) With your proposal, in addition to the "3DS" grouping and the
"3dFormat"
> attribute, the "depend" attribute needs also be supported, with the
> particular meaning as you have defined it.

Yes

> (D) Does having less "m=" lines justify the additional complexity of using
the
> "depend" attribute?

I think so, but I wouldn't mind some feedback on the matter. Regarding the
costs, I think the complexity introduced by "a=depend" is relatively low,
though I admit I don't have direct experience using it in implementations.
On the other hand, if an implementation was already using "a=depend" for
other purposes, this cost would be negated. Current applications requiring
the  attribute include H.264/SVC coding, for example. Also, H.264/MVC, when
it's finished, is highly likely to also use "a=depend" - and I'd expect that
format to be widely adopted by 3D-capable UAs.

As for the benefits, I've discussed the merits of reducing "m=" lines above.
On top of that, a benefit of my overall approach which I had't mentioned yet
is that it's better suited for scenarios with multiple 3D streams (e.g.
multiple displays, or 3D telepresence).In the current draft, though it is
possible for an answerer to accept more than one 3d format at once, there is
no clear way for the offerer to indicate whether it actually intends to use
more than one. An offer could have multiple 3d formats as alternatives to
each other, or as actual 3d streams that could be reproduced simultaneously.
By contrast, in my proposal, each "3DS" group would always correspond to a
separate 3d stream. Of course, this can be addressed through other kind of
update to the current draft, but that may have its own complexity cost.
Either way, this may be worth discussing as a separate topic.

> (E) What if different video streams in the same media group require
> different SDP attributes?

Of all media-level attributes, the most likely to change from one 3d media
format to the other is "a=fmtp", which is defined for each payload type.
That said, there are a few potentially problematic interactions.

"a=recvonly", "a=sendonly", "a=sendrecv", "a=inactive": It would be rare,
but not impossible, for an offerer to want to specify different stream
directions depending on the 3D media format (maybe has display capabilities,
but cannot capture, for a certain format). In this specific case, a simple
re-invite once the offer/answer process has narrowed down the 3d formats to
a sigle option should work.

"b=" : clearly, different 3d formats may have different bandwidth
requirements, so having many formats under a single media descriptor makes
the "b=" attribute less accurate than under your proposal. That said, it is
already possible to have a single media descriptor with multiple 2D formats
with very different bandwidth requirements, so this is hardly a new problem.

I haven't found other media level attributes that could cause trouble. Did
you have any in  mind?