Re: [MMUSIC] 3D format negotiation in draft-greevenbosch-mmusic-signal-3d-format

"Pedro Capelastegui" <capelastegui@dit.upm.es> Thu, 03 November 2011 16:34 UTC

Return-Path: <capelastegui@dit.upm.es>
X-Original-To: mmusic@ietfa.amsl.com
Delivered-To: mmusic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D977011E8132 for <mmusic@ietfa.amsl.com>; Thu, 3 Nov 2011 09:34:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.649
X-Spam-Level:
X-Spam-Status: No, score=-1.649 tagged_above=-999 required=5 tests=[AWL=-2.650, BAYES_00=-2.599, J_CHICKENPOX_12=0.6, J_CHICKENPOX_13=0.6, J_CHICKENPOX_15=0.6, J_CHICKENPOX_16=0.6, J_CHICKENPOX_53=0.6, J_CHICKENPOX_55=0.6]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 46r-SKUrFGSX for <mmusic@ietfa.amsl.com>; Thu, 3 Nov 2011 09:34:36 -0700 (PDT)
Received: from mail.dit.upm.es (mail.dit.upm.es [IPv6:2001:720:1500:42:215:c5ff:fef6:86e4]) by ietfa.amsl.com (Postfix) with ESMTP id AC64211E8110 for <mmusic@ietf.org>; Thu, 3 Nov 2011 09:34:35 -0700 (PDT)
Received: from correo.dit.upm.es (correo.dit.upm.es [IPv6:2001:720:1500:42:250:56ff:fea2:1a03]) by mail.dit.upm.es (8.14.2/8.14.2) with ESMTP id pA3GYU5o018335 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 3 Nov 2011 17:34:30 +0100
Received: from delta (delta.dit.upm.es [138.4.7.204]) (authenticated bits=0) by correo.dit.upm.es (8.14.3/8.14.3/Debian-9.1ubuntu1) with ESMTP id pA3GYIbu008726 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Thu, 3 Nov 2011 17:34:19 +0100
From: Pedro Capelastegui <capelastegui@dit.upm.es>
To: 'IETF - MMUSIC' <mmusic@ietf.org>, Bert.Greevenbosch@huawei.com
References: <002801cc9a45$69afeda0$3d0fc8e0$@dit.upm.es>
In-Reply-To: <002801cc9a45$69afeda0$3d0fc8e0$@dit.upm.es>
Date: Thu, 03 Nov 2011 17:34:31 +0100
Message-ID: <002901cc9a46$771e6290$655b27b0$@dit.upm.es>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Outlook 14.0
Thread-Index: AQKXCGBtb1yACvM5goToMvn0QrotsJQGewXg
Content-Language: es
Subject: Re: [MMUSIC] 3D format negotiation in draft-greevenbosch-mmusic-signal-3d-format
X-BeenThere: mmusic@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Multiparty Multimedia Session Control Working Group <mmusic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mmusic>, <mailto:mmusic-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mmusic>
List-Post: <mailto:mmusic@ietf.org>
List-Help: <mailto:mmusic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mmusic>, <mailto:mmusic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 03 Nov 2011 16:34:37 -0000

Hi,

In the previous post, I argued against the 3D format negotiation mechanism
currently used in draft-greevenbosch-mmusic-signal-3d-format, and suggested
associating 3D formats with payload types instead. Here I explain how this
could be achieved.

The process could be summarized as follows: 3d format configuration
parameters are described in media level attributes (like in the current
draft), but these attributes have a payload type as an additional parameter:

	a=3dformat:<Payload_Type> <3dformat parameters>

Each 3D media stream will be composed of one or two media descriptions, each
with one or more 3d format attributes. For each media description, an
answerer would select one of the available media formats, with the preferred
combination of codec and 3d format. In order to keep complexity manageable,
the answerer would only be allowed to choose a single media format when 3d
is enabled.

This still leaves us with the problem of declaring dependencies between
media descriptions within a 3D media stream (e.g. stream B is the depth map
of stream A, or stream B is the right view and stream A is the left one).
This could be signaled with the ‘a=depend’ attribute (see RFC 5583).

As an example, here is how this system would work in the scenario shown in
section 9.4 of draft-greevenbosch-mmusic-signal-3d-format. The offerer is
trying to initiate a session with a 3D video stream which can have 2
possible configurations: 

	(1)	1 stream for video and one stream for a parallax map
	(2)	1 stream for the left view and one for the right view.

This is what the offer SDP would look like
	v=0
	o=Alice 2890844526 2890842807 IN IP4 131.163.72.4
	s=The technology of 3D-TV
	c=IN IP4 131.164.74.2
	t=0 0
	a=group:3DS 1 2
	m=video 49170 RTP/AVP 99 100
	a=rtpmap:99 H264/90000
	a=3dFormat:99 2DA C
	a=rtpmap:100 H264/90000
	a=3dFormat:100 SC L
	a=mid:1
	m=video 49172 RTP/AVP 99 100
	a=rtpmap:99 H264/90000
	a=3dFormat:99 2DA P
	a=rtpmap:100 H264/90000
	a=3dFormat:100 SC R
	a=mid:2
	a=depend 99 3dv 1:99; 100 3dv 1:100
	m=audio 52890 RTP/AVP 10
	a=rtpmap:10 L16/16000/2

‘3dv’ would be a new dependency-type defined for use with 3d format. The
‘depend’ attribute is tipically used in combination with “DDP” (decoding
dependency) groups, though I think using it with “3DS” groups in this
scenario wouldn’t go against the specification.

This offer SDP has 2 video streams in a 3DS group, each with 2 possible
configurations. The first one can be configured as a ‘center’ view in a
video+auxiliary map scheme, or as a left view in a simulcast scheme. The
second media stream can be configured as a parallax map in a video+aux
configuration, or as a right view in a simulcast configuration. Only certain
payload type combinations are viable, which is signaled in the ‘a=depend’
line: video stream 2 can only be configured as a parallax map if video
stream 1 is the center view, and it can only be configured as a right view
if stream 1 is the left view.

The SDP answer would look like this, if the answerer chooses the "center
view plus parallax map" option: 

	v=0
	o=Bob 2890844528 2890842809 IN IP4 131.163.72.5
	s=The technology of 3D-TV
	c=IN IP4 131.164.74.3
	t=0 0
	a=group:3DS 1 2
	m=video 48170 RTP/AVP 99
	a=rtpmap:99 H264/90000
	a=3dFormat:99 2DA C
	a=mid:1
	m=video 48172 RTP/AVP 99 
	a=rtpmap:99 H264/90000
	a=3dFormat:99 2DA P
	a=mid:2
	a=depend 99 3dv 1:99; 100 3dv 1:100
	m=audio 52890 RTP/AVP 10
	a=rtpmap:10 L16/16000/2

Compared to the SDPs used in draft-greevenbosch-mmusic-signal-3d-format,
this offer/answer exchange only uses 3 media descriptions instead of 5.
Also, including additional 3D formats would be more simple using this
mechanism.

Regards,
Pedro

-----Original Message-----
From: mmusic-bounces@ietf.org [mailto:mmusic-bounces@ietf.org] On Behalf Of
Pedro Capelastegui
Sent: Thursday, November 03, 2011 5:27 PM
To: IETF - MMUSIC; Bert.Greevenbosch@huawei.com
Subject: [MMUSIC] 3D format negotiation in
draft-greevenbosch-mmusic-signal-3d-format

Hi,

Here are some further comments on
draft-greevenbosch-mmusic-signal-3d-format. These are focused on the 3d
format negotiation process, and specifically how the extension behaves when
multiple 3d configuration options are offered.

My main concern is that, in order to offer alternate 3d format
configurations, the offerer has to include additional 'm=' lines in the SDP.
Currently, each media descriptor can have a single 3d format, and each 3d
format is associated to 1 or 2 media descriptors. This is not an issue in
offers with a single 3d format (see examples 9.1, 9.2 and 9.3), but when
several 3d formats are offered, the number of media descriptors multiplies.
This can be observed in example 9.4, where 4 media descriptors are used in
the offer, though the answerer is expected to reject all but two. In
general, in order to provide N 3D format options for a 3D video
stream,between N and 2N media descriptors will be required in the offered
SDP, of which only one or two will be used.

I think this proliferation of media descriptions has several drawbacks. For
one, it goes against the general philosophy of SDP negotiation. Normally,
when different possible configurations exist for a media stream, they are
shown as different media formats within a single media descriptor. Using
multiple media descriptors that actually refer to the same media stream (the
3D video stream, or a subset of it), and having the answerer reject all but
the ones with the desired configuration breaks this model. In addition, this
excess of 'm=' lines may have a negative effect when combined with
techniques such as ICE, and could be confusing for 3d-unaware nodes.

On the other hand, this is only a problem if user agents are expected to
select between many 3d formats on an average session. Unfortunately, I think
this will be the case. At the very least, a UA starting a 3D session should
provide a single 3D format and the option not to use 3D. Apart from that, a
3D user agent intended to be interoperable with other UAs should support
several 3D formats, including video+depth and transmission of 2 views. 

The solution, in my opinion, would be to shift away from the current model
and instead associate each 3D format with a payload type within a media
description. I will discuss how this can be done in a separate mail.

Regards,

Pedro

_______________________________________________
mmusic mailing list
mmusic@ietf.org
https://www.ietf.org/mailman/listinfo/mmusic
-----
No se encontraron virus en este mensaje.
Comprobado por AVG - www.avg.com
Versión: 10.0.1411 / Base de datos de virus: 2092/3992 - Fecha de
publicación: 11/02/11