Re: [MMUSIC] Comments on draft-greevenbosch-mmusic-signal-3d-format-01

Bert Greevenbosch <Bert.Greevenbosch@huawei.com> Fri, 21 October 2011 02:20 UTC

Return-Path: <Bert.Greevenbosch@huawei.com>
X-Original-To: mmusic@ietfa.amsl.com
Delivered-To: mmusic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4B9FD1F0C4F for <mmusic@ietfa.amsl.com>; Thu, 20 Oct 2011 19:20:15 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.598
X-Spam-Level:
X-Spam-Status: No, score=-6.598 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id K8mMX47+dVlI for <mmusic@ietfa.amsl.com>; Thu, 20 Oct 2011 19:20:14 -0700 (PDT)
Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [119.145.14.66]) by ietfa.amsl.com (Postfix) with ESMTP id 568321F0C3B for <mmusic@ietf.org>; Thu, 20 Oct 2011 19:20:13 -0700 (PDT)
Received: from huawei.com (szxga03-in [172.24.2.9]) by szxga03-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LTE00NIJ95J40@szxga03-in.huawei.com> for mmusic@ietf.org; Fri, 21 Oct 2011 10:20:08 +0800 (CST)
Received: from szxrg02-dlp.huawei.com ([172.24.2.119]) by szxga03-in.huawei.com (iPlanet Messaging Server 5.2 HotFix 2.14 (built Aug 8 2006)) with ESMTP id <0LTE00BOR95JU0@szxga03-in.huawei.com> for mmusic@ietf.org; Fri, 21 Oct 2011 10:20:07 +0800 (CST)
Received: from szxeml203-edg.china.huawei.com ([172.24.2.119]) by szxrg02-dlp.huawei.com (MOS 4.1.9-GA) with ESMTP id AEI64787; Fri, 21 Oct 2011 10:20:06 +0800
Received: from SZXEML402-HUB.china.huawei.com (10.82.67.32) by szxeml203-edg.china.huawei.com (172.24.2.55) with Microsoft SMTP Server (TLS) id 14.1.270.1; Fri, 21 Oct 2011 10:19:59 +0800
Received: from SZXEML509-MBS.china.huawei.com ([169.254.2.46]) by szxeml402-hub.china.huawei.com ([10.82.67.32]) with mapi id 14.01.0270.001; Fri, 21 Oct 2011 10:19:56 +0800
Date: Fri, 21 Oct 2011 02:19:54 +0000
From: Bert Greevenbosch <Bert.Greevenbosch@huawei.com>
X-Originating-IP: [10.70.109.97]
To: "capelastegui@dit.upm.es" <capelastegui@dit.upm.es>
Message-id: <46A1DF3F04371240B504290A071B4DB60BF75AAC@szxeml509-mbs.china.huawei.com>
MIME-version: 1.0
Content-type: multipart/alternative; boundary="Boundary_(ID_92JUmIYkuohpbtkmTrUQew)"
Content-language: en-US
Accept-Language: en-GB, zh-CN, en-US
Thread-topic: Comments on draft-greevenbosch-mmusic-signal-3d-format-01
Thread-index: AcyKijb4TtB/B1/kTfyPYAoYdFul1gC7CiGgADJdn5AAVd2OEA==
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
X-CFilter-Loop: Reflected
Cc: IETF - MMUSIC <mmusic@ietf.org>
Subject: Re: [MMUSIC] Comments on draft-greevenbosch-mmusic-signal-3d-format-01
X-BeenThere: mmusic@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Multiparty Multimedia Session Control Working Group <mmusic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mmusic>, <mailto:mmusic-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mmusic>
List-Post: <mailto:mmusic@ietf.org>
List-Help: <mailto:mmusic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mmusic>, <mailto:mmusic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Oct 2011 02:20:15 -0000

Hi Pedro,

Thank you very much for your thorough review of the drafts.  Please find my responses inline.

Best regards,
Bert


From: Pedro Capelastegui [mailto:capelastegui@dit.upm.es]
Sent: 15 October 2011 00:18
To: 'IETF - MMUSIC'; Bert Greevenbosch
Subject: Comments on draft-greevenbosch-mmusic-signal-3d-format-01


Hi,



I have reviewed draft-greevenbosch-mmusic-signal-3d-format-01, and I have the following comments.



                - Backwards compatibility



When a 3D-enabled device using this specification calls a legacy device, it is very likely that the answerer will accept the session and end up receiving media that it can’t display properly, such as frame-packed video or a depth map. I think the default behaviour here should be that whenever the SDP answer omits the 3dFormat attribute, the offerer only sends 2D video (if available), avoiding further offers if possible. This speeds up session setup with legacy agents, and prevents transmitting useless media.



To implement this change, I would modify a few paragraphs in section 8.



In 8.1 (Frame Packing), replace:



“the offerer MAY update the offer with a 2D stream.  If the offerer is the streaming server, it  MAY choose to stream the frame packed video as it is.“



with



“the offerer SHOULD treat that media stream as a 2D stream.”



[BG] The original thinking was, that when the legacy answerer does not support the SDP attribute, but does support 3D (which the user can switch on manually), then it could make sense sending the 3D stream anyway. I am fine with your change, plus an informative note pointing at the possibility for manual switching.



In 8.2 (2D and auxiliary as a single stream), replace:



“ the offerer MAY update its offer by offering only a 2D  video stream.”



with



“SHOULD treat that media stream as a 2D stream.”







In 8.3 (2D and auxiliary as two separate streams), replace:



" If the answerer selects only the auxiliary video, the offerer  SHOULD update its offer without the auxiliary video."



with



"If the answerer selects only the auxiliary video, the offerer  MAY treat that stream as a 2D video stream. If it does not, the offerer SHOULD update its offer without the auxiliary video."

 

[BG] As long as the auxiliary video uses the same codec as the 2D video (likely) this is OK. So "MAY" suits in the new text. I'll adopt it.





                - H.264/MVC



I think that compatibility with the MVC format is a must for this specification. This will probably require defining new types of 3D formats and components. Unfortunately, the payload format for MVC (draft-ietf-payload-rtp-mvc-0) is still unfinished, with most of the SDP section missing…



[BG] The draft currently does not consider MVC, but is more focussed on general codecs. For example, TaB or SbS frame packing can be used by any 2D video codec to convey 3D. However, certain modes, e.g. frame sequential frame packing and 2D+depth require specific codecs, such as AVC + frame packing arrangement SEI or 23002-3.



                - Depth Maps in auxiliary streams.



Some transmission options use a single stream containing one view (centre or left) and its associated depth map (or parallax map). However, there is no mention as to how the view and map are packed in that stream. I suppose the specification assumes the use of the auxiliary data format defined in MPEG-C part 3, but this should probably be indicated explicitly.



Also, perhaps the “2DA” format should leave the door open to alternate packing mechanisms other than MPEG-C part 3. I don’t know of any currently existing alternatives, but it is not unreasonable to expect them at some point in the future.



[BG] Yes, the draft focuses on 23002-3 for depth maps and their signalling. How to combine the 2D and auxiliary streams in a single stream is out-of-scope of 23002-3, and also left out-of-scope of this draft. I agree that we should keep the door open for other packing mechanisms.



                - Parallax Maps



The draft provides no reference for the use of parallax maps, so it’s probably worth mentioning that they are defined, like depth maps, in [ISO/IEC 23002-3]. Also, I have not found any examples of 3D video applications based on parallax maps in the literature.  How likely is it that this scenario will see actual use?



[BG] I can add the reference to the draft. I agree that depth maps are more common than parallax maps.



                - Frame Sequential packing



I think that the mechanism for identifying L- and R- streams in a frame sequential stream should be clearly explained.



[BG] For AVC frame packing, this can be done in the frame packing arrangement SEI messages. I can add some explanatory text for this. It will be hard to write something general for all codecs, as the signalling of which frames are L and which are R should be sent synchronous with the video stream.



                - Multiple 3D streams



On page 14, it says that “The answerer SHOULD select only one 3D format”. This means that, for any given session, only one 3D video stream can be used. I think this is too limiting, since it prevents describing sessions where one agent has multiple 3D displays, among others.



[BG] I am fine with removing this normative statement from the draft.



                - Multiple Depth Maps.



The document assumes that depth maps will only be used in 2D+depth streams. However, I think that the combination of multiple views and depth maps (e.g. L/R views and 2 depth maps) is an interesting scenario that may deserve consideration.



[BG] Do you have any existing format in mind? Or do you want to keep open the possibility to signal this kind of scenario later?



Regards,

Pedro





---------------------------------------------------------------------------



Pedro Capelastegui

Researcher

Universidad Politécnica de Madrid